Can I suggest going back to basics.
A single drive (HDD, SSD), no others connecting to anything. Create a pool and fill it with rubbish
If that works add a few more and test
I was already doing this with the extra SSDs that weren't in a zpool. No issues writing to them. I even moved around all the miniSAS HD cables to different cards and even removed cards and added SAS expanders. I couldn't get a reboot until I imported Bunnies and wrote to it.
To check if TrueNAS have a issue you can try to boot other Linux distro that use the same or new version of ZFS and try to import Bunnies pool and test writes.
If the resets continue TrueNAS is not a problem but still can be a software issue - ZFS or something else.
Or you can try TrueNAS Core if Core ZFS can import pool from Scale to check if there is something different in logs.
This was a great idea. Sadly, I already nixed the whole zpool.
Since that pool was the issue, I created and destroyed a bunch of zpools doing tons of benchmarks until I was satisfied that it wouldn't die.
After creating my new dRAID zpool with these SSDs, I went ahead and started copying data. It's been writing at ~2.0-2.5GB/s consistently with ZFS send/recv through TrueNAS from my HDD array. I'd say those are pretty good numbers considering the source drives.
I'm assuming something was botched with that zpool. Was it a TrueNAS or ZFS issue? I can't tell anymore. Would've been good to know for posterity, but this sucked 3 days of my time. I'm just wanting everything back to normal.
SSD zpool benchmarks
I did a ton of benchmarks and landed on these two configs:
Code:
## 40 x mirrors (only 4TB) @ 1M recordsize
`readwrite`
READ: bw=3892MiB/s (4081MB/s), 3892MiB/s-3892MiB/s (4081MB/s-4081MB/s), io=38.0GiB (40.8GB), run=10001-10001msec
WRITE: bw=4059MiB/s (4256MB/s), 4059MiB/s-4059MiB/s (4256MB/s-4256MB/s), io=39.6GiB (42.6GB), run=10001-10001msec
## 7 x draid2:5d:16c:1s & 2 x special mirror @ 1M recordsize
`readwrite`
READ: bw=3251MiB/s (3408MB/s), 3251MiB/s-3251MiB/s (3408MB/s-3408MB/s), io=31.8GiB (34.1GB), run=10002-10002msec
WRITE: bw=3436MiB/s (3603MB/s), 3436MiB/s-3436MiB/s (3603MB/s-3603MB/s), io=33.6GiB (36.0GB), run=10002-10002msec
## 57 x mirrors (17 x 2TB & 40 x 4TB) @ 1M recordsize & 16 jobs
`readwrite`
READ: bw=9514MiB/s (9976MB/s), 311MiB/s-1308MiB/s (326MB/s-1372MB/s), io=93.3GiB (100GB), run=10001-10042msec
WRITE: bw=9727MiB/s (10.2GB/s), 278MiB/s-1243MiB/s (292MB/s-1303MB/s), io=95.4GiB (102GB), run=10001-10042msec
`read`
READ: bw=22.0GiB/s (23.7GB/s), 339MiB/s-5145MiB/s (355MB/s-5395MB/s), io=221GiB (237GB), run=10001-10029msec
`write`
WRITE: bw=23.3GiB/s (25.0GB/s), 1396MiB/s-1592MiB/s (1464MB/s-1670MB/s), io=233GiB (250GB), run=10004-10014msec
`randread`
READ: bw=6460MiB/s (6774MB/s), 401MiB/s-409MiB/s (421MB/s-429MB/s), io=63.3GiB (67.9GB), run=10001-10028msec
`randwrite`
WRITE: bw=7584MiB/s (7953MB/s), 379MiB/s-630MiB/s (397MB/s-660MB/s), io=74.2GiB (79.7GB), run=10002-10023msec
## 7 x draid2:5d:16c:1s & 2 x special mirror @ 1M recordsize & 16 jobs
`readwrite`
READ: bw=11.6GiB/s (12.4GB/s), 390MiB/s-1320MiB/s (409MB/s-1384MB/s), io=116GiB (124GB), run=10002-10016msec
WRITE: bw=11.7GiB/s (12.5GB/s), 393MiB/s-1323MiB/s (413MB/s-1388MB/s), io=117GiB (126GB), run=10002-10016msec
`read`
READ: bw=17.7GiB/s (19.0GB/s), 551MiB/s-2811MiB/s (578MB/s-2947MB/s), io=177GiB (190GB), run=10001-10026msec
`write`
WRITE: bw=15.9GiB/s (17.0GB/s), 838MiB/s-1295MiB/s (878MB/s-1358MB/s), io=159GiB (171GB), run=10001-10016msec
`randread`
READ: bw=12.5GiB/s (13.5GB/s), 786MiB/s-819MiB/s (824MB/s-859MB/s), io=126GiB (135GB), run=10001-10015msec
`randwrite`
WRITE: bw=6626MiB/s (6948MB/s), 332MiB/s-633MiB/s (348MB/s-664MB/s), io=72.0GiB (77.3GB), run=10001-11130msec
## Benchmark script
zfs set primarycache=none Temp
fio --ioengine=libaio --filename=/mnt/Temp/performanceTest --direct=1 --sync=0 --rw=readwrite --bs=16M --numjobs=16 --iodepth=1 --runtime=10 --size=50G --time_based --name=fio
fio --ioengine=libaio --filename=/mnt/Temp/performanceTest --direct=1 --sync=0 --rw=read --bs=16M --numjobs=16 --iodepth=1 --runtime=10 --size=50G --time_based --name=fio
fio --ioengine=libaio --filename=/mnt/Temp/performanceTest --direct=1 --sync=0 --rw=write --bs=16M --numjobs=16 --iodepth=1 --runtime=10 --size=50G --time_based --name=fio
fio --ioengine=libaio --filename=/mnt/Temp/performanceTest --direct=1 --sync=0 --rw=randread --bs=16M --numjobs=16 --iodepth=1 --runtime=10 --size=50G --time_based --name=fio
fio --ioengine=libaio --filename=/mnt/Temp/performanceTest --direct=1 --sync=0 --rw=randwrite --bs=16M --numjobs=16 --iodepth=1 --runtime=10 --size=50G --time_based --name=fio
rm /mnt/Temp/performanceTest
zfs set primarycache=all Temp
I'm satisfied with getting an extra 100 TiB from using dRAID. The speeds are already at max SMB Multichannel I can support, and my NVMe drives in Windows are gen4, so each individually should be able to saturate this link. I think I'm good!
One thing I'm surprised about is the relatively slow speeds. Considering each drive is addressable directly by the hardware, and considering that PCIe 3.0 x8 is plenty of bandwidth for these SATA SSDs, I wish I could get more speed out of them just for kicks. Oh well.