Unexpectedly Slow Read Performance with RaidZ (Scale)

im.thatoneguy · Jan 24, 2023

Intel Xeon Silver 4314 24MB 16-core 2.4GHz
256 GB ECC RAM
Supermicro X12SPI-TF Motherboard
Broadcom 3808 IT Mode HBA
Intel X550 10gbe (but slow locally not even over SMB)
21x Western Digital WDC HC550 WUH721816ALE6L4
3x7 RAIDz-2

I've got three 7 Drive RAIDZ2 VDevs and a 16 core processor and I'm only seeing read speed of ~600MB/s. Write speeds are actually higher: about 900MB/s which is surprising since, if anything the RaidZ2 writes should be getting bogged down by parity calculations?

Process: Run once to create the data file. Run a decoy data file to get fed into ARC. Run the read test again on the original file.

Code:

sudo fio --filename=/mnt/pool/dataset/file.dat --rw=read --direct=1 --bs=1M --ioengine=libaio --numjobs=1 --group_reporting --name=seq_write --iodepth=32 --size=128G

sudo fio --filename=/mnt/pool/dataset/decoy.dat --rw=read --direct=1 --bs=1M --ioengine=libaio --numjobs=1 --group_reporting --name=seq_write --iodepth=32 --size=128G

sudo fio --filename=/mnt/pool/dataset/file.dat --rw=read --direct=1 --bs=1M --ioengine=libaio --numjobs=1 --group_reporting --name=seq_write --iodepth=32 --size=128G

zpool iostat 5 also running to confirm in the BG what's actually being read from the disks.

If I let it read from ARC it hits 3,300MB/s so the CPU doesn't seem to have any inherent issues.

Am I just testing wrong? I'm not sure where to start even troubleshooting. My goal for the system was ~10gb read/write.

sretalla · Jan 25, 2023

OK, well I have a pool similar to yours (8 disks per VDEV rather than 7).

The read test I usually run looks like this:

fio --name TEST --eta-newline=5s --filename=fio-tempfile.dat --rw=read --size=50g --io_size=1500g --blocksize=1M --iodepth=16 --direct=1 --numjobs=16 --runtime=120 --group_reporting

Assuming you have no compression and have set the recordsize of the dataset to 1M.

What I see out of that is:

Code:

bw (  MiB/s): min=  319, max=16477, per=100.00%, avg=11145.28, stdev=206.48, samples=3808
iops        : min=  304, max=16464, avg=11138.68, stdev=206.56, samples=3808

direct=1 (I see you also used) should be avoiding ARC (by requesting unbuffered I/O) and I see all the disks' activity lights almost fully lit the whole time, so I have relative confidence it's doing that. I guess that would make the size of your test and the decoy read a little overkill.

im.thatoneguy · Jan 25, 2023

Well you're getting 11GiB/s so you're definitely just testing your RAM. :D

Someone on Discord also randomly ran into the same thing and posted last night:

BloodyGent — Yesterday at 7:24 PM
Did a quick benchmark for large file transfers using smb internally to and from an NVME pool. These are some quick results. Interesting why reads are so horrible with RaidZ below 1 drive performance.

SMB large file transfer bandwidth:
Z1 (3+1) vdev: 360-400 MB/s write | read 80-110 MB/s
Z1 (2+1) vdev + hotspare: 100-250 MB/s write | read 80 - 110 MB/s
Z2 (2+2) vdev: 340-380 MB/s write | read UNTESTED
2+2 mirror striped vdevs: 380-420 MB/s write | read 580-650 MB/s

These are fast WD HC550 16TB HDDs with 512MB cache. Their peak bandwidth is around 260MB/s according to the spec sheet.

sretalla · Jan 26, 2023

im.thatoneguy said:
Well you're getting 11GiB/s so you're definitely just testing your RAM. :D

At least partially true...

The setting we had different (that matters) is numjobs.

At 16, I get 16 "clients" requesting data, so that results in at least a little (and in the case of 16, a lot) benefit from data already in disk cache, so I'm able to push at what is more-or-less the speed of the controller. (which is SAS 12GB/s)

If I push that number down to 1, I get more like 500MB/s, for 2 I get 1200MB/s.

I can confirm that ARC isn't coming into play as I ran your decoy process and even with 16 jobs, I can get 10, 9 and 8GB/s on the 3 runs.

Important Announcement for the TrueNAS Community.

Unexpectedly Slow Read Performance with RaidZ (Scale)

im.thatoneguy

Dabbler

sretalla

Powered by Neutrality

im.thatoneguy

Dabbler

sretalla

Powered by Neutrality

Similar threads

Important Announcement for the TrueNAS Community.

Unexpectedly Slow Read Performance with RaidZ (Scale)

im.thatoneguy

Dabbler

sretalla

Powered by Neutrality

im.thatoneguy

Dabbler

sretalla

Powered by Neutrality

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Unexpectedly Slow Read Performance with RaidZ (Scale)"

Similar threads