write speeds 490mb/s ? Ok? 6 x SSD on 2 vdevs

daemonix

Dabbler
Joined
Jun 3, 2022
Messages
21
Hi all,
This is a new setup with 6x8TB SSDs 870QVO (I know, not the best but that's what I have :) ).
I have 2 vdev raidz1 with 3 disks per vdev (reading about IOPS limits so 2 vdev are a bit better no?).

Write over the net (10G) give me the same performance as a local dd (dd if=/dev/urandom of=ostechnix.bin bs=50M count=200), around 490mbytes/s. Read from SMB give me 1.1Gb/s but Im not sure if its cached. The system has 64GB RAM.

From the "Reporting" each disk seems to be writing with 100ish mbyte/s and IO is super low.

Any RTFM I need to do?
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
dd if=/dev/urandom
That's a CPU bottleneck if your pool can go faster than the CPU can do random generation (usually the case).... write it to a file first, then use that file as the input.

Or learn about and use fio to test your disks.

Since it seems your network can match the requirement, there's probably no point to running iperf3 yet.

With 6 disks and the IOPS of only 2 at your disposal, if the transfer you're doing is IOPS heavy, you'll be slow.

Getting the same performance across the network as with dd probably says that it's not sync writes in either case (since dd wouldn't be).

Also, I would consider what you're getting to be "normal" for that pool setup, since you won't be benefiting a lot from wide VDEVs, so probably limited to the write throughput performance of near to 2 disks too (so already at the max). As you can see, your reads are better as the load can be spread across more drives (maybe even all 6).
 

daemonix

Dabbler
Joined
Jun 3, 2022
Messages
21
That's a CPU bottleneck if your pool can go faster than the CPU can do random generation (usually the case).... write it to a file first, then use that file as the input.

Or learn about and use fio to test your disks.

Since it seems your network can match the requirement, there's probably no point to running iperf3 yet.

With 6 disks and the IOPS of only 2 at your disposal, if the transfer you're doing is IOPS heavy, you'll be slow.

Getting the same performance across the network as with dd probably says that it's not sync writes in either case (since dd wouldn't be).

Also, I would consider what you're getting to be "normal" for that pool setup, since you won't be benefiting a lot from wide VDEVs, so probably limited to the write throughput performance of near to 2 disks too (so already at the max). As you can see, your reads are better as the load can be spread across more drives (maybe even all 6).
Hi, Thanks for the reply!

iperf3 even with a single stream get 8.5 to 9.x Gbit/s so Im good there even without jumbo frames.
Regarding the disks. Each 870QVO seems to get (on large media reviews) 130 to 230MB/s random writes and 400+ sequential writes.
Disks are on a LSI 9300 8i.

What do you think is a better strategy to test performance? FIO?
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
fio every time for raw pool performance
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
I have been playing around with fio a little to amuse myself today on a Samsung 970 Evo Plus 1TB NVME (just a single disk pool).

Smaller blocks (aligned with default dataset recordsize of 128k

fio --name TEST --eta-newline=5s --filename=fio-tempfile.dat --rw=write --size=50g --io_size=1500g --blocksize=128k --iodepth=16 --direct=1 --numjobs=16 --runtime=120 --group_reporting

bw ( MiB/s): min= 706, max=26151, per=99.62%, avg=2171.26, stdev=167.80, samples=3824
iops : min= 5648, max=209203, avg=17369.89, stdev=1342.36, samples=3824

Then for the same pool, larger blocks (1m to go with what OpenZFS recommends for a pool of large files like media)

fio --name TEST --eta-newline=5s --filename=fio-tempfile.dat --rw=write --size=50g --io_size=1500g --blocksize=1m --iodepth=16 --direct=1 --numjobs=16 --runtime=120 --group_reporting

bw ( MiB/s): min= 684, max=13806, per=100.00%, avg=3112.64, stdev=124.31, samples=3824
iops : min= 684, max=13803, avg=3112.19, stdev=124.28, samples=3824

Interesting to see how IOPS trade for throughput (shown as bandwidth=bw)
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
And because I have an 860 QVO 1Tb in that same system:

fio --name TEST --eta-newline=5s --filename=fio-tempfile.dat --rw=write --size=50g --io_size=1500g --blocksize=128k --iodepth=16 --direct=1 --numjobs=16 --runtime=120 --group_reporting

bw ( MiB/s): min= 527, max=33447, per=100.00%, avg=3992.29, stdev=242.52, samples=3824
iops : min= 4215, max=267578, avg=31938.26, stdev=1940.18, samples=3824

fio --name TEST --eta-newline=5s --filename=fio-tempfile.dat --rw=write --size=50g --io_size=1500g --blocksize=1m --iodepth=16 --direct=1 --numjobs=16 --runtime=120 --group_reporting

bw ( MiB/s): min= 990, max=13193, per=100.00%, avg=1480.77, stdev=102.61, samples=3824
iops : min= 990, max=13191, avg=1480.69, stdev=102.58, samples=3824

I would have to add the comment that the QVO is idle, while the Evo NVME (in my previous post) is busy doing other tasks too, so maybe not a fair comparison.

Clearly the QVO struggles with larger blocks.

But does a shade better when I give it a dataset aligned to 1M recordsize:

bw ( MiB/s): min= 1142, max=10814, per=99.97%, avg=1677.46, stdev=108.58, samples=3824
iops : min= 1142, max=10812, avg=1677.09, stdev=108.51, samples=3824
 
Last edited:

mav@

iXsystems
iXsystems
Joined
Sep 29, 2011
Messages
1,428
Just for note, modern desktop class NVMe may provide very different write results depending on percent of used space, thanks to SLC caching. Usually it means that first 1/6-1/8 of free space can be written extremely fast, but the remnant will be between slow and extremely slow, depending on controller, DRAM cache and firmware. It is OK for their typical use in desktops, but benchmarking them is a pain.
 

daemonix

Dabbler
Joined
Jun 3, 2022
Messages
21
A future reference to others: 6 QVO disks in 2 vdev does indeed saturate a 10GBe link in both read and write without any special SLOGs etc.
 
Top