Register for the iXsystems Community to get an ad-free experience and exclusive discounts in our eBay Store.

SOLVED TrueNAS Scale with Intel Optane 900P Slow Performance

Kartright

Dabbler
Joined
Feb 3, 2022
Messages
11
Hi everyone :smile:

I'm new to the Forum and hope I do this correct!

I'm writing this post in hope to get some help from the experts for my current TrueNAS-SCALE-22.02-RC.2 build. I do not have any values in read or write speed values for comparision, but the new NAS build feels very slow. Although it should be very fast, I suffer from big performance penalties.

The new NAS was built with the following hardware:
HPE DL380 Gen9 12xLFF Storage Server
2x Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
128GB DDR4 ECC
HPE Smart Array P840ar Controller (HBA mode)
HPE Ethernet 10Gb 2-port 530 FLR-SFP+ FLOM Adapter

The server boots from a pool that includes the following hardware. It was created automatically created and configured during TrueNAS Scale installation.
1x Samsung Datacenter SSD PM893 240GB SATA 6G

The main storage pool consists the following hardware.
12x Seagate Exos 16TB HDD SAS 12G, configured as ZFS RAIDz3 with sync=always
2x Intel Optane SSD 900P, configured as mirrored SLOG

After setup and configuration of TrueNAS Scale, I performed the following tests. I hope these tests are any useful and represents the correct way how the ZFS pool should be tested.

Direct ZFS pool tests:
Code:
dd if=tmp.dat bs=2048k of=/dev/null count=50k
51200+0 records in
51200+0 records out
107374182400 bytes (107 GB, 100 GiB) copied, 14.2386 s, 7.5 GB/s

Without SLOG (sync=standard):
Code:
dd if=/dev/zero of=tmp.dat bs=2048k count=50k
51200+0 records in
51200+0 records out
107374182400 bytes (107 GB, 100 GiB) copied, 43.4737 s, 2.5 GB/s


With SLOG (sync=always):
Code:
dd if=/dev/zero of=tmp.dat bs=2048k count=50k
51200+0 records in
51200+0 records out
107374182400 bytes (107 GB, 100 GiB) copied, 121.531 s, 884 MB/s

Read:
Code:
dd if=testwrite bs=2048k of=/dev/null count=50k
51200+0 records in
51200+0 records out
107374182400 bytes (107 GB, 100 GiB) copied, 232.358 s, 462 MB/s


Write (without SLOG and sync=standard):
Code:
dd if=/dev/zero of=tmp_smb.dat bs=2048k count=50k
51200+0 records in
51200+0 records out
107374182400 bytes (107 GB, 100 GiB) copied, 124.622 s, 862 MB/s


Write (with SLOG and sync=always):
Code:
dd if=/dev/zero of=tmp_smb.dat bs=2048k count=50k
51200+0 records in
51200+0 records out
107374182400 bytes (107 GB, 100 GiB) copied, 124.535 s, 862 MB/s

Read:
Code:
dd if=tmp_nfs_a.dat bs=2048k of=/dev/null count=50k
51200+0 records in
51200+0 records out
107374182400 bytes (107 GB, 100 GiB) copied, 273.052 s, 393 MB/s


Write (without SLOG and sync=standard):
Code:
dd if=/dev/zero of=tmp_nfs_a.dat bs=2048k count=50k
51200+0 records in
51200+0 records out
107374182400 bytes (107 GB, 100 GiB) copied, 116.46 s, 922 MB/s


Write (with SLOG and sync=always):
Code:
dd if=/dev/zero of=tmp_nfs_a.dat bs=2048k count=50k
51200+0 records in
51200+0 records out
107374182400 bytes (107 GB, 100 GiB) copied, 118.518 s, 906 MB/s

Read:
Code:
dd if=tmp_nfs_s.dat bs=2048k of=/dev/null count=50k
51200+0 records in
51200+0 records out
107374182400 bytes (107 GB, 100 GiB) copied, 236.775 s, 453 MB/s


Write (without SLOG and sync=standard):
Code:
dd if=/dev/zero of=tmp_nfs_s.dat bs=2048k count=50k
51200+0 records in
51200+0 records out
107374182400 bytes (107 GB, 100 GiB) copied, 496.207 s, 216 MB/s


Write (with SLOG and sync=always):
Code:
51200+0 records in
51200+0 records out
107374182400 bytes (107 GB, 100 GiB) copied, 472.944 s, 227 MB/s

The ZFS Pool is configured in the following way:
Name: tank/nas
Sync: Always
Compression LeveL: LZ4
Enable Atime: Off
ZFS Deduplication: Off
Read-Only: Off
Exec: Off
Snapshot Directory: Invisible
Copies: 1
ACL Type: Off
Type: Generic

I read here in the forum, that TrueNAS Scale is tuned yet, so may this explain the performance results I get? If yes, what are the alternatives I have?
I also saw in a similar thread, that Intel Optane should perform much better as it does on my system. Any ideas?


Is the performance really bad or do I misinterpret anything?
Do you see any problems with the current hardware and configuration?
Is this NAS performing well (compared to similar systems)?
Are the tests from above useful?
What can be improved?
Should I access the NAS via NFS with NFS sync option, when the ZFS pool is already using sync=always? I saw very slow performance in my tests above.



Thanks in advance!
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
1,515
Hi everyone :smile:

I'm new to the Forum and hope I do this correct!



Write (with SLOG and sync=always):
Code:
dd if=/dev/zero of=tmp_smb.dat bs=2048k count=50k
51200+0 records in
51200+0 records out
107374182400 bytes (107 GB, 100 GiB) copied, 124.535 s, 862 MB/s

[/spoiler]


Are the tests from above useful?
What can be improved?
Should I access the NAS via NFS with NFS sync option, when the ZFS pool is already using sync=always? I saw very slow performance in my tests above.
dd is not a good way of testing NAS performance.
It is single threaded and only allows one IO at a time.

Better to use a dedicated performance testing tool like fio.
 
Joined
Dec 29, 2014
Messages
1,054
I will steal some thunder from @jgreco and point out that the RAID card in HBA mode in not ideal.

https://www.truenas.com/community/r...bas-and-why-cant-i-use-a-raid-controller.139/

What is your storage use case? A 12 wide Z3 vdev isn't going to get you the most performance. I finally went with the consensus for the storage I use for my ESXi server and have 8 mirrored vdevs, and that helps a lot. I am sssuming you are using the Optane as an SLOG (as I am doing), you may want to tune some of the variable that allow the system to have more dirty data before writing it to spinning rust (aka hard drive). @morganL makes a good point about using a more purpose built tool for disk testing. You should also use iperf between the target systems to make sure you are getting 90-95% of you hardware speed on the network connections.
 
Last edited:

jgreco

Resident Grinch
Moderator
Joined
May 29, 2011
Messages
15,876
I will steal some thunder from @jgreco and point out that the RAID card in HBA mode

No worries. You nailed it; the P840 is based off the trainwreck CISS driver and is therefore expected to be highly problematic. "HBA mode" is a synonym for "bald faced lie," or, if we're being generous, "merely lobotomized RAID card".

But I can circle around and nail the thing you missed, grin. ;-)

HPE Ethernet 10Gb 2-port 530 FLR-SFP+ FLOM Adapter

These aren't particularly stellar either. They're based on some crappy QLogic chipset (now Broadcom 57810). There's a REASON they're selling for only $15.00 on eBay. They're not great cards. They weren't at the time either.

It could be that the Linux driver support for this network card is better than the FreeBSD support, but replacing it is recommended if problems persist.

dd is not a good way of testing NAS performance.
It is single threaded and only allows one IO at a time.

Well, that's true, but the poster is using RAIDZ which is optimized towards low numbers of primarily sequential workloads. Also, most people tend to care about that single threaded model more than full potential throughput of the pool anyways, because it's the copying data back and forth to the NAS via SMB that they care about, etc.

Compression LeveL: LZ4

So you were really testing compression efficiency with your dd commands. Rather than using /dev/zero, create a temporary file from /dev/random (which will have limited speed, so the speed there is meaningless). Then copy THAT on your pool. If you make a 1GB random file, you can do something like

% cat myrandom1g{,,,,,,,,,,,,,,,,,,,,,,,,} | dd of=testwrite.dat bs=1048576

which is structured such that the "myrandom1g" file ends up in ARC, but doesn't compress much, so the dd test results in the actual write speed of the pool, more or less. You can of course add commas to suit. But it's also important to remember that ZFS benchmarks are often garbage, or at least not particularly trustworthy/repeatable/meaningful, even if you know exactly what you're doing.
 

Kartright

Dabbler
Joined
Feb 3, 2022
Messages
11
Thanks for all the replies so far!

Better to use a dedicated performance testing tool like fio.
I tested using the tests from the following link:

Results are:

Random write test for IOP/s
Code:
root@truenas[/mnt/tank/nas/test]# sync;fio --randrepeat=1 --ioengine=libaio --direct=1 --name=test --filename=test --bs=4k --size=4G --readwrite=randwrite --ramp_time=4
test: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
fio-3.25
Starting 1 process
test: Laying out IO file (1 file / 4096MiB)
Jobs: 1 (f=1): [w(1)][98.1%][w=27.9MiB/s][w=7142 IOPS][eta 00m:03s]
test: (groupid=0, jobs=1): err= 0: pid=1998556: Fri Feb  4 16:35:13 2022
  write: IOPS=6839, BW=26.7MiB/s (28.0MB/s)(4027MiB/150724msec); 0 zone resets
    slat (usec): min=62, max=473862, avg=140.91, stdev=702.07
    clat (nsec): min=1070, max=536429, avg=2585.70, stdev=2237.90
     lat (usec): min=64, max=473877, avg=143.93, stdev=702.16
    clat percentiles (nsec):
     |  1.00th=[ 1448],  5.00th=[ 1576], 10.00th=[ 1672], 20.00th=[ 1912],
     | 30.00th=[ 2480], 40.00th=[ 2608], 50.00th=[ 2672], 60.00th=[ 2704],
     | 70.00th=[ 2768], 80.00th=[ 2896], 90.00th=[ 3088], 95.00th=[ 3344],
     | 99.00th=[ 4640], 99.50th=[ 5344], 99.90th=[12096], 99.95th=[15296],
     | 99.99th=[27008]
   bw (  KiB/s): min= 9539, max=35480, per=100.00%, avg=27361.58, stdev=3657.79, samples=301
   iops        : min= 2384, max= 8870, avg=6840.31, stdev=914.46, samples=301
  lat (usec)   : 2=22.35%, 4=76.07%, 10=1.45%, 20=0.11%, 50=0.02%
  lat (usec)   : 100=0.01%, 250=0.01%, 500=0.01%, 750=0.01%
  cpu          : usr=4.37%, sys=49.85%, ctx=2066823, majf=1, minf=971
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,1030819,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=26.7MiB/s (28.0MB/s), 26.7MiB/s-26.7MiB/s (28.0MB/s-28.0MB/s), io=4027MiB (4222MB), run=150724-150724msec

Random Read test for IOP/s
Code:
root@truenas[/mnt/tank/nas/test]# sync;fio --randrepeat=1 --ioengine=libaio --direct=1 --name=test --filename=test --bs=4k --size=4G --readwrite=randread --ramp_time=4
test: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
fio-3.25
Starting 1 process
Jobs: 1 (f=1): [r(1)][81.0%][r=398MiB/s][r=102k IOPS][eta 00m:04s]
test: (groupid=0, jobs=1): err= 0: pid=2866151: Fri Feb  4 16:37:05 2022
  read: IOPS=65.7k, BW=257MiB/s (269MB/s)(3247MiB/12657msec)
    slat (usec): min=3, max=549, avg=13.95, stdev=18.16
    clat (nsec): min=604, max=444275, avg=731.07, stdev=1274.23
     lat (usec): min=3, max=551, avg=14.77, stdev=18.35
    clat percentiles (nsec):
     |  1.00th=[  628],  5.00th=[  636], 10.00th=[  644], 20.00th=[  644],
     | 30.00th=[  652], 40.00th=[  652], 50.00th=[  660], 60.00th=[  668],
     | 70.00th=[  684], 80.00th=[  876], 90.00th=[  924], 95.00th=[  948],
     | 99.00th=[ 1032], 99.50th=[ 1224], 99.90th=[ 2160], 99.95th=[ 6752],
     | 99.99th=[16320]
   bw (  KiB/s): min=196032, max=535048, per=98.18%, avg=257947.96, stdev=70103.51, samples=25
   iops        : min=49008, max=133762, avg=64486.88, stdev=17525.91, samples=25
  lat (nsec)   : 750=78.40%, 1000=20.19%
  lat (usec)   : 2=1.23%, 4=0.11%, 10=0.01%, 20=0.05%, 50=0.01%
  lat (usec)   : 500=0.01%
  cpu          : usr=8.81%, sys=91.17%, ctx=28, majf=0, minf=63
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=831353,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=257MiB/s (269MB/s), 257MiB/s-257MiB/s (269MB/s-269MB/s), io=3247MiB (3405MB), run=12657-12657msec

Mixed Random Workload
Code:
root@truenas[/mnt/tank/nas/test]# sync;fio --randrepeat=1 --ioengine=libaio --direct=1 --name=test --filename=test --bs=4k --size=4G --readwrite=readwrite --ramp_time=4
test: (g=0): rw=rw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
fio-3.25
Starting 1 process
Jobs: 1 (f=1): [M(1)][93.8%][r=28.7MiB/s,w=28.3MiB/s][r=7355,w=7246 IOPS][eta 00m:05s]
test: (groupid=0, jobs=1): err= 0: pid=2867096: Fri Feb  4 16:39:01 2022
  read: IOPS=6896, BW=26.9MiB/s (28.2MB/s)(1936MiB/71868msec)
    slat (usec): min=3, max=43247, avg=12.07, stdev=62.48
    clat (nsec): min=679, max=452581, avg=1694.52, stdev=1770.21
     lat (usec): min=4, max=43250, avg=13.96, stdev=62.53
    clat percentiles (nsec):
     |  1.00th=[  860],  5.00th=[ 1012], 10.00th=[ 1624], 20.00th=[ 1656],
     | 30.00th=[ 1672], 40.00th=[ 1688], 50.00th=[ 1704], 60.00th=[ 1736],
     | 70.00th=[ 1752], 80.00th=[ 1768], 90.00th=[ 1800], 95.00th=[ 1832],
     | 99.00th=[ 2480], 99.50th=[ 2544], 99.90th=[ 3120], 99.95th=[ 9536],
     | 99.99th=[13888]
   bw (  KiB/s): min=18032, max=33200, per=99.92%, avg=27564.61, stdev=1953.09, samples=143
   iops        : min= 4508, max= 8300, avg=6891.09, stdev=488.27, samples=143
  write: IOPS=6886, BW=26.9MiB/s (28.2MB/s)(1933MiB/71868msec); 0 zone resets
    slat (usec): min=59, max=170528, avg=125.27, stdev=510.20
    clat (nsec): min=1142, max=446374, avg=2709.78, stdev=2012.39
     lat (usec): min=60, max=170540, avg=128.44, stdev=510.24
    clat percentiles (nsec):
     |  1.00th=[ 1496],  5.00th=[ 1816], 10.00th=[ 2576], 20.00th=[ 2640],
     | 30.00th=[ 2672], 40.00th=[ 2704], 50.00th=[ 2704], 60.00th=[ 2736],
     | 70.00th=[ 2768], 80.00th=[ 2832], 90.00th=[ 2928], 95.00th=[ 3056],
     | 99.00th=[ 3600], 99.50th=[ 3856], 99.90th=[ 5984], 99.95th=[11456],
     | 99.99th=[15808]
   bw (  KiB/s): min=17960, max=34048, per=99.90%, avg=27519.25, stdev=1861.64, samples=143
   iops        : min= 4490, max= 8512, avg=6879.76, stdev=465.42, samples=143
  lat (nsec)   : 750=0.02%, 1000=2.37%
  lat (usec)   : 2=49.44%, 4=47.95%, 10=0.16%, 20=0.06%, 50=0.01%
  lat (usec)   : 100=0.01%, 250=0.01%, 500=0.01%
  cpu          : usr=6.88%, sys=45.60%, ctx=989819, majf=0, minf=91
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=495619,494904,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=26.9MiB/s (28.2MB/s), 26.9MiB/s-26.9MiB/s (28.2MB/s-28.2MB/s), io=1936MiB (2030MB), run=71868-71868msec
  WRITE: bw=26.9MiB/s (28.2MB/s), 26.9MiB/s-26.9MiB/s (28.2MB/s-28.2MB/s), io=1933MiB (2027MB), run=71868-71868msec

Sequential write test for throughput
Code:
root@truenas[/mnt/tank/nas/test]# sync;fio --randrepeat=1 --ioengine=libaio --direct=1 --name=test --filename=test --bs=4M --size=4G --readwrite=write --ramp_time=4
test: (g=0): rw=write, bs=(R) 4096KiB-4096KiB, (W) 4096KiB-4096KiB, (T) 4096KiB-4096KiB, ioengine=libaio, iodepth=1
fio-3.25
Starting 1 process
Jobs: 1 (f=1): [W(1)][42.9%][w=621MiB/s][w=155 IOPS][eta 00m:08s]
test: (groupid=0, jobs=1): err= 0: pid=3065934: Fri Feb  4 16:39:36 2022
  write: IOPS=148, BW=595MiB/s (624MB/s)(1140MiB/1917msec); 0 zone resets
    slat (msec): min=4, max=129, avg= 6.94, stdev= 8.81
    clat (nsec): min=1989, max=28696, avg=4774.37, stdev=2519.47
     lat (msec): min=4, max=129, avg= 6.74, stdev= 8.10
    clat percentiles (nsec):
     |  1.00th=[ 2064],  5.00th=[ 2640], 10.00th=[ 2896], 20.00th=[ 3376],
     | 30.00th=[ 3664], 40.00th=[ 4128], 50.00th=[ 4512], 60.00th=[ 4704],
     | 70.00th=[ 4832], 80.00th=[ 5152], 90.00th=[ 6432], 95.00th=[ 9280],
     | 99.00th=[16064], 99.50th=[22912], 99.90th=[28800], 99.95th=[28800],
     | 99.99th=[28800]
   bw (  KiB/s): min=614400, max=729088, per=100.00%, avg=666720.33, stdev=58000.40, samples=3
   iops        : min=  150, max=  178, avg=162.67, stdev=14.19, samples=3
  lat (usec)   : 2=0.70%, 4=37.32%, 10=58.45%, 20=2.82%, 50=0.70%
  cpu          : usr=3.91%, sys=65.97%, ctx=1218, majf=0, minf=61
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,284,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=595MiB/s (624MB/s), 595MiB/s-595MiB/s (624MB/s-624MB/s), io=1140MiB (1195MB), run=1917-1917msec

Sequential Read test for throughput
Code:
root@truenas[/mnt/tank/nas/test]# sync;fio --randrepeat=1 --ioengine=libaio --direct=1 --name=test --filename=test --bs=4M --size=4G --readwrite=read --ramp_time=4
test: (g=0): rw=read, bs=(R) 4096KiB-4096KiB, (W) 4096KiB-4096KiB, (T) 4096KiB-4096KiB, ioengine=libaio, iodepth=1
fio-3.25
Starting 1 process
Jobs: 1 (f=0)
test: (groupid=0, jobs=1): err= 0: pid=3120437: Fri Feb  4 16:39:58 2022
  read: IOPS=507, BW=2030MiB/s (2128MB/s)(4096MiB/2018msec)
  cpu          : usr=0.35%, sys=99.50%, ctx=2, majf=0, minf=526
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=1024,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=2030MiB/s (2128MB/s), 2030MiB/s-2030MiB/s (2128MB/s-2128MB/s), io=4096MiB (4295MB), run=2018-2018msec

You should also use iperf between the target systems to make sure you are getting 90-95% of you hardware speed on the network connections.
I ran iperf and have almost full 10Gbit.
Code:
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  1.08 GBytes  9.30 Gbits/sec    0   1.93 MBytes
[  5]   1.00-2.00   sec  1.09 GBytes  9.39 Gbits/sec    0   2.04 MBytes
[  5]   2.00-3.00   sec  1.08 GBytes  9.28 Gbits/sec    0   2.04 MBytes
[  5]   3.00-4.00   sec  1.09 GBytes  9.34 Gbits/sec    0   2.04 MBytes
[  5]   4.00-5.00   sec  1.09 GBytes  9.34 Gbits/sec    0   2.14 MBytes
[  5]   5.00-6.00   sec  1.09 GBytes  9.39 Gbits/sec    1   1.62 MBytes
[  5]   6.00-7.00   sec  1.08 GBytes  9.29 Gbits/sec    5   1.22 MBytes
[  5]   7.00-8.00   sec  1.08 GBytes  9.28 Gbits/sec  473    939 KBytes
[  5]   8.00-9.00   sec  1.09 GBytes  9.39 Gbits/sec    0   1.20 MBytes
[  5]   9.00-10.00  sec  1.09 GBytes  9.32 Gbits/sec    0   1.23 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  10.9 GBytes  9.33 Gbits/sec  479             sender
[  5]   0.00-10.00  sec  10.9 GBytes  9.33 Gbits/sec                  receiver

iperf Done.

What is your storage use case?
It is my new primary NAS for my homelab (the old one went out of space). It will store my media and it will be used to store application data from my Docker containers.
I am assuming you are using the Optane as an SLOG (as I am doing), you may want to tune some of the variable that allow the system to have more dirty data before writing it to spinning rust (aka hard drive).
Yes, I bought the 2x Intel Optane 900Ps for a mirrored SLOG. I know that a SLOG may not improve performance, so the main purpose for me is to enable sync=always and ensure the ZFS pool guarantees the data is safely written before telling a client.
I will steal some thunder from @jgreco and point out that the RAID card in HBA mode in not ideal.

https://www.truenas.com/community/r...bas-and-why-cant-i-use-a-raid-controller.139/
No worries. You nailed it; the P840 is based off the trainwreck CISS driver and is therefore expected to be highly problematic. "HBA mode" is a synonym for "bald faced lie," or, if we're being generous, "merely lobotomized RAID card".

But I can circle around and nail the thing you missed, grin. ;-)
That is very interesting, thank you for pointing me in that direction. Does that mean, I need to swap it out now? 45drives is using LSI 9305-16i, would this be a good option? I'd like to have enough lanes on the HBA that the system supports SAS native (without SAS expander).
These aren't particularly stellar either. They're based on some crappy QLogic chipset (now Broadcom 57810). There's a REASON they're selling for only $15.00 on eBay. They're not great cards. They weren't at the time either.

It could be that the Linux driver support for this network card is better than the FreeBSD support, but replacing it is recommended if problems persist.
I run full 10GBit with the HPE card (see iperf output above). I also have an Intel card lying around, I may test with it too, but with the HPE card I already have 10Gbit, so I don't think there is need to change it. What do you think?
So you were really testing compression efficiency with your dd commands. Rather than using /dev/zero, create a temporary file from /dev/random (which will have limited speed, so the speed there is meaningless). Then copy THAT on your pool. If you make a 1GB random file, you can do something like

% cat myrandom1g{,,,,,,,,,,,,,,,,,,,,,,,,} | dd of=testwrite.dat bs=1048576

which is structured such that the "myrandom1g" file ends up in ARC, but doesn't compress much, so the dd test results in the actual write speed of the pool, more or less. You can of course add commas to suit. But it's also important to remember that ZFS benchmarks are often garbage, or at least not particularly trustworthy/repeatable/meaningful, even if you know exactly what you're doing.
I hope I tested correctly. I create a new file from /dev/random and copied it to my ZFS pool. The executed the above command, resulting in the following output.
Code:
root@truenas[/mnt/tank/nas/test]# cat myrandom1g{,,,,,,,,,,,,,,,,,,,,,,,,} | dd of=testwrite.dat bs=1048576
0+151172 records in
0+151172 records out
9907208192 bytes (9.9 GB, 9.2 GiB) copied, 53.0982 s, 187 MB/s


So are all these performance numbers right?
What are the next steps to improve?
 
Joined
Dec 29, 2014
Messages
1,054
I ran iperf and have almost full 10Gbit.
The iperf numbers you show are good. The Intel and Chelsio cards have a lot more hours and community support. If you already have an Intel card, I would certainly give it a try. Sometimes issues with drivers or equipment that doesn't have as good of support don't manifest themselves until you are under load. The potential for damage is greater at that point.
That is very interesting, thank you for pointing me in that direction. Does that mean, I need to swap it out now? 45drives is using LSI 9305-16i, would this be a good option? I'd like to have enough lanes on the HBA that the system supports SAS native (without SAS expander).
That sounds like a good option to me. I would certainly take any opinions from @HoneyBadger or @jgreco to heart on those matters.
 

jgreco

Resident Grinch
Moderator
Joined
May 29, 2011
Messages
15,876
That is very interesting, thank you for pointing me in that direction. Does that mean, I need to swap it out now? 45drives is using LSI 9305-16i, would this be a good option? I'd like to have enough lanes on the HBA that the system supports SAS native (without SAS expander).

The CISS driver was a well meaning but possibly misguided attempt to create a standard shim layer. Please don't put anything important or valuable on the system. The problem with this is that it's really a RAID driver and isn't as ... what's the word ...? maybe "meticulous"? ... about the details, and it is known to hide faults/problems/etc. If you were going to use this RAID controller for a standard RAID5 array, it's actually not bad, because it tries hard to present a clean storage system to the host. But that's not right for ZFS.

That is very interesting, thank you for pointing me in that direction. Does that mean, I need to swap it out now? 45drives is using LSI 9305-16i, would this be a good option? I'd like to have enough lanes on the HBA that the system supports SAS native (without SAS expander).

Yup.

I run full 10GBit with the HPE card (see iperf output above). I also have an Intel card lying around, I may test with it too, but with the HPE card I already have 10Gbit, so I don't think there is need to change it. What do you think?

We've had lots of people show up with problems that traced back to the Broadcom/QLogic ethernet chipsets.

Intel and Chelsio both have authored their own drivers (at least initially) and have provided support to the community on an ongoing basis. As noted by @Elliot Dierksen it isn't JUST about iperf numbers; those are like the automobile "dummy light" for network problems, but network performance is a lot more complicated in practice.

It is perfectly possible that this PARTICULAR chipset is fine and will perform well. However, usually if that's the case, you wouldn't find it being blown out on eBay for $15/card. If you can try head-to-head comparisons against a decent Intel card, that's a good idea. I'm fine and dandy with the QLogic if it works. Network is a thing where it is primarily exasperating from a performance perspective if it doesn't work well. The HBA/RAID thing can be actively risking your data, by comparison, so of the two things, get the HBA thing remedied first.

Just my two cents.
 

Kartright

Dabbler
Joined
Feb 3, 2022
Messages
11
The iperf numbers you show are good. The Intel and Chelsio cards have a lot more hours and community support. If you already have an Intel card, I would certainly give it a try. Sometimes issues with drivers or equipment that doesn't have as good of support don't manifest themselves until you are under load. The potential for damage is greater at that point.
We've had lots of people show up with problems that traced back to the Broadcom/QLogic ethernet chipsets.

Intel and Chelsio both have authored their own drivers (at least initially) and have provided support to the community on an ongoing basis. As noted by @Elliot Dierksen it isn't JUST about iperf numbers; those are like the automobile "dummy light" for network problems, but network performance is a lot more complicated in practice.

It is perfectly possible that this PARTICULAR chipset is fine and will perform well. However, usually if that's the case, you wouldn't find it being blown out on eBay for $15/card. If you can try head-to-head comparisons against a decent Intel card, that's a good idea. I'm fine and dandy with the QLogic if it works. Network is a thing where it is primarily exasperating from a performance perspective if it doesn't work well. The HBA/RAID thing can be actively risking your data, by comparison, so of the two things, get the HBA thing remedied first.
The intel card is an Intel X520-DA2.
That sounds like a good option to me. I would certainly take any opinions from @HoneyBadger or @jgreco to heart on those matters.
I'm glad you think that the LSI 9305-16i is a good option but its rather "old" isn't it? Are there any "newer" LSI cards that should be considered? The server has PCIe 3.0 at most.

I'm still wondering though, why the Intel Optane performs so bad. It has so good specs, so much write performance, but its performing that bad in my system. Is there any reason for that? Shouldn't the Intel Optane improve the performance to a very high level to mitigate the problems from the HPE storage controller? @Elliot Dierksen and @jgreco can you explain that to me please?
 

jgreco

Resident Grinch
Moderator
Joined
May 29, 2011
Messages
15,876
LSI 9305-16i is a good option but its rather "old" isn't it?

No, LSI 9211-8i is rather old. LSI 9305 is newer. After that, LSI really stopped making HBA products, because the 93xx will do full speed, so what's the point? LSI also got sold to Avago got sold to Broadcom and lots of talent probably left. HBA's aren't where they make money anyways. RAID cards are.

Shouldn't the Intel Optane improve the performance to a very high level to mitigate the problems from the HPE storage controller?

Have you possibly mistaken SLOG for some sort of cache? It isn't. Async writes are ALWAYS faster than sync writes, which are the only time a SLOG device gets used. Adding a SLOG and turning on sync writes always slows the system down, no matter how fast the SLOG.


And the problems with the HPE controller are not really solvable in any case.
 

Kartright

Dabbler
Joined
Feb 3, 2022
Messages
11
No, LSI 9211-8i is rather old. LSI 9305 is newer. After that, LSI really stopped making HBA products, because the 93xx will do full speed, so what's the point? LSI also got sold to Avago got sold to Broadcom and lots of talent probably left. HBA's aren't where they make money anyways. RAID cards are.
I thought there would be anything newer, that is a good option to go for. So then I'll buy myself one of the LSI 9305-16i.
Have you possibly mistaken SLOG for some sort of cache? It isn't. Async writes are ALWAYS faster than sync writes, which are the only time a SLOG device gets used. Adding a SLOG and turning on sync writes always slows the system down, no matter how fast the SLOG.
I knew that the SLOG isn't a write cache as the write cache relies in the RAM. I understood the SLOG as an additional drive next to the pool, where TXGs of ZFS are saved to during a sync writes. What I did not know was that the SLOG is not on the data path. I thought it is and so it improves write speeds on sync writes by sending an "Data write OK package" when the data is written to the SLOG and not when its on the pool. Combining this misunderstanding with sync=always lead me to the conclusion, that the SLOG may mitigate the bad choice for the HBA as it only depends on the write speed of the SLOG.
As I learned in your article now, the SLOG is not on the data path. This leads to the following conclusion: The ZFS storage can merely be that fast as the pool is, regardless if a SLOG is used or not. Makes sense now.
BTW: Your article is very informative and interesting. I learned a lot.
And the problems with the HPE controller are not really solvable in any case.
Yes, that is what I learned today. I hope that the performance will be acceptable with the LSI 9305-16i. Do you have some numbers for me I can put mine in comparision with when I've put in the LSI card?
 

jgreco

Resident Grinch
Moderator
Joined
May 29, 2011
Messages
15,876
Like a benchmark? No. Benchmarks under ZFS don't do well because of all the complexity. Two runs rarely result in the same answers. You can get somewhat more conventional and consistent results from something like solnet-array-test-v2.sh but that's not a ZFS test.
 

Kartright

Dabbler
Joined
Feb 3, 2022
Messages
11
I understand. I ordered the HBA, I'll flash it to IT mode and keep this thread updated as soon as I have new results.

May I ask one additional question for now? In your post What's all the noise about HBAs, and why can't I use a RAID controller? you wrote
The LSI 9300-8i (PCIe 3.0 based on LSI 12Gbps SAS3000) requires firmware 16.00.10.00 or the special 16.00.12.00 available via iXsystems.
Is the version 16.00.12.00 (the one available via iXsystems) suitable for the LSI 9305-16i? So I mean, can I safely flash it to the controller or should I use the firmware from LSI directly?
Thanks in advance.
 

jgreco

Resident Grinch
Moderator
Joined
May 29, 2011
Messages
15,876
I understand. I ordered the HBA, I'll flash it to IT mode and keep this thread updated as soon as I have new results.

May I ask one additional question for now? In your post What's all the noise about HBAs, and why can't I use a RAID controller? you wrote

Is the version 16.00.12.00 (the one available via iXsystems) suitable for the LSI 9305-16i? So I mean, can I safely flash it to the controller or should I use the firmware from LSI directly?
Thanks in advance.

I am happy to answer inquiries that are indicative of an attempt to do the homework but have ended in paranoia and caution. ;-) Paranoia and caution are very healthy.


The 9305-16i on the Broadcom firmware download site now lists the 16.00.12.00 firmware direct-from-Broadcom.


Click "Downloads", expand "Firmware"


The file

9305_16i_Pkg_P16.12_IT_FW_BIOS_for_MSDOS_Windows/Firmware/SAS9305_16i_IT_P/SAS9305_16i_IT_P.bin

within can be flashed to the controller regardless of OS; this package is "for" Windows but TrueNAS has the CLI "sas3flash" tool to flash this from FreeBSD or Linux, so you just need this file, and the ROM BIOS file:

9305_16i_Pkg_P16.12_IT_FW_BIOS_for_MSDOS_Windows/sasbios_rel/mptsas3.rom

if you're flashing the BIOS, which I recommend, but others don't.
 

Kartright

Dabbler
Joined
Feb 3, 2022
Messages
11
As the controller is not that cheap, I'm better asking the experts - little caution, just to be sure. Before writing my previous post, I downloaded the version from iXsystems which just includes the version for the LSI 9300 controller only. As I couldn't find a similar patched version from iXsystems that is agreed to work on LSI 9305, I'll better ask if the firmware from Broadcom suits the needs for this controller to deliver the expected performance, you know. I neither had a ZFS-based system, nor a LSI HBA, so unfortunately I have little experience so far.

I reviewed the firmware from Broadcom for LSI 9305-16i and TrueNAS Scale already comes with the flash utility, very practicable. Thanks for your detailed explanation, especially the hint about the BIOS flash!
 

jgreco

Resident Grinch
Moderator
Joined
May 29, 2011
Messages
15,876
Thanks for your detailed explanation, especially the hint about the BIOS flash!

You can make your own determination on this. I like the BIOS because it lets me see disk problems before booting the NAS and to poke at stuff. Other people dislike it because it takes longer to boot (maybe 30 sec?) I never got that argument, since it isn't like you're booting the NAS up every time you want to access a file. I hope. Heh.
 

Kartright

Dabbler
Joined
Feb 3, 2022
Messages
11
I got the new LSI 9305-16i controller now and could upgrade it to latest firmware (IT mode) successfully.
This is the output of the sas3flash -list command:
Code:
Avago Technologies SAS3 Flash Utility
Version 16.00.00.00 (2017.05.02)
Copyright 2008-2017 Avago Technologies. All rights reserved.

        Adapter Selected is a Avago SAS: SAS3224(A1)

        Controller Number              : 0
        Controller                     : SAS3224(A1)
        PCI Address                    : 00:08:00:00
        SAS Address                    : 500XXXX-X-XXXX-XXXX
        NVDATA Version (Default)       : 10.00.00.05
        NVDATA Version (Persistent)    : 10.00.00.05
        Firmware Product ID            : 0x2228 (IT)
        Firmware Version               : 16.00.12.00
        NVDATA Vendor                  : LSI
        NVDATA Product ID              : SAS9305-16i
        BIOS Version                   : 08.37.02.00
        UEFI BSD Version               : 18.00.03.00
        FCODE Version                  : N/A
        Board Name                     : SAS9305-16i
        Board Assembly                 : 03-25703-02003
        Board Tracer Number            : SV62620477

        Finished Processing Commands Successfully.
        Exiting SAS3Flash.

The performance results with fio are slightly slower than the HP as it seems (sync=always and same settings for ZFS pool) on:
Code:
Random write test for IOP/s
==================
HP:
root@truenas[/mnt/tank/nas/test]# sync;fio --randrepeat=1 --ioengine=libaio --direct=1 --name=test --filename=test --bs=4k --size=4G --readwrite=randwrite --ramp_time=4
test: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
fio-3.25
Starting 1 process
test: Laying out IO file (1 file / 4096MiB)
Jobs: 1 (f=1): [w(1)][98.1%][w=27.9MiB/s][w=7142 IOPS][eta 00m:03s]
test: (groupid=0, jobs=1): err= 0: pid=1998556: Fri Feb  4 16:35:13 2022
  write: IOPS=6839, BW=26.7MiB/s (28.0MB/s)(4027MiB/150724msec); 0 zone resets
    slat (usec): min=62, max=473862, avg=140.91, stdev=702.07
    clat (nsec): min=1070, max=536429, avg=2585.70, stdev=2237.90
     lat (usec): min=64, max=473877, avg=143.93, stdev=702.16
    clat percentiles (nsec):
     |  1.00th=[ 1448],  5.00th=[ 1576], 10.00th=[ 1672], 20.00th=[ 1912],
     | 30.00th=[ 2480], 40.00th=[ 2608], 50.00th=[ 2672], 60.00th=[ 2704],
     | 70.00th=[ 2768], 80.00th=[ 2896], 90.00th=[ 3088], 95.00th=[ 3344],
     | 99.00th=[ 4640], 99.50th=[ 5344], 99.90th=[12096], 99.95th=[15296],
     | 99.99th=[27008]
   bw (  KiB/s): min= 9539, max=35480, per=100.00%, avg=27361.58, stdev=3657.79, samples=301
   iops        : min= 2384, max= 8870, avg=6840.31, stdev=914.46, samples=301
  lat (usec)   : 2=22.35%, 4=76.07%, 10=1.45%, 20=0.11%, 50=0.02%
  lat (usec)   : 100=0.01%, 250=0.01%, 500=0.01%, 750=0.01%
  cpu          : usr=4.37%, sys=49.85%, ctx=2066823, majf=1, minf=971
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,1030819,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=26.7MiB/s (28.0MB/s), 26.7MiB/s-26.7MiB/s (28.0MB/s-28.0MB/s), io=4027MiB (4222MB), run=150724-150724msec

LSI:
sync;fio --randrepeat=1 --ioengine=libaio --direct=1 --name=test --filename=testnew2 --bs=4k --size=4G --readwrite=randwrite --ramp_time=4
test: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
fio-3.25
Starting 1 process
test: Laying out IO file (1 file / 4096MiB)
Jobs: 1 (f=1): [w(1)][97.9%][w=27.0MiB/s][w=6914 IOPS][eta 00m:04s]
test: (groupid=0, jobs=1): err= 0: pid=1067055: Thu Feb 10 12:13:05 2022
  write: IOPS=5748, BW=22.5MiB/s (23.5MB/s)(4019MiB/178987msec); 0 zone resets
    slat (usec): min=65, max=31422, avg=168.23, stdev=140.11
    clat (nsec): min=1154, max=656163, avg=2589.97, stdev=1856.37
     lat (usec): min=66, max=31436, avg=171.28, stdev=140.37
    clat percentiles (nsec):
     |  1.00th=[ 1352],  5.00th=[ 1496], 10.00th=[ 1656], 20.00th=[ 2192],
     | 30.00th=[ 2448], 40.00th=[ 2512], 50.00th=[ 2576], 60.00th=[ 2736],
     | 70.00th=[ 2832], 80.00th=[ 2928], 90.00th=[ 3120], 95.00th=[ 3376],
     | 99.00th=[ 4512], 99.50th=[ 5088], 99.90th=[11328], 99.95th=[14656],
     | 99.99th=[24704]
   bw (  KiB/s): min=15432, max=30200, per=99.95%, avg=22982.62, stdev=2107.80, samples=357
   iops        : min= 3858, max= 7550, avg=5745.57, stdev=526.97, samples=357
  lat (usec)   : 2=17.91%, 4=80.46%, 10=1.51%, 20=0.10%, 50=0.02%
  lat (usec)   : 100=0.01%, 500=0.01%, 750=0.01%
  cpu          : usr=4.00%, sys=55.41%, ctx=2059230, majf=0, minf=1081
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,1028904,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=22.5MiB/s (23.5MB/s), 22.5MiB/s-22.5MiB/s (23.5MB/s-23.5MB/s), io=4019MiB (4214MB), run=178987-178987msec

Random Read test for IOP/s
==================
HP:
root@truenas[/mnt/tank/nas/test]# sync;fio --randrepeat=1 --ioengine=libaio --direct=1 --name=test --filename=test --bs=4k --size=4G --readwrite=randread --ramp_time=4
test: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
fio-3.25
Starting 1 process
Jobs: 1 (f=1): [r(1)][81.0%][r=398MiB/s][r=102k IOPS][eta 00m:04s]
test: (groupid=0, jobs=1): err= 0: pid=2866151: Fri Feb  4 16:37:05 2022
  read: IOPS=65.7k, BW=257MiB/s (269MB/s)(3247MiB/12657msec)
    slat (usec): min=3, max=549, avg=13.95, stdev=18.16
    clat (nsec): min=604, max=444275, avg=731.07, stdev=1274.23
     lat (usec): min=3, max=551, avg=14.77, stdev=18.35
    clat percentiles (nsec):
     |  1.00th=[  628],  5.00th=[  636], 10.00th=[  644], 20.00th=[  644],
     | 30.00th=[  652], 40.00th=[  652], 50.00th=[  660], 60.00th=[  668],
     | 70.00th=[  684], 80.00th=[  876], 90.00th=[  924], 95.00th=[  948],
     | 99.00th=[ 1032], 99.50th=[ 1224], 99.90th=[ 2160], 99.95th=[ 6752],
     | 99.99th=[16320]
   bw (  KiB/s): min=196032, max=535048, per=98.18%, avg=257947.96, stdev=70103.51, samples=25
   iops        : min=49008, max=133762, avg=64486.88, stdev=17525.91, samples=25
  lat (nsec)   : 750=78.40%, 1000=20.19%
  lat (usec)   : 2=1.23%, 4=0.11%, 10=0.01%, 20=0.05%, 50=0.01%
  lat (usec)   : 500=0.01%
  cpu          : usr=8.81%, sys=91.17%, ctx=28, majf=0, minf=63
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=831353,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=257MiB/s (269MB/s), 257MiB/s-257MiB/s (269MB/s-269MB/s), io=3247MiB (3405MB), run=12657-12657msec

LSI:
root@truenas[/mnt/tank/nas/test]# sync;fio --randrepeat=1 --ioengine=libaio --direct=1 --name=test --filename=test --bs=4k --size=4G --readwrite=randread --ramp_time=4
test: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
fio-3.25
Starting 1 process
Jobs: 1 (f=1): [r(1)][90.0%][r=108MiB/s][r=27.7k IOPS][eta 00m:05s]
test: (groupid=0, jobs=1): err= 0: pid=859569: Thu Feb 10 10:07:33 2022
  read: IOPS=23.4k, BW=91.3MiB/s (95.7MB/s)(3751MiB/41095msec)
    slat (usec): min=3, max=583, avg=40.94, stdev=13.31
    clat (nsec): min=616, max=455495, avg=976.73, stdev=1084.00
     lat (usec): min=3, max=585, avg=42.05, stdev=13.47
    clat percentiles (nsec):
     |  1.00th=[  644],  5.00th=[  700], 10.00th=[  916], 20.00th=[  932],
     | 30.00th=[  948], 40.00th=[  956], 50.00th=[  964], 60.00th=[  972],
     | 70.00th=[  980], 80.00th=[  996], 90.00th=[ 1020], 95.00th=[ 1064],
     | 99.00th=[ 1288], 99.50th=[ 1400], 99.90th=[ 6880], 99.95th=[15680],
     | 99.99th=[16768]
   bw (  KiB/s): min=81912, max=172240, per=99.60%, avg=93098.37, stdev=10122.34, samples=82
   iops        : min=20478, max=43060, avg=23274.46, stdev=2530.60, samples=82
  lat (nsec)   : 750=5.82%, 1000=77.17%
  lat (usec)   : 2=16.71%, 4=0.18%, 10=0.02%, 20=0.09%, 50=0.01%
  lat (usec)   : 500=0.01%
  cpu          : usr=4.82%, sys=95.18%, ctx=98, majf=0, minf=64
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=960345,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=91.3MiB/s (95.7MB/s), 91.3MiB/s-91.3MiB/s (95.7MB/s-95.7MB/s), io=3751MiB (3934MB), run=41095-41095msec

Mixed Random Workload
================
root@truenas[/mnt/tank/nas/test]# sync;fio --randrepeat=1 --ioengine=libaio --direct=1 --name=test --filename=test --bs=4k --size=4G --readwrite=readwrite --ramp_time=4
HP:
test: (g=0): rw=rw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
fio-3.25
Starting 1 process
Jobs: 1 (f=1): [M(1)][93.8%][r=28.7MiB/s,w=28.3MiB/s][r=7355,w=7246 IOPS][eta 00m:05s]
test: (groupid=0, jobs=1): err= 0: pid=2867096: Fri Feb  4 16:39:01 2022
  read: IOPS=6896, BW=26.9MiB/s (28.2MB/s)(1936MiB/71868msec)
    slat (usec): min=3, max=43247, avg=12.07, stdev=62.48
    clat (nsec): min=679, max=452581, avg=1694.52, stdev=1770.21
     lat (usec): min=4, max=43250, avg=13.96, stdev=62.53
    clat percentiles (nsec):
     |  1.00th=[  860],  5.00th=[ 1012], 10.00th=[ 1624], 20.00th=[ 1656],
     | 30.00th=[ 1672], 40.00th=[ 1688], 50.00th=[ 1704], 60.00th=[ 1736],
     | 70.00th=[ 1752], 80.00th=[ 1768], 90.00th=[ 1800], 95.00th=[ 1832],
     | 99.00th=[ 2480], 99.50th=[ 2544], 99.90th=[ 3120], 99.95th=[ 9536],
     | 99.99th=[13888]
   bw (  KiB/s): min=18032, max=33200, per=99.92%, avg=27564.61, stdev=1953.09, samples=143
   iops        : min= 4508, max= 8300, avg=6891.09, stdev=488.27, samples=143
  write: IOPS=6886, BW=26.9MiB/s (28.2MB/s)(1933MiB/71868msec); 0 zone resets
    slat (usec): min=59, max=170528, avg=125.27, stdev=510.20
    clat (nsec): min=1142, max=446374, avg=2709.78, stdev=2012.39
     lat (usec): min=60, max=170540, avg=128.44, stdev=510.24
    clat percentiles (nsec):
     |  1.00th=[ 1496],  5.00th=[ 1816], 10.00th=[ 2576], 20.00th=[ 2640],
     | 30.00th=[ 2672], 40.00th=[ 2704], 50.00th=[ 2704], 60.00th=[ 2736],
     | 70.00th=[ 2768], 80.00th=[ 2832], 90.00th=[ 2928], 95.00th=[ 3056],
     | 99.00th=[ 3600], 99.50th=[ 3856], 99.90th=[ 5984], 99.95th=[11456],
     | 99.99th=[15808]
   bw (  KiB/s): min=17960, max=34048, per=99.90%, avg=27519.25, stdev=1861.64, samples=143
   iops        : min= 4490, max= 8512, avg=6879.76, stdev=465.42, samples=143
  lat (nsec)   : 750=0.02%, 1000=2.37%
  lat (usec)   : 2=49.44%, 4=47.95%, 10=0.16%, 20=0.06%, 50=0.01%
  lat (usec)   : 100=0.01%, 250=0.01%, 500=0.01%
  cpu          : usr=6.88%, sys=45.60%, ctx=989819, majf=0, minf=91
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=495619,494904,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=26.9MiB/s (28.2MB/s), 26.9MiB/s-26.9MiB/s (28.2MB/s-28.2MB/s), io=1936MiB (2030MB), run=71868-71868msec
  WRITE: bw=26.9MiB/s (28.2MB/s), 26.9MiB/s-26.9MiB/s (28.2MB/s-28.2MB/s), io=1933MiB (2027MB), run=71868-71868msec

LSI:
sync;fio --randrepeat=1 --ioengine=libaio --direct=1 --name=test --filename=test --bs=4k --size=4G --readwrite=readwrite --ramp_time=4
test: (g=0): rw=rw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
fio-3.25
Starting 1 process
Jobs: 1 (f=0): [f(1)][100.0%][r=26.3MiB/s,w=26.8MiB/s][r=6722,w=6858 IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=862535: Thu Feb 10 10:10:46 2022
  read: IOPS=6889, BW=26.9MiB/s (28.2MB/s)(1944MiB/72228msec)
    slat (usec): min=3, max=34604, avg=13.06, stdev=67.59
    clat (nsec): min=668, max=433936, avg=1630.22, stdev=1734.46
     lat (usec): min=4, max=34607, avg=14.87, stdev=67.66
    clat percentiles (nsec):
     |  1.00th=[  820],  5.00th=[ 1020], 10.00th=[ 1576], 20.00th=[ 1592],
     | 30.00th=[ 1592], 40.00th=[ 1608], 50.00th=[ 1624], 60.00th=[ 1640],
     | 70.00th=[ 1656], 80.00th=[ 1672], 90.00th=[ 1720], 95.00th=[ 1816],
     | 99.00th=[ 2320], 99.50th=[ 2416], 99.90th=[ 3152], 99.95th=[ 7008],
     | 99.99th=[12480]
   bw (  KiB/s): min=18504, max=33144, per=100.00%, avg=27566.56, stdev=1869.06, samples=144
   iops        : min= 4626, max= 8286, avg=6891.58, stdev=467.27, samples=144
  write: IOPS=6880, BW=26.9MiB/s (28.2MB/s)(1941MiB/72228msec); 0 zone resets
    slat (usec): min=60, max=187177, avg=125.11, stdev=421.97
    clat (nsec): min=1063, max=438415, avg=2344.43, stdev=1667.71
     lat (usec): min=62, max=187185, avg=127.98, stdev=422.01
    clat percentiles (nsec):
     |  1.00th=[ 1224],  5.00th=[ 1672], 10.00th=[ 2192], 20.00th=[ 2256],
     | 30.00th=[ 2288], 40.00th=[ 2288], 50.00th=[ 2320], 60.00th=[ 2352],
     | 70.00th=[ 2384], 80.00th=[ 2416], 90.00th=[ 2512], 95.00th=[ 2736],
     | 99.00th=[ 3376], 99.50th=[ 3504], 99.90th=[ 5024], 99.95th=[10560],
     | 99.99th=[14528]
   bw (  KiB/s): min=18232, max=32200, per=100.00%, avg=27530.99, stdev=1749.62, samples=144
   iops        : min= 4558, max= 8050, avg=6882.69, stdev=437.40, samples=144
  lat (nsec)   : 750=0.01%, 1000=2.43%
  lat (usec)   : 2=48.62%, 4=48.79%, 10=0.10%, 20=0.04%, 50=0.01%
  lat (usec)   : 500=0.01%
  cpu          : usr=6.07%, sys=45.98%, ctx=994047, majf=0, minf=120
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=497605,496956,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=26.9MiB/s (28.2MB/s), 26.9MiB/s-26.9MiB/s (28.2MB/s-28.2MB/s), io=1944MiB (2038MB), run=72228-72228msec
  WRITE: bw=26.9MiB/s (28.2MB/s), 26.9MiB/s-26.9MiB/s (28.2MB/s-28.2MB/s), io=1941MiB (2036MB), run=72228-72228msec

Sequential write test for throughput
=======================
HP:
root@truenas[/mnt/tank/nas/test]# sync;fio --randrepeat=1 --ioengine=libaio --direct=1 --name=test --filename=test --bs=4M --size=4G --readwrite=write --ramp_time=4
test: (g=0): rw=write, bs=(R) 4096KiB-4096KiB, (W) 4096KiB-4096KiB, (T) 4096KiB-4096KiB, ioengine=libaio, iodepth=1
fio-3.25
Starting 1 process
Jobs: 1 (f=1): [W(1)][42.9%][w=621MiB/s][w=155 IOPS][eta 00m:08s]
test: (groupid=0, jobs=1): err= 0: pid=3065934: Fri Feb  4 16:39:36 2022
  write: IOPS=148, BW=595MiB/s (624MB/s)(1140MiB/1917msec); 0 zone resets
    slat (msec): min=4, max=129, avg= 6.94, stdev= 8.81
    clat (nsec): min=1989, max=28696, avg=4774.37, stdev=2519.47
     lat (msec): min=4, max=129, avg= 6.74, stdev= 8.10
    clat percentiles (nsec):
     |  1.00th=[ 2064],  5.00th=[ 2640], 10.00th=[ 2896], 20.00th=[ 3376],
     | 30.00th=[ 3664], 40.00th=[ 4128], 50.00th=[ 4512], 60.00th=[ 4704],
     | 70.00th=[ 4832], 80.00th=[ 5152], 90.00th=[ 6432], 95.00th=[ 9280],
     | 99.00th=[16064], 99.50th=[22912], 99.90th=[28800], 99.95th=[28800],
     | 99.99th=[28800]
   bw (  KiB/s): min=614400, max=729088, per=100.00%, avg=666720.33, stdev=58000.40, samples=3
   iops        : min=  150, max=  178, avg=162.67, stdev=14.19, samples=3
  lat (usec)   : 2=0.70%, 4=37.32%, 10=58.45%, 20=2.82%, 50=0.70%
  cpu          : usr=3.91%, sys=65.97%, ctx=1218, majf=0, minf=61
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,284,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=595MiB/s (624MB/s), 595MiB/s-595MiB/s (624MB/s-624MB/s), io=1140MiB (1195MB), run=1917-1917msec

LSI:
sync;fio --randrepeat=1 --ioengine=libaio --direct=1 --name=test --filename=test --bs=4M --size=4G --readwrite=write --ramp_time=4
test: (g=0): rw=write, bs=(R) 4096KiB-4096KiB, (W) 4096KiB-4096KiB, (T) 4096KiB-4096KiB, ioengine=libaio, iodepth=1
fio-3.25
Starting 1 process
Jobs: 1 (f=1): [W(1)][46.2%][w=629MiB/s][w=157 IOPS][eta 00m:07s]
test: (groupid=0, jobs=1): err= 0: pid=1086055: Thu Feb 10 10:11:51 2022
  write: IOPS=178, BW=718MiB/s (753MB/s)(1408MiB/1961msec); 0 zone resets
    slat (msec): min=3, max=117, avg= 5.58, stdev= 7.38
    clat (nsec): min=1817, max=446102, avg=5034.25, stdev=23656.84
     lat (msec): min=3, max=117, avg= 5.58, stdev= 7.39
    clat percentiles (usec):
     |  1.00th=[    3],  5.00th=[    3], 10.00th=[    3], 20.00th=[    3],
     | 30.00th=[    4], 40.00th=[    4], 50.00th=[    4], 60.00th=[    4],
     | 70.00th=[    4], 80.00th=[    5], 90.00th=[    5], 95.00th=[    7],
     | 99.00th=[   11], 99.50th=[   12], 99.90th=[  445], 99.95th=[  445],
     | 99.99th=[  445]
   bw (  KiB/s): min=442368, max=876544, per=96.64%, avg=710515.00, stdev=234410.46, samples=3
   iops        : min=  108, max=  214, avg=173.33, stdev=57.14, samples=3
  lat (usec)   : 2=0.57%, 4=74.07%, 10=23.36%, 20=1.71%, 500=0.28%
  cpu          : usr=3.26%, sys=78.17%, ctx=788, majf=0, minf=63
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,351,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=718MiB/s (753MB/s), 718MiB/s-718MiB/s (753MB/s-753MB/s), io=1408MiB (1476MB), run=1961-1961msec

Sequential Read test for throughput
=======================
HP:
root@truenas[/mnt/tank/nas/test]# sync;fio --randrepeat=1 --ioengine=libaio --direct=1 --name=test --filename=test --bs=4M --size=4G --readwrite=read --ramp_time=4
test: (g=0): rw=read, bs=(R) 4096KiB-4096KiB, (W) 4096KiB-4096KiB, (T) 4096KiB-4096KiB, ioengine=libaio, iodepth=1
fio-3.25
Starting 1 process
Jobs: 1 (f=0)
test: (groupid=0, jobs=1): err= 0: pid=3120437: Fri Feb  4 16:39:58 2022
  read: IOPS=507, BW=2030MiB/s (2128MB/s)(4096MiB/2018msec)
  cpu          : usr=0.35%, sys=99.50%, ctx=2, majf=0, minf=526
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=1024,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=2030MiB/s (2128MB/s), 2030MiB/s-2030MiB/s (2128MB/s-2128MB/s), io=4096MiB (4295MB), run=2018-2018msec

LSI:
root@truenas[/mnt/tank/nas/test]# sync;fio --randrepeat=1 --ioengine=libaio --direct=1 --name=test --filename=test --bs=4M --size=4G --readwrite=read --ramp_time=4
test: (g=0): rw=read, bs=(R) 4096KiB-4096KiB, (W) 4096KiB-4096KiB, (T) 4096KiB-4096KiB, ioengine=libaio, iodepth=1
fio-3.25
Starting 1 process
Jobs: 1 (f=0)
test: (groupid=0, jobs=1): err= 0: pid=1140319: Thu Feb 10 10:12:54 2022
  read: IOPS=517, BW=2069MiB/s (2169MB/s)(4096MiB/1980msec)
  cpu          : usr=0.00%, sys=99.90%, ctx=4, majf=0, minf=523
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=1024,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=2069MiB/s (2169MB/s), 2069MiB/s-2069MiB/s (2169MB/s-2169MB/s), io=4096MiB (4295MB), run=1980-1980msec

Do these performance values sound reasonable to anyone?
Is the performance I get with the LSI card attached that slow because of TrueNAS Scale not tuned yet (read this somewhere, don't know if correct)? Would it better to give TrueNAS Core a try?
The LSI controller UEFI allows me to enable Disk Write Caching. Should this be done or left disabled, whats the common practice here?

You can make your own determination on this. I like the BIOS because it lets me see disk problems before booting the NAS and to poke at stuff. Other people dislike it because it takes longer to boot (maybe 30 sec?) I never got that argument, since it isn't like you're booting the NAS up every time you want to access a file. I hope. Heh.
The server boots slower that is true. However, for me this is not a reason to not use it and I rather think its better to have the opportunity to control the card via UEFI directly although the slower boot.
 
Last edited:

Kartright

Dabbler
Joined
Feb 3, 2022
Messages
11
Additionally, on console I get the following output. Its just a log info notice, but does anyone know if this is something important or in any way related to the "performance" problems I have?
Code:
mpt3sas_cm0: log_info(0x30030109): originator(IOP), code(0x03), sub_code(0x0109)

Do these performance values sound reasonable to anyone?
Is the performance I get with the LSI card attached that slow because of TrueNAS Scale not tuned yet (read this somewhere, don't know if correct)? Would it better to give TrueNAS Core a try?
The LSI controller UEFI allows me to enable Disk Write Caching. Should this be done or left disabled, whats the common practice here?
Does anyone of the experts have an idea? Thanks in advance!
 

Kartright

Dabbler
Joined
Feb 3, 2022
Messages
11
I now have enabled disk write caching, I read ZFS is designed to work with disk write cache enabled.

I switched now to TrueNAS Core, the results are slightly better for read, the rest is the same as TrueNAS scale. I executed the following tests on both systems for comparison:

Async:
Code:
fio --filename=test --ioengine=posixaio --rw=randread --bs=4k --numjobs=1 --iodepth=4 --group_reporting --name=test --filesize=10G --runtime=300 && rm test
fio --filename=test --ioengine=posixaio --rw=randwrite --bs=4k --numjobs=1 --iodepth=4 --group_reporting --name=test --filesize=10G --runtime=300 && rm test
fio --filename=test --ioengine=posixaio --rw=read --bs=4k --numjobs=1 --iodepth=4 --group_reporting --name=test --filesize=10G --runtime=300 && rm test
fio --filename=test --ioengine=posixaio --rw=write --bs=4k --numjobs=1 --iodepth=4 --group_reporting --name=test --filesize=10G --runtime=300 && rm test

TestIOPSMB/s
4K QD4 rnd read39800163
4K QD4 rnd write1040042,5
4K QD4 seq read87200357
4K QD4 seq write57400235
64K QD4 rnd read342002242
64K QD4 rnd write9303610
64K QD4 seq read302001979
64K QD4 seq write12700831
1M QD4 rnd read54845751
1M QD4 rnd write741778
1M QD4 seq read57236002
1M QD4 seq write8558971

Sync:
Code:
fio --filename=test --sync=1 --rw=randread --bs=4k --numjobs=1 --iodepth=4 --group_reporting --name=test --filesize=10G --runtime=300 && rm test
fio --filename=test --sync=1 --rw=randwrite --bs=4k --numjobs=1 --iodepth=4 --group_reporting --name=test --filesize=10G --runtime=300 && rm test
fio --filename=test --sync=1 --rw=read --bs=4k --numjobs=1 --iodepth=4 --group_reporting --name=test --filesize=10G --runtime=300 && rm test
fio --filename=test --sync=1 --rw=write --bs=4k --numjobs=1 --iodepth=4 --group_reporting --name=test --filesize=10G --runtime=300 && rm test

TestIOPSMB/s
4K QD4 rnd read2010082,5
4K QD4 rnd write488520,0
4K QD4 seq read2650001087
4K QD4 seq write995940,8
64K QD4 rnd read170001113
64K QD4 rnd write3549233
64K QD4 seq read299001962
64K QD4 seq write4373287
1M QD4 rnd read19592055
1M QD4 rnd write634665
1M QD4 seq read18891981
1M QD4 seq write651683

I'm aware of the fact that the tests may not workaround any caches, so if you have further tests I could execute, please tell me. I do not have any performance results of a similar system, thats why I'm quite unsure if the measured performance is "good". What do you think about the results?

Thanks in advance!
 

Kartright

Dabbler
Joined
Feb 3, 2022
Messages
11
I experimented around a little bit. I swap the LSI again back to the HP 840ar controller and could verify, that the LSI 9305-16i HBA is definitly faster. I understand now, why everyone recommends it.

I also tested different layouts using my 12 disks and my SLOG devices:
1) 12 disks in a single RAIDz2 vdev
2) 6 disks in two RAIDz2 vdevs in same pool

The results between RAIDz2 and the original RAIDz3 are equal, the only difference is that I loose only 2 disks instead of 3 (more usable capacity left). Two vdevs each 6 disks in RAIDz2 resulted in double IOPS (as expected), but little slower sequential speeds and 4 drives lost in total.

There's one caveat. Keep your HBA cool. It is an embedded computer and throws off about 10 watts. Failure to have airflow directed over your HBA can cause overheat, and in extreme cases LSI HBA's have been found to vomit random bits all over, which isn't good for ZFS.
See: https://www.truenas.com/community/r...bas-and-why-cant-i-use-a-raid-controller.139/

After reading that again, I rethought the PCI device layout. I changed the place where the LSI HBA is located. Before testing the PCI device order in the server was the following:
1) Intel SSD 900P
2) Intel SSD 900P
3) LSI HBA

I changed to the following order:
1) LSI HBA
2) Intel SSD 900P
3) Intel SSD 900P

This change definitly changed something as the performance increased. The system configuration did not change, so the only thing I can imagine of is the LSI runs much cooler now due to improved airflow (as it is on the top now)

These are the "new" results I got:
Async:
Code:
fio --filename=test --ioengine=posixaio --rw=randread --bs=4k --numjobs=1 --iodepth=4 --group_reporting --name=test --filesize=10G --runtime=300 && rm test
fio --filename=test --ioengine=posixaio --rw=randwrite --bs=4k --numjobs=1 --iodepth=4 --group_reporting --name=test --filesize=10G --runtime=300 && rm test
fio --filename=test --ioengine=posixaio --rw=read --bs=4k --numjobs=1 --iodepth=4 --group_reporting --name=test --filesize=10G --runtime=300 && rm test
fio --filename=test --ioengine=posixaio --rw=write --bs=4k --numjobs=1 --iodepth=4 --group_reporting --name=test --filesize=10G --runtime=300 && rm test

TestIOPS (old)MB/s (old)IOPS (new)MB/s (new)
4K QD4 rnd read3980016339800163
4K QD4 rnd write1040042,51760072,2
4K QD4 seq read8720035787000356
4K QD4 seq write5740023557400235
64K QD4 rnd read342002242333002181
64K QD4 rnd write9303610159001041
64K QD4 seq read302001979301001974
64K QD4 seq write12700831204001337
1M QD4 rnd read5484575173777736
1M QD4 rnd write74177817151799
1M QD4 seq read5723600271507498
1M QD4 seq write85589718991991

As you can see, the performance increased. Sync IO also changed to higher speeds.

I think I found the problem and can leave it now to be happy :)

PS: Maybe I switch back to TrueNAS Scale when the official release is here and retest again ;)
 

jgreco

Resident Grinch
Moderator
Joined
May 29, 2011
Messages
15,876
back to the HP 840ar controller and could verify, that the LSI 9305-16i HBA is definitly faster. I understand now, why everyone recommends it.

That has nothing to do with why everyone recommends it. We recommend, virtually require, it -- because it works correctly, when most other controllers do not.


We recommend the LSI 2008 and 2308 based controllers too, which may well be slower than your HP. It's the fact that they're known to work correctly which is relevant, not the speed.
 
Top