Help with varying (slow-ish) write speeds

azjet77

Cadet
Joined
Nov 20, 2023
Messages
8
Disclaimer: it's my first time running TrueNAS and ZFS, by extension. I recently put together a new NAS machine, the specs are:
  • Chassis/MB
    • Supermicro CSE-836e16/X11SSH-GF-1585L
  • CPU
    • Xeon E3-1585L v5
  • RAM
    • 64GB ECC DDR4 2133MHz
  • Storage
    • TrueNAS Core on Crucial MX500
    • Pool = 4x 16TB 7.2k Seagate Exos X18 ST16000NM000J in RAIDZ1
  • Hard disk controllers
    • Inspur LSI SAS-3008 controller (in IT mode)
  • Network cards
    • Onboard 1Gbe LAN
I created a pool with 1 vdev and then an archives dataset under it using compression=lz4, dedup=off, recordsize=512k. Then I ran rsync from my old NAS to transfer data to this one using rsync -rP -e 'ssh -c aes128-cbc -o compression=no'. Initially, the transfer speed (as reported by rsync) was 105-110MB/s, which is to be expected if it's filling up ZFS cache first and encountering no network issues. But some time later, I assume after the cache filled up, the speed dropped to 35-40MB/s and then slowly crept up to around 60-65MB/s.

I collected some data using gstat during the slower writes:

Code:
truenas% gstat -s -b -f '^da'
dT: 1.041s  w: 1.000s  filter: ^da
 L(q)  ops/s    r/s     kB   kBps   ms/r    w/s     kB   kBps   ms/w   %busy Name
    0    271      0      0      0    0.0    271    665 180120    1.9   52.3  da0
    1    209      0      0      0    0.0    209    832 174281    3.7   78.1  da1
    0    274      0      0      0    0.0    274    658 180120    1.9   52.4  da2
    1    212      0      0      0    0.0    212    822 174400    3.7   78.0  da3
    0      0      0      0      0    0.0      0      0      0    0.0    0.0  da3p1
    0      0      0      0      0    0.0      0      0      0    0.0    0.0  da2p1
    0      0      0      0      0    0.0      0      0      0    0.0    0.0  da1p1
    0      0      0      0      0    0.0      0      0      0    0.0    0.0  da0p1
    1    212      0      0      0    0.0    212    822 174400    3.7   78.0  da3p2
    1    209      0      0      0    0.0    209    832 174281    3.7   78.1  da1p2
    0    271      0      0      0    0.0    271    665 180120    1.9   52.4  da0p2
    0    274      0      0      0    0.0    274    658 180120    1.9   52.4  da2p2
truenas% gstat -s -b -f '^da'
dT: 1.001s  w: 1.000s  filter: ^da
 L(q)  ops/s    r/s     kB   kBps   ms/r    w/s     kB   kBps   ms/w   %busy Name
    1    130      0      0      0    0.0    130    823 106927    3.2   41.0  da0
    1    118      0      0      0    0.0    118    854 100716    3.5   40.9  da1
    1    137      0      0      0    0.0    137    800 109505    3.0   41.0  da2
    1    127      0      0      0    0.0    127    830 105376    3.2   41.1  da3
    0      0      0      0      0    0.0      0      0      0    0.0    0.0  da3p1
    0      0      0      0      0    0.0      0      0      0    0.0    0.0  da2p1
    0      0      0      0      0    0.0      0      0      0    0.0    0.0  da1p1
    0      0      0      0      0    0.0      0      0      0    0.0    0.0  da0p1
    1    127      0      0      0    0.0    127    830 105376    3.2   41.1  da3p2
    1    118      0      0      0    0.0    118    854 100716    3.5   40.9  da1p2
    1    130      0      0      0    0.0    130    823 106927    3.2   41.0  da0p2
    1    137      0      0      0    0.0    137    800 109505    3.0   41.0  da2p2


I also did a test using fio:

Code:
truenas% fio --name=test --size=5g --rw=write --ioengine=posixaio --direct=1 --bs=512k
test: (g=0): rw=write, bs=(R) 512KiB-512KiB, (W) 512KiB-512KiB, (T) 512KiB-512KiB, ioengine=posixaio, iodepth=1
fio-3.28
Starting 1 process
test: Laying out IO file (1 file / 5120MiB)
Jobs: 1 (f=1): [W(1)][-.-%][w=822MiB/s][w=1643 IOPS][eta 00m:00s]
test: (groupid=0, jobs=1): err= 0: pid=33083: Fri Jan 26 09:18:07 2024
  write: IOPS=2933, BW=1467MiB/s (1538MB/s)(5120MiB/3491msec); 0 zone resets
    slat (usec): min=4, max=3745, avg=32.08, stdev=61.01
    clat (nsec): min=900, max=2169.1k, avg=307591.28, stdev=336626.21
     lat (usec): min=31, max=3971, avg=339.67, stdev=331.07
    clat percentiles (nsec):
     |  1.00th=[    988],  5.00th=[   1020], 10.00th=[   1048],
     | 20.00th=[  27008], 30.00th=[  34560], 40.00th=[ 109056],
     | 50.00th=[ 154624], 60.00th=[ 284672], 70.00th=[ 544768],
     | 80.00th=[ 602112], 90.00th=[ 651264], 95.00th=[1155072],
     | 99.00th=[1253376], 99.50th=[1269760], 99.90th=[1499136],
     | 99.95th=[1531904], 99.99th=[2007040]
   bw (  MiB/s): min=  813, max= 5019, per=100.00%, avg=1573.03, stdev=1689.42, samples=6
   iops        : min= 1626, max=10039, avg=3145.83, stdev=3378.65, samples=6
  lat (nsec)   : 1000=1.91%
  lat (usec)   : 2=16.64%, 4=0.49%, 10=0.08%, 20=0.23%, 50=19.09%
  lat (usec)   : 100=1.37%, 250=18.82%, 500=7.02%, 750=27.46%, 1000=1.40%
  lat (msec)   : 2=5.47%, 4=0.02%
  cpu          : usr=4.53%, sys=1.78%, ctx=10607, majf=1, minf=1
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,10240,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=1467MiB/s (1538MB/s), 1467MiB/s-1467MiB/s (1538MB/s-1538MB/s), io=5120MiB (5369MB), run=3491-3491msec


Is this to be expected with this configuration or is something else going on?
 

MrGuvernment

Patron
Joined
Jun 15, 2017
Messages
268
it is depending on the file sizes you are copying over. If they are not Gigabytes in size for example and instead or many smaller files in the MB/Mb/Kb size, random read and write performance is not always the best., since you maxed out the 1Gb link initially as things on your old NAS may of been in cache, TrueNAS L2ARC is a read cache

Also that fio test is only testing a single thread, so you may not see the true performance of your new system.

I have some test options in my thread about performance and using fio, as well as the source one person noted in another site that may help get some different numbers

 
Top