10Gbe - 8Gbps with iperf, 1.3Mbps with NFS

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Common error. You'll need to target the storage device nvd0 with the diskinfo command.

The last device is the actual enclosure itself (ses0) which can be addressed by the sesutil command to do things like blink drive bay LEDs.
 
Joined
Apr 26, 2015
Messages
320
Thanks. I should have noticed that in the info I was able to pull up and in the disks list on the system.

# diskinfo -v nvd0
nvd0
512 # sectorsize
29260513280 # mediasize in bytes (27G)
57149440 # mediasize in sectors
0 # stripesize
0 # stripeoffset
INTEL MEMPEK1J032GA # Disk descr.
PHBT8030015K032P # Disk ident.
Yes # TRIM/UNMAP support
0 # Rotation rate in RPM

I also read that the SLOG is assigned to the pool/s and since I'm waiting for the drives, maybe I can't set anything up yet.
That's IF I understood what was meant in the post.

The enclosure access is interesting.
That means it's similar to some embedded devices I work with like mini routers, Pi devices, where I can control the LEDs as you mentioned and other devices built into the hardware.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Log devices (and cache devices) can be attached and detached from pools without harming the contents, so you can attach it to your current pool and see if it has a positive effect. Once you get your new SSDs you can then detach from current pool and attach to SSD pool.

Re: sesutil another user @Ender117 created a neat little batch script to check pools and blink the LED of the failed drive. Not sure if it's been updated since release but unless the commands have changed it should still work.

 
Joined
Apr 26, 2015
Messages
320
Well, happy new year everyone. I've got down time this morning so thought I'd play with this since I got the SSD's yesterday.
I created the following and later added the slog but I'm not sure if I've done it right since there isn't much speed improvement.

Code:
# zpool status -v pool01
  pool: pool01
 state: ONLINE
config:

        NAME                                            STATE     READ WRITE CKSUM
        pool01                                          ONLINE       0     0     0
          mirror-0                                      ONLINE       0     0     0
            gptid/5177bf3c-6a94-11ec-b980-90b11c1dd891  ONLINE       0     0     0
            gptid/523d87a8-6a94-11ec-b980-90b11c1dd891  ONLINE       0     0     0
          mirror-1                                      ONLINE       0     0     0
            gptid/5096ab16-6a94-11ec-b980-90b11c1dd891  ONLINE       0     0     0
            gptid/52148cbb-6a94-11ec-b980-90b11c1dd891  ONLINE       0     0     0
          mirror-2                                      ONLINE       0     0     0
            gptid/50c02623-6a94-11ec-b980-90b11c1dd891  ONLINE       0     0     0
            gptid/52633bce-6a94-11ec-b980-90b11c1dd891  ONLINE       0     0     0
          mirror-3                                      ONLINE       0     0     0
            gptid/514863e8-6a94-11ec-b980-90b11c1dd891  ONLINE       0     0     0
            gptid/51fb7702-6a94-11ec-b980-90b11c1dd891  ONLINE       0     0     0
          mirror-4                                      ONLINE       0     0     0
            gptid/51b7e69d-6a94-11ec-b980-90b11c1dd891  ONLINE       0     0     0
            gptid/526a2d98-6a94-11ec-b980-90b11c1dd891  ONLINE       0     0     0
        logs
          gptid/0b3db7a3-6b29-11ec-8cb3-90b11c1dd891    ONLINE       0     0     0

errors: No known data errors


Not even hitting 1Gbps either from copying a file from esx host to TN or from a vm with the TN NFS share mounted and copying to it.

2022-01-01_104806.jpg



And, I see this in the logs. Guess I should have ordered a couple extras. Now I have to wait again for a replacement to come in.
Is it possible the performance is crummy because of just one drive?

Device: /dev/da2, SMART Failure: WARNING: ascq=0x5

2022-01-01_115417.jpg


Code:
# smartctl -a /dev/da2
smartctl 7.2 2020-12-30 r5155 [FreeBSD 12.2-RELEASE-p10 amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               STEC
Product:              S842E800M2
Revision:             E4T1
Compliance:           SPC-4
User Capacity:        800,166,076,416 bytes [800 GB]
Logical block size:   512 bytes
LU is resource provisioned, LBPRZ=1
Rotation Rate:        Solid State Device
Form Factor:          2.5 inches
Logical Unit id:      0x5000a7203007e1ad
Serial number:        STM000176F52
Device type:          disk
Transport protocol:   SAS (SPL-3)
Local Time is:        Sat Jan  1 10:56:47 2022 PST
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Enabled

=== START OF READ SMART DATA SECTION ===
SMART Health Status: WARNING: ascq=0x5 [asc=b, ascq=5]

Current Drive Temperature:     34 C
Drive Trip Temperature:        75 C

Accumulated power on time, hours:minutes 17792:50
Elements in grown defect list: 2

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:   88546974545 50384615  88546974545  88597359160   88597359160    1134504.022         389
write:         0        1         1         1          1     623811.367           0
verify: 1335998615      228  1335998615  1335998843   1335998843      12658.790          15

Non-medium error count:        0

No Self-tests have been logged


I decided to try removing that drive from the pool and making a simple pool. I only got up to 900+Mbps.

# zpool status -v pool01
pool: pool01
state: ONLINE
config:

NAME STATE READ WRITE CKSUM
pool01 ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
gptid/b493dcba-6b52-11ec-8cb3-90b11c1dd891 ONLINE 0 0 0
gptid/b4a62c73-6b52-11ec-8cb3-90b11c1dd891 ONLINE 0 0 0
logs
gptid/b4804181-6b52-11ec-8cb3-90b11c1dd891 ONLINE 0 0 0

errors: No known data errors
 
Last edited:
Joined
Apr 26, 2015
Messages
320
Can't seem to find a solution/answer to this. Trying to create a smaller pool of three mirrors and adding SLOG but keep getting the stripe warning and no options to change anything.

I went ahead anyhow just to test. This is all being done with MTU back to defaults since the DC doesn't allow jumbo frames between different locations so there is no point in testing using jumbo.

I mounted the NFS share I created onto a vm on the ESX host that is connected using a 10Gbps NIC.
I was able to hit just above 1Gbps but it always drops down to just under 300Mbps as the file is copied.

2022-01-01_165151.jpg


2022-01-01_170652.jpg


Quite discouraged and nervous I've missed my deadline and have spent a lot of additional money to get not much further.
Must be missing something again.

Code:
root@truenas[~]# fio --name=pool01 --size=5g --rw=write --ioengine=posixaio --direct=1 --bs=1m
pool01: (g=0): rw=write, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=posixaio, iodepth=1
fio-3.27
Starting 1 process
pool01: Laying out IO file (1 file / 5120MiB)
Jobs: 1 (f=1): [W(1)][100.0%][eta 00m:00s]
pool01: (groupid=0, jobs=1): err= 0: pid=20820: Sat Jan  1 16:58:24 2022
  write: IOPS=249, BW=249MiB/s (261MB/s)(5120MiB/20534msec); 0 zone resets
    slat (usec): min=25, max=1507, avg=79.84, stdev=32.80
    clat (usec): min=351, max=3700.8k, avg=3927.01, stdev=105220.39
     lat (usec): min=379, max=3700.9k, avg=4006.86, stdev=105219.98
    clat percentiles (usec):
     |  1.00th=[    371],  5.00th=[    433], 10.00th=[    510],
     | 20.00th=[    523], 30.00th=[    537], 40.00th=[    545],
     | 50.00th=[    553], 60.00th=[    586], 70.00th=[    627],
     | 80.00th=[    652], 90.00th=[    676], 95.00th=[    709],
     | 99.00th=[   2737], 99.50th=[   2769], 99.90th=[   2933],
     | 99.95th=[3338666], 99.99th=[3707765]
   bw (  KiB/s): min=324982, max=1579596, per=100.00%, avg=918998.55, stdev=428309.41, samples=11
   iops        : min=  317, max= 1542, avg=897.00, stdev=418.16, samples=11
  lat (usec)   : 500=8.01%, 750=87.77%, 1000=0.06%
  lat (msec)   : 4=4.06%, >=2000=0.10%
  cpu          : usr=2.00%, sys=0.41%, ctx=5136, majf=1, minf=1
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,5120,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=249MiB/s (261MB/s), 249MiB/s-249MiB/s (261MB/s-261MB/s), io=5120MiB (5369MB), run=20534-20534msec
 
Last edited:

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
A SLOG device is pool critical, which is why you are getting the warning. If you tick force you can still create the pool, which for testing is fine
 
Joined
Apr 26, 2015
Messages
320
Hi, that's what I figured, nothing to lose, just testing so went ahead but I must still have something messed up since the speeds aren't great yet.
 
Joined
Apr 26, 2015
Messages
320
I noticed the 10GBe driver was a little older so I updated that but still barely getting 1Gbps.
 
Joined
Apr 26, 2015
Messages
320
I would really appreciate some help from anyone following this as I'm super late on getting this installed and there is no point running the storage using 10G NICS when I can only get around 1GGbps.

I must be missing something since I thought I followed all of the recommendations in this thread.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
For my hardware configuration (Pool design) - please see my signature

Running the same test as you on my pools:
fio --name=pool01 --size=5g --rw=write --ioengine=posixaio --direct=1 --bs=1m pool01: (g=0): rw=write, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=posixaio, iodepth=1

BigPool, HDD, backed by Optane SLOG:
WRITE: bw=222MiB/s (233MB/s), 222MiB/s-222MiB/s (233MB/s-233MB/s), io=5120MiB (5369MB), run=23081-23081msec

SSDPool, SSD, backed by Optane SLOG
WRITE: bw=231MiB/s (243MB/s), 231MiB/s-231MiB/s (243MB/s-243MB/s), io=5120MiB (5369MB), run=22117-22117msec
So I get the same results as you from that test

NVMEPool for giggles. No SLOG, Sync=disabled
WRITE: bw=2129MiB/s (2232MB/s), 2129MiB/s-2129MiB/s (2232MB/s-2232MB/s), io=5120MiB (5369MB), run=2405-2405msec

New Test:
fio --bs=128k --direct=1 --directory=/mnt/lol/fio --gtod_reduce=1 --ioengine=posixaio --iodepth=32 --group_reporting --name=randrw --numjobs=12 --ramp_time=10 --runtime=60 --rw=randrw --size=256M --time_based

BigPool:
READ: bw=254MiB/s (266MB/s), 254MiB/s-254MiB/s (266MB/s-266MB/s), io=15.0GiB (16.1GB), run=60435-60435msec WRITE: bw=255MiB/s (267MB/s), 255MiB/s-255MiB/s (267MB/s-267MB/s), io=15.0GiB (16.1GB), run=60435-60435msec

SSDPool:
READ: bw=1605MiB/s (1683MB/s), 1605MiB/s-1605MiB/s (1683MB/s-1683MB/s), io=94.1GiB (101GB), run=60033-60033msec WRITE: bw=1605MiB/s (1682MB/s), 1605MiB/s-1605MiB/s (1682MB/s-1682MB/s), io=94.1GiB (101GB), run=60033-60033msec

NVMEPool, No SLOG, Sync=disabled (if it makes a difference)
READ: bw=308MiB/s (323MB/s), 308MiB/s-308MiB/s (323MB/s-323MB/s), io=18.3GiB (19.6GB), run=60699-60699msec WRITE: bw=308MiB/s (323MB/s), 308MiB/s-308MiB/s (323MB/s-323MB/s), io=18.3GiB (19.6GB), run=60699-60699msec

So to summarise

Old Test (your testing command)
BigPool = 1.8Gb/s
SSDPool = 1.8Gb/s
NVMEPool = 17.0Gb/s !!!!

New Test
BigPool = 1.8 Gb/s
SSDPool = 12.9 Gb/s
NVMEPool = 2.5 Gb/s

My takeaway from this is that benchmarks are crap - what does anyone else think?
Maybe we could agree a fio command that a group of us could post results on!!
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
Hmmm, that NVME test is weird, results feel wrong
Repeating the NVME Tests
Old: 7.4Gb/s - much different. Still quick though
New: 2.6Gb/s
 
Joined
Apr 26, 2015
Messages
320
Is there anything else I can share about the test setup? Maybe I don't have the SLOG set up right but even without it, now I have all SSD drives without encryption so why am I not even seeing over 1Gbps yet? That's confusing.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
Try running the fio command I used in the second set of tests. Does it give different results?
 
Joined
Apr 26, 2015
Messages
320
Sure though without the --directory option since I'm not sure what that would be right now.


Code:
# fio --bs=128k --direct=1 --gtod_reduce=1 --ioengine=posixaio --iodepth=32 --group_reporting --name=randrw --numjobs=12 --ramp_time=10 --runtime=60 --rw=randrw --size=256M --time_based
randrw: (g=0): rw=randrw, bs=(R) 128KiB-128KiB, (W) 128KiB-128KiB, (T) 128KiB-128KiB, ioengine=posixaio, iodepth=32
...
fio-3.27
Starting 12 processes
randrw: Laying out IO file (1 file / 256MiB)
randrw: Laying out IO file (1 file / 256MiB)
randrw: Laying out IO file (1 file / 256MiB)
randrw: Laying out IO file (1 file / 256MiB)
randrw: Laying out IO file (1 file / 256MiB)
randrw: Laying out IO file (1 file / 256MiB)
randrw: Laying out IO file (1 file / 256MiB)
randrw: Laying out IO file (1 file / 256MiB)
randrw: Laying out IO file (1 file / 256MiB)
randrw: Laying out IO file (1 file / 256MiB)
randrw: Laying out IO file (1 file / 256MiB)
randrw: Laying out IO file (1 file / 256MiB)
Jobs: 12 (f=12): [m(12)][100.0%][eta 00m:00s]
randrw: (groupid=0, jobs=12): err= 0: pid=65554: Tue Jan  4 08:37:55 2022
  read: IOPS=603, BW=75.6MiB/s (79.3MB/s)(4994MiB/66027msec)
   bw (  MiB/s): min=  139, max= 2341, per=100.00%, avg=1076.09, stdev=61.61, samples=108
   iops        : min= 1108, max=18729, avg=8603.11, stdev=492.83, samples=108
  write: IOPS=610, BW=76.8MiB/s (80.5MB/s)(5069MiB/66027msec); 0 zone resets
   bw (  MiB/s): min=  167, max= 2359, per=100.00%, avg=1087.94, stdev=61.71, samples=108
   iops        : min= 1335, max=18870, avg=8697.67, stdev=493.75, samples=108
  cpu          : usr=0.09%, sys=0.26%, ctx=9279, majf=0, minf=1
  IO depths    : 1=2.1%, 2=4.7%, 4=10.0%, 8=21.1%, 16=55.4%, 32=6.6%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=96.4%, 8=0.6%, 16=0.4%, 32=2.6%, 64=0.0%, >=64=0.0%
     issued rwts: total=39818,40316,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32

Run status group 0 (all jobs):
   READ: bw=75.6MiB/s (79.3MB/s), 75.6MiB/s-75.6MiB/s (79.3MB/s-79.3MB/s), io=4994MiB (5237MB), run=66027-66027msec
  WRITE: bw=76.8MiB/s (80.5MB/s), 76.8MiB/s-76.8MiB/s (80.5MB/s-80.5MB/s), io=5069MiB (5315MB), run=66027-66027msec
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
the directory option allows you to indicate where the fio files will go. Without it I don't know where you were testing. Just create a folder called IO on the top level dataset/pool you are testing (it makes it easier to delete) so we know where the fio command is testing. Adjust to match your setup.
For example - My SSD pool is called SSDPool, so I used shell and created /mnt/SSDPool/IO and used that to test the fio command

For all I know you might have been testing your boot disk with that command
 
Joined
Apr 26, 2015
Messages
320
Ok, done.

Code:
# fio --bs=128k --direct=1 --directory=/mnt/pool01/io --gtod_reduce=1 --ioengine=posixaio --iodepth=32 --group_reporting --name=randrw --numjobs=12 --ramp_time=10 --runtime=60 --rw=randrw --size=256M --time_based
randrw: (g=0): rw=randrw, bs=(R) 128KiB-128KiB, (W) 128KiB-128KiB, (T) 128KiB-128KiB, ioengine=posixaio, iodepth=32
...
fio-3.27
Starting 12 processes
randrw: Laying out IO file (1 file / 256MiB)
randrw: Laying out IO file (1 file / 256MiB)
randrw: Laying out IO file (1 file / 256MiB)
randrw: Laying out IO file (1 file / 256MiB)
randrw: Laying out IO file (1 file / 256MiB)
randrw: Laying out IO file (1 file / 256MiB)
randrw: Laying out IO file (1 file / 256MiB)
randrw: Laying out IO file (1 file / 256MiB)
randrw: Laying out IO file (1 file / 256MiB)
randrw: Laying out IO file (1 file / 256MiB)
randrw: Laying out IO file (1 file / 256MiB)
randrw: Laying out IO file (1 file / 256MiB)
Jobs: 12 (f=12): [m(12)][100.0%][r=666MiB/s,w=660MiB/s][r=5325,w=5282 IOPS][eta 00m:00s]
randrw: (groupid=0, jobs=12): err= 0: pid=68841: Tue Jan  4 13:07:22 2022
  read: IOPS=5614, BW=702MiB/s (736MB/s)(41.2GiB/60054msec)
   bw (  KiB/s): min=277199, max=1780099, per=99.99%, avg=718838.66, stdev=26213.18, samples=1428
   iops        : min= 2158, max=13904, avg=5611.39, stdev=204.81, samples=1428
  write: IOPS=5611, BW=702MiB/s (736MB/s)(41.2GiB/60054msec); 0 zone resets
   bw (  KiB/s): min=286652, max=1782065, per=99.93%, avg=718326.28, stdev=25961.68, samples=1428
   iops        : min= 2232, max=13918, avg=5607.60, stdev=202.83, samples=1428
  cpu          : usr=1.20%, sys=2.19%, ctx=561857, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.2%, 16=71.3%, 32=28.5%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=93.9%, 8=4.4%, 16=1.6%, 32=0.1%, 64=0.0%, >=64=0.0%
     issued rwts: total=337176,336988,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=32

Run status group 0 (all jobs):
   READ: bw=702MiB/s (736MB/s), 702MiB/s-702MiB/s (736MB/s-736MB/s), io=41.2GiB (44.2GB), run=60054-60054msec
  WRITE: bw=702MiB/s (736MB/s), 702MiB/s-702MiB/s (736MB/s-736MB/s), io=41.2GiB (44.2GB), run=60054-60054msec
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
Which is a very different figure to your first result.
702MiB/s = 5.6 Gbps (actually it isn't as the converter I am using cos I am lazy doesn't have MiB/s only only has MB/s - its actually slightly more - its about 4.9% more I think)

So you are at 50%+ of your NIC speed now with a higher iodepth
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
So - question - whose benchmark is correct? (answer - probably neither, or rather "it depends")
 
Joined
Apr 26, 2015
Messages
320
Well, that's an interesting observation and calculations but why am I still not seeing over 1Gbps in terms of transfers?
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
Ahh - don't know.
Is db01, physical or virtual? Is there a directpath to the storage? [You may have answered these before - if so sorry] or does it go through other switches?
the pv command, what was it running on (hardware)?

I might try something similar on mine. Just don't have anything setup atm
 
Top