My new pool consists of 6 x 512GB Samsung EVO's with an Intel 900P as SLOG:
Here is the local performance of the pool with sync=disabled set:
FIO writing (local) with 16 IO depth: 2195MB/s
FIO reading (local) with 16 IO depth: 4165MB/s
This would be everything you expect from this pool. Even tho the 10GB file I used is a bit small for performance testing and some of the reads might come out the ARC.
Now for the network part:
Mounted the pool using NFS to my 2 ESXi machines. All my machines (3) are equiped with Intel X520-DA2 cards (10Gbps) and Intel AFBR-703SDZ-IN2 adapters equiped with LC to LC - OM3 fibre cables. The cards are directly connected, I am not using any switch.
I do not have any tunables set on FreeNAS and I increased the ammount of threads on my NFS config to 32. Did some further testing with increasing threads on ESXi or FreeNAS but this didnt seem to have any effect. So I just left the 32 threads on the FN NFS config stay.
ESXi machine I am using for testing is also untouched for NFS specific tunables.
Made a 50GB VMDK disk on the NFS share and attached it to a CentOS 7 machine. Nothing else is running on the NFS share, the CentOS is on local storage on ESXi, so its just the disk I am writing/reading to. The disk is formatted with EXT4.
FIO writing (NFS) with 16 IO depth: 1141MB/s
So far so good, this is almost the limit (1250MB/s) of the 10Gbps card.
FIO reading (NFS) with 16 IO depth: 376MB/s
Now this is where the problem starts, I am able to reach 4165MB/s on the local benchmark but only 376MB/s on the NFS benchmark. This doesnt make any sense because I am able to write with 1141MB/s to the NFS datastore.
So I did some further digging, i ran some Iperf benchmarks to and from FreeNAS:
FreeNAS > ESXi (5.33 Gbits/sec)
ESXi > FreeNAS (9.38 Gbits/sec)
Now this is where it gets weird, why I am only able to reach 9.38Gbits/sec when connecting from ESXi (client) to FreeNAS (server), and only 5.33Gbits/sec when connecting from FreeNAS (client) to ESXi (server). Not sure what causes this, but maybe the ESXi enviroment isnt strong enough to run an iPerf server?
What else have I tried?
I mounted the NFS share inside the same VM and got similar performance from both reading and writing to the NFS share.
Changed the following tunables inside ESXi:
NFS.MaxQueueDepth to different values
Increased rx ring buffer to 4096
Net.TcpipHeapMax to 1536
Net.TcpipHeapSize to 32
None of these options actually increased performance.
The only thing I can still do is test the performance to a bare metal CentOS machine, maybe with this way I can finally find out where the bottleneck is, either somewhere in FreeNAS or ESXi. But it still doesn't make sense why my read speeds can be so much lower than my write speeds.
If anyone else has a suggestion on what to check it would be very welcome!
Code:
pool: easy state: ONLINE scan: resilvered 1.90G in 0 days 00:00:05 with 0 errors on Sat Apr 21 12:45:24 2018 config: NAME STATE READ WRITE CKSUM easy ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 gptid/59862696-44e2-11e8-bcdd-001b216cc170 ONLINE 0 0 0 gptid/59ca03ea-44e2-11e8-bcdd-001b216cc170 ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 gptid/5a1633e0-44e2-11e8-bcdd-001b216cc170 ONLINE 0 0 0 gptid/5a6890fb-44e2-11e8-bcdd-001b216cc170 ONLINE 0 0 0 mirror-2 ONLINE 0 0 0 gptid/5abb3453-44e2-11e8-bcdd-001b216cc170 ONLINE 0 0 0 gptid/5b07a7c9-44e2-11e8-bcdd-001b216cc170 ONLINE 0 0 0 logs nvd0p1 ONLINE 0 0 0 errors: No known data errors
Here is the local performance of the pool with sync=disabled set:
FIO writing (local) with 16 IO depth: 2195MB/s
Code:
root@freenas:/mnt/easy/vmware-nfs/test # fio fio-seq-write.job file1: (g=0): rw=write, bs=(R) 256KiB-256KiB, (W) 256KiB-256KiB, (T) 256KiB-256KiB, ioengine=psync, iodepth=16 fio-3.0 Starting 1 process file1: Laying out IO file (1 file / 10240MiB) Jobs: 1 (f=1): [W(1)][100.0%][r=0KiB/s,w=1624MiB/s][r=0,w=6496 IOPS][eta 00m:00s] file1: (groupid=0, jobs=1): err= 0: pid=56186: Sun Apr 22 20:01:04 2018 write: IOPS=8372, BW=2093MiB/s (2195MB/s)(123GiB/60001msec) clat (usec): min=23, max=536675, avg=112.91, stdev=1121.18 lat (usec): min=24, max=536678, avg=117.48, stdev=1122.20 clat percentiles (usec): | 1.00th=[ 34], 5.00th=[ 43], 10.00th=[ 44], 20.00th=[ 48], | 30.00th=[ 52], 40.00th=[ 60], 50.00th=[ 72], 60.00th=[ 91], | 70.00th=[ 115], 80.00th=[ 129], 90.00th=[ 172], 95.00th=[ 194], | 99.00th=[ 537], 99.50th=[ 922], 99.90th=[ 3032], 99.95th=[ 5473], | 99.99th=[20317] bw ( MiB/s): min= 12, max= 3290, per=99.42%, avg=2080.85, stdev=431.53, samples=119 iops : min= 49, max=13163, avg=8322.92, stdev=1726.09, samples=119 lat (usec) : 50=26.61%, 100=37.22%, 250=33.28%, 500=1.78%, 750=0.47% lat (usec) : 1000=0.19% lat (msec) : 2=0.27%, 4=0.10%, 10=0.05%, 20=0.01%, 50=0.01% lat (msec) : 100=0.01%, 250=0.01%, 500=0.01%, 750=0.01% cpu : usr=4.26%, sys=44.28%, ctx=1285338, majf=0, minf=0 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwt: total=0,502330,0, short=0,0,0, dropped=0,0,0 latency : target=0, window=0, percentile=100.00%, depth=16 Run status group 0 (all jobs): WRITE: bw=2093MiB/s (2195MB/s), 2093MiB/s-2093MiB/s (2195MB/s-2195MB/s), io=123GiB (132GB), run=60001-60001msec
FIO reading (local) with 16 IO depth: 4165MB/s
Code:
root@freenas:/mnt/easy/vmware-nfs/test # fio fio-seq-read.job file1: (g=0): rw=read, bs=(R) 256KiB-256KiB, (W) 256KiB-256KiB, (T) 256KiB-256KiB, ioengine=psync, iodepth=16 fio-3.0 Starting 1 process file1: Laying out IO file (1 file / 10240MiB) Jobs: 1 (f=1): [R(1)][100.0%][r=4046MiB/s,w=0KiB/s][r=16.2k,w=0 IOPS][eta 00m:00s] file1: (groupid=0, jobs=1): err= 0: pid=57281: Sun Apr 22 20:02:15 2018 read: IOPS=15.9k, BW=3972MiB/s (4165MB/s)(233GiB/60001msec) clat (usec): min=52, max=25326, avg=62.48, stdev=28.12 lat (usec): min=52, max=25326, avg=62.53, stdev=28.13 clat percentiles (usec): | 1.00th=[ 60], 5.00th=[ 61], 10.00th=[ 61], 20.00th=[ 61], | 30.00th=[ 61], 40.00th=[ 62], 50.00th=[ 62], 60.00th=[ 62], | 70.00th=[ 62], 80.00th=[ 63], 90.00th=[ 64], 95.00th=[ 65], | 99.00th=[ 88], 99.50th=[ 113], 99.90th=[ 174], 99.95th=[ 215], | 99.99th=[ 359] bw ( MiB/s): min= 2341, max= 4039, per=98.87%, avg=3926.77, stdev=246.27, samples=119 iops : min= 9365, max=16158, avg=15706.62, stdev=985.10, samples=119 lat (usec) : 100=99.27%, 250=0.70%, 500=0.03%, 750=0.01%, 1000=0.01% lat (msec) : 2=0.01%, 4=0.01%, 10=0.01%, 50=0.01% cpu : usr=1.04%, sys=98.66%, ctx=12352, majf=0, minf=64 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwt: total=953221,0,0, short=0,0,0, dropped=0,0,0 latency : target=0, window=0, percentile=100.00%, depth=16 Run status group 0 (all jobs): READ: bw=3972MiB/s (4165MB/s), 3972MiB/s-3972MiB/s (4165MB/s-4165MB/s), io=233GiB (250GB), run=60001-60001msec
This would be everything you expect from this pool. Even tho the 10GB file I used is a bit small for performance testing and some of the reads might come out the ARC.
Now for the network part:
Mounted the pool using NFS to my 2 ESXi machines. All my machines (3) are equiped with Intel X520-DA2 cards (10Gbps) and Intel AFBR-703SDZ-IN2 adapters equiped with LC to LC - OM3 fibre cables. The cards are directly connected, I am not using any switch.
I do not have any tunables set on FreeNAS and I increased the ammount of threads on my NFS config to 32. Did some further testing with increasing threads on ESXi or FreeNAS but this didnt seem to have any effect. So I just left the 32 threads on the FN NFS config stay.
ESXi machine I am using for testing is also untouched for NFS specific tunables.
Made a 50GB VMDK disk on the NFS share and attached it to a CentOS 7 machine. Nothing else is running on the NFS share, the CentOS is on local storage on ESXi, so its just the disk I am writing/reading to. The disk is formatted with EXT4.
FIO writing (NFS) with 16 IO depth: 1141MB/s
Code:
[root@core benchmark]# fio fio-seq-write.job file1: (g=0): rw=write, bs=(R) 256KiB-256KiB, (W) 256KiB-256KiB, (T) 256KiB-256KiB, ioengine=psync, iodepth=16 fio-3.1 Starting 1 process file1: Laying out IO file (1 file / 10240MiB) Jobs: 1 (f=1): [W(1)][100.0%][r=0KiB/s,w=1082MiB/s][r=0,w=4329 IOPS][eta 00m:00s] file1: (groupid=0, jobs=1): err= 0: pid=3106: Sun Apr 22 19:57:18 2018 write: IOPS=4353, BW=1088MiB/s (1141MB/s)(63.8GiB/60001msec) clat (usec): min=79, max=93065, avg=226.05, stdev=1364.14 lat (usec): min=80, max=93068, avg=228.87, stdev=1364.17 clat percentiles (usec): | 1.00th=[ 83], 5.00th=[ 85], 10.00th=[ 87], 20.00th=[ 90], | 30.00th=[ 93], 40.00th=[ 96], 50.00th=[ 99], 60.00th=[ 104], | 70.00th=[ 110], 80.00th=[ 116], 90.00th=[ 124], 95.00th=[ 135], | 99.00th=[ 1090], 99.50th=[13829], 99.90th=[16450], 99.95th=[16909], | 99.99th=[18220] bw ( MiB/s): min= 866, max= 1878, per=99.88%, avg=1087.08, stdev=85.35, samples=120 iops : min= 3466, max= 7514, avg=4348.21, stdev=341.40, samples=120 lat (usec) : 100=51.57%, 250=47.05%, 500=0.10%, 750=0.23%, 1000=0.06% lat (msec) : 2=0.08%, 4=0.01%, 10=0.22%, 20=0.68%, 50=0.01% lat (msec) : 100=0.01% cpu : usr=1.64%, sys=45.76%, ctx=3246, majf=0, minf=33 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwt: total=0,261215,0, short=0,0,0, dropped=0,0,0 latency : target=0, window=0, percentile=100.00%, depth=16 Run status group 0 (all jobs): WRITE: bw=1088MiB/s (1141MB/s), 1088MiB/s-1088MiB/s (1141MB/s-1141MB/s), io=63.8GiB (68.5GB), run=60001-60001msec Disk stats (read/write): sdc: ios=0/129339, merge=0/48060, ticks=0/8463868, in_queue=8467668, util=99.80%
So far so good, this is almost the limit (1250MB/s) of the 10Gbps card.
FIO reading (NFS) with 16 IO depth: 376MB/s
Code:
[root@core benchmark]# fio fio-seq-read.job file1: (g=0): rw=read, bs=(R) 256KiB-256KiB, (W) 256KiB-256KiB, (T) 256KiB-256KiB, ioengine=psync, iodepth=16 fio-3.1 Starting 1 process file1: Laying out IO file (1 file / 10240MiB) Jobs: 1 (f=1): [R(1)][100.0%][r=358MiB/s,w=0KiB/s][r=1433,w=0 IOPS][eta 00m:00s] file1: (groupid=0, jobs=1): err= 0: pid=3112: Sun Apr 22 19:58:43 2018 read: IOPS=1434, BW=359MiB/s (376MB/s)(21.0GiB/60001msec) clat (usec): min=403, max=4792, avg=695.46, stdev=129.56 lat (usec): min=403, max=4793, avg=695.63, stdev=129.57 clat percentiles (usec): | 1.00th=[ 502], 5.00th=[ 529], 10.00th=[ 529], 20.00th=[ 545], | 30.00th=[ 586], 40.00th=[ 660], 50.00th=[ 709], 60.00th=[ 742], | 70.00th=[ 775], 80.00th=[ 816], 90.00th=[ 857], 95.00th=[ 889], | 99.00th=[ 979], 99.50th=[ 1004], 99.90th=[ 1074], 99.95th=[ 1090], | 99.99th=[ 1631] bw ( KiB/s): min=343552, max=446976, per=100.00%, avg=367395.09, stdev=20000.71, samples=120 iops : min= 1342, max= 1746, avg=1435.03, stdev=78.18, samples=120 lat (usec) : 500=0.98%, 750=61.49%, 1000=36.95% lat (msec) : 2=0.57%, 4=0.01%, 10=0.01% cpu : usr=0.41%, sys=3.17%, ctx=86093, majf=0, minf=97 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwt: total=86091,0,0, short=0,0,0, dropped=0,0,0 latency : target=0, window=0, percentile=100.00%, depth=16 Run status group 0 (all jobs): READ: bw=359MiB/s (376MB/s), 359MiB/s-359MiB/s (376MB/s-376MB/s), io=21.0GiB (22.6GB), run=60001-60001msec Disk stats (read/write): sdc: ios=86090/15, merge=0/78, ticks=58377/3, in_queue=58324, util=97.07%
Now this is where the problem starts, I am able to reach 4165MB/s on the local benchmark but only 376MB/s on the NFS benchmark. This doesnt make any sense because I am able to write with 1141MB/s to the NFS datastore.
So I did some further digging, i ran some Iperf benchmarks to and from FreeNAS:
FreeNAS > ESXi (5.33 Gbits/sec)
Code:
-------------- Client connecting to 192.168.22.5, TCP port 5001 TCP window size: 32.8 KByte (default) ------------------------------------------------------------ [ 3] local 192.168.22.10 port 60859 connected with 192.168.22.5 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 6.20 GBytes 5.33 Gbits/sec
ESXi > FreeNAS (9.38 Gbits/sec)
Code:
-------------- Server listening on TCP port 5001 TCP window size: 64.0 KByte (default) ------------------------------------------------------------ [ 4] local 192.168.22.10 port 5001 connected with 192.168.22.5 port 47769 [ ID] Interval Transfer Bandwidth [ 4] 0.0-10.0 sec 10.9 GBytes 9.38 Gbits/sec
Now this is where it gets weird, why I am only able to reach 9.38Gbits/sec when connecting from ESXi (client) to FreeNAS (server), and only 5.33Gbits/sec when connecting from FreeNAS (client) to ESXi (server). Not sure what causes this, but maybe the ESXi enviroment isnt strong enough to run an iPerf server?
What else have I tried?
I mounted the NFS share inside the same VM and got similar performance from both reading and writing to the NFS share.
Changed the following tunables inside ESXi:
NFS.MaxQueueDepth to different values
Increased rx ring buffer to 4096
Net.TcpipHeapMax to 1536
Net.TcpipHeapSize to 32
None of these options actually increased performance.
The only thing I can still do is test the performance to a bare metal CentOS machine, maybe with this way I can finally find out where the bottleneck is, either somewhere in FreeNAS or ESXi. But it still doesn't make sense why my read speeds can be so much lower than my write speeds.
If anyone else has a suggestion on what to check it would be very welcome!