NFS write performance slow on 10gbps networking

taupehat

Explorer
Joined
Dec 20, 2016
Messages
54
We have 2 servers adjacent to one another which was handy for performance testing. I will describe them as A and B (because that's literally their short hostname)
A:
Supermicro X10DRi-T4+​
Xeon E5-2620 v4​
132G RAM​
Intel 2-port X520 NIC​
SLOG: dual Samsung SSD 970 PRO 512GB​
SAS3008 HBA​
46 4TB HGST ultrastars in a zpool of 2-disk mirrors (+2 spares)​
SC846BE1C-R1K03JBOD chassis contains half the disks​
B:
Supermicro X11SPH-nCTF​
Xeon Silver 4114​
132G RAM​
Chesio T520-SO NIC​
SLOG: dual INTEL SSDPE21D280GA​
SAS3008 HBA​
30 6TB HGST ultrastars in a zpool of 2-disk mirrors (+2 spares)​
Both servers are wired via DACs to QFX switches, trunked, storage VLAN set to mtu 9000. I can "ping -D -s 8973" in both directions so connectivity and MTU seem fine.
iperf3 looks good - 0 retries in both directions and an average of about 9.8 Gbits/sec each way.
Writing locally to the zpool cranks out an easy 400MB/s.

Problem:
NFS write performance is low for both systems given their internal configuration and connection to 10g networking. NFS writes are much slower on A than on B.

For testing I created a 10G file from /dev/random, attached an NFS mount to the pool of the other host, and used rsync to copy the test file across. This is what it looked like:

Code:
a# rsync --info=progress2 testfile /var/tmp/test/
 10,485,760,000 100%  143.72MB/s    0:01:09 (xfr#1, to-chk=0/1)

b# rsync --info=progress2 testfile /var/tmp/test/
 10,485,760,000 100%   35.86MB/s    0:04:38 (xfr#1, to-chk=0/1)


During the file copy I used gstats and the ZIL for A was barely even hitting 4% busy, mostly closer to 2%. When the testfile was being copied to B I was seeing the ZIL top out at 10%.

Questions:
  1. What kind of NFS write speed would you expect to see with the above hardware?
  2. What am I missing in the test setup above?
  3. Why is performance so low?
  4. What can I look for to explain the lower performance on the system with a higher spindle count?
Thanks in advance!
 

taupehat

Explorer
Joined
Dec 20, 2016
Messages
54
Attempted the same rsync task with sync turned off, also tried with compression turned off. Turning off sync led to a different busy profile in gstat as expected (zil disks were idle, spinny disks more consistently busy but not highly busy, maybe 10-ish percent) but neither change made any discernable difference in performance.
 
Joined
Dec 29, 2014
Messages
1,135
rsync isn't a good choice to test network performance. It spends a lot of time trying to figure out if it should copy files or not. rsync also isn't using NFS.
 

taupehat

Explorer
Joined
Dec 20, 2016
Messages
54
rsync isn't a good choice to test network performance. It spends a lot of time trying to figure out if it should copy files or not. rsync also isn't using NFS.
I've got the NFS shares mounted locally, so it's definitely using nfs (rsync /path/to/source /path/to/dest/). I could use "cp" I guess - it's one big file that I delete after copying so there isn't a lot of time spent enumerating directory trees.
 
Joined
Dec 29, 2014
Messages
1,135
rsync still isn't a good choice to test performance for the reasons I mentioned earlier. cp would certainly be a better bet.
 

taupehat

Explorer
Joined
Dec 20, 2016
Messages
54
Alright, well... cp doesn't have a great way to monitor rate so I wrapped it in "time" and also used pv. Either way this gets measured we're still less than halfway to link speed on the faster direction and still unsure why the slower direction is so much slower. Looking at the below, two questions:

1. Is 3.9gb/s expected speed when everything is healthy?
2. How do I chase the write speed bottleneck on host a?

Stats for a->b:
a# time cp testfile /var/tmp/test/

real 0m21.160s
user 0m0.008s
sys 0m12.368s

a# pv testfile > /var/tmp/test/testfile
9.77GiB 0:00:21 [ 463MiB/s] [================================================================================================>] 100%

Stats for b->a:
b# time cp testfile /var/tmp/test/

real 2m51.710s
user 0m0.009s
sys 2m47.609s
b# pv testfile > /var/tmp/test/testfile
9.77GiB 0:02:48 [59.3MiB/s] [================================================================================================>] 100%
 

rvassar

Guru
Joined
May 2, 2018
Messages
972
59.3 mb/sec is so low, I would start with the blinky lights and /usr/bin/top. It should be obvious the disks are not performing from the access lights. Looks for patterns, walking pairs, etc... Run top in another window while performing your test. Look for processes blocked in iowait, etc...
 

taupehat

Explorer
Joined
Dec 20, 2016
Messages
54
59.3 mb/sec is so low, I would start with the blinky lights and /usr/bin/top. It should be obvious the disks are not performing from the access lights. Looks for patterns, walking pairs, etc... Run top in another window while performing your test. Look for processes blocked in iowait, etc...
The confusing issue here is that local write is fine - I can create a 10G file from /dev/urandom and it'll copy that locally in about 30 seconds, similarly I can write that file out via NFS to another array at 406MiB/s (as of today) so I don't think the network infra is at issue. MTU 9000 is set for the LACP bonds on both hosts. It's only inbound NFS that's slow and it is super slow. Today's rate is 39.4MiB/s.
 

rvassar

Guru
Joined
May 2, 2018
Messages
972
Ditching the SLOG is probably a pretty good suggestion, just to simplify the config. Past that, I'm thinking you need to dig in the logs for unusual messages, and maybe get a network analyzer on the wire somehow.
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
NFS performance is always a function of parallelism... with "queue-depth=1", bandwidth is always low. If you test with tools like fio and vdbench, you can control queue depth and get more accurate numbers. Copying files is generally bound to a single IO at a time.
 

Herr_Merlin

Patron
Joined
Oct 25, 2019
Messages
200
Remove the slots or set sync to disable.. test again. That are no devices, which are favorable as SLOG. They don't have the features you need.
 

alecz

Dabbler
Joined
Apr 2, 2021
Messages
18
The FreeBSD NFS Server is "hardcoded" to use 128K read and write size which likely affects the write perfomance of ZFS over NFS:
 
Top