NFS write performance slow on 10gbps networking

taupehat · Nov 18, 2020

We have 2 servers adjacent to one another which was handy for performance testing. I will describe them as A and B (because that's literally their short hostname)
A:

Supermicro X10DRi-T4+

Xeon E5-2620 v4

132G RAM

Intel 2-port X520 NIC

SLOG: dual Samsung SSD 970 PRO 512GB

SAS3008 HBA

46 4TB HGST ultrastars in a zpool of 2-disk mirrors (+2 spares)

SC846BE1C-R1K03JBOD chassis contains half the disks

B:

Supermicro X11SPH-nCTF

Xeon Silver 4114

132G RAM

Chesio T520-SO NIC

SLOG: dual INTEL SSDPE21D280GA

SAS3008 HBA

30 6TB HGST ultrastars in a zpool of 2-disk mirrors (+2 spares)

Both servers are wired via DACs to QFX switches, trunked, storage VLAN set to mtu 9000. I can "ping -D -s 8973" in both directions so connectivity and MTU seem fine.
iperf3 looks good - 0 retries in both directions and an average of about 9.8 Gbits/sec each way.
Writing locally to the zpool cranks out an easy 400MB/s.

Problem:
NFS write performance is low for both systems given their internal configuration and connection to 10g networking. NFS writes are much slower on A than on B.

For testing I created a 10G file from /dev/random, attached an NFS mount to the pool of the other host, and used rsync to copy the test file across. This is what it looked like:

Code:

a# rsync --info=progress2 testfile /var/tmp/test/
 10,485,760,000 100%  143.72MB/s    0:01:09 (xfr#1, to-chk=0/1)

b# rsync --info=progress2 testfile /var/tmp/test/
 10,485,760,000 100%   35.86MB/s    0:04:38 (xfr#1, to-chk=0/1)

During the file copy I used gstats and the ZIL for A was barely even hitting 4% busy, mostly closer to 2%. When the testfile was being copied to B I was seeing the ZIL top out at 10%.

Questions:

What kind of NFS write speed would you expect to see with the above hardware?
What am I missing in the test setup above?
Why is performance so low?
What can I look for to explain the lower performance on the system with a higher spindle count?

Thanks in advance!

taupehat · Nov 20, 2020

Attempted the same rsync task with sync turned off, also tried with compression turned off. Turning off sync led to a different busy profile in gstat as expected (zil disks were idle, spinny disks more consistently busy but not highly busy, maybe 10-ish percent) but neither change made any discernable difference in performance.

Elliot Dierksen · Nov 20, 2020

rsync isn't a good choice to test network performance. It spends a lot of time trying to figure out if it should copy files or not. rsync also isn't using NFS.

taupehat · Nov 20, 2020

Elliot Dierksen said:
rsync isn't a good choice to test network performance. It spends a lot of time trying to figure out if it should copy files or not. rsync also isn't using NFS.

I've got the NFS shares mounted locally, so it's definitely using nfs (rsync /path/to/source /path/to/dest/). I could use "cp" I guess - it's one big file that I delete after copying so there isn't a lot of time spent enumerating directory trees.

Elliot Dierksen · Nov 20, 2020

rsync still isn't a good choice to test performance for the reasons I mentioned earlier. cp would certainly be a better bet.

taupehat · Nov 23, 2020

Alright, well... cp doesn't have a great way to monitor rate so I wrapped it in "time" and also used pv. Either way this gets measured we're still less than halfway to link speed on the faster direction and still unsure why the slower direction is so much slower. Looking at the below, two questions:

1. Is 3.9gb/s expected speed when everything is healthy?
2. How do I chase the write speed bottleneck on host a?

Stats for a->b:
a# time cp testfile /var/tmp/test/

real 0m21.160s
user 0m0.008s
sys 0m12.368s

a# pv testfile > /var/tmp/test/testfile
9.77GiB 0:00:21 [ 463MiB/s] [================================================================================================>] 100%

Stats for b->a:
b# time cp testfile /var/tmp/test/

real 2m51.710s
user 0m0.009s
sys 2m47.609s
b# pv testfile > /var/tmp/test/testfile
9.77GiB 0:02:48 [59.3MiB/s] [================================================================================================>] 100%

taupehat · Jan 21, 2021

rvassar · Jan 21, 2021

59.3 mb/sec is so low, I would start with the blinky lights and /usr/bin/top. It should be obvious the disks are not performing from the access lights. Looks for patterns, walking pairs, etc... Run top in another window while performing your test. Look for processes blocked in iowait, etc...

taupehat · Jan 21, 2021

rvassar said:
59.3 mb/sec is so low, I would start with the blinky lights and /usr/bin/top. It should be obvious the disks are not performing from the access lights. Looks for patterns, walking pairs, etc... Run top in another window while performing your test. Look for processes blocked in iowait, etc...

The confusing issue here is that local write is fine - I can create a 10G file from /dev/urandom and it'll copy that locally in about 30 seconds, similarly I can write that file out via NFS to another array at 406MiB/s (as of today) so I don't think the network infra is at issue. MTU 9000 is set for the LACP bonds on both hosts. It's only inbound NFS that's slow and it is super slow. Today's rate is 39.4MiB/s.

HoneyBadger · Jan 21, 2021

At first blush, it's likely your SLOG. A 970 Pro is a "prosumer" drive without PLP.

Make a test dataset on HostA, disable sync, and copy to that. Bet it flies.

rvassar · Jan 21, 2021

Ditching the SLOG is probably a pretty good suggestion, just to simplify the config. Past that, I'm thinking you need to dig in the logs for unusual messages, and maybe get a network analyzer on the wire somehow.

morganL · Jan 22, 2021

NFS performance is always a function of parallelism... with "queue-depth=1", bandwidth is always low. If you test with tools like fio and vdbench, you can control queue depth and get more accurate numbers. Copying files is generally bound to a single IO at a time.

Herr_Merlin · Jan 22, 2021

Remove the slots or set sync to disable.. test again. That are no devices, which are favorable as SLOG. They don't have the features you need.

alecz · Dec 15, 2021

The FreeBSD NFS Server is "hardcoded" to use 128K read and write size which likely affects the write perfomance of ZFS over NFS:

NFSd: (max) wsize and rsize settings

Hi guys, SPEC: Site 1: FreeNAS-9.2.1.7-RELEASE-x64 Site 2: FreeNAS-9.10.1-U4 I use the same linux client (CentOs) connecting to both FreeNas using the following fstab entries: 10.19.10.2:/mnt/vStoreEnc /data/backup nfs vers=3,async,rsize=524288,wsize=524288,timeo=14,noatime,nodiratime,intr...

www.truenas.com

Important Announcement for the TrueNAS Community.

NFS write performance slow on 10gbps networking

taupehat

Explorer

taupehat

Explorer

Elliot Dierksen

Guru

taupehat

Explorer

Elliot Dierksen

Guru

taupehat

Explorer

taupehat

Explorer

rvassar

Guru

taupehat

Explorer

HoneyBadger

actually does care

rvassar

Guru

morganL

Captain Morgan

Herr_Merlin

Patron

alecz

Dabbler

NFSd: (max) wsize and rsize settings

Similar threads

Important Announcement for the TrueNAS Community.

NFS write performance slow on 10gbps networking

Explorer

Explorer

Guru

Explorer

Guru

Explorer

Explorer

Guru

Explorer

actually does care

Guru

Captain Morgan

Patron

Dabbler

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "NFS write performance slow on 10gbps networking"

Similar threads