TrueNAS slow performance with NFS

yucong111

Dabbler
Joined
Jan 4, 2023
Messages
17
# hardware and system version
CPU:Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz * 2
MEMORY: 128G
POOL: 2T * 10 (raidz3) + 512G * 2 L2ARC
TRUENAS: TrueNAS-13.0-U3.1
# dd test performance
# on truenas local
root@freenas[/mnt/dev1/docker]# dd if=/dev/zero of=test.dat bs=1M count=400 conv=fsync oflag=direct
400+0 records in
400+0 records out
419430400 bytes transferred in 0.126497 secs (3315728885 bytes/sec)
root@freenas[/mnt/dev1/docker]# dd if=/dev/zero of=test.dat bs=512K count=800 conv=fsync oflag=direct
800+0 records in
800+0 records out
419430400 bytes transferred in 0.128923 secs (3253346675 bytes/sec)
root@freenas[/mnt/dev1/docker]# dd if=/dev/zero of=test.dat bs=4K count=20000 conv=fsync oflag=direct
20000+0 records in
20000+0 records out
81920000 bytes transferred in 0.135577 secs (604232041 bytes/sec)
# on nfs client (1GB network)
[root@node1 docker]# dd if=/dev/zero of=test.dat bs=1M count=400 conv=fsync oflag=direct
400+0 records in
400+0 records out
419430400 bytes (419 MB) copied, 4.01762 s, 104 MB/s
[root@node1 docker]# dd if=/dev/zero of=test.dat bs=512K count=800 conv=fsync oflag=direct
800+0 records in
800+0 records out
419430400 bytes (419 MB) copied, 4.39617 s, 95.4 MB/s
[root@node1 docker]# dd if=/dev/zero of=test.dat bs=4K count=20000 conv=fsync oflag=direct
20000+0 records in
20000+0 records out
81920000 bytes (82 MB) copied, 6.95326 s, 11.8 MB/s
# nfs
# client mount
192.168.10.16:/mnt/dev1/docker on /docker type nfs4 (rw,noatime,nodiratime,vers=4.1,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.10.120,local_lock=none,addr=192.168.10.16,_netdev)
# other
Truenas pool already set sync disable

We see 4k test is very slow on nfs client. Can someone help me troubleshoot?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
We see 4k test is very slow on nfs client. Can someone help me troubleshoot?

Sure, easy peasy. Drop the "conv=fsync" and "oflag=direct" nonsense. That's going to be stupid slow on a RAIDZ3. Using only the in-pool ZIL, it's going to be painful. You can get yourself a decent SLOG device if you need the sync writes.

See


Once you do that, you'll then run into a different problem because you're using /dev/zero, but we can talk about that in a bit.
 

yucong111

Dabbler
Joined
Jan 4, 2023
Messages
17
Sure, easy peasy. Drop the "conv=fsync" and "oflag=direct" nonsense. That's going to be stupid slow on a RAIDZ3. Using only the in-pool ZIL, it's going to be painful. You can get yourself a decent SLOG device if you need the sync writes.

See


Once you do that, you'll then run into a different problem because you're using /dev/zero, but we can talk about that in a bit.
Thank you for your advice , I will add SLOG device later . But for the 4k test I still don't understand why in the same pool, there can be 600M/s in turenas local , but only 11M/s in nfs client . I guess there is about 100M/s under 1GB network.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Thank you for your advice , I will add SLOG device later . But for the 4k test I still don't understand why in the same pool, there can be 600M/s in turenas local , but only 11M/s in nfs client .

Because you're testing the ZFS host's compression speed when doing it locally. /dev/zero produces extremely compressible data, which ZFS will by default compress. Compression can be temporarily turned off on a dataset to ""fix"" this.

A RAIDZ3 array of ten disks is only going to be capable of speeds a few times the speed of the underlying component devices, so if your components are capable of 100MBytes/sec sustained, you might find the RAIDZ3 only able to do 300MBytes/sec or thereabouts.

All your dd test sizes are wack, by the way. If you have a 128GB system, you should use a large multiple such as 1TB for the target filesize to reduce the ARC cache influences on any read tests you later do. All your tests are completing in well under a second and well within ARC.
 

yucong111

Dabbler
Joined
Jan 4, 2023
Messages
17
Because you're testing the ZFS host's compression speed when doing it locally. /dev/zero produces extremely compressible data, which ZFS will by default compress. Compression can be temporarily turned off on a dataset to ""fix"" this.

A RAIDZ3 array of ten disks is only going to be capable of speeds a few times the speed of the underlying component devices, so if your components are capable of 100MBytes/sec sustained, you might find the RAIDZ3 only able to do 300MBytes/sec or thereabouts.

All your dd test sizes are wack, by the way. If you have a 128GB system, you should use a large multiple such as 1TB for the target filesize to reduce the ARC cache influences on any read tests you later do. All your tests are completing in well under a second and well within ARC.
hello,
I queried some documents and got some information. After setting sync=disabled, zfs will set ZIL in memory, so add Slog device may not improve performance at present, and slog is mainly to improve the performance of write operations, but my nfs client executes ls command to open 200k files in the directory is very slow, it is a read operation,so is there a way to improve it?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
RAIDZ is very poor at small transactions, which also includes metadata type operations.

Lots of RAM (128GB is probably fine) and persistent L2ARC for metadata will get you large improvements. Your RAIDZ with 200k files is going to take a very long time, maybe an hour or more, to do an "ls -l" the first time you do so after the NAS reboots, because none of the metadata will be in ARC. Using a persistent L2ARC or a special vdev for metadata will increase performance substantially; each have their pros and cons.
 
Top