slow SMB read (but not write) in a striped mirror of NVMe SSDs

metebalci

Dabbler
Joined
Jan 18, 2023
Messages
28
I dont understand why the read speed is slow for this pool.

TrueNAS is running as a VM, 4 cores 32GB memory, with Intel 10G NIC which is assigned with PCIe passthrough.

All 4 disks in the pool below are Samsung PM981 NVMe SSD, installed on an x16 gen3 adapter, assigned to TrueNAS with PCIe passthrough.

Code:
root@tnas[~]# zpool status fast
  pool: fast
 state: ONLINE
config:

        NAME                                            STATE     READ WRITE CKSUM
        fast                                            ONLINE       0     0     0
          mirror-0                                      ONLINE       0     0     0
            gptid/db594e70-a209-11ed-8822-649d99b180d2  ONLINE       0     0     0
            gptid/db58ad97-a209-11ed-8822-649d99b180d2  ONLINE       0     0     0
          mirror-1                                      ONLINE       0     0     0
            gptid/db5a6840-a209-11ed-8822-649d99b180d2  ONLINE       0     0     0
            gptid/db56fd03-a209-11ed-8822-649d99b180d2  ONLINE       0     0     0

errors: No known data errors


This is the write from my Windows PC to TrueNAS. They are on the same VLAN, connected to the same switch. The file (64gb.bin) is an 64GB file. I have dirty data max ~10GB, and it always syncs ~2GB in ~1 seconds. There is no write throttle or delay happening. Hence, I think, the write speed is stable.

fastds-write.png


This is the read. I also tried copying just after a reboot and it is same.

fastds-read.png


I also tried this with an 16GB file as I believe it fully fits to ARC, and a second copy shows 0 miss in arcstat but the read performance is still same ~400MB/s.
 

Tony-1971

Contributor
Joined
Oct 1, 2016
Messages
147
Maybe you hit the max write speed of your windows PC?
Best Regards,
Antonio
 

metebalci

Dabbler
Joined
Jan 18, 2023
Messages
28
Maybe you hit the max write speed of your windows PC?
Best Regards,
Antonio

I forgot to mention that, it could be, but it is also NVMe SSD.

1675245533307.png


I should also add I tried the same with 3x striped HDD pool, and the result was also similar (naturally the write was a bit slower but read was again ~350MB/s). So it makes me think this is not related to the pool but I couldnt find an answer yet.
 

metebalci

Dabbler
Joined
Jan 18, 2023
Messages
28
I did a very simple test with NFS (v3), with a Linux client on the same VLAN. It is a bit better (~800MB/s write, ~500MB/s read) but still not what I expect. Again the file is, as far as I can say, totally in ARC (I see zero misses in arcstat).
 

Tony-1971

Contributor
Joined
Oct 1, 2016
Messages
147
You can try to execute a local filesystem performance test using fio command.
And then iperf3 in both direction (as client and as server)
 

metebalci

Dabbler
Joined
Jan 18, 2023
Messages
28
I already checked iperf3 a few times, I didnt see an issue to cause something like this. Normally it is around 8-9Gb/s either direction, with single flow, on Windows I see 6Gb/s sometimes but not sure if this is related to iperf3 build or Windows or NIC. Parallel flow always reaches >9Gb/s.

Also, I dont see anything wrong with fio but not sure if I am using the proper parameters.

Code:
fio --name=test --size=16g --rw=write --ioengine=posixaio --direct=1 --bs=1m

WRITE: bw=3000MiB/s (3146MB/s), 3000MiB/s-3000MiB/s (3146MB/s-3146MB/s), io=16.0GiB (17.2GB), run=5461-5461msec


Code:
fio --name=test --size=16g --rw=read --ioengine=posixaio --direct=1 --bs=1m

READ: bw=6297MiB/s (6603MB/s), 6297MiB/s-6297MiB/s (6603MB/s-6603MB/s), io=16.0GiB (17.2GB), run=2602-2602msec
 

Tony-1971

Contributor
Joined
Oct 1, 2016
Messages
147
6Gb/s on single flow can explain low read speed (NFS and Samba are single flow).
You can also check if the problem is related to CPU speed other than iperf3 build or Windows or NIC.
 

metebalci

Dabbler
Joined
Jan 18, 2023
Messages
28
First, regarding CPU limitation, I dont see any core over 50% on TrueNAS, and I dont see my desktop CPU go over 15%. Both CPUs are quite new (EPYC 7313P and Ryzen 3700X) and base freq>3GHz, so I dont think there is anything CPU limited.

It seems like there is something going on about how SMB uses the network/NIC. Instead of using just a file copy, I mapped this SMB share as a drive and I checked the speed with CrystalDiskMark:

1675336092612.png


A bit higher, but SEQ1MQ1T1 read is similar to what I see, and SEQ1MQ8T1 reaches the limit. I understand SMB might be single thread/flow, but I still dont get why I see write speed 2x higher than read speed on basic windows file copy.

Edit: I wondered all Q=1 or 2 and T=1 or 2 combinations, so here it is :

1675336950042.png
 

metebalci

Dabbler
Joined
Jan 18, 2023
Messages
28
I tried SMB multichannel, I configured the server side (TrueNAS) and confirmed it is setup correctly on the client (Windows), but the result is same. I do realized however, even with smb multichannel, I see 4 sockets opened, and >1 is concurrently used when writing but mostly only 1 is used (but different ones) for reading. I have a feeling RSS is not properly used on windows side but could not dig into that much. I thought maybe RSS hash is only IP address on Windows (and I could not find yet how to see or modify that), and I had only one IP at server so I also tried connecting the other port so the server has 2 IPs and I can see that on the client, but it also did not change anything.
 

Dice

Wizard
Joined
Dec 11, 2015
Messages
1,410
Have you played with jumbo frames?
 

Morris

Contributor
Joined
Nov 21, 2020
Messages
120
When you copy to the NAS, the data goes to RAM and later to disk. When you read you are reading the disk. RAM is much faster than even an NVME drive.
 

metebalci

Dabbler
Joined
Jan 18, 2023
Messages
28
Sorry for the very late reply. I parked this test and started using TrueNAS Core with 4x2-way HDDs. Recently I did some tests (SMB, crystal disk mark, SEQ1M Q8T1) and one thing that is obvious is I need SMB Multichannel in my setup (10G network).

Without SMB multichannel, I have a limit around 800 MB/s, that is the same for read and write and same with or without caches (ARC only, I dont have L2ARC).

When caches are disabled, with SMB multichannel (I tried various combinations, 1 or 2 NICs at client and server etc.), the write throughput increases a bit up to 1 GB/s (that should really be my theoretical maximum, the HDDs are 260 MB/s on spec) but the read speed stays the same around 800 MB/s. This still puzzles me because I think 4x2-way setup should have 2x read speed, I think either I dont know something or my test is inappropriate for this. I was expecting to saturate 10G network quickly with read.

When caches are enabled, with SMB multichannel, 10G is saturated, I see ~1.1 GB/s for both read and write.
 
Top