Poor 10gb network performance Truenas Scale

Benji99

Cadet
Joined
Jan 30, 2023
Messages
9
Hi folks,

I'm not particularly happy with my network performance from my Truenas Scale build and was hoping this could be improved.

My Truenas Scale server specs are:
X99 motherboard
64gb of ram
Xeon E5-2690v3 CPU
Intel X710 4x SFP+ NIC (with 2 in use using SFP+ DAC connected to a Unifi Aggregate Switch

NIC details
Code:
sudo lspci | grep X710
04:00.0 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 02)
04:00.1 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 02)
04:00.2 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 02)
04:00.3 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 02)

sudo find /sys | grep drivers.*04:00.0
/sys/bus/pci/drivers/i40e/0000:04:00.0
sudo find /sys | grep drivers.*04:00.1
/sys/bus/pci/drivers/i40e/0000:04:00.1

sudo lshw -class network
  *-network:0
       description: Ethernet interface
       product: Ethernet Controller X710 for 10GbE SFP+
       vendor: Intel Corporation
       physical id: 0
       bus info: pci@0000:04:00.0
       logical name: ens4f0
       version: 02
       serial: 40:a6:b7:3b:2d:e8
       size: 10Gbit/s
       capacity: 10Gbit/s
       width: 64 bits
       clock: 33MHz
       capabilities: pm msi msix pciexpress vpd bus_master cap_list rom ethernet physical fibre 10000bt-fd
       configuration: autonegotiation=off broadcast=yes driver=i40e driverversion=5.15.79+truenas duplex=full firmware=6.80 0x800042e1 0.385.97 ip=192.168.6.5 latency=0 link=yes multicast=yes speed=10Gbit/s
       resources: irq:29 memory:90000000-907fffff memory:92800000-92807fff memory:fb980000-fb9fffff memory:92000000-921fffff memory:92820000-9289ffff
  *-network:1
       description: Ethernet interface
       product: Ethernet Controller X710 for 10GbE SFP+
       vendor: Intel Corporation
       physical id: 0.1
       bus info: pci@0000:04:00.1
       logical name: ens4f1
       version: 02
       serial: 40:a6:b7:3b:2d:e9
       size: 10Gbit/s
       capacity: 10Gbit/s
       width: 64 bits
       clock: 33MHz
       capabilities: pm msi msix pciexpress vpd bus_master cap_list rom ethernet physical fibre 10000bt-fd
       configuration: autonegotiation=off broadcast=yes driver=i40e driverversion=5.15.79+truenas duplex=full firmware=6.80 0x800042e1 0.385.97 latency=0 link=yes multicast=yes speed=10Gbit/s
       resources: irq:29 memory:90800000-90ffffff memory:92808000-9280ffff memory:fb900000-fb97ffff memory:92200000-923fffff memory:928a0000-9291ffff




I've run tests from 2 external machines:
1.) Ubuntu server hosted directly on the Truenas Scale server and using the 2nd SFP+ NIC. It has 6x vCPU codes and 8gb of RAM
2.) Lenovo M920q i7-9700T with 32GB of RAM and a Supermicro AOC-STGN-I2S SFP+ card with Ubuntu Server freshly installed

I have a few datasets on my truenas Server but I'm trying in particular to optimize for my NVME dataset that has 2x 1TB NVMEs in a RAIDz1 array

Looks like internally, the NVME is operating at acceptable speed with 1.5GBs reported:

Code:
time dd if=/dev/zero of=/mnt/NVME_Pool/Docker_Data/testfile bs=16k count=128k
131072+0 records in
131072+0 records out
2147483648 bytes (2.1 GB, 2.0 GiB) copied, 1.6583 s, 1.3 GB/s


Here's the same test from 1.)
Code:
dd if=/dev/zero of=/mnt/nfsmount/testfile bs=16k count=128k
131072+0 records in
131072+0 records out
2147483648 bytes (2.1 GB, 2.0 GiB) copied, 10.1924 s, 211 MB/s


and 2.)
Code:
dd if=/dev/zero of=/mnt/NVME_Pool/Docker_Data/testfile bs=16k count=128k
131072+0 records in
131072+0 records out
2147483648 bytes (2.1 GB, 2.0 GiB) copied, 1.6583 s, 1.3 GB/s


iperf3 tests:
From 1.)
Code:
iperf3 -P 20 -c 192.168.6.5 -p 8008
Connecting to host 192.168.6.5, port 8008
[  5] local 192.168.6.7 port 36192 connected to 192.168.6.5 port 8008
<snip>
[ 43] local 192.168.6.7 port 36364 connected to 192.168.6.5 port 8008
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  15.6 MBytes   131 Mbits/sec    0    202 KBytes
<snip>
[ 43]   0.00-1.00   sec  11.5 MBytes  96.2 Mbits/sec    0    168 KBytes
[SUM]   0.00-1.00   sec   275 MBytes  2.30 Gbits/sec    0
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   1.00-2.00   sec  24.3 MBytes   204 Mbits/sec    0    325 KBytes
[<snip>
[ 43]   1.00-2.00   sec  14.9 MBytes   125 Mbits/sec    0    175 KBytes
[SUM]   1.00-2.00   sec   348 MBytes  2.92 Gbits/sec    4
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   2.00-3.00   sec  19.3 MBytes   162 Mbits/sec    0    325 KBytes
<snip>
[ 43]   2.00-3.00   sec  12.3 MBytes   103 Mbits/sec    0    184 KBytes
[SUM]   2.00-3.00   sec   278 MBytes  2.33 Gbits/sec    0
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   3.00-4.00   sec  18.1 MBytes   152 Mbits/sec    0    325 KBytes
<snip>
[ 43]   3.00-4.00   sec  12.3 MBytes   103 Mbits/sec    0    184 KBytes
[SUM]   3.00-4.00   sec   284 MBytes  2.38 Gbits/sec    0
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   4.00-5.00   sec  23.0 MBytes   193 Mbits/sec    0    325 KBytes
<snip>
[ 43]   4.00-5.00   sec  15.0 MBytes   126 Mbits/sec    0    191 KBytes
[SUM]   4.00-5.00   sec   365 MBytes  3.07 Gbits/sec    0
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   5.00-6.00   sec  22.2 MBytes   186 Mbits/sec    0    325 KBytes
<snip>
[ 43]   5.00-6.00   sec  14.0 MBytes   118 Mbits/sec    0    197 KBytes
[SUM]   5.00-6.00   sec   344 MBytes  2.88 Gbits/sec    0
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   6.00-7.00   sec  21.7 MBytes   182 Mbits/sec    0    325 KBytes
<snip>
[ 43]   6.00-7.00   sec  15.8 MBytes   133 Mbits/sec    0    197 KBytes
[SUM]   6.00-7.00   sec   359 MBytes  3.01 Gbits/sec    0
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   7.00-8.00   sec  20.3 MBytes   171 Mbits/sec    0    325 KBytes
<snip>
[ 43]   7.00-8.00   sec  13.9 MBytes   117 Mbits/sec    0    197 KBytes
[SUM]   7.00-8.00   sec   343 MBytes  2.88 Gbits/sec    0
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   8.00-9.00   sec  22.3 MBytes   187 Mbits/sec    0    325 KBytes
<snip>
[ 43]   8.00-9.00   sec  15.4 MBytes   129 Mbits/sec    0    208 KBytes
[SUM]   8.00-9.00   sec   365 MBytes  3.06 Gbits/sec    0
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   9.00-10.00  sec  18.8 MBytes   158 Mbits/sec    0    325 KBytes
<snip>
[ 43]   9.00-10.00  sec  16.0 MBytes   135 Mbits/sec    0    270 KBytes
[SUM]   9.00-10.00  sec   306 MBytes  2.57 Gbits/sec    0
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   206 MBytes   172 Mbits/sec    0             sender
<snip>
[ 43]   0.00-10.00  sec   140 MBytes   117 Mbits/sec                  receiver
[SUM]   0.00-10.00  sec  3.19 GBytes  2.74 Gbits/sec    4             sender
[SUM]   0.00-10.00  sec  3.17 GBytes  2.72 Gbits/sec                  receiver

iperf Done.


From 2.)
Code:
iperf3 -P 20 -c 192.168.6.5 -p 8008
Connecting to host 192.168.6.5, port 8008
[  5] local 192.168.4.219 port 40520 connected to 192.168.6.5 port 8008
<snip>
[ 43] local 192.168.4.219 port 40698 connected to 192.168.6.5 port 8008
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  36.5 MBytes   306 Mbits/sec    0   1.28 MBytes
<snip>
[ 43]   0.00-1.00   sec  28.7 MBytes   241 Mbits/sec    0    834 KBytes
[SUM]   0.00-1.00   sec   491 MBytes  4.11 Gbits/sec    0
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   1.00-2.00   sec  27.5 MBytes   231 Mbits/sec    0   1.28 MBytes
<snip>
[ 43]   1.00-2.00   sec  27.5 MBytes   231 Mbits/sec    0    899 KBytes
[SUM]   1.00-2.00   sec   431 MBytes  3.62 Gbits/sec  225
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   2.00-3.00   sec  23.8 MBytes   199 Mbits/sec    0   1.35 MBytes
<snip>
[ 43]   2.00-3.00   sec  22.5 MBytes   189 Mbits/sec    0    899 KBytes
[SUM]   2.00-3.00   sec   417 MBytes  3.50 Gbits/sec   90
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   3.00-4.00   sec  22.5 MBytes   189 Mbits/sec    0   1.35 MBytes
<snip>
[ 43]   3.00-4.00   sec  21.2 MBytes   178 Mbits/sec   18    191 KBytes
[SUM]   3.00-4.00   sec   419 MBytes  3.51 Gbits/sec  108
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   4.00-5.00   sec  22.5 MBytes   189 Mbits/sec    0   1.35 MBytes
<snip>
[ 43]   4.00-5.00   sec  21.2 MBytes   178 Mbits/sec   27    629 KBytes
[SUM]   4.00-5.00   sec   432 MBytes  3.63 Gbits/sec  117
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   5.00-6.00   sec  21.2 MBytes   178 Mbits/sec    0   1.35 MBytes
<snip>
[ 43]   5.00-6.00   sec  23.8 MBytes   199 Mbits/sec    0    629 KBytes
[SUM]   5.00-6.00   sec   452 MBytes  3.80 Gbits/sec    0
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   6.00-7.00   sec  22.5 MBytes   189 Mbits/sec    0   1.35 MBytes
<snip>
[ 43]   6.00-7.00   sec  28.8 MBytes   241 Mbits/sec    0    629 KBytes
[SUM]   6.00-7.00   sec   485 MBytes  4.07 Gbits/sec    0
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   7.00-8.00   sec  25.0 MBytes   210 Mbits/sec    0   1.35 MBytes
<snip>
[ 43]   7.00-8.00   sec  23.8 MBytes   199 Mbits/sec    0    629 KBytes
[SUM]   7.00-8.00   sec   434 MBytes  3.64 Gbits/sec    0
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   8.00-9.00   sec  21.2 MBytes   178 Mbits/sec    0   1.43 MBytes
<snip>
[ 43]   8.00-9.00   sec  22.5 MBytes   189 Mbits/sec    0    877 KBytes
[SUM]   8.00-9.00   sec   418 MBytes  3.50 Gbits/sec  405
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   9.00-10.00  sec  22.5 MBytes   189 Mbits/sec    0   1.43 MBytes
<snip>
[ 43]   9.00-10.00  sec  25.0 MBytes   210 Mbits/sec    0    877 KBytes
[SUM]   9.00-10.00  sec   429 MBytes  3.60 Gbits/sec    0
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   245 MBytes   206 Mbits/sec    0             sender
<snip>
[ 43]   0.00-10.00  sec   242 MBytes   203 Mbits/sec                  receiver
[SUM]   0.00-10.00  sec  4.30 GBytes  3.70 Gbits/sec  945             sender
[SUM]   0.00-10.00  sec  4.24 GBytes  3.64 Gbits/sec                  receiver

iperf Done.


I'll be able getting another system online with a 10gb card so I'll be able to test without Truenas but any thoughts?
Could there be NIC misconfiguration or an outdated NIC driver?
Better NFS/Dataset settings (althought doesn't the iperf3 test abstract the datasets and sharing protocols?)
I mean I don't think I'm hitting any CPU or RAM limitations and clearly the NICs negotiated at 10gb so really not sure what's going on...

And just for your info. What i'm trying to achieve is the fastest possible remote storage for a Promox cluster and also perhaps for (hosted on the Proxmox nodes) kubernetes/Docker persistent data. I'm planning to add a x16 NVME adapter with 4x NVMEs but clearly I'm not even able to get close to saturate the current NVME setup.

Any help appreciated!
 

Attachments

  • Screenshot 2023-03-10 at 6.03.22 PM.png
    Screenshot 2023-03-10 at 6.03.22 PM.png
    887.3 KB · Views: 181
  • Screenshot 2023-03-10 at 6.03.47 PM.png
    Screenshot 2023-03-10 at 6.03.47 PM.png
    87.3 KB · Views: 146
  • Screenshot 2023-03-10 at 6.03.36 PM.png
    Screenshot 2023-03-10 at 6.03.36 PM.png
    804.8 KB · Views: 123
  • Screenshot 2023-03-10 at 6.04.25 PM.png
    Screenshot 2023-03-10 at 6.04.25 PM.png
    1.4 MB · Views: 144

boostedn

Dabbler
Joined
Mar 9, 2023
Messages
14
I don't know the exact issue here but I recently upgraded my network to use 10G as well. I also run a combination of Ubiquiti and TP-Link hardware. I found that if you enable Flow Control globally for every switch, it will drastically improve transfer speeds. You can leave Jumbo Frames unchecked.

If you are using any other switches down the line from any other manufacturers, you will need to enable Flow Control on those as well. On the Client/Server side, I did not have to do anything else.

My setup is UDM Pro (SFP+ with a DAC to my TP Link Switch) -> TPLink to my server with a DAC. Then TP Link to my Client machine with a SFP+ to Ethernet converter and Cat6e cable. All connected at 10G.

In the Unifi controller it's located under Settings -> Networks -> Global Switch Settings: Checkmark it, then save and wait for all the switches to update. As soon as I enabled it (and on my TP Link as well) I was able to achieve 1.1GB/s transfer speeds. My TrueNAS server only has 2.5G ports but I max those out with the same settings using SFP+ to Ethernet modules from WiiTek.



flowcontrol.png
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
One comment, using /dev/zero as a source for timing tests can cause false readings. You may know this already, but if their is compression enabled on the pool / dataset, (you don't specify), then /dev/zero will compress down to nothing.

Code:
time dd if=/dev/zero of=/mnt/NVME_Pool/Docker_Data/testfile bs=16k count=128k
131072+0 records in
131072+0 records out
2147483648 bytes (2.1 GB, 2.0 GiB) copied, 1.6583 s, 1.3 GB/s


Other than that, I don't have any suggestions.
 

NickF

Guru
Joined
Jun 12, 2014
Messages
763
I too have noticed similar performance concerns with single-client single-stream testing. Scaling out wider than that, performance was more in line with what I had expected. Before realizing, I had resorted to installing a 100 Gigabit ethernet card in my server (40 gigabit to switch) and a 25 Gigabit ethernet card in my PC (10 gigabit to switch). But even that didn't resolve the issue, and I wish I hadn't bothered spending the extra money.

Performance didn't scale at all.
1679368667549.png


You really need to realize that performance is limited and bottlenecked elsewhere in the stack, as multiple streams. Adding
Code:
-P 10


to the end of your iperf string, will result in greater performance.
1679369024806.png


Unfortunately this is not a limit in TrueNAS SCALE, but rather a little more complicated than that...I would challenge you to replicate the test on some other operating systems, but I would almost certainly guarantee that you will see similar results. Single threaded tests are single threaded tests...
 
Last edited:

kawal

Cadet
Joined
Aug 10, 2023
Messages
3
I had same issue with iperf and real transfers being slow. I would get max 3Gb speeds with truenas scale but 9.8 ...9.9 Gb using iperf -P 5 (same 3..4Gb performance in single streams) Tried all kinds of "improvements". Now using OMV and get full speed. transferring files without any tweaking.
HW :
HP Dl380 gen8 2x Xeon E5-2680v2 , 128GB Ram , P822/2GB raid card on OMV with 12x8TB SAS drives HW RAid 6 XFS
on Truenas I used a HP H220 HBA card in IT mode Raid Z2.
I started with OMV but moved to Truenas in hopes for better intyegration with Nvidia and Plex. I was surpprosed about the lacking performance in network transfers so back to OMV6 I went.
I used to be a freenas user a while ago and wanted to try it again.

kawal
 

Attachments

  • Speed test Desktop to  OMV2.jpg
    Speed test Desktop to OMV2.jpg
    22.1 KB · Views: 175

kawal

Cadet
Joined
Aug 10, 2023
Messages
3
I had same issue with iperf and real transfers being slow. I would get max 3Gb speeds with truenas scale but 9.8 ...9.9 Gb using iperf -P 5 (same 3..4Gb performance in single streams) Tried all kinds of "improvements". Now using OMV and get full speed. transferring files without any tweaking.
HW :
HP Dl380 gen8 2x Xeon E5-2680v2 , 128GB Ram , P822/2GB raid card on OMV with 12x8TB SAS drives HW RAid 6 XFS
on Truenas I used a HP H220 HBA card in IT mode Raid Z2.
I started with OMV but moved to Truenas in hopes for better intyegration with Nvidia and Plex. I was surpprosed about the lacking performance in network transfers so back to OMV6 I went.
I used to be a freenas user a while ago and wanted to try it again.

kawal
I did Iperf on OMV and single stream will not get full 10Gb either but actual transfers are way faster.
 

Attachments

  • iperf omv.PNG
    iperf omv.PNG
    53.9 KB · Views: 204
Joined
Dec 29, 2014
Messages
1,135
I have a vague recollection of iperf not performing as well as iperf3 where I was running my TN on HP G6/G7 hardware. That was several years ago. Do you different results from iperf3 over the older version of iperf? You have to use iperf3 on both sides since it listens on different ports than the older iperf.
 

kawal

Cadet
Joined
Aug 10, 2023
Messages
3
I have a vague recollection of iperf not performing as well as iperf3 where I was running my TN on HP G6/G7 hardware. That was several years ago. Do you different results from iperf3 over the older version of iperf? You have to use iperf3 on both sides since it listens on different ports than the older iperf.
Sorry I did use iperf3 as seen on the screenshots.
 
Joined
Dec 29, 2014
Messages
1,135
Yeah, like I said earlier. You need to use -p to see what the theoretical maximum performance of your networking is.
I have seen this too. On the aforementioned HP platform, there was a net gain up a certain number of threads. Once I went past that, adding threads reduced performance. It probably varies by CPU.
 
Top