SOLVED Disable hardware offload (bandwidth still meh...)

HolyK

Ninja Turtle
Moderator
Joined
May 26, 2011
Messages
653
OS: TrueNAS-12.0-U2 on SSD Intel 520 180GB
MB: SuperMicro MBD-X10SL7-F - Intel C222
CPU: Intel Core i3-4130
RAM: Kingston Value 16GB (2x8GB) DDR3 1333 ECC + SK Hynix 16GB (2x 8GB) DDR3-1600 ECC
PSU: Enermax ErPRO80+ 350W
Pool1: Mirror / WD "White" WD120EMAZ + WD120EMFZ 12TB
Pool2: RAIDZ2 / 6x WD "White" WD120EMAZ 12TB
Case: Fractal Design DEFINE R4 Black Pearl
Network: VLAN separation of host and jail stacks. Jails on separated VLAN + subnet.

So as you might know VNET has a bug in FreeBSD 12.0 (FreeBSD Bugzilla 235607) which causes a very poor network performance in case a VNET+bridge is used.

Example of iperf between Jail (server) and Win7 (client) ... 4.19 Mbits/sec ... note that this is on Gbit LAN ... and it completely fcks up the tcp...
Client side
iperf.2.1.1.exe -c xxx.xxx.xxx.xxx -i 1 -p 5001 -w 512K -f m -t 10
------------------------------------------------------------
Client connecting to xxx.xxx.xxx.xxx, TCP port 5001
TCP window size: 0.500 MByte
------------------------------------------------------------
[ 1] local yyy.yyy.yyy.yyy port 53346 connected with xxx.xxx.xxx.xxx port 5001
tcp write failed: Software caused connection abort
[ ID] Interval Transfer Bandwidth
[ 1] 0.00-1.00 sec 0.500 MBytes 4.19 Mbits/sec
[ 1] 0.00-1.00 sec 0.500 MBytes 4.19 Mbits/sec

Server side

[ ID] Interval Transfer Bandwidth
[ 1] 0.00-1.00 sec 31.4 KBytes 257 Kbits/sec

And another example of iperf between Jail (server) and Debian Linux (client, small ARM device) ... 11.2 Mbits/sec it does not abort but still the speed is baaaaad
Client side
iperf -c xxx.xxx.xxx.xxx -i 1 -p 5001 -w 512K -f m -t 10
------------------------------------------------------------
Client connecting to xxx.xxx.xxx.xxx, TCP port 5001
TCP window size: 0.41 MByte (WARNING: requested 0.50 MByte)
------------------------------------------------------------
[ 3] local zzz.zzz.zzz.zzz port 44463 connected with xxx.xxx.xxx.xxx port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0- 1.0 sec 0.75 MBytes 6.32 Mbits/sec
[ 3] 1.0- 2.0 sec 1.50 MBytes 12.6 Mbits/sec
[ 3] 2.0- 3.0 sec 1.50 MBytes 12.6 Mbits/sec
[ 3] 3.0- 4.0 sec 1.75 MBytes 14.7 Mbits/sec

Server side

[ ID] Interval Transfer Bandwidth
[ 3] 0.00-10.04 sec 13.5 MBytes 11.2 Mbits/sec

General recommendation is to Disable Hardware Offloading but the question is where?
- Yes it is clear to do it on the Host interface on which the VLAN/Bridge is sitting but how about the rest?
- Should it be explicitly disabled on the VLAN and the bridge as well? Does not seems to have any flags active even if not explicitly disabled (ticket) in GUI
- How about the Windows VMs ? ( I see HW offloads (TSO/LRO) as "enabled" under VirtIO NIC)
- Jails does not seems to have it enabled if the host iface has it disabled.

Anyway after disabling the HW offload on the host NIC it looks way better but on the other hand the bandwidth still sucks... (about 600Mbit between client and Jail but ~950Mbit between client and TrueNAS host itself)
Is the vnet/bridge really that fcked up so it takes almost 50% overhead? Whaaaat ?!??!!!
(All tests done only with one Jail active and w/o any bhyve VM online. The Jail itself holds only a small MySQL DB with one client but no major transfers...)

//EDIT: Ok, I've figured it out few minutes after submitting the topic (after 2-3 hours of composing and testing ...)

The problem is that by default iperf opens single connection which is handled by a single core of the client/(router)/server (here the bottle neck seems to be the pfSense router where the J3160 CPU can apparently handle ~600Mbit per second). When i ran the iperf with parallel connections it utilized multiple cores and i got ~900Mbit which seems reasonable ...

Oh and second thing ... i was getting 950Mbit between clients and TrueNAS host becuase both sides are on the same VLAN so it went only through switch. Jails are on different VLAN so it went through router where the bottleneck was.

So all good after all ... well Win10 speed stills sucks but that seems to be more like bhyve issue or something....

I will leave the rest of the original thread in spoiler tags bellow just for historical purpose (yes, you can laugh on me now :D)

Iperf Jail (server) and Win7 (client) ... 613 Mbits/sec
Client side
iperf.2.1.1.exe -c xxx.xxx.xxx.xxx -i 1 -p 5001 -w 512K -f m -t 10
------------------------------------------------------------
Client connecting to xxx.xxx.xxx.xxx, TCP port 5001
TCP window size: 0.500 MByte
------------------------------------------------------------
[ 1] local yyy.yyy.yyy.yyy port 54149 connected with xxx.xxx.xxx.xxx port 5001
[ ID] Interval Transfer Bandwidth
[ 1] 0.00-1.00 sec 75.5 MBytes 633 Mbits/sec
[ 1] 1.00-2.00 sec 76.3 MBytes 640 Mbits/sec
[ 1] 2.00-3.00 sec 74.4 MBytes 624 Mbits/sec

Server side

[ ID] Interval Transfer Bandwidth
[ 4] 0.00-10.01 sec 732 MBytes 613 Mbits/sec

Iperf Jail (server) and Debian (client) ... 600 Mbits/sec
Client side
iperf -c xxx.xxx.xxx.xxx -i 1 -p 5001 -w 512K -f m -t 10
------------------------------------------------------------
Client connecting to xxx.xxx.xxx.xxx, TCP port 5001
TCP window size: 0.41 MByte (WARNING: requested 0.50 MByte)
------------------------------------------------------------
[ 3] local zzz.zzz.zzz.zzz port 45077 connected with xxx.xxx.xxx.xxx port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0- 1.0 sec 70.9 MBytes 595 Mbits/sec
[ 3] 1.0- 2.0 sec 71.9 MBytes 603 Mbits/sec
[ 3] 2.0- 3.0 sec 70.6 MBytes 592 Mbits/sec

Server side

[ ID] Interval Transfer Bandwidth
[ 1] 0.00-10.01 sec 715 MBytes 600 Mbits/sec

Ok now the iperf between TrueNAS HOST (server) and Win + Debian as clients ... 941 Mbits/sec and 814 Mbits/sec all goooood !
Win7 client
iperf.2.1.1.exe -c aaa.aaa.aaa.aaa -i 1 -p 5001 -w 512K -f m -t 10
------------------------------------------------------------
Client connecting to aaa.aaa.aaa.aaa, TCP port 5001
TCP window size: 0.500 MByte
------------------------------------------------------------
[ 1] local yyy.yyy.yyy.yyy port 54459 connected with aaa.aaa.aaa.aaa port 5001
[ ID] Interval Transfer Bandwidth
[ 1] 0.00-1.00 sec 112 MBytes 937 Mbits/sec
[ 1] 1.00-2.00 sec 113 MBytes 944 Mbits/sec
[ 1] 2.00-3.00 sec 111 MBytes 933 Mbits/sec

Debian client

iperf -c aaa.aaa.aaa.aaa -i 1 -p 5001 -w 512K -f m -t 10
------------------------------------------------------------
Client connecting to aaa.aaa.aaa.aaa, TCP port 5001
TCP window size: 0.41 MByte (WARNING: requested 0.50 MByte)
------------------------------------------------------------
[ 3] local zzz.zzz.zzz.zzz port 42450 connected with aaa.aaa.aaa.aaa port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0- 1.0 sec 96.9 MBytes 813 Mbits/sec
[ 3] 1.0- 2.0 sec 95.9 MBytes 804 Mbits/sec
[ 3] 2.0- 3.0 sec 97.5 MBytes 818 Mbits/sec

Server side for both
[ ID] Interval Transfer Bandwidth
[ 4] 0.0-10.0 sec 1.10 GBytes 941 Mbits/sec
[ 5] 0.0-10.0 sec 971 MBytes 814 Mbits/sec

So what the hell i am missing ? My network is simple ... pfSense as a router (physical device, Gbit Intel NICs, CPU J3160, 8GB RAM) + HP ProCurve 1810G. Few VLANs, some FW rules, nothing fancy ... IF configs bellow:

TrueNAS host:
igb0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
description: LAN
options=8138b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,WOL_UCAST,WOL_MCAST,WOL_MAGIC,VLAN_HWFILTER>
ether hh:hh:hh:hh:hh:hh
inet aaa.aaa.aaa.aaa netmask 0xffffff00 broadcast aaa.aaa.aaa.255
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
nd6 options=9<PERFORMNUD,IFDISABLED>

vlan50: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
description: vlan50 for Jails
ether hh:hh:hh:hh:hh:hh
groups: vlan
vlan: 50 vlanpcp: 0 parent interface: igb0
media: Ethernet autoselect (1000baseT <full-duplex>)
status: active
nd6 options=9<PERFORMNUD,IFDISABLED>

bridge50: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
description: bridge50 for Jails
ether hh:hh:hh:hh:hh:hh
id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
maxage 20 holdcnt 6 proto stp-rstp maxaddr 2000 timeout 1200
root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
member: vnet0.29 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
ifmaxaddr 0 port 7 priority 128 path cost 2000
member: vlan50 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
ifmaxaddr 0 port 5 priority 128 path cost 20000
groups: bridge
nd6 options=9<PERFORMNUD,IFDISABLED>

vnet0.29: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
description: associated with jail: kodidb as nic: epair0b
options=8<VLAN_MTU>
ether hh:hh:hh:hh:hh:hh
hwaddr hh:hh:hh:hh:hh:hha
groups: epair
media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
status: active
nd6 options=1<PERFORMNUD>

Jail:
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
inet6 ::1 prefixlen 128
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1
inet 127.0.0.1 netmask 0xff000000
groups: lo
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>

pflog0: flags=0<> metric 0 mtu 33160
groups: pflog

epair0b: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=8<VLAN_MTU>
ether hh:hh:hh:hh:hh:hh
hwaddr hh:hh:hh:hh:hh:hh
inet xxx.xxx.xxx.xxx netmask 0xffffff00 broadcast xxx.xxx.xxx.255
groups: epair
media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
status: active
nd6 options=1<PERFORMNUD>

Any ideas ?

Thank you in advance for comments...
 
Last edited:

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,740
Anyway after disabling the HW offload on the host NIC it looks way better but on the other hand the bandwidth still sucks... (about 600Mbit between client and Jail but ~950Mbit between client and TrueNAS host itself)
Is the vnet/bridge really that fcked up so it takes almost 50% overhead?
I would not have put it that way, but yes it is. It runs on a single CPU core with a giant lock in FreeBSD 12. The bridge code was completely reworked by Kristof Provost sponsored by the FreeBSD Foundation. The speedup is about 5x on a moderate sized system with even higher factors expected on platforms with a really high core count.

Unfortunately (for us) the reworked driver is available in FreeBSD 13 and onward and cannot be backported due to missing kernel interfaces in FreeBSD 12.

I expect the bridge in TrueNAS to be en par with any other platform performance-wise, once we get to run TrueNAS on FreeBSD 13. When that might be, only iXsystems can answer.
 
Top