Bhyve network issue

Lorem-Ipsum · Jan 27, 2021

Hi All,

I'm currently having a strange network performance issue with a couple of bhyve VMs.

My TrueNAS server is dual homed.
I have 2 x 1Gbe NICs in an LACP.
There are two vlan interfaces bound to the LACP.

The Bhyve guests are both using the same vlan interface as SMB and other services.
The second vlan interface is only for NFS storage.

The guests have similar specification:
1GB RAM
1 vCPU
4 vCPU Cores
VirtIO NIC

Both of the Bhyve guests are running Ubuntu 18.04 and exhibit the same issue.
Network upload speed (from the guest out to the network) is fine.
Network download speed (from the network to the guest) is horrific.

Here's an iperf test from one of the guests using an iperf server on the network:

Code:

TCP window size:  416 KByte (default)
------------------------------------------------------------
[  5] local 10.200.100.104 port 52688 connected with 10.200.101.101 port 5001
[  4] local 10.200.100.104 port 5001 connected with 10.200.101.101 port 36412
[ ID] Interval       Transfer     Bandwidth
[  5]  0.0-10.0 sec  1.09 GBytes   939 Mbits/sec
[  4]  0.0-10.3 sec  7.62 MBytes  6.22 Mbits/sec

Other (non bhyve) virtual machines on the network perform normally.
The TrueNAS host does not exhibit the issue.

Has anyone seen anything like this before?

Samuel Tai · Jan 27, 2021

We'll need a bit more info before we can weigh in. What model 1Gbps NIC are you running on the host? This sounds at first blush like a MTU fragmentation issue. Only certain NIC drivers have good support for the FreeBSD vlan driver, and can natively handle long buffers.

In practice, bhyve's not been well-tested outside of bare-metal access to host NICs, and you're going through 2 layers of indirection with both LAGG and VLANs.

Lorem-Ipsum · Jan 27, 2021

NIC details as follows:

Code:

1% pciconf -lv | grep -A1 -B3 network
em0@pci0:3:0:0: class=0x020000 card=0x000015d9 chip=0x10d38086 rev=0x00 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '82574L Gigabit Network Connection'
    class      = network
    subclass   = ethernet
em1@pci0:4:0:0: class=0x020000 card=0x000015d9 chip=0x10d38086 rev=0x00 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '82574L Gigabit Network Connection'
    class      = network
    subclass   = ethernet

I've had this setup in place for several years through a number of versions of FreeNAS and only recently updated to TrueNAS-12.0-U1.1.
As far as I can tell this has only become an issue since then, so possibly a recent bug or something of that ilk.

EDIT: Here's a full ifconfig from the TrueNAS host to show all interfaces:

Code:

1% ifconfig
em0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: member of lagg0
        options=81249b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LRO,WOL_MAGIC,VLAN_HWFILTER>
        ether 00:25:90:aa:21:88
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=9<PERFORMNUD,IFDISABLED>
em1: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: member of lagg0
        options=81249b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LRO,WOL_MAGIC,VLAN_HWFILTER>
        ether 00:25:90:aa:21:88
        hwaddr 00:25:90:aa:21:89
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=9<PERFORMNUD,IFDISABLED>
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x3
        inet 127.0.0.1 netmask 0xff000000
        groups: lo
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
pflog0: flags=0<> metric 0 mtu 33160
        groups: pflog
lagg0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: lagg0
        options=81249b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LRO,WOL_MAGIC,VLAN_HWFILTER>
        ether 00:25:90:aa:21:88
        laggproto lacp lagghash l2,l3,l4
        laggport: em0 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
        laggport: em1 flags=1c<ACTIVE,COLLECTING,DISTRIBUTING>
        groups: lagg
        media: Ethernet autoselect
        status: active
        nd6 options=9<PERFORMNUD,IFDISABLED>
vlan100: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: vlan100
        options=401<RXCSUM,LRO>
        ether 00:25:90:aa:21:88
        inet 10.200.100.101 netmask 0xffffff00 broadcast 10.200.100.255
        groups: vlan
        vlan: 100 vlanpcp: 0 parent interface: lagg0
        media: Ethernet autoselect
        status: active
        nd6 options=9<PERFORMNUD,IFDISABLED>
vlan101: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: vlan101
        options=401<RXCSUM,LRO>
        ether 00:25:90:aa:21:88
        inet 10.200.101.1 netmask 0xffffff00 broadcast 10.200.101.255
        inet 10.200.101.204 netmask 0xffffff00 broadcast 10.200.101.255
        inet 10.200.101.203 netmask 0xffffff00 broadcast 10.200.101.255
        inet 10.200.101.201 netmask 0xffffff00 broadcast 10.200.101.255
        groups: vlan
        vlan: 101 vlanpcp: 0 parent interface: lagg0
        media: Ethernet autoselect
        status: active
        nd6 options=9<PERFORMNUD,IFDISABLED>
bridge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        ether 02:ca:15:e7:f9:00
        id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
        maxage 20 holdcnt 6 proto stp-rstp maxaddr 2000 timeout 1200
        root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
        member: vnet0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 9 priority 128 path cost 2000000
        member: vlan101 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 7 priority 128 path cost 10000
        groups: bridge
        nd6 options=1<PERFORMNUD>
vnet0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=80000<LINKSTATE>
        ether fe:a0:98:4f:cf:60
        hwaddr 58:9c:fc:10:8b:15
        groups: tap
        media: Ethernet autoselect
        status: active
        nd6 options=1<PERFORMNUD>
        Opened by PID 2366
bridge1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        ether 02:ca:15:e7:f9:01
        id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15
        maxage 20 holdcnt 6 proto stp-rstp maxaddr 2000 timeout 1200
        root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0
        member: vnet1 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 11 priority 128 path cost 2000000
        member: vlan100 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP>
                ifmaxaddr 0 port 6 priority 128 path cost 10000
        groups: bridge
        nd6 options=1<PERFORMNUD>
vnet1: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=80000<LINKSTATE>
        ether fe:a0:98:0e:d8:e8
        hwaddr 58:9c:fc:10:2c:2c
        groups: tap
        media: Ethernet autoselect
        status: active
        nd6 options=1<PERFORMNUD>
        Opened by PID 2400

The additional inet's on vlan101 are for jails.

Samuel Tai · Jan 27, 2021

If you haven't upgraded your pool, you can revert back to 11.3-U5. The em driver has full VLAN processing in hardware.

Lorem-Ipsum · Feb 9, 2021

Hi, Sorry I forgot to come back and update this post.

Unfortunately I have already upgraded the pool as the issue did not present itself initially.
I rarely use much bandwidth within the guests (one acts as a corosync member as a tie break for a cluster on the network, the other is occasionally used as a test webserver) so the issue did not become apparent until I came to patch the guests.

I updated to TrueNAS 12.0-U2 today but unfortunately the issue persists.

I'll see if I can do some more testing, IE setting up a new VM from scratch to see if the issue is limited to the existing VMs.

jayecin · Feb 9, 2021

Do you have a need for the LACP port? I know often times its fun to use LACP because you have extra hardware and why not, but ive found that most times LACP just causes more problems than its worth. Unless you have a bunch of hosts talking to a single server or the need for redundancy, you really arent getting a performance boost with LACP.

Lorem-Ipsum · Feb 9, 2021

I can certainly test without the LACP to see if it the cause of the performance issues.

I'd prefer to keep it if possible for both performance and redundancy.

My TrueNAS server fills multiple roles, primarily as a file server but is also used for backup storage from multiple Proxmox clusters and both 1Gbit NICs are heavily utilized when backups are running.

Jon Moog · Feb 9, 2021

Try disabling LRO on lagg0.

Patrick M. Hausen · Feb 9, 2021

Samuel Tai said:
In practice, bhyve's not been well-tested outside of bare-metal access to host NICs, and you're going through 2 layers of indirection with both LAGG and VLANs.

Sorry, Samuel, I disagree. Bhyve uses tap(4) and if_bridge(4) and there's nothing special about that when used with lagg(4) and/or vlan(4). Most of the time as @Jon Moog suggested it is necessary to disable HW offloading on the physical parent interface(s).

@Lorem-Ipsum see my screenshot:

Bildschirmfoto 2021-02-09 um 19.48.57.png

HTH,
Patrick

Lorem-Ipsum · Feb 9, 2021

Patrick M. Hausen said:
Sorry, Samuel, I disagree. Bhyve uses tap(4) and if_bridge(4) and there's nothing special about that when used with lagg(4) and/or vlan(4). Most of the time as @Jon Moog suggested it is necessary to disable HW offloading on the physical parent interface(s).

@Lorem-Ipsum see my screenshot:
View attachment 45017

HTH,
Patrick

Thanks Patrick.

With Hardware Offloading disabled for the em0, em1 and lagg0 interfaces the network speed in the guest is back up to gigabit.

I'm glad the performance is back but I am curious why this worked in previous builds of FreeNAS and only became a problem when I updated to TrueNAS.
Performance to the TrueNAS host was also unaffected.

Perhaps a bug has been introduced somewhere along the road?

Edit: I did notice this:

The em device driver first appeared in FreeBSD 4.4. em was merged with the igb device driver and converted to the iflib framework in FreeBSD 12.0.

Edit2: So I went digging:

Looks like I'm not the only one to run into this:

234550 – Performance regression in em(4) driver with bridge/tap and used with bhyve

bugs.freebsd.org

Patrick M. Hausen · Feb 9, 2021

@Lorem-Ipsum That's a long story. With the upgrade from FreeNAS 11.3 to TrueNAS 12.0 you also went from FreeBSD 11.3 to FreeBSD 12.2. And all of the network drivers got a major overhaul in FreeBSD 12. While that produced a regression for you (and others), I am a bit reluctant to call it a bug, because hardware offloading for a network interface does not make sense if the host is not the recipient of the payload. E.g. if there are more layer 2 interfaces like lagg(4) and if_bridge(4) stacked on top of the physical IF. Why should the interface hardware and driver interfere with anything related to e.g. TCP? That's em - lagg - bridge - tap - VM ... 4 layers higher up the stack where an OS would even be interested in anything but the link-layer header.

So - currently - in my opinion this is mainly a documentation problem.

Important Announcement for the TrueNAS Community.

Bhyve network issue

Lorem-Ipsum

Dabbler

Samuel Tai

Never underestimate your own stupidity

Lorem-Ipsum

Dabbler

Samuel Tai

Never underestimate your own stupidity

Lorem-Ipsum

Dabbler

jayecin

Explorer

Lorem-Ipsum

Dabbler

Jon Moog

Dabbler

Patrick M. Hausen

Hall of Famer

Lorem-Ipsum

Dabbler

234550 – Performance regression in em(4) driver with bridge/tap and used with bhyve

Patrick M. Hausen

Hall of Famer

Similar threads