vnet jail bridged to LAN has very slow access to LAN/WAN

Ray DeMoss

Dabbler
Joined
Jul 11, 2017
Messages
11
I am recovering from a crashed TrueNAS system where I lost my jails. This is a long story and really not important. This is my primary home server and hosts many local LAN shares and when everything was working, it runs 5 jails. Currently, I have one jail setup that runs a webserver that downloads files. Before the crash, I observed nearly full ISP bandwidth to the Internet from within the jail running the same software. Here is the relevant info, but please ask for anything that may be missing.

  • I have one NIC connected to the LAN. This is a 10Gbit interface ix0 connected to a 10Gbit switch connected to the pfSense router.
  • I have a Google Fiber connection to the Internet with 2Gbit down and 1Gbit up.
  • From the root folder of the TrueNAS server, I run a command line speed test to ensure the core functionality is working as shown below
  • I am only running IPv4 and everything is on one network. There are no VLANs or LAGGs on any interfaces. This is a home server so I keep it pretty simple.
  • The MTU is all set to 1500 all-around although the host interface looks like it supports jumbo packets.
  • The NIC card is an Intel X540 card and offloading is enabled. This was also enabled in my previous build.
root@truenas:~ # ./speedtest-cli Retrieving speedtest.net configuration... Testing from Google Fiber (136.xxx.xxx.xxx)... Retrieving speedtest.net server list... Selecting best server based on ping... Hosted by UTOPIA Fiber (SLC, UT) [7.62 km]: 5.159 ms Testing download speed................................................................................ [COLOR=rgb(184, 49, 47)]Download: 1923.62 Mbit/s[/COLOR] Testing upload speed...................................................................................................... [COLOR=rgb(184, 49, 47)]Upload: 834.26 Mbit/s[/COLOR] root@truenas:~ #
  • The jail is set up using vnet0 and bridge0.
  • Here is the ifconfig output from the host system:
ix0: flags=8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=4a538b9<RXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,WOL_UCAST,WOL_MCAST,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,NOMAP> ether xx:xx:xx:xx:xx:60 inet 192.168.10.96 netmask 0xffffff00 broadcast 192.168.10.255 media: Ethernet autoselect (10Gbase-T <full-duplex>) status: active nd6 options=9<PERFORMNUD,IFDISABLED> bridge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 ether xx:xx:xx:xx:xx:91 id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15 maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200 root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0 member: vnet0.7 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP> ifmaxaddr 0 port 10 priority 128 path cost 2000 member: ix0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP> ifmaxaddr 0 port 5 priority 128 path cost 2000 groups: bridge nd6 options=9<PERFORMNUD,IFDISABLED> vnet0.7: flags=8963<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500 description: associated with jail: sabnzbd as nic: epair0b options=8<VLAN_MTU> ether xx:xx:xx:xx:xx:38 hwaddr xx:xx:xx:xx:xx:0a groups: epair media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>) status: active nd6 options=9<PERFORMNUD,IFDISABLED>
  • I then run the same speed test from inside the jail
root@testjail:~ # ./speedtest-cli Retrieving speedtest.net configuration... Testing from QuickWeb Hosting Solutions (184.170.241.12)... Retrieving speedtest.net server list... Selecting best server based on ping... Hosted by CopperNet Systems, Inc. (Kearny, AZ) [113.56 km]: 78.688 ms Testing download speed................................................................................ [COLOR=rgb(184, 49, 47)]Download: 77.04 Mbit/s[/COLOR] Testing upload speed...................................................................................................... [COLOR=rgb(184, 49, 47)]Upload: 68.72 Mbit/s[/COLOR]
  • In the previous build of this jail, I would see 1.5 to 1.8 Gbit down
  • The previous build of the jail also used vnet0 and bridge0.
  • The VNET and BRIDGE were created by TrueNAS when I created the jail. Neither existed prior to the creation of this testjail.
  • Here are the NIC interfaces inside the jail:
root@testjail:~ # ifconfig lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384 options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6> inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1 inet 127.0.0.1 netmask 0xff000000 groups: lo nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL> pflog0: flags=0<> metric 0 mtu 33160 groups: pflog epair0b: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=8<VLAN_MTU> ether xx:xx:xx:xx:xx:39 hwaddr xx::xx:xx:xx:xx:0b inet 192.168.10.40 netmask 0xffffff00 broadcast 192.168.10.255 groups: epair media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>) status: active nd6 options=1<PERFORMNUD>

I've been reading the forums for hours and while others have posted similar questions, the situations have been fairly different. Some use VLANs some use LAGG or other configurations. This is pretty straightforward with the vnet connected to the bridge connect to the ix0 interface and still hitting a network performance issue.

I would really appreciate any suggestions to improve or compare performance. Does this look like it's configured properly? I'm pretty advanced with network hardware, but I'm not an expert with virtualized components like the VNETs or BRIDGEs.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,702
I note that the test system used differs in the two tests, Utopia Fiber vs Coppernet Systems (7 vs 113 KM away from you)... maybe it's worth forcing the same for both?

Nothing else obvious I can see that would make it different.
 

Ray DeMoss

Dabbler
Joined
Jul 11, 2017
Messages
11
I note that the test system used differs in the two tests, Utopia Fiber vs Coppernet Systems (7 vs 113 KM away from you)... maybe it's worth forcing the same for both?
The host system says it's in Utah and it is, but the Jail thinks it's in Arizona near Phoenix. I don't know why. The bridge must be obfuscating the geo-location of the external IP address. When I run the list command to show the speed test servers, the resulting list only shows the closest 10 servers to where it thinks its geo-location is. So the host will only run against the closest 10 servers in Utah, and the Jail will only run against the closest 10 servers in Arizona. I tried to use the Utah server ID in the Jail and it says "ERROR: No matched servers". When I reverse it to run the host against the Arizona server it says the same thing. Strange, but I'll keep trying or maybe find another test.

I did run the speed test against the 10 servers available to the jail and the results were about the same usually between 40-100Mbits Up and down.

I haven't tried this, but is it possible without breaking the jail to try different network configurations, like switching to the NAT feature or DHCP, which I'm assuming it's just bridged to the LAN? I'm not married to the VNET configuration if something else will work better.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,702
What's your objective here?

Are you trying to prove that jail networking is slow? (what you're testing here isn't achieving that)

Are you just interested in seeing the same numbers from the host and the jail? (I question the value in working toward that, but whatever makes you happy). I suggest you look into the options for speedtest-cli and see if you can specify the server to use for the test at the command by ID, which I guess you could grab on the host and then use in the jail too.

 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,740
Two problems:

1. The IP address must be on bridge0 instead of ix0.
2. Hardware offloading must be disabled for ix0.
 

Ray DeMoss

Dabbler
Joined
Jul 11, 2017
Messages
11
What's your objective here?
@sretalla

Thank for the response. I appreciate the help. As I said in my opening, my previous jail runs a download server where it downloads files and runs a bunch of scripts. I lost the drives that hosted my jails so I am rebuilding each lost jail and I started with this one. My previous downloading jail did run at near-line speed on vnet0 to bridge0 to ix0. This jail ran typically at about ~1Gbit to ~1.5Gbit for years so I know this works. When I created the new replacement jail and tested it, I noticed I was only getting about 15-20% of my previous download rate, so I started troubleshooting where the slowdown is occurring.

I used the same script to build the new jail as the previous one with a few exceptions like the upgraded v13-0-release, and the IP address changed. The previous jail was built on the 12.x releases and upgraded over time. Running the speedtest-cli is a troubleshooting step. When the downloads seemed slow, I tested to confirm where the slowdown happens. I tested the router and that was full speed. I tested the TrueNAS server as root and that ran at full speed, then I tested it inside the jail and it ran slow. I know the jail can run fast because I've seen it run at ~1.5 Gbits down. So, I am trying to figure out why this jail is slower by a significant margin than the previous jail. That is my objective.

Are you just interested in seeing the same numbers from the host and the jail? (I question the value in working toward that, but whatever makes you happy). I suggest you look into the options for speedtest-cli and see if you can specify the server to use for the test at the command by ID, which I guess you could grab on the host and then use in the jail too.
Please refer to my 3rd post, which explains exactly this. If you run the list command, it only shows you the 10 closest servers to your geo-location or where it thinks your geo-location is. It doesn't show you all of them. The online help for speedtest-cli doesn't say this or explain this at all but if you run the command, that is what it shows. There is an online list of all the speed-test servers with their server IDs that you can supposedly use on the command line. If you use a server ID that is not on the closest 10 servers list, you get an error. This is not explained in the speedtest-cli documentation. Maybe they haven't updated their docs yet. I am open to an alternate suggestion for a speed test. I can use iperf to my router to baseline the issue, but I think the speed test servers likely have a very large pipe to the Internet so they can test high-speed connections because that is their purpose. If I would see roughly ~500Mbit to Arizona, I would not have made this post, but the jail is slow. I typically get Gbit+ connections to the Netherlands and Germany where some of my colleagues live/work and we routinely exchange large software images and logs.

Two problems:

1. The IP address must be on bridge0 instead of ix0.
2. Hardware offloading must be disabled for ix0.
@Patrick M. Hausen

I've read through many of your other posts and I appreciate your extensive knowledge. Just to be clear, when you say the Jails IP address must be on bridge0, do you mean a different subnet from ix0? If so, I can easily change that. I've read through other explanations for the hardware offloading being disabled when there are other network layers involved like LAGG or VLAN tags, which I am not running. I get that it doesn't provide many benefits when there are too many software layers of the OSI model involved. Having said that, in my previous build with HW offloading enabled I was easily seeing 1Gbit+ speeds in the jail. Was this just a fluke? Did I just get lucky? I've worked in the computer industry for 28-years on storage hardware and software still baffles me at times. I set up my previous jail some 3-years ago and it just worked with very little tweaking.
  • Is there is a better option than using VNET to the BRIDGE adapter?
  • Can I change the networking stack inside the jail to test different options? Are there any good guides to do this?
  • Should I just destroy the jail and tweak my iocage script and rebuild it?
 
Last edited:

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,702
when you say the Jails IP address must be on bridge0, do you mean a different subnet from ix0?
ix0 should have NO address and the bridge sghould have the one currently assigned to ix0

Please refer to my 3rd post, which explains exactly this. If you run the list command, it only shows you the 10 closest servers to your geo-location or where it thinks your geo-location is. It doesn't show you all of them. The online help for speedtest-cli doesn't say this or explain this at all but if you run the command, that is what it shows. There is an online list of all the speed-test servers with their server IDs that you can supposedly use on the command line. If you use a server ID that is not on the closest 10 servers list, you get an error. This is not explained in the speedtest-cli documentation. Maybe they haven't updated their docs yet
What I was suggesting was to find the list from the host, then use the ID for that server in the jail... if that doesn't work, too bad I guess.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,740
@Ray DeMoss Not the jail's IP address. That is on the epair0b vnet interface inside the jail. The host's IP address that is currently on ix0. A bridge member interface MUST NOT have an IP address. The FreeBSD documentation explicitly says so.

So you need to remove the 192.168.10.96 from ix0 and put it on bridge0.

As for the hardware offloading, all sorts of TCP offloading to the network hardware instead of the main CPU only makes sense when the host is the final destination of the relevant TCP connections. Which is not the case if the host serves as a bridge or a router. And since FreeBSD does not cope with that situation automatically very well, and the TCP offloading interferes with the traffic destined for the jail, the best advice is to simple disable hardware offloading globally. This will put a bit more of a burden on your main CPU but for most modern systems that can be neglected.

HTH,
Patrick
 

Ray DeMoss

Dabbler
Joined
Jul 11, 2017
Messages
11
@sretalla @Patrick M. Hausen

After a bit of reading up on the Bridge devices, I see what you're talking about, however, I propose a minor change in the plan. The Intel X540 NIC has dual 10Gbe ports and the second port is not used at the moment. Since I use ix0 for all my SMB and NFS shares as well as the webUI management, I want to leave that alone. I can set up bridge0 to use ix1, currently not used, as the bridge member and set the IP at the bridge. I will recreate the Jail and all future jails will run over their own Network port. I will globally disable TCP offloading. I am making some assumptions that TrueNAS will route traffic to the shares over ix0 and the traffic to the jail and VMs over the ix1 port.

I really appreciate your input. If this plan sounds good to you, I will set it up and update the thread in a couple of days with the results.

FYI... just some observations. When a bridge device does not exist and you create a jail, it also creates a new bridge device and this bridge device does not show up in the webGUI. The member configuration automatically includes the NIC it detects with an IP address. I assumed this was working as it should, but apparently, the default behavior does not follow the best practices as you've described them.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,740
The default behaviour of TrueNAS does not follow FreeBSD mandatory practices. I have been complaining and filing tickets about this for years ... The "no IP addresses on bridge members" constraint has been there since the bridge feature was introduced.

The automatic creation of bridges is specifically "entertaining" because STP defaults to off in FreeBSD and with the right topology just creating a jail can introduce a loop in your network. :smile:

Well, your approach is perfectly reasonable. Only thing to keep in mind: you must set vnet_default_interface to "none" for each of your jails and reference the bridge you manually created explicitly down in the "network" section, option "interfaces" of your jail.

Happy bridging.

Edit: the bridge0 interface does not need an IP address in your case. Just configure the ix1 interface "up" and disable hardware offloading.
 
Last edited:

Ray DeMoss

Dabbler
Joined
Jul 11, 2017
Messages
11
@Patrick M. Hausen

I attempted to set up bridge0 to ix1 on the same subnet as ix0 and TrueNAS didn't like that. It wants bridge0 on a new subnet. This means I need a VLAN, which is fine. I know how to set up VLANs on pfSense.
Edit: the bridge0 interface does not need an IP address in your case. Just configure the ix1 interface "up" and disable hardware offloading.
I just noticed this edit, and this will also require a VLAN. What is the best practice to configure ix1 on a VLAN in the TrueNAS WebGUI? I played with it a little and noticed that the VLAN interface in the WebGUI can be set up to use ix1 as the member interface. Is it basically the same advice to apply the IP address to the VLAN interface and not on the physical port? After I set up the VLAN, how do I set up the bridge? Do I just use the new VLAN interface as a member of the bridge?
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,740
No, you definitely don't need a VLAN. No IP addresses on ix1 and bridge0. You need to reboot with all jails autostart disabled first to get rid of the automatically created bridge0 interface. Then create bridge0 in the UI and add ix1 as a member.

If you intend to use ix1 as a layer 2 connection for your jails why would you need an IP address on the host? ix0 already has got one in that subnet. You cannot have two interfaces in the same subnet.
 

Ray DeMoss

Dabbler
Joined
Jul 11, 2017
Messages
11
@Patrick M. Hausen

I was able to delete the auto-created bridge0 using the ifconfig bridge0 destroy command. I also removed the jails so there are no jails at the moment. I can create the jail in about 5 minutes with my script. So ix1 and the manually configured bridge0 are created and no IP addresses are assigned.

So I start the installation of the jail via my basic jail creation script and my network crashes. Nothing could ping anything else. After some troubleshooting, I disconnect the cable to ix1 and everything starts working again. When I look at the network setting, the jail script uses bridge0 for the VNET, but it somehow makes ix0 a member of the bridge, which is what I believe crashed the network. There is nothing in my script that says to do this. The jail did not finish its installation after the network configuration, but it's up and running enough to get to its console. This is starting to seem like work.

I removed ix0 from bridge0. I did forget to reboot after creating bridge0, but no jails were on the system at the time.
 
Last edited:

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,740
if vnet_default_interface is set to "auto" (the default) then iocage will make the interface that has the default gateway a member of the jail bridge. With two interfaces in bridge0 you created a loop, hence the crash. What I wrote about the interface settings for the jail - vnet_default_interface=none, interfaces=vnet0:bridge0. These settings are mandatory. I did not grasp you were using a script and assumed the UI. You are familiar with iocage set then, I figure.
 

Ray DeMoss

Dabbler
Joined
Jul 11, 2017
Messages
11
if vnet_default_interface is set to "auto" (the default) then iocage will make the interface that has the default gateway a member of the jail bridge. With two interfaces in bridge0 you created a loop, hence the crash. What I wrote about the interface settings for the jail - vnet_default_interface=none, interfaces=vnet0:bridge0. These settings are mandatory. I did not grasp you were using a script and assumed the UI. You are familiar with iocage set then, I figure.

If I need to do anything more than twice, it will put it in a shell script. I really needed that one variable, vnet_default_interface=none to make this work. Once I added that to the iocage create command-line, it did not add ix0 to the bridge0 interface. After that, everything just worked. The jail was created normally and the networking worked properly.

Just for fun, I ran the speed test again, and the results are below. I also note that the geo-location of the jail for this test was properly located in the SLC, Utah area.

[root@testjail ~]# ./speedtest.py Retrieving speedtest.net configuration... Testing from Google Fiber (xx.xx.xx.73)... Retrieving speedtest.net server list... Selecting best server based on ping... Hosted by Voonami, Inc. (Salt Lake City, UT) [3.08 km]: 5.758 ms Testing download speed................................................................................ Download: 1025.37 Mbit/s Testing upload speed...................................................................................................... Upload: 726.10 Mbit/s

Thanks for the help and suggestions.

In summary for those interested in this thread, here are the steps I took with the advice of @Patrick M. Hausen and @sretalla.
  • Removed the bridge devices linking to the main NIC interface used to manage the TrueNAS server and also to which the SMB and NFS shares are connected.
  • Physically connected a second NIC port to the network switch
  • Do not assign an IP address to the second NIC
  • Enable the second NIC by using ifconfig interface# up command to enable it
  • Manually created a new bridge device adding the second NIC interface port as a member. Do not add an IP address to the bridge device.
  • Create a new jail using vnet0:bridge0. If you create the jail using a script add vnet_default_interface=none to the iocage create command. The vnet=vnet0 parameter does not include the bridge. You can also use the iocage set interfaces="vnet0:bridge0" examplejail command.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,740
If I need to do anything more than twice, it will put it in a shell script. I really needed that one variable, vnet_default_interface=none to make this work. Once I added that to the iocage create command-line, it did not add ix0 to the bridge0 interface. After that, everything just worked. The jail was created normally and the networking worked properly.
Of course you need that setting. It's supposed to work that way.

Now if the vnet_default_interface=auto feature makes sense at all can be debated. iocage was developed by a former iXsystems employee in lockstep with FreeNAS/TrueNAS. And Brandon was not experienced in either networking or FreeBSD, unfortunately. So there is much ad-hockery in the code to force things into a working state somehow.

This non-compliant setup works out of the box under these conditions:
  • single NIC
  • no VLANs
  • no IPv6
which seems to be what was tested :wink:

What this actually violates/breaks is all multicast. So with the out of the box TN configuration you cannot run mDNS or other media server "things" relying on multicast in a jail and neither can you run IPv6 because that uses a lot of multicast instead of broadcast.

That's the reason why the IP address must be on the bridge if you share the interface between host and jails. If you have a dedicated jail/VM NIC, it's of course all layer 2 only.
 

Teeps

Dabbler
Joined
Sep 13, 2015
Messages
37
  • Create a new jail using vnet0:bridge0. If you create the jail using a script add vnet_default_interface=none to the iocage create command. The vnet=vnet0 parameter does not include the bridge. You can also use the iocage set interfaces="vnet0:bridge0" examplejail command.

Really interesting thread. Care to share your full jail creation portion of your script?
 
Top