After some hours of hunting the root-cause we figured it out ... So for the future generations bellow is explanation what the (hell) was going on and how to fix/overcome that annoying behavior.
Symptoms:
- Physical NIC gets cycled (up/down) upon
first Jail
with VNET start OR the
last jail
stop
- Whole network communication gets interrupted for ~10-15 secs (GUI/SSH drops as well as all of the CIFS and other stuff).
- Second and any additional Jail stop/start does NOT cause any issues. Only the first start and last stop
Cause:
- By default FreeBSD starts the network interfaces with bunch of the OPTIONS enabled (see bellow). It works OK until you actually start the first jail with VNET enabled. Apparently VNET does not work well with certain options so it
enforces the parent Interface to disable some of them. When this happens the parent NIC
needs to restart in order to make the changes.
This was causing the network disconnects.
- Once the first Jail was running any further Jails were not causing issues because the NIC options were not present so no problem with startup another Jail, nothing to be done on parent nic. Just create a new vnet, add it to the bridge and all good.
- And the similar story happens in a opposite direction. When you stop second jail nothing happens with the options (because there is another Jail running) but as soon as you stop the
last Jail the host system will just re-activate the options which were removed before ... and YES, this requires another cycle of the parent NIC causing another network outage ...
Solution (workaround?):
Disable the Hardware offload ... Two reasons ... It is mentioned in the TrueNAS docs (as it is causing various issues) ...
Disabling this is only recommended when the interface is managing Jails, Plugins, or Virtual Machines.
... and secondly there is actually a
BUG in the FreeBSD 12.0 causing even more crazy issues. So yes, disable it and move on.
Go to
GUI ->
Network ->
Interfaces -> edit the interface which is used as a parent for Jails and check
Disable Hardware Offloading . Confirm the warning and save (confirm test, save, save...).
By doing so you will disable following NIC options:
RXCSUM
,
LRO
,
VLAN_HWTSO
,
RXCSUM_IPV6
But that is not the whole story. There are additional options which are enabled by default and getting disabled upon first Jail start. So to prevent disabling/enabling them every time Jail start/stops we will just disable them by default.
So go to the interface config again and add
-rxcsum -txcsum -rxcsum6 -txcsum6
to the
Options
field. So at the end it will look like this:
- Now Save, confirm everything and after that
reboot the NAS! This is important because if you don't then these four excluded options are not persistent. It will allow you to start Jail w/o interruption BUT if you stop the jail these flags gets enabled again (I guess the state is cached/stored somewhere). After reboot try to stop/start your Jails. It should not interrupt your network.
Details about the interface options ...
ifconfig <interface>
(ifconfig igb0) shows these ... note that output might be slightly different. Depends on the NIC capabilities:
- Initial (default NAS w/o Jails runnign)
RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6
- After disabling Hardware Offload (red ones are removed)
RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6
- And after adding the extra options (blue ones are removed)
RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6
So the output of ifconfig is:
- Initial:
RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6
- HW Offload disabled:
TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,WOL_MAGIC,VLAN_HWFILTER,TXCSUM_IPV6
- Further options disabled:
VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,WOL_MAGIC,VLAN_HWFILTER
So that's about it ... hope it helps :]
//Note: The second part of the OPs question about not being able to stop the Jail is another/unrelated issue. That one is caused by the custom script