Hi, ( Apologies if this case has been covered elsewhere but couldn't see anything directly relevant by searching )
TrueNas Core 13 seems to be mis-configuring the Bridge between Jails & VM following a crash of bhyve, leaving the Jails running but disconnected from the network.
There are 2 physical interfaces, and 2 vlan interfaces
igb0 - connected to the main network. No vlans. igb0 has a static IP address (10.1.1.32/24) configured in Network/Interfaces
igb1 - configured (but currently NOT connected) to be connected to a network containing 2 VLANs (11 and 99) that do not route to the 10.1.1.1/24 network). Physically disconnected in order to aid debugging this problem.
vlan11 - Assigned to VLAN 11 on igb1. Down as igb1 is disconnected
vlan99 - Assigned to VLAN 99 on igb1. Down as igb1 is disconnected
There are 3 running jails have a single vnet interface configured (by default, I think) to use VNT, with ipv4 interface set to "vnet0", network properties' interfaces set to 'vnet0:bridge0'
There are 2 VMs:
- A FreeBSD VM not running (but configured) with a VirtIO interface onto igb0 and also interfaces onto 'vlan11' and 'vlan99'
- A Windows 10 VM running, with a VirtIO interface onto igb0
(All config has been done via the UI, not via the command line.)
In "working" state, bridge0 is exists, is up and has all the member interfaces I'd expect:
But there is a 'random' problem with Windows or bhyve crashing (not sure which, might be both) where TrueNAS keeps running, the jails all keep running BUT when bhyve starts again, bridge1 is created, the VM assigned to bridge1 and the igb0 connection moved from bridge0 to bridge1. Leaving the jails in bridge1 without a connection to igb0,
A further reboot of FreeNAS and the bridging looks back as it should be -- but rebooting the whole thing because Windows (or bhyve) has had an issue doesn't seem like it ought to be necessary. I know the trigger for this is Windows/bhyve failing (and I'm trying to find root cause for that) but it seems something in TrueNAS is also not handling it right and severing the jails is compounding the problem. I'm also making the assumption that the reconfig of the bridge is a symptom of Windows/bhyve failing and not the other way around....
Any clues/suggestions?
Some specific questions...
1/
Should I manually configure the bridge, explicitly, and manually set it for all jails & VM?
2/
Should I configure a 2nd bridge (on an unlikely auto number e.g. bridge9) specifically for the Windows VM - so that TrueNAS has no need to mess with the 'main' bridge if Windows/bhyve fails?
Thanks!
TrueNas Core 13 seems to be mis-configuring the Bridge between Jails & VM following a crash of bhyve, leaving the Jails running but disconnected from the network.
There are 2 physical interfaces, and 2 vlan interfaces
igb0 - connected to the main network. No vlans. igb0 has a static IP address (10.1.1.32/24) configured in Network/Interfaces
igb1 - configured (but currently NOT connected) to be connected to a network containing 2 VLANs (11 and 99) that do not route to the 10.1.1.1/24 network). Physically disconnected in order to aid debugging this problem.
vlan11 - Assigned to VLAN 11 on igb1. Down as igb1 is disconnected
vlan99 - Assigned to VLAN 99 on igb1. Down as igb1 is disconnected
There are 3 running jails have a single vnet interface configured (by default, I think) to use VNT, with ipv4 interface set to "vnet0", network properties' interfaces set to 'vnet0:bridge0'
There are 2 VMs:
- A FreeBSD VM not running (but configured) with a VirtIO interface onto igb0 and also interfaces onto 'vlan11' and 'vlan99'
- A Windows 10 VM running, with a VirtIO interface onto igb0
(All config has been done via the UI, not via the command line.)
In "working" state, bridge0 is exists, is up and has all the member interfaces I'd expect:
Code:
bridge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 ether 58:9c:fc:10:ff:e1 id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15 maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200 root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0 member: vnet0.3 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP> ifmaxaddr 0 port 11 priority 128 path cost 2000 member: vnet0.2 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP> ifmaxaddr 0 port 10 priority 128 path cost 2000 member: vnet0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP> ifmaxaddr 0 port 9 priority 128 path cost 2000000 member: vnet0.1 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP> ifmaxaddr 0 port 8 priority 128 path cost 2000 member: igb0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP> ifmaxaddr 0 port 1 priority 128 path cost 20000 groups: bridge nd6 options=9<PERFORMNUD,IFDISABLED>
But there is a 'random' problem with Windows or bhyve crashing (not sure which, might be both) where TrueNAS keeps running, the jails all keep running BUT when bhyve starts again, bridge1 is created, the VM assigned to bridge1 and the igb0 connection moved from bridge0 to bridge1. Leaving the jails in bridge1 without a connection to igb0,
Code:
bridge0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 ether 58:9c:fc:10:ff:e1 id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15 maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200 root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0 member: vnet0.3 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP> ifmaxaddr 0 port 12 priority 128 path cost 2000 member: vnet0.2 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP> ifmaxaddr 0 port 11 priority 128 path cost 2000 member: vnet0.1 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP> ifmaxaddr 0 port 8 priority 128 path cost 2000 groups: bridge nd6 options=9<PERFORMNUD,IFDISABLED> bridge1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 ether 58:9c:fc:10:ff:fa id 00:00:00:00:00:00 priority 32768 hellotime 2 fwddelay 15 maxage 20 holdcnt 6 proto rstp maxaddr 2000 timeout 1200 root id 00:00:00:00:00:00 priority 32768 ifcost 0 port 0 member: vnet0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP> ifmaxaddr 0 port 10 priority 128 path cost 2000000 member: igb0 flags=143<LEARNING,DISCOVER,AUTOEDGE,AUTOPTP> ifmaxaddr 0 port 1 priority 128 path cost 20000 groups: bridge nd6 options=9<PERFORMNUD,IFDISABLED>
A further reboot of FreeNAS and the bridging looks back as it should be -- but rebooting the whole thing because Windows (or bhyve) has had an issue doesn't seem like it ought to be necessary. I know the trigger for this is Windows/bhyve failing (and I'm trying to find root cause for that) but it seems something in TrueNAS is also not handling it right and severing the jails is compounding the problem. I'm also making the assumption that the reconfig of the bridge is a symptom of Windows/bhyve failing and not the other way around....
Any clues/suggestions?
Some specific questions...
1/
Should I manually configure the bridge, explicitly, and manually set it for all jails & VM?
2/
Should I configure a 2nd bridge (on an unlikely auto number e.g. bridge9) specifically for the Windows VM - so that TrueNAS has no need to mess with the 'main' bridge if Windows/bhyve fails?
Thanks!