possible network bug

nikkon

Contributor
Joined
Dec 16, 2012
Messages
171
Hi all,
I have a problem that seems to be replicating every time.
Version: TrueNAS-13.0-U3.1
I have 2 interfaces: em0 - connected to a non vlan switch port and em1 connected to a vlan port.
Every time I need to move a jail from one network to another the jail fails to get dhcp and the interface is stuck with the old ip from the previous network.
I will open a bug for this, however is there a way i can hup the routing/mapping process. The only fix so far is restarting the NAS which is expensive

Thank you
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
It would be helpful to include more information, specifically the sub-nets, (which you can anonymize). What I am looking for, is if you have both network ports on the same sub-net.

Further, why are you changing network ports?
 

nikkon

Contributor
Joined
Dec 16, 2012
Messages
171
you are right. let me bring more data and clarity
there are the 2 physical interfaces I have in my NAS. as it's all internal there's not problem to share the subnets :)
igb0 is part of 172.16.10.0/24 - gw *10.1
igb1 has a Vlan attached and part of 172.16.20.0/29 - gw *.20.1
1677506529046.png


If for example I am trying to change the vnet_default_interface of a jail (coruscant in this example) from igb0:LAN to vlan10:VPN_LAN and restart the jail, it will not get the IP from 172.16.20.0/29
1677507098283.png


1677507138719.png

keeps adding the previous ip from the 172.16.10.0/24 network.

Jail logs look like this:
/var/log/messages:
Feb 27 15:13:16 coruscant syslogd: kernel boot file is /boot/kernel/kernel
Feb 27 15:17:46 coruscant dhclient[35415]: receive_packet failed on epair0b: Device not configured
Feb 27 15:17:46 coruscant dhclient[35415]: ioctl(SIOCGIFFLAGS) on epair0b: Operation not permitted
Feb 27 15:17:46 coruscant dhclient[35415]: Interface epair0b no longer appears valid.
Feb 27 15:17:46 coruscant dhclient[35415]: No live interfaces to poll on - exiting.
Feb 27 15:17:46 coruscant dhclient[35415]: exiting.
Feb 27 15:17:46 coruscant dhclient[35415]: connection closed
Feb 27 15:17:46 coruscant dhclient[35415]: exiting.
Feb 27 15:17:46 coruscant syslogd: exiting on signal 15
Feb 27 15:18:13 coruscant syslogd: kernel boot file is /boot/kernel/kernel

jail interface after jail restart :
ifconfig
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
inet6 ::1 prefixlen 128
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1
inet 127.0.0.1 netmask 0xff000000
groups: lo
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
pflog0: flags=0<> metric 0 mtu 33160
groups: pflog
epair0b: flags=8863<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=8<VLAN_MTU>
ether 72:85:c2:13:ea:b5
hwaddr 02:09:b8:eb:78:0b
inet 172.16.10.18 netmask 0xffffff00 broadcast 172.16.10.255
groups: epair
media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
status: active
nd6 options=1<PERFORMNUD>

If I restart truenas, all works. looks to me there is a flush somewhere that doesn't happen.
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
Looks like a bug. You can report it to iXSystems via the "Report a Bug" at the top of the Forum pages.

Other than that, I don't have any suggestions. Perhaps someone else will.
 

nikkon

Contributor
Joined
Dec 16, 2012
Messages
171
Thanks. I'll open a bug request.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Have you created the necessary bridge interfaces and assigned them in the "interfaces" section of your jail config?
 

nikkon

Contributor
Joined
Dec 16, 2012
Messages
171
wait what? do I need to bridge the 2 interfaces?
why is that?
 
Last edited:

nikkon

Contributor
Joined
Dec 16, 2012
Messages
171
If I bridge them, then what Ip should the bridge get?
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
TrueNAS will auto create bridge interfaces. How else do you think the jails get connected to the wire? The code doing this is unfortunately full of idiosyncrasies, assumptions and ad hoc solutions, so it only works reliably in the case of a single physical interface.

Be back in a couple of minutes for more ... typing on my ipad isn't that much fun.
 

nikkon

Contributor
Joined
Dec 16, 2012
Messages
171
TrueNAS will auto create bridge interfaces. How else do you think the jails get connected to the wire? The code doing this is unfortunately full of idiosyncrasies, assumptions and ad hoc solutions, so it only works reliably in the case of a single physical interface.

Be back in a couple of minutes for more ... typing on my ipad isn't that much fun.
I get that. the bridge seems to be the right one
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
So ... if I get you correctly you want to connect VNET jails to separate interfaces, one with VLAN, one without - but that doesn't really matter. I must admit that I do not use the plugins and actively discourage their use. For the reasons see this thread started by @danb35:

Anyway, jails themselves are a fantastic rock solid technology and VNET is the best networking system since sliced bread.

As you probably already know a VNET jail gets its own network stack independent of the host OS. VNET interfaces come as "epair" instances. epair(4) is a virtual patch cable. One end ends up in the jail and can be given an IP address, routes can be added etc. The other end is on the host and needs to be plugged into ... something. That something is a virtual switch. In FreeBSD that means an if_bridge(4).

What you want in the end is something like
Code:
bridge0 ---+--- igb0
           |
           +--- epair of jail1
           |
           +--- epair of jail2
           ...

bridge1 ---+--- vlan10 ---> igb1
           |
           +--- epair of jail3
           |
           +--- epair of jail4

and the like. I assume this is more or less what you see when you type ifconfig on the host system.

Now there's a problem. In their attempt to be helpful and autoconfigure everything without bothering the user with bridges and the like, the configuration you are currently running violates fundamental constraints of the FreeBSD network stack. Not a bug. A well documented feature that has been in place for ages and that iXsystems decided to ignore. I have been beating this drum for years ...

The point is: a member interface of a bridge interface (e.g. igb0 or vlan10) must not have a layer 3 address (IPv4 or IPv6 address). Must not. Not open for debate. The configuration you currently run with the IP configuration on the member interfaces breaks multicast.

Many users (and probably the developers when this was first implemented) simply never notice because they do not run any multicast applications over IPv4 and do not run IPv6 at all. Yes, no proper multicast support implies breakage of all of IPv6.

So, according to the FreeBSD documentation all IP addresses - v4 and v6 alike - must go on the bridge interfaces and not on any of the members.

To achieve that - the most simple way to describe in this medium:
  • reboot with all jails autostart set to disabled, if you run VMs, disable them too
  • this gets rid of the automatically created bridge interfaces
  • then create bridge interfaces manually - create bridge0, set igb0 as member, test and save - nothing bad should happen
  • then move the IP address from igb0 to bridge0 - delete IP address(es) from igb0, put "up" in the options field, enter the IP adress(es) for bridge0, test and save. Possibly increase the timeout for the test and save dance a bit, because ARP caches. Also if necessary delete the ARP cache of your desktop system. You should be able to get into the UI and save with all IP addressing on the bridge interface. I'll get to the MAC address of the bridge interface later
  • repeat that dance for vlan10 and bridge1 - all IP addresses must be on bridge1 and none on vlan10
  • if that was successful then for your jails
    • set vnet_default_interface to "none"
    • set interfaces to "vnet0:bridge0" or "vnet0:bridge1", respectively depending on which bridge you want the jail to connect to
    • you can connect a jail to more than one interface: "vnet0:bridge0,vnet1:bridge1" - easy peasy now that everything is set up correctly
Then re-enable your jails, DHCP will work, IPv6 will work - static as well as SLAAC, ... joy, world peace, ...

To get the bridge interfaces to have stable MAC addresses identical with the physical interfaces you connect them to, you need to set two tunables and reboot.

Variable: if_bridge_load
Value: YES
Type: LOADER

Variable: net.link.bridge.inherit_mac
Value: 1
Type: SYSCTL

HTH,
Patrick
 

nikkon

Contributor
Joined
Dec 16, 2012
Messages
171
I’ll do this tomorrow and follow up with the results.
thank you Patrick
 

nikkon

Contributor
Joined
Dec 16, 2012
Messages
171
I did all this. restarted the machine. now seems to work. switching between both bridges requires a stop jail, add new settings, start
 

nikkon

Contributor
Joined
Dec 16, 2012
Messages
171
@Patrick M. Hausen I am bothering you with a stupid question: everything works - all my config was done cli (faster) and the problem is if I need to reboot as I did today, is not persistent.
it is not reflected in the web UI and I have no clue how to save it :)
any guidance here?
thanks in advance
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
You must do it in the UI. The UI persists its settings in a TrueNAS proprietary database from which the running configuration is generated anew at each boot. By propietary I mean not standard FreeBSD/Unix/etc. config files. It's SQLite and everything is open source but the structure is not documented apart from the source code.

The CLI gives you a very clear warning about this.
 

nikkon

Contributor
Joined
Dec 16, 2012
Messages
171
It seems to save it but after reboot is back to point one. Can I manually modify the SQL? Is there a hack I can do?
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
You can configure one system via UI, dump and reverse engineer the database structure, then apply to your production system :tongue:

Seriously just use the UI as I outlined above in detail.
 

nikkon

Contributor
Joined
Dec 16, 2012
Messages
171
ack. let me check this again.last time i did it manually from cli because the UI was coming back to the previous config
 

nikkon

Contributor
Joined
Dec 16, 2012
Messages
171
this seems wrong:
one interface is DHCP the second is just down

1677966179887.png
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
You know the "test" then reconnect, then "save" mechanism for all network changes? Ant there's a 60 second timeoutthat you can increase before clicking "test".
 
Top