Jail IPv6 NDP not learning MAC

Junicast

Patron
Joined
Mar 6, 2015
Messages
206
Hi,

I suffer this phenomenon since FreeNAS 11.3. It only affects jail guests. FreeNAS itself and also bhyve guests are not affected.
I just upgraded to 11.3U1 but problem persists.

I'm running my network mostly in IPv6 which is my my jails are configured with static IPv6 addresses. This is an example configuration of a jail:
Code:
...
interfaces:vnet0:bridge5
ip4:new
ip4_addr:10.10.101.173/24
ip4_saddrsel:1
ip6:new
ip6_addr:2001:6666:1111:1b::1aee/64
ip6_saddrsel:1
ip_hostname:0
vnet:1
vnet0_mac:7085c26cd053 7085c26cd054
vnet1_mac:none
vnet2_mac:none
vnet3_mac:none
vnet_default_interface:auto
vnet_interfaces:none
...

As you can see my jails are attached to bridges that are at the same time bond to vlan interfaces since I work with VLAN quite a lot. This guest is connected to bridge5 with represents vlan 5. bhyve guest on the same bridge work flawlessly.

The actual problem is the following. My jails are not being found by Neighbor Discovery Protocol by my router. My router is an OpenWrt device 19.07.01.
It sends out Neighbor Solicitations like those:
Code:
22:39:34.308307 IP6 fe80::feec:daff:fe7b:3798 > ff02::1:ff00:1aee: ICMP6, neighbor solicitation, who has 2001:6666:1111:1b::1aee, length 32

This is what tcpdumps shows when I sniff at the jails vnet0.x device on the FreeNAS host itself. I cannot tcpdump within the jail, it gives me an error:
Code:
tcpdump: (there are no BPF devices)
The solicitation is never being answered.

When the jail itself tries to get out to the internet it sends Neighbor Solicitations on its own. During this process the OpenWrt router is learning the jails MAC address. This is only a very temporary solution. As long as the jail does outgoing connection, incoming connections work, too since my router keeps learning the MAC address. About 10 seconds after the jail stops communication the router unlearns the MAC and inbound communication fails.
The NDP entry on my router looks like this:
Code:
root@kukilala:~# ip -6 n s|grep 1aee
2001:6666:1111:1b::1aee dev br-lan  INCOMPLETE


Does someone have an idea what might be going on here? To me it looks like a FreeNAS bug but I'm not quite sure because other local clients are able to learn the jails MAC addresses.
 

Louis2

Contributor
Joined
Sep 7, 2019
Messages
177
I just posted a help request on the pfSense form, because I think I have the same or at least a related problem ....


Whatever it is damn serious,^IPV6 Neighbor Solicitation Not answered !?? => No :confused: :confused:^

I do not completely understand what you did, but it seems to me an illegal work around! (could you never then less in more detail what you did)
This looks like the same problem I reported here:


In short, the jail is dropping the neighbor solicitation because it doesn't recognize the IP fe80:: address. If you manually assign an fe80:: address to the jail, then it works for me.


The issue should be fixed. In TrueNas and or probably in FreeBSD!

I did found a couple of links earlier describing more or less the same issue
# https://forums.freebsd.org/threads/freebsd-is-not-answering-neighbour-solicitation.77814/
# https://forums.freebsd.org/threads/freebsd-12-not-answering-neighbor-solicitation.69035/
# https://www.mail-archive.com/freebsd-net@freebsd.org/msg63870.html
# https://www.mail-archive.com/freebsd-net@freebsd.org/msg63848.html
# https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=233283
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Please, one of the people experiencing this problem post the output of ifconfig on the NAS host while the jails are running. Thanks.
 

Louis2

Contributor
Joined
Sep 7, 2019
Messages
177
Please, one of the people experiencing this problem post the output of ifconfig on the NAS host while the jails are running. Thanks.

Patrick, since I did a lot of tests before I met the described issue, I decided to do a clean install and reconfigure things again. Just to make sure that the communication problem was not caused by some previous test.
- I did a clean 12 U7 install and did configure network and a first jail. For the this (re)test I tried to install the NextCloud plugin
- which did not work ..... because of the described problem .... (I traced the communication using package capture on pfSense)

As requested here the ifconfig file of the host system (lion)
 

Attachments

  • 20211211_ifconfig.txt
    7.9 KB · Views: 152

Louis2

Contributor
Joined
Sep 7, 2019
Messages
177
I should add the following vnet part to the posted ifconfig

vnet0.5: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500
description: associated with jail: RedZoneGlobal as nic: epair0b
options=8<VLAN_MTU>
ether 02:c1:87:05:d0:c8
hwaddr 02:e6:3e:2d:d4:0a
groups: epair
media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
status: active
nd6 options=1<PERFORMNUD>

I also experimented with adding fe80::d/64 as an extra ipv6 address to the jail. Did not help. However I noticed strange things I did not notice before like ARP-messages where IPV4 address A is asking where is IP address A !!?? And some multicast messages. I removed the fe80::d/64 from the jail definition, since it did not improve the situation (perhaps the opposite)

However the majority of messages is simple a not answered cry "please tell me how to reach my router" (ICMPv6 Neighbor Solicitation".

Bottom line, unless someone has the solution, I do not manage to get (a static) IPV6 working, something I need to make e.g. a jail provided website working. I probably have to wait for the FreeBSD13 version, which hopefully works.

Note that the problem is not limited to jails, solicitation messages starting form the host are not answered as well
 
Last edited:

Louis2

Contributor
Joined
Sep 7, 2019
Messages
177
Some additional findings:
- the problem is not limited to jails, but also occurs between the truenas core host and the pfsense router
- latest / my truenas version is based on 12.2-RELEASE-p11 and my pfsense version is based on FreeBsd 12.3
- solicitation messages send from pfSense towards TrueNas are answered (that in opposite to the other way around)
- if I am going to the edit for a running jail (you can not edit a running jails config, but you can have a look at the parameter settings), the IPV6 part is greyed out.
- problem also exist between my other TrueNas system (also running 12U7) and pfSense. Note that the two TrueNas systems use different 10G cards (Melanox fiber <> intel 550 RJ45)

Up to now I did capture the IP using the pfSense package capture function, however using ^tcpdump -ni vlan10 icmp6 | fgrep neighbor^ you can capture the traffic also from the TrueNas commandline.

What I did is the following test:

test-1) I opened two command line windows on truenas
> on the first one I started ^tcpdump -ni vlan10 icmp6 | fgrep neighbor^
> on the second one "ping6 www.google.com
> result: pings are NOT answered and the trace shows that the solicitations are not answered

test-2
> I still have the truenas command line with tcpdump active
> now I open a command line on pfsense and start a ping towards truenas
> result: solicitations are send from pfSense and .... are answered by truenas

Below a copy of the trace

Test-1 (truenas sends solicitations)
13;04;04.977961 IP6 A;B;C;10;;25 > ff02;;1;ff00;1; ICMP6, neighbor solicitation, who has A;B;C;10;;1, length 32
13;04;05.999748 IP6 A;B;C;10;;25 > ff02;;1;ff00;1; ICMP6, neighbor solicitation, who has A;B;C;10;;1, length 32
13;04;07.134956 IP6 A;B;C;10;;25 > ff02;;1;ff00;1; ICMP6, neighbor solicitation, who has A;B;C;10;;1, length 32
13;04;08.194709 IP6 A;B;C;10;;25 > ff02;;1;ff00;1; ICMP6, neighbor solicitation, who has A;B;C;10;;1, length 32

Test-2 (pfSense sends solicitations)
13;04;08.751257 IP6 A;B;C;10;;1 > ff02;;1;ff00;25; ICMP6, neighbor solicitation, who has A;B;C;10;;25, length 32
13;04;08.751287 IP6 A;B;C;10;;25 > A;B;C;10;;1; ICMP6, neighbor advertisement, tgt is A;B;C;10;;25, length 32
13;04;13.800706 IP6 A;B;C;10;;25 > A;B;C;10;;1; ICMP6, neighbor solicitation, who has A;B;C;10;;1, length 32
13;04;13.800825 IP6 A;B;C;10;;1 > A;B;C;10;;25; ICMP6, neighbor advertisement, tgt is A;B;C;10;;1, length 24
13;04;36.774099 IP6 A;B;C;10;;1 > A;B;C;10;;25; ICMP6, neighbor solicitation, who has A;B;C;10;;25, length 32
13;04;36.774117 IP6 A;B;C;10;;25 > A;B;C;10;;1; ICMP6, neighbor advertisement, tgt is A;B;C;10;;25, length 24
 
Last edited:

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
@Louis2 The most important part of the ifconfig output is missing, namely which bridge the vnet interface of your jail is a member of. That's why I asked for the output with the jails up and running ...

OK, so let's start with a perfectly working IPv6 configuration, OK?

1. Physical interface
Bildschirmfoto 2021-12-12 um 13.37.34.png
Disable hardware offloading, no IP address!

2. VLAN interface
Bildschirmfoto 2021-12-12 um 13.38.11.png
No IP address! Please ignore the "lagg0" parent - if you run without link aggregation, just picture "igb0" here.

3. Bridge interface
Bildschirmfoto 2021-12-12 um 13.38.30.png
If the NAS host needs to communicate in that VLAN, the IP configuration must go here.

4. Jail configuration
Bildschirmfoto 2021-12-12 um 13.39.12.png
Bildschirmfoto 2021-12-12 um 13.39.37.png
And now for the important bit! Inside the jail make sure this is in /etc/rc.conf: ifconfig_epair0b_ipv6="inet6 auto_linklocal accept_rtadv autoconf"

Voila. Bridged jail with IPv6 on a VLAN. Working here since FreeNAS 11.0 without a hitch. If you don't want SLAAC you will probably need to remove everything except "auto_linklocal".
 

Louis2

Contributor
Joined
Sep 7, 2019
Messages
177
Patrick

To start with "many thanks for your support!"

Yep I made the ipconfig initially without running jails, I did correct that in my next replay. Reason was that is that I did not yet define jails at that moment. I reinstalled the system from zero to be quite sure the system was clean without leftovers from previous tests.

Additional tests did learn me that the problem is not just jail related. It is also related to then TrueNas Host !!
More findings:

1) the problem is not only present on my test system, but also on my actual used NAS
- both NAS-systems are running 12U7
- both systems are via switches connected to the same pfSense (running latest release based on FreeBSD12.3)

2) If I initiate the test form the pfSense side it works :)
- and the pings earlier started the TrueNas site "magically" start working as well (of cause)

Below my findings and reactions when reading your post above
1) Physical interface. Yep, that is how it is and was configured

2) Yep, I did not assign any ip to the involved vlans
Remarks:
- In opposite the vlans related to the "TrueNas-host" have IP's assigned
- and those "TrueNas-host" related vlans are not assigend to a bridge
- I am not using a lagg on the NAS
- I am using an intel x550 board, where ix0 is the trunk
- the vnet need to turn hardware support off :mad: (performance impact !),
For that reason I plan to use ix0 as the vnet trunk ans ix1 as the interface towards the host itself

3) I am using bridges for all vlans used in favor of the jails. I have a couple of them
- vlan-x with bridge-x, vlan-y with bridge-y etc
- the bridges do NOT have ip's assigned

4) Basic Jail propertied
- Here are significant differences!!!
a) I do not use Router Advertisement, however note that pfSence support it
b) I do not have the Berkely Packet filter active, since it seems that:
- it is only required for promiscuous mode (not used)
- or perhaps dhcp (not used)
- and it is costing performance
c) I need fixed addresses, because
- http etc. is forwarded to the fixed IPV6 address supposed to be available in the jails

Note: I would love to have mac based firewall rules to control outgoing traffic. Reason: Since IPV6 can not be filtered based on source address. However pfSense / (under the GUI pf) does not provide that option.

d) If I would use DHCP6, I could perhaps use the DUID to automatically assign an fixed IPV6-address
- However ..... In that case I need have to know the DUID first ...... (and the DUID needs to be stable)
- SLAAC is different / is intended to provide a server with the option to choose its own IP-address
(Which is exactly what I can not use!)
- Never the less within pfSense it seems to be the same service !? So perhaps there is an option!??

c) As you might have guessed I configured IPV6 the same way as you defined IPV4
- and that should ..... work ..... however .... not up to now (and probably never in the actual Release)
- What ever, I can and probably will try, is the RA method or perhaps DHCP6, seems to be the only working option

Note: I really hope to see an FreeBSD13 based TrueNas release very soon!! and also that that also solves this problem!
FreeBSD13 seems to have significant network improvements and among other things, hopefully also support my 2,5 realtec NIC.
(Which I plan to use as startup / emergency NIC, now that I have to use my second 10G-port as high performance host connection)
I will switch to TrueNas13 as soon as there is a workable beta

5) Jail Network properties
- I was especially surprises to see the bridge number !!!!!!!! That is the default bridge .... not the VLAN bridge!
- and that bridge is also potentially carrying traffic from multiple vlans ... what is so damn wrong (and insecure).
- I am using different resolver settings, but that is not the problem

6) Suggested rc.conf settings
- I did add ^ifconfig_epair0b_ipv6="auto_linklocal"^ to a jail:
^# 202111212 LvB On Advice Patrick
ifconfig_epair0b_ipv6="auto_linklocal"^

- and restarted the jail .... no luck :frown:

The settings should (assuming it would solve the), also be added to the host its "/etc/rc.conf"
or more precise to "/conf/base/etc/rc.conf", since that file overrides the rc.conf each restart.
Or even better if it is an option to defined with "tunables"

Actual conclusion seems to be that static IP's do not work :( at all :mad: :mad:
So the only actual option, seems to be RA or perhaps DHCP6

Patrick again many thanks for your support!
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
  • - the vnet need to turn hardware support off :mad: (performance impact !),
The performance impact is not noticeable. Your CPU will have a little bit more work. But hardware offloading only makes sense if the TCP connection terminates on the host - which it doesn't. The host only serves as a bridge/vswitch to the jails. That's why you must turn it off.

- vlan-x with bridge-x, vlan-y with bridge-y etc
- the bridges do NOT have ip's assigned
Good.

b) I do not have the Berkely Packet filter active, since it seems that:
- and it is costing performance
Who told you that? The BPF setting in the jail properties only allows the jail to use BPF. BPF only gets activated if you run an application that uses it inside the jail. I like to be able to use tcpdump inside jails, that's why I have that setting.

c) I need fixed addresses, because
- http etc. is forwarded to the fixed IPV6 address supposed to be available in the jails
Me too. All of my jails' addresse are static. Because SLAAC addresses are static. This is not IPv4 DHCP with a pool at work. SLAAC addresses are derived from
  • the prefix - static in my and I assume also your case if you run public services
  • an EUI-64 address derived from the 48 bit MAC address of the jail - which is also statically configured
I have been using SLAAC for servers for years. Just enable, boot, then lookup the IPv6 address, then put that into DNS.

5) Jail Network properties
- I was especially surprises to see the bridge number !!!!!!!! That is the default bridge .... not the VLAN bridge!
- and that bridge is also potentially carrying traffic from multiple vlans ... what is so damn wrong (and insecure).
Wrong. This is the VLAN bridge because I statically created and configured it to have VLAN 2 as its member. See my screenshot. So it's the bridge for that VLAN and there's nothing dangerous about my setup. I have only one VLAN with jails and VMs. So I decided to name the bridge "bridge0". 0 is just a number. Of course in your case with multiple VLANs it makes perfect sense to name the bridge for VLAN x "bridgex".

"bridge0" is only created as the default bridge if you don't explicitly configure anything else. I agree with you that this part of TrueNAS needs a serious redesign.

The settings should (assuming it would solve the), also be added to the host its "/etc/rc.conf"
or more precise to "/conf/base/etc/rc.conf", since that file overrides the rc.conf each restart.
Or even better if it is an option to defined with "tunables"
No, no, no ... the point of vnet is that each jail has got its own independent IP stack for both IPv4 and IPv6. Why would any setting you apply to the host have any effect on the jails? It just doesn't.

Actual conclusion seems to be that static IP's do not work :( at all :mad: :mad:

OK, so now to that claim. I used one of my jails, named "rdp" (I run Apache Guacamole in there) - looked up its SLAAC address and then using that address configured it this way.

Jail settings in the UI:
Bildschirmfoto 2021-12-12 um 20.12.44.png

Settings inside the jail's /etc/rc.conf:
ifconfig_epair0b_ipv6="inet6 2003:a:d59:3880:ff:60ff:fed0:7196/64 auto_linklocal"

Start jail, ping6 from outside, ssh, ping6 from inside - all working.

HTH,
Patrick

P.S. What does not work but should is putting a link local address into the Default Router field. Like fe80::3eec:efff:fe00:5430%epair0b. I will have to investigate, because we are using that in our data centre with standard FreeBSD instead of TrueNAS. So I will need to find out if there's anything different in their iocage vs. ours. But with a GUA as the gateway it definitely works. Or use SLAAC :wink:
 
Last edited:

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Update: so all of the iocage controlled settings in my plain FreeBSD production environment and on my TrueNAS are the same. Yet the former works with a default gateway link local address and the latter doesn't unless I use SLAAC - which I do with TrueNAS, but static with link local gateway should work just as well.

So it's down to different iocage versions, I guess.
 

Louis2

Contributor
Joined
Sep 7, 2019
Messages
177
Patrick,

I think you overlooked two things perhaps my explanation was not clear enough:
- the problem is NOT only related to the jails, but also to the true NAS host, that is the reason that I was thinking about how to configure the host! / host:rc.conf and so on
- when I was writing about (v)lan performance my idea was:
> performance wise the connection to the host / the nas functionality is the important one. So for that interface I do not want performance degradation. So I dedicate ix0 with hardware acceleration for that purpose => ix0 => vlan-1 => host
> the jails are mostly not used for performance critical applications, so I set up a trunk with all jail related vlans on ix1 => ix0 => vlan-2, vlan-3 etc => bridge-1, bridge-2 etc => vnet => jail-1, jail-2 etc

I will think about what you did write a bit more tomorrow and probably try to test SLAAC with a jail.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Got it. I need more time to look into this. The FreeBSD stack definitely is not broken. I run >50 hosts with >500 jails - all with static addresses because my colleagues overruled me with respect to the use of SLAAC. All hosts and jails alike have a link local default gateway with a GUA address. Many hosts and many jails are IPv6 only with a single jail running NAT64 for outbound and an SNI proxy for inbound connections via IPv4.

So that's why I am easily triggered by claims that the FreeBSD stack was somehow broken. It isn't. The performance gains we will get for the bridge interface with FreeBSD 13 are an order of magnitude.

At home with a large TrueNAS installation I run all IPv6 hosts with SLAAC. My desktop Mac will use privacy extensions and change addresses at will, while all my servers, jails, VMs, FreeBSD, Linux ... have automatically assigned but still static addresses as long as the MAC addresses don't change.

As for the hardware acceleration or not point - have you done any measurements? According to my experience we are talking single digit percentage here, so why bother? As I wrote it is not noticeable under common circumstances.
 

Louis2

Contributor
Joined
Sep 7, 2019
Messages
177
Patrick,

I am trying to set up a new multipurpose system. That takes much longer than expected :smile:
So the system is far from ready.

However, recently I changed the interface card in my pc from am TN4010 nbase-t card for an intel x550, which does have more 'off-loading / is a server card'. Running windows10 that card exchange made a lot of performance difference ...
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Measurements with FreeBSD and hardware offloading on vs. hardware offloading off. Everything else doesn't count.
 

Louis2

Contributor
Joined
Sep 7, 2019
Messages
177
I keep trying to get it working ..... the way it should .....

What ever a few findings:

Debugging option (may be)
There seems to be a debugging option for, however I did not manage to get it working .... I doubt if it provides more info than a wire shark trace whatsever
oot@lion[~]# sysctl net.inet6.icmp6.nd6_debug
net.inet6.icmp6.nd6_debug: 0
root@lion[~]# sysctl net.inet6.icmp6.nd6_debug=1
net.inet6.icmp6.nd6_debug: 0 -> 1

Tunable (I tried .... and if than only for the host)
variable: net.inet6.icmp6.nd6_debug
value: 1
type: sysctl
description: Debug neighbor discovery

SLAAC
I tried that for one of the jails ..... and perhaps that works .... but not the way I want:
- a few settings
Inside the jail make sure this is in /etc/rc.conf: ifconfig_epair0b_ipv6="inet6 auto_linklocal accept_rtadv autoconf"
- jail basic properties: vnet, berkeley packetfilter, autoconfigure IPV6 and I added the ipv6 default gateway
- in pfsense I actvated DHCP6 server and RA (Managed RA-FLAGS)
.... not sure I tried multiple options, not every thing worked, first positive and perhaps only positive result was with the option ^unmanaged RA=flags^

I met situations where there the jail was recognized based on a link local address, but did NOT have a normal IPV6. So it could not communicate over the internet. Later on probably due to changed RA-settings there was also a normal IPV6 address.

However that is a SLAAC address and not the IPV6-address I had in mind :mad:
I can just hope that I can change that via DHCP (assigning an address based on the DUID)

What ever, I should define the address .... and not the server ....

I have to do more testing, however glad that this perhaps works, but it stays ridiculous and a severe bug
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Unmanaged is correct for SLAAC. Managed means DHCPv6. You don't need to set the gateway if you have accept_rtadv.

Read my post above where I set a static address. That did work here ...
 

Louis2

Contributor
Joined
Sep 7, 2019
Messages
177
I did some more testing ...... and did even roll back the system due to strange affects ....
- SLAAC or DHCP ... I lose sight on what really happens ...... do provide addresses among them link local.
- Additionally also the explicit provided addresses are show on the gui and in ipconfig

However, that does not take away that the "NeightBor Solicitation Problem" is still there and that despite of the availability from IPV6 link local addresses and global IP6 addresses even a "ping6 to its own gateway" fails (I now and that saw that working, do not know why)

All in All ..... I have to decide if I wait for the TrueNas 13 release hoping that that solves problems, or that I switch to an other environment accepting that I lose the beloved TrueNas NAS application.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
You aren't messing with ipfw, are you?
 

Louis2

Contributor
Joined
Sep 7, 2019
Messages
177
No, I installed 12U7 from zero, to be sure that it was completely clean. I did check "ipfw list". Only one rule accept every thing.
 
Top