Network failover trouble (active-backup)

MichaelDE

Cadet
Joined
Feb 26, 2021
Messages
9
Hello Community,

unfortunately I have network problems with TrueNAS-SCALE-22.12.4.2. Because I can not solve the problem myself, I contact you and ask for assistance.

I am trying to get a failover network configuration to work. Both links by themselves work, but in bonding mode "fault-tolerance" the passive link does not take over when I disable the active link on the switch ;(

Can you identify the problem?
Thanks & best regards
Michael

cat /proc/net/bonding/bond0 Ethernet Channel Bonding Driver: v5.15.131+truenas Bonding Mode: fault-tolerance (active-backup) Primary Slave: enp9s0 (primary_reselect always) Currently Active Slave: enp9s0 MII Status: up MII Polling Interval (ms): 0 Up Delay (ms): 0 Down Delay (ms): 0 Peer Notification Delay (ms): 0 Slave Interface: enp9s0 MII Status: up Speed: 10000 Mbps Duplex: full Link Failure Count: 0 Permanent HW addr: 00:07:43:0b:bc:8a Slave queue ID: 0

#doesn't work :( 4: enp9s0: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond0 state DOWN group default qlen 1000 link/ether 00:07:43:0b:bc:8a brd ff:ff:ff:ff:ff:ff 5: enp9s0d1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond0 state UP group default qlen 1000 link/ether 00:07:43:0b:bc:8a brd ff:ff:ff:ff:ff:ff permaddr 00:07:43:0b:bc:8b 6: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default qlen 1000 link/ether 00:07:43:0b:bc:8a brd ff:ff:ff:ff:ff:ff #works fine 4: enp9s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000 link/ether 00:07:43:0b:bc:8a brd ff:ff:ff:ff:ff:ff inet6 fe80::207:43ff:fe0b:bc8a/64 scope link valid_lft forever preferred_lft forever 5: enp9s0d1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond0 state UP group default qlen 1000 link/ether 00:07:43:0b:bc:8b brd ff:ff:ff:ff:ff:ff 9: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default qlen 1000 link/ether 00:07:43:0b:bc:8b brd ff:ff:ff:ff:ff:ff #works fine 4: enp9s0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond0 state UP group default qlen 1000 link/ether 00:07:43:0b:bc:8a brd ff:ff:ff:ff:ff:ff 5: enp9s0d1: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 9000 qdisc mq master bond0 state DOWN group default qlen 1000 link/ether 00:07:43:0b:bc:8a brd ff:ff:ff:ff:ff:ff permaddr 00:07:43:0b:bc:8b 6: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default qlen 1000 link/ether 00:07:43:0b:bc:8a brd ff:ff:ff:ff:ff:ff #works fine 4: enp9s0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond0 state UP group default qlen 1000 link/ether 00:07:43:0b:bc:8a brd ff:ff:ff:ff:ff:ff 5: enp9s0d1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond0 state UP group default qlen 1000 link/ether 00:07:43:0b:bc:8a brd ff:ff:ff:ff:ff:ff permaddr 00:07:43:0b:bc:8b 6: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP group default qlen 1000 link/ether 00:07:43:0b:bc:8a brd ff:ff:ff:ff:ff:ff
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
Lets start with the config. How is it set up in the webUI?

 

MichaelDE

Cadet
Joined
Feb 26, 2021
Messages
9
Hello morganL,

thank you very much for your feedback.
Attached I am uploading the two screenshots.

best regards
Michael
 

Attachments

  • config.png
    config.png
    59.9 KB · Views: 67
  • widget.png
    widget.png
    15 KB · Views: 63

MrGuvernment

Patron
Joined
Jun 15, 2017
Messages
268
Is your switch configured for anything? LACP or?

Next thought is why not just have both links as active in the bond? (if that works?)
 

MichaelDE

Cadet
Joined
Feb 26, 2021
Messages
9
Hello MrGuvernment,

no LACP is configured on the switch ports, only the tagged VLANs are on them.

I would like to configure the active / passive failover so that I can better control the data flow.

best regards
Michael
 

MrGuvernment

Patron
Joined
Jun 15, 2017
Messages
268
Hello MrGuvernment,

no LACP is configured on the switch ports, only the tagged VLANs are on them.

I would like to configure the active / passive failover so that I can better control the data flow.

best regards
Michael

What is it about the data flow you wish to control? Active/Active just gives you double the bandwidth and if one dies, things fail over to the active link anyways?

Do you have spanning tree enabled?Can you monitor your ports via your switch to see if you have port blocking happening?

Personally, I would LACP / LAGG those NICS on your switch side and config truenas side into a group, tag your VLANs on it and be done.
 

MichaelDE

Cadet
Joined
Feb 26, 2021
Messages
9
Hello MrGuvernment,

thank you for your advice.
The two links are on different switches. The backup link should only be used for maintenance purposes. Otherwise, the traffic should run via the preferred switch. For this reason, an active-active configuration makes no sense for me.
Spanning tree is activated and configured correctly. The same setup works for other systems without any problems. Port blocking does not occur, I have checked the logs accordingly.

Do you have any tips for troubleshooting the active-backup configuration?

Best regards
Micha
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
You cannot have a bond link to two different switches unless the two switches support something commonly called "multi chassis LACP" or also "stacking".
 

MichaelDE

Cadet
Joined
Feb 26, 2021
Messages
9
Hello Patrick M. Hausen,

I don't think that statement is correct. An active-passive configuration is also possible without multi chassis link aggeragtion.
This also works without any problems with Truena's core systems.

best regards
Micha
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Even in environments where it works I consider it a hack. I recommend not running multiple links without LACP ever. Unless you are using ESXi without vCenter, of course. :confused:
 

MichaelDE

Cadet
Joined
Feb 26, 2021
Messages
9
This is not a hack, it is part of the specification ....

Unfortunately, your personal interpretation does not help to solve the technical problem ;(

best regards
Michael
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Where in that paragraph is anything but "the network switch" mentioned? Single switch it is.
 

MrGuvernment

Patron
Joined
Jun 15, 2017
Messages
268
In addition, the Truenas documentation refers to lagg(4)*.
Please take a look at the example.
*https://wiki.debian.org/Bonding#Configuration_-_Example_2_.28.22Laptop-Mode.22.29

From a network view, this is no different than 2 switches ....

Routing tables. This is likely why your links do not work and fail over. Because your routing table and arp is broadcast to one switch, you then disconnect it and the bond breaks, because your network still thinks the connections is on the same switch... which it is not.. the other switch has no idea about your other switch, hence "stacking"

In proper networks, you must "stack" switches so they "appear" as one to the equipment which connects to it, only when switches are properly configured, can you split links between them for redundancy as you want to do....

But you said you had this working in Core?
 

MichaelDE

Cadet
Joined
Feb 26, 2021
Messages
9
Hello MrGuvernment,

Thank you for your message.
Yes, it works as I said for Truena's Core and also our ESXi servers are operated with such an active / passive connection.

best regards
Michael
 
Top