Setting up link aggregation with 2 NICs and no other access

the_jest

Explorer
Joined
Apr 16, 2017
Messages
71
This is a followup to what I asked in the LACP thread yesterday, broken out as it wasn't really the point there.

I have a FreeNAS box with two NICs, and no serial port, keyboard/monitor, or other way of accessing it currently configured. My main, and only current, connection is on eth0 set to a static 192.168.1.10. I'd like to add igb0 as a failover, using the same IP address as the interface, as this is how the rest of the network expects to find the box. (Though I was successfully talked out of it, I'm also curious about setting up the two NICs as an LACP aggregation.)

The docs warn that before creating a link aggregation, you should be careful about changing the network interface used by the web interface. But they don't say how to get around this. What's the right way to set this up? Going through the motions, I see that I can add a lagg0 interface, specify the protocol (LACP, failover, etc.), associate both NICs, and specify an IP address. What will happen if I specify the same address as eth0 is currently on (which is what I ultimately want)? Or if I give another address, will the original be inaccessible?

I don't want to create lagg0 and then find myself kicked offline with no way to connect until I finish a procedure that I can no longer finish.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
If you set the web GUI address (System | General) to 0.0.0.0 , then as long as connectivity exists on an IP, the FreeNAS GUI will be there.

As a couple of other notes, it's not generally recommended to do LACP with different NIC drivers (igb0 anbd eth0 indicates that's the case for you... recommended would be if you had igb0 and igb1, for example), it's also interesting to know what you think you're expecting this scenario to cover... what will need to fail for this failover to happen?.
 

Yorick

Wizard
Joined
Nov 4, 2018
Messages
1,912
With LACP, there is a protocol that monitors the links, and any kind of failure to reach the server over one link will take that link out of commission.
I am running LACP “just for the heck of it”. Active/active, first configured on the server and then the switch. The server becomes unreachable after creating the lagg, and reachable again after configuring the switch: Provided everything was done right.

No experience with failover, and, I’d expect that to kick in for link down (layer 1) only. I expect it to be L2, which means there is an IP associated with lagg0, no need for a secondary IP.

Changing a device to link aggregation without having some kind of secondary access - IPMI, serial console, heck even keyboard and monitor - is risky. One mistake or unexpected behavior, and you’re cut off.

You can mitigate your risk by testing your steps outside production first, having good documentation for your steps, and implementing a “rip cord” of sorts - a crontab that is set to a few minutes in the future and restores the original config, tested outside production as well. That way, if you lose access, you just walk away for a few minutes, have a tea, and come back to the system the way it was.
If your change worked and survives tests, you remove the rip cord crontab entry. Which means you want it far enough in the future that you can do the changes and test them, but not so far that you have a prolonged wait time on failure.

My typical “rip cord” would be a boot environment, but without IPMI, that doesn’t work for you.
Edit: Might still work - use midctl to select the known good boot env as active, and reboot, in your crontab entry. Definitely test that outside production first.

How long is your maintenance window for this?
 
Last edited:

the_jest

Explorer
Joined
Apr 16, 2017
Messages
71
If you set the web GUI address (System | General) to 0.0.0.0 , then as long as connectivity exists on an IP, the FreeNAS GUI will be there.

OK, that was already the situation. Does "as long as connectivity exists on an IP" incorporate setting up the lagg0 interface?

As a couple of other notes, it's not generally recommended to do LACP with different NIC drivers (igb0 anbd eth0 indicates that's the case for you... recommended would be if you had igb0 and igb1, for example), it's also interesting to know what you think you're expecting this scenario to cover... what will need to fail for this failover to happen?.

I mistyped--it's actually igb0 and em0, but point taken.

As far as the practical case--this is mainly just for me to learn a bit more about this; I don't have any performance or reliability issues I'm trying to address. But the scenario for failover would be the usual, I guess--physical failure of the first interface, or accidental removal of the network cable, or something like that. But mainly, because I have a box with two network ports, and I figured I should do something with the second one, and this seemed the most straightforward thing.
 

the_jest

Explorer
Joined
Apr 16, 2017
Messages
71
Changing a device to link aggregation without having some kind of secondary access - IPMI, serial console, heck even keyboard and monitor - is risky. One mistake or unexpected behavior, and you’re cut off.

You can mitigate your risk by testing your steps outside production first, having good documentation for your steps, and implementing a “rip cord” of sorts - a crontab that is set to a few minutes in the future and restores the original config, tested outside production as well. That way, if you lose access, you just walk away for a few minutes, have a tea, and come back to the system the way it was.
If your change worked and survives tests, you remove the rip cord crontab entry. Which means you want it far enough in the future that you can do the changes and test them, but not so far that you have a prolonged wait time on failure.

[...]

How long is your maintenance window for this?

Thanks. I could add a keyboard and monitor in an emergency, I suppose. But I like the crontab ripcord idea.

This is a home server, and while there are other people who/services that use this box, there's no specific maintenance window. If it takes ten minutes instead of thirty seconds, it's no big deal. If I have to reassemble the box and reinstall the OS from scratch, it would be a big deal for me :-/
 

Yorick

Wizard
Joined
Nov 4, 2018
Messages
1,912
This "goes without saying" and I'll say it anyway: Take a backup of your config and store it outside the unit. In fact you should, ideally, have this set up as a daily job that syncs to the cloud.

The likelihood that you'll need to reinstall FreeNAS is vanishingly small. It could happen if you boot from USB and your USB dies during this procedure. Beyond that, I don't see a lot of scenarios.

And now you know why IPMI is desirable even in a home setup :).
 

the_jest

Explorer
Joined
Apr 16, 2017
Messages
71
OK, I spent a bit of time on the weekend on this, and I confess I don't see how it's possible.

I did backup my config, and I wrote a crontab line to copy this backup onto the regular config, a few minutes in the future, as @Yorick suggested. However, when I try to add the lagg interface, and enter 192.168.1.10/24 as the IP address, I get a validation error "[EINVAL] interface_create.aliases.0: The network 192.168.1.0/24 is already in use by another interface." This occurs with any address on the subnet; trying .99 (because .10 is already in use by em0) gives the same error.

But if I remove 192.168.1.10 from em0 first, I will of course lose connectivity, and won't be able to create the lagg interface.

How am I meant to do this?

Separately--and this might indicate my ignorance of some very basic things--I'm getting no signal when I plug the box into a monitor, either with VGA or HDMI. I don't know if I have to have it plugged in at boot time? My motherboard doesn't have a serial console, or IPMI.
 
Joined
Dec 29, 2014
Messages
1,135
I can't recall if you can add additional interfaces to a LAGG after creation. If you can, then you could create a 1 port LAGG with a different IP address on the same network. After that is up, delete the other interfaces and add it to the LAGG. You will have to change your switch port config 1 at a time as well. Otherwise you will have to do it from the console.
 

Yorick

Wizard
Joined
Nov 4, 2018
Messages
1,912
I wrote a crontab line to copy this backup onto the regular config, a few minutes in the future,

Make sure you test this really well. As in, does it actually recover your old config? My expectation is that it'd take a middleware call or reboot to get back to the config you had, and since you don't have access, all of that would need to be scripted.
 
Top