New installation NIC and networking confusion.

jrodgers

Cadet
Joined
Nov 1, 2021
Messages
4
Hi.
I have installed TrueNAS 12.0-U6 and have had it working for several weeks on a Supermicro server. The server has 4 Intel integrated NIC cards in it. Last Friday, I got the bright idea that I would separate the NIC interfaces to a setup like this:

igb0 Mgmt VLAN 10 on the 192.168.10.x/25 subnet
ibg1 unused
ibg2-3 as a LAG group using LACP for aggregation on VLAN 20 to talk to the other servers and accept general storage traffic on the 192.168.20.x/25 subnet.

I thought that seemed reasonable and simple to do. TrueNAS is confirmed to be working with the NIC cards and can communicate on all of them and get either static or DHCP addressing, depending on what I tried.

The problem occurs when I try to make any changes through the GUI. I lose comms to the box and cannot regain control of it to fix it. Sometimes, if I make a small change, I can get it to revert changes and recover. On other instances, I have completely reinstalled to get back to a known working starting point.

For some reason when I plug a KVM into the server, I cannot get the serial interface on the screen, even if I enable it in the GUI before hand. The only way I can get to the text configuration screen is through SSH, which is fine, until I screw around with a network interface which causes the networking to reset and networking restarts and leaves me stranded without a way to talk to the box -- reinstall, reboot, something has to happen to get back in.

So, as of now, I can statically set a DHCP reservation for the VLAN 10 interface, but I cannot figure out how to make it "VLAN aware". It seems that this should be a tick box or something in the port config screen in the GUI. If I try to "add a VLAN interface" as in the documentation, then it complains that I am trying to put it on the already assigned subnet.

I can configure the LACP port aggregation in the switch and get that side working, but as soon as I try to turn it on in TrueNAS, I am locked out again, even though I should still be able to get in on igb0 mgmt VLAN 10.

Does anyone have any ideas on how to do this better? I have spent all weekend monkeying around with this and I am out of ideas as to what to try next.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
You need to remove the ip address from the physical interface first, then assign it to the newly created VLAN interface. That's what the separate "test" and "save" stages are for, so you won't lock yourself out.
After clicking "test" quickly reconfigure your switch so the port connected to igb0 now carries VLAN 10 tagged, reload the page, click "save" if that worked.
If not the TrueNAS will revert the network settings to the state they were in before.
 

jrodgers

Cadet
Joined
Nov 1, 2021
Messages
4
Thanks, I will give that a try later this evening when I get home.

As far as the console goes, do you know of a way to get the text console to show up on the serial console interface on the server? I have console redirection enabled in BIOS and I think I should be able to get a text console when I plug a keyboard and monitor into the machine, but I dont. The only way I can get to it is to SSH into the machine and start /etc/netcli over the ssh connection.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
The BIOS stops running when the kernel starts running. You need to configure FreeNAS for serial console. There's a tickbox in the configuration, System -> Advanced -> Enable Serial Console ...
 

jrodgers

Cadet
Joined
Nov 1, 2021
Messages
4
Thanks for the pointer Patrick, I was able to setup the VLAN and the LAG using the two step process you outlined.

A new question has come up though. It looks like the LAGG interface and the VLAN tied to it are showing up/up in TrueNAS, but I can only ping the box through the management VLAN interface, but not on the LAGG/VLAN interface. Even though the LAGG/VLAN for that interface show up good, and the switch is properly configured with matching LACP settings, I cannot ping the address assigned to the LAGG/VLAN interface.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
That depends on too many factors for a quick diagnose. Can you provide a plan of your network? There must be some device routing between those VLANs, right?
 

jrodgers

Cadet
Joined
Nov 1, 2021
Messages
4
This is setup on my home network. The configuration is a fairly simple router on a stick, where I have a pfSense router/firewall appliace defining the subnets and VLANS on the network and it is connected to a TP-Link T1600-52TS switch. The TN box is connected to the switch by four Gb NIC ports. I have had this setup working for some time now with the VLAN definitions and subnets configured. The new thing was to try to LAG three of the ports on the TN box to the switch.

TN is throwing errors that the "LAGG switch ports are misconfigured, check your switch or cabling". So, I am assuming that I have the switch LAG group setup correctly, per the TP-Link instructions, but I have a misconfigured hashing algorithm. That seems reasonable to me because I dont fully understand the LAG and how it works other than at a high level. I think the has algo is wrong, but I have tried all the options available in the switch port config for the LAG, thinking that I would hit on the right combination. I didnt. I can't find anywhere in the TN documentation where it tells you what TN is expecting for a hash algo, though either. I am beginning to wonder if maybe it is a hardware driver problem with the Intel i350 NIC cards in the TN box. Apparently there were driver issues with these in a previous version of FreeBSD, but they were marked as fixed in v12.

Again, I am trying to use LACP on both ends, and have not tried to do a failover static LAG configuration yet.

I have the management interface on igb0 setup on vlan 20, and the other three igb1-3, setup as vlan 10 for the data traffic that is for the storage network. I can ping back and forth between the vlans on other devices on the network, so I know that the network is alright. I keep coming back to the LAG config and the hash algo in the switch and on the TN box.

Once I get the LAG working on TN, the intention is to do the same on similar hardware running Proxmox VE virtualization servers and use the TN as a place to backup, and store stuff, etc.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
See also



There is not really such a thing as having a misconfigured hashing algorithm. Each device is allowed to select its own. IEEE 802.3ad is clear in that no particular algorithm is mandated. The purpose of hashing is simply to make sure that all packets for a given network flow go out the same physical ethernet port. The goal is to keep packets from being delivered out-of-order, which causes mayhem.

Even if you had the same algorithm on each side, this would not cause the packets for a given flow to traverse the same wire in both directions, since on one side you are hashing {hostA,hostB} and coming the other way this is {hostB,hostA}.

The hash is based on the RSS hash from the network card, if available. It is not a particularly configurable option, though you can set "net.link.lagg.default_use_flowid" to 1 to force manual computation (which will be slower).

I'm not aware of where your quoted "misconfigured" error message would be coming from.
 
Top