Resource icon

Multiple network interfaces on a single subnet

Status
Not open for further replies.

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,681
We've had a number of people over the years bring up the topic of multiple network interfaces on a single subnet.

Often done by Windows administrators or storage admins, where it works to varying degrees, this is sadly not proper IP networking practice.

Multiple interfaces on a single network (broadcast domain) is officially supported by LACP. You configure a single lagg interface with several Ethernet interfaces and that will work. The interface shows up to the OS as a single connection to the network.

Multiple IP addresses on a single subnet are supported through IP aliases. You do not need multiple physical interfaces on the network. Multiple physical interfaces on the same network may not work the way you anticipate.

You can place the network IP and aliases on a LACP link and that works.

LACP has some limitations in that packets are not "load balanced." Packets for a given destination are deterministically routed out a specific interface in order to prevent out-of-order packet delivery. This means LACP is probably not useful for small deployments (less than 10 hosts, maybe) but is a massive win with large numbers of busy clients.

Now there will invariably be some people who will say ... "WRONG! I know that multiple interfaces are legal!" At which we come to an impasse about what kind-of works versus what actually happens. People see things like the term "bind to the IP address". What most people want this to mean is that when you bind to an IP address that's assigned to a specific card, traffic goes in/out that card. This doesn't happen. Sorry.

I don't actually care to debate with you why you feel multiple non-LACP interfaces "should" work a certain way. I don't actually care if you think it worked for you under Windows or on some storage platform. I'm here to document the way it works on FreeBSD, Linux, Apple, and most other UNIX variants. I didn't make these design decisions but I am spending some time here to help you understand them. So please don't belabor the point.

FreeBSD is a sophisticated, modern operating system, and the way traffic is routed is based on an abstracted networking subsystem. Multiple interfaces on a single subnet is an oxymoron and in order to make it "work right" that way, there would need to be another selection layer in the network stack to handle traffic delivery, in addition to the route table lookup. You essentially need another layer to notice that a given destination has multiple possible links, and then search to see if any of those links are best, which adds a lot of complexity. Modern systems typically don't do this because it's slow, and there are standards such as LACP to do it, and there are 10GbE interfaces available, and on top of that, rational network designs like what I suggest above can be used to make multiple separate subnets, so there are multiple ways to "do it better."

Almost all modern UNIX operating systems use abstracted networking stacks. Output traffic is handled through the route table, so the outbound load balancing that you are hoping to see by putting two interfaces on a single network ... just doesn't happen. The authors of the modern stacks know 802.3ad exists; it isn't 1989 anymore and RFC1122 stupidity is pointless. Making multiple interfaces work through a SECOND mechanism - in particular the "route it out the same interface" mechanism many people in this situation seem to expect - would require each packet to be looked up again in a different way, dramatically reducing throughput.

Apple has their own explanation here: http://support.apple.com/kb/TS3679

Microsoft explains how this nonstandard configuration actually works on Windows: https://support.microsoft.com/en-us/kb/175767

National Instruments writes a small book on the topic: http://www.ni.com/white-paper/12558/en/

It's a terminology thing. People want to assume that when you bind to an IP address that's assigned to a specific card, traffic goes in/out that card. This does not happen.

You can certainly "bind to an IP address" in UNIX but in FreeBSD it's an abstraction that makes no guarantees as to the physical handling of the traffic. This is really useful in many environments for lots of stuff. For example, you can get network redundancy for a server using OSPF by putting your service addresses on a loopback interface, and then advertising that into your interior gateway protocol using OSPF or some other IGP. The address that all external traffic uses isn't configured on ANY ethernet interface. Userland applications "bind to the IP address" just like it was an alias (or the primary address) on an ethernet interface... anyways, point is, application level binding is pretty much not closely coupled to physical interfaces. FreeBSD has a rich set of network interfaces that it supports, including point to point links (PPP, SLIP, parallel port cable), ATM, etc., and the networking subsystem presents it all as an abstraction to the userland. So of course since a lot of the IP configuration is driven by what's defined for physical interfaces, this leads to operational and terminology confusion.

Basically, for the issue at hand, there are two key bits:

Input traffic from an Ethernet network to a host is controlled by ARP. The ARP subsystem publishes MAC addresses in response to requests from an ethernet network, and this happens infrequently (meaning far less than once a second). ARP controls packet ingress. The system ARP table maintains a list of learned and published addresses, and when an ARP packet is received, the system compares it to the system's interfaces and responds with the MAC address of the interface. Now this process works pretty much the way the OP would expect, but it can be subverted. For example, if I have a server with two interfaces, em0=192.168.1.1/24 and em1=192.168.1.2/24, and I set a client's ARP table with a manual entry for 192.168.1.1 pointing at the MAC address of server's em1, the traffic for 192.168.1.1 from client to server enters server on em1. And everything works fine. The UNIX networking layer doesn't think this odd or anything, even if you have a userland app that is bound to 192.168.1.1, it all works.

Output traffic to an ethernet network is controlled by the routing table, and the routing table is keyed by destination IP address. Basically when you do a "route get ip.ad.dr.ess" the system does a routing table lookup, similar to what happens when a packet is output. The source IP address isn't considered because IP routing happens based on destination. So as long as the routing code picks AN interface that's on the network in question, packets will be delivered, and that's pretty reasonable.

If you want to have multiple interfaces on a network, you should use link aggregation.

If you want to have multiple IP addresses on a network, you should define ifconfig aliases.

You can do other things, but then you're fighting the way it was designed, and then it may not work as you expect.

I do not wish to entertain a debate as to whether or not this is "right" or "wrong." It is the way that modern UNIX systems work, and such debate would be pointless. I am happy to discuss ways to do your IP networking within this framework to make all your systems happy though.

Updated 06/2016; bonus points and thanks to @Mirfster for tenacious Google-fu.
 
Last edited:

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
Good post jgreco
 

tanik1

Contributor
Joined
Mar 31, 2013
Messages
163
ok that clears it up i guess. thanks
 

FreeNASftw

Contributor
Joined
Mar 1, 2015
Messages
124
Thank fk you posted this (I know I'm late finding it) - I'll be saving this link for future 'discussions' on the topic!
 

CSP-on-FN

Dabbler
Joined
Apr 16, 2015
Messages
15
Yes, I'll say thankyou too for this superbly-full explanation.

Shouldn't this be made a 'sticky' somewhere? (Ooops! It's already pinned.)

Yes, I had installed two NICs in my own FreeNAS box, then given each of them a different IP address, on the same subnet -- and connected them both to my HP 1810 Switch. So I'd done exactly what I *now* know was (on a FreeNAS server) the *wrong* thing to do!! So - until I solved this problem myself, eventually - I saw for myself the wierd network-crippling symptoms caused by trying to run with a double-NIC arrangement.

I had searched hard for clarity on this (double-NIC) theme a few months ago but found almost nothing specific to FreeNAS. (I'd obviously not discovered this post from jgreco.) This post would have saved me many many hours of head-scratching, network diagnosis and scouring of my LOGs if I'd discovered it earlier! On the plus side, there's that usual silver lining that comes from battling to fix any wierd bug ... In other words, I'm now better educated on Ethernet Level 2 Flow Control and 802.3x Pause Frames - none of which I'd had to battle with (at any real depth) during my two decades or so working on Novell networks and Windows networks!

(Before raising a query here on this Forum to ask for help with this (around January) I wanted first to build a decent picture of the problem, and to be able to list all the tests I'd made to prove that it wasn't caused (for example) by a ZFS ARC issue, or a disk-write problem, or a cabling problem ... etc., etc., but I'd discovered the fix and cured it myself before I'd reached the point of shouting for help.)(And anyway - when circumstances allow - I always prefer to 'know' my own problems in depth before asking for help on them :smile: )

I've waffled far too long below on the problems I got by running this double-NIC arrangement, but now - of course - I'm using just *one* NIC on my server, and everything is back to full speed and running trouble free.

Here's the TL;DR blurb ...

After many months of trouble-free networking with this double-NIC arrangement (which was wierd, in retrospect!) but corresponding roughly with one of the FreeNAS System Updates, I was getting huge drop-offs in data-writes to the FreeNAS server - and *only* to my FreeNAS server. The transfer (write) rates would drop from circa 750Mbits/s to circa 40Mbits/s, yet (in my interpretation) there were no relevant error messages, and no obvious clues in my FreeNAS LOGs of complaints or clashes or problems ... as to why these drop-offs were occurring. (And remember, these double-NICs had been (apparently) trouble-free for over a year, beforehand.)

I did discover that by pausing my data-transfer to the server for around 20 or 30 seconds, I could then resume the transfer at full speed - but that full-speed resumption only lasted for another minute or so - and then it dropped off again.

To cut this long story short(er) - by watching the Port Status pages on my HP Switch I saw that the Port to which my FreeNAS server was connected was showing high numbers of Ethernet Pause Frames (aka Pause Packets) being received *and* transmitted, accompanied by a similar (correlated) rise in the reported "Packets Received With Error" (on the same Switch Port). In other words, I was seeing the effect of Level 2 / 802.3x Ethernet Flow Control on my network.

However, this wasn't just a matter of disabling the HP 1810 Switch option labelled as "Flow Control" (it's OFF by default anyway), because with Flow Control turned OFF on my Switch, these Ethernet Layer 2 Pause Frames would be propagated (correctly) to all of the other NICs on my network, causing my whole system (on that subnet) to grind to a halt. I had to keep Flow Control turned ON on my Switch so that this problem affected only my FreeNAS server.

Yes, while testing, I did try lots of permutations of turning Flow Control ON and OFF on (some of) the NICs on my other Windows computers, and - because I could find no other way to toggle the Flow Control setting on my FreeNAS NICs, I toyed with asking a question here about using the FreeNAS GUI Tuning feature to get the equivalent of the (BSD-specific?) "fc=0" (Flow Control OFF) on my FreeNAS NICs! I even tried adding "dev.igb.0.fc=0" and "dev.igb.1.fc=0" to my server's etc_sysctl.conf file (and rebooting) - but (of course, in hindsight) none of these tweaks made any difference. (I'm certain the fc=0 entries in my etc_sysctl.conf file were ignored by FreeNAS anyway, so I removed them.)

I'll spare you any more detail here because (yawwnnn!) my many "what if" tests and my head-scratching and my note taking on this problem spanned many weeks on and off, and I never did build the coherent picture of this problem that I'd want to share on this Forum. My 'findings' are now all moot anyway :smile: and my interim findings would probably raise more questions than they'd answer.

Apologies for such a long post, and I salute you if you've read down this far!!!
Colin P.

(Cosmetic edit on 11 October 2017 just to remove some of my over-used single-quotes.)
 
Last edited:

OBRI

Dabbler
Joined
Apr 15, 2017
Messages
30
Hello jgreco,

many thanks for this detailled post.
I'm not sure, if I understood it right. So I have a few questionmarks in mind. Perhaps you can help me.

In my configuration, I have a server mainboard with two different network chips. (hmm three with IPMI)

My idea was, that I use igb0 only for the internal traffic in my homenet (smb, freenas WebGUI) and igb1 only for external connection to the internet (sending the status e-mails, update freenas,..) for security reasons.

So I blocked the igb0 in my router (no connection to the internet) bound the smb service and the freenas webgui to igb0.
And then I configured the igb1 in my router for outgoing connections allowed.

But this doesn't work. Freenas can`t reache the internet and I can reach the web gui via igb1 and igb0.

I`m not really sure, if the reason is that, what You explained above. But if it is so, how can I reach my goal?

I would like to use one network card only internally (home net) and one for the external traffic. Is this possible?

Thanks in advance!

My System is:

obri
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,828
I'll only add that I had LAGG setup as a failover between the Chelsio 10G as the primary interface and the on-board 1G as a backup (set up via console). That worked really well until a system update made my FreeNAS un-reachable. The only option was to nuke the LAGG.

No amount of resetting all network parameters (i.e. the settings for IGB0, CXL0, LAGG) solved the issue. The failover LAGG on my machine is currently dead and the only active network interfaces are the Chelsio 10G + IPMI.

I thought using the LAGG this way was a 'legal' approach to having multiple network interfaces connect to the same subnet?
 

RegularJoe

Patron
Joined
Aug 19, 2013
Messages
330
Well the new standard TRILL is being supplanted by SPB(802.1aq) and it has been used in some large venues. I can't wait for FreeNAS to implement Open vSwitch so that we can start doing this with FreeNAS!
 

magev958

Cadet
Joined
Feb 1, 2017
Messages
2
Did You get this idea to work? I would like to do the same with one NIC for internal traffic and one for internet

Hello jgreco,

many thanks for this detailled post.
I'm not sure, if I understood it right. So I have a few questionmarks in mind. Perhaps you can help me.

In my configuration, I have a server mainboard with two different network chips. (hmm three with IPMI)

My idea was, that I use igb0 only for the internal traffic in my homenet (smb, freenas WebGUI) and igb1 only for external connection to the internet (sending the status e-mails, update freenas,..) for security reasons.

So I blocked the igb0 in my router (no connection to the internet) bound the smb service and the freenas webgui to igb0.
And then I configured the igb1 in my router for outgoing connections allowed.

But this doesn't work. Freenas can`t reache the internet and I can reach the web gui via igb1 and igb0.

I`m not really sure, if the reason is that, what You explained above. But if it is so, how can I reach my goal?

I would like to use one network card only internally (home net) and one for the external traffic. Is this possible?

Thanks in advance!

My System is:

obri
 

RegularJoe

Patron
Joined
Aug 19, 2013
Messages
330
if you have a smart switch(cisco) just put both in an ether channel and use 2 VLAN's. If your ether channel load balance is src-dst-ip you should get some load distribution. If your doing anything on the internet you might want to run a FreeBSD based firewall, pfSense is the gold standard and works well. I use both on VMware, so if you have the RAM you can run both on one piece of hardware. I trust VMware interfaces where Windows and Linux interfaces I am too afraid they are going to someday default and start leaking data to the outside world.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,681
I thought using the LAGG this way was a 'legal' approach to having multiple network interfaces connect to the same subnet?

The setup described is a single network interface on the subnet- the LAG (IEEE 802.3ad) itself. That's the whole point. How packets get onto and off the network under the sheets is irrelevant, which is why this should be done via this type of abstraction.

As for why it didn't work? It's quite possible that mixing and matching interfaces broke something. LACP itself requires you to have matching port speeds in a group, and experience with FreeBSD suggests that you should also use the same interface type (igb, em, ix, etc) of a quality server-grade ethernet. That's really the only way I'd expect it to work reliably.
 

silversword

Dabbler
Joined
Nov 9, 2017
Messages
21
Ok, I've read this article at least 4 times...just wanted to confirm one thing:

So it's because of this:
Output traffic to an ethernet network is controlled by the routing table, and the routing table is keyed by destination IP address. Basically when you do a "route get IP.ad.dr.ess" the system does a routing table lookup, similar to what happens when a packet is output. The source IP address isn't considered because IP routing happens based on destination. So as long as the routing code picks AN interface that's on the network in question, packets will be delivered, and that's pretty reasonable.

is why even if you have two NIC with isolated service bindings to specific IP's:
em0 = x.x.x.1 = SMB, APFS
em1 = x.x.x.2 = NFS, iscsi

the multiple NIC's is still a non-supported configuration because even if the SAMBA service is only bound to one adapter 'nix will ignore which service is bound to which adapter for outbound return traffic and randomly output the data on whatever NIC happens to get it thus the client gets return traffic sometimes from unexpected source IP's and drops it?
 

RegularJoe

Patron
Joined
Aug 19, 2013
Messages
330
the best you can do is LACP/PAGP and ether channel and hope you have enough sessions to load up all your ports, with VMware and iSCSI(multiple subnets, RR, balance every xx packets and do the stupid iSCSI port binding) you can make it work as well but that is an edge case.
 

Alister

Explorer
Joined
Sep 18, 2011
Messages
52
I'm wondering if now that freenas supports bhyve VMs, this holds?

My box has a Realtek Nic [Disabled at present] and a Intel Server 1000 NIC installed. What I'd like to do is

Intel NIC -> FreeNAS
Realtek -> VM (probably Linux Mint but poss Win7)

They'll be on the same subnet, would this be better than just using the Virtual NIC and the Intel?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,681
is why even if you have two NIC with isolated service bindings to specific IP's:
em0 = x.x.x.1 = SMB, APFS
em1 = x.x.x.2 = NFS, iscsi

the multiple NIC's is still a non-supported configuration because even if the SAMBA service is only bound to one adapter 'nix will ignore which service is bound to which adapter for outbound return traffic and randomly output the data on whatever NIC happens to get it

It's because the idea of "bound to one adapter" is a fallacy. Services are bound to IP addresses through the system bind() function, which does not include any reference to a physical adapter. The network stack's job is to take care of interacting with the physical network. The network stack handles multiple interfaces on a single network just fine, but does not behave in the manner some people expect - in particular, it makes no guarantees as to which physical interface traffic will ingress or egress. It just guarantees it will appear on the appropriate network, as determined by how your system is configured. So if you want multiple interfaces on a single subnet, use LACP (for multiple physical interfaces) and/or IP aliases (for multiple IP addresses).

thus the client gets return traffic sometimes from unexpected source IP's and drops it?

The return traffic source IP created by the UNIX system should always be correct, so, no, that isn't an issue.
 

pro lamer

Guru
Joined
Feb 16, 2018
Messages
626
I'm wondering if now that freenas supports bhyve VMs, this holds?

My box has a Realtek Nic [Disabled at present] and a Intel Server 1000 NIC installed. What I'd like to do is

Intel NIC -> FreeNAS
Realtek -> VM (probably Linux Mint but poss Win7)

They'll be on the same subnet, would this be better than just using the Virtual NIC and the Intel?
Have you tried this? Has it worked for you?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,681
I'm wondering if now that freenas supports bhyve VMs, this holds?

My box has a Realtek Nic [Disabled at present] and a Intel Server 1000 NIC installed. What I'd like to do is

Intel NIC -> FreeNAS
Realtek -> VM (probably Linux Mint but poss Win7)

They'll be on the same subnet, would this be better than just using the Virtual NIC and the Intel?

A virtual machine is a different machine. It's like putting your desktop and a laptop on the same network and wondering if this thread means that there is a problem. The VM has its own network stack and therefore follows that none of the issues discussed herein are relevant.
 

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
Please, change to proper OS with proper network capability.
Its strange that only FreeNAS/FreeBSD can't use mutiple NICs on same subnet.
You may want to re-read this thread.
This is funny because BSD has been the network gold standard for so long and has been used by so many ISPs for core networking for so long...
 
Status
Not open for further replies.
Top