Resource icon

Resource Multiple network interfaces on a single subnet

We've had a number of people over the years bring up the topic of multiple network interfaces on a single subnet.

Often done by Windows administrators or storage admins, where it works to varying degrees, this is sadly not proper IP networking practice.

Multiple interfaces on a single network (broadcast domain) is officially supported by LACP. You configure a single lagg interface with several Ethernet interfaces and that will work. The interface shows up to the OS as a single connection to the network.

Multiple IP addresses on a single subnet are supported through IP aliases. You do not need multiple physical interfaces on the network. Multiple physical interfaces on the same network may not work the way you anticipate.

You can place the network IP and aliases on a LACP link and that works.

LACP has some limitations in that packets are not "load balanced." Packets for a given destination are deterministically routed out a specific interface in order to prevent out-of-order packet delivery. This means LACP is probably not useful for small deployments (less than 10 hosts, maybe) but is a massive win with large numbers of busy clients.

Now there will invariably be some people who will say ... "WRONG! I know that multiple interfaces are legal!" At which we come to an impasse about what kind-of works versus what actually happens. People see things like the term "bind to the IP address". What most people want this to mean is that when you bind to an IP address that's assigned to a specific card, traffic goes in/out that card. This doesn't happen. Sorry.

I don't actually care to debate with you why you feel multiple non-LACP interfaces "should" work a certain way. I don't actually care if you think it worked for you under Windows or on some storage platform. I'm here to document the way it works on FreeBSD, Linux, Apple, and most other UNIX variants. I didn't make these design decisions but I am spending some time here to help you understand them. So please don't belabor the point.

FreeBSD is a sophisticated, modern operating system, and the way traffic is routed is based on an abstracted networking subsystem. Multiple interfaces on a single subnet is an oxymoron and in order to make it "work right" that way, there would need to be another selection layer in the network stack to handle traffic delivery, in addition to the route table lookup. You essentially need another layer to notice that a given destination has multiple possible links, and then search to see if any of those links are best, which adds a lot of complexity. Modern systems typically don't do this because it's slow, and there are standards such as LACP to do it, and there are 10GbE interfaces available, and on top of that, rational network designs like what I suggest above can be used to make multiple separate subnets, so there are multiple ways to "do it better."

Almost all modern UNIX operating systems use abstracted networking stacks. Output traffic is handled through the route table, so the outbound load balancing that you are hoping to see by putting two interfaces on a single network ... just doesn't happen. The authors of the modern stacks know 802.3ad exists; it isn't 1989 anymore and RFC1122 stupidity is pointless. Making multiple interfaces work through a SECOND mechanism - in particular the "route it out the same interface" mechanism many people in this situation seem to expect - would require each packet to be looked up again in a different way, dramatically reducing throughput.

Apple has their own explanation here: http://support.apple.com/kb/TS3679

Microsoft explains how this nonstandard configuration actually works on Windows: https://support.microsoft.com/en-us/kb/175767

National Instruments writes a small book on the topic: http://www.ni.com/white-paper/12558/en/

Ubuntu also has an explanation.

It's a terminology thing. People want to assume that when you bind to an IP address that's assigned to a specific card, traffic goes in/out that card. This does not happen.

You can certainly "bind to an IP address" in UNIX but in FreeBSD it's an abstraction that makes no guarantees as to the physical handling of the traffic. This is really useful in many environments for lots of stuff. For example, you can get network redundancy for a server using OSPF by putting your service addresses on a loopback interface, and then advertising that into your interior gateway protocol using OSPF or some other IGP. The address that all external traffic uses isn't configured on ANY ethernet interface. Userland applications "bind to the IP address" just like it was an alias (or the primary address) on an ethernet interface... anyways, point is, application level binding is pretty much not closely coupled to physical interfaces. FreeBSD has a rich set of network interfaces that it supports, including point to point links (PPP, SLIP, parallel port cable), ATM, etc., and the networking subsystem presents it all as an abstraction to the userland. So of course since a lot of the IP configuration is driven by what's defined for physical interfaces, this leads to operational and terminology confusion.

Basically, for the issue at hand, there are two key bits:

Input traffic from an Ethernet network to a host is controlled by ARP. The ARP subsystem publishes MAC addresses in response to requests from an ethernet network, and this happens infrequently (meaning far less than once a second). ARP controls packet ingress. The system ARP table maintains a list of learned and published addresses, and when an ARP packet is received, the system compares it to the system's interfaces and responds with the MAC address of the interface. Now this process works pretty much the way the OP would expect, but it can be subverted. For example, if I have a server with two interfaces, em0=192.168.1.1/24 and em1=192.168.1.2/24, and I set a client's ARP table with a manual entry for 192.168.1.1 pointing at the MAC address of server's em1, the traffic for 192.168.1.1 from client to server enters server on em1. And everything works fine. The UNIX networking layer doesn't think this odd or anything, even if you have a userland app that is bound to 192.168.1.1, it all works.

Output traffic to an ethernet network is controlled by the routing table, and the routing table is keyed by destination IP address. Basically when you do a "route get ip.ad.dr.ess" the system does a routing table lookup, similar to what happens when a packet is output. The source IP address isn't considered because IP routing happens based on destination. So as long as the routing code picks AN interface that's on the network in question, packets will be delivered, and that's pretty reasonable.

If you want to have multiple interfaces on a network, you should use link aggregation.

If you want to have multiple IP addresses on a network, you should define ifconfig aliases.

You can do other things, but then you're fighting the way it was designed, and then it may not work as you expect.

I do not wish to entertain a debate as to whether or not this is "right" or "wrong." It is the way that modern UNIX systems work, and such debate would be pointless. I am happy to discuss ways to do your IP networking within this framework to make all your systems happy though.

Updated 06/2016; bonus points and thanks to @Mirfster for tenacious Google-fu.
Author
jgreco
Views
42,567
First release
Last update
Rating
4.00 star(s) 3 ratings

More resources from jgreco

Latest reviews

Excellent overview of BSD specific network design and operation. Confrmation that multile interfaces on he same domain = aggregation (LACP primarily).
All I can say is, that it works even on a Synology, it works on Windows (ofc), it works on EMC SANs, on IBM SANs, on Purestorage SANs and on Nimble SANs and probably on many more platforms. AFAIK they are all Unix based and many of them also hyper fast (you stating stuff getting slow if having to choose paths). In 20 years of Enterprise IT this is the first system I encounter which doesn't allow this.

Show us the RFCs which prohibit this, maybe you're reading them wrong.
jgreco
jgreco
Purpose-built SAN systems have custom IP stacks designed for high performance direct I/O to minimize latency. As outlined in the resource, general purpose Linux/FreeBSD/Windows/etc have abstracted I/O stacks that allow all sorts of neat stuff like software interfaces for tunnels, but this comes at the cost of complexity. I included a number of references and I don't feel the need to dredge through RFC's right now to prove what is generally accepted behaviour by the major IP stacks.
Excellent post, answered all my questions on the topic and a few I didn't know I should have been asking.
Top