MPIO iSCSI Network Performance

Jeff Fanelli · Sep 25, 2013

Folks, i've sent an apology to the moderator and jgreco, and i'd like to apologize here as well. I've asked jgreco to continue this (technology) thread off-line for now :)

-jeff

viniciusferrao · Apr 13, 2014

I'm replying to this thread just to get help from experts.

In a scenario where I want to use 4 network interfaces on the storage, 2 switches and 2 networks ports on each hosts how should I handle this?

It's clear to me to use 4 different subnets on the storage, but how I should handle the traffic? Routing isn't an option either right?

Perhaps the right option is create four VLANs, put each interface on the storage in one of those VLANs, split two VLANs to each switch and finally configure ports with VLAN trunking for two vlans in each host gigE ports.

I don't know if the idea is clear, and if it's right. That's why I'm looking for advices.

Thanks,

jgreco · Apr 14, 2014

It's not clear to me that you want four different subnets. Maybe you do, maybe you don't.

You could do two LACP interfaces with two ethernets to each switch with two subnets. This is easy to manage and as long as there are more than a few hosts, probably works out well.

You could also do something more complicated like what you suggest, which ought to work but has more moving parts (which means more things that could break).

Do not try to do any routing. Let the MPIO handle failures and if a subnet breaks, then let it break and go fix it.

Please do not resurrect old threads. It is called necro-posting and considered kind of rude.

viniciusferrao · Apr 14, 2014

Hello @jgreco, I will open another thread.

zeroluck · Jun 1, 2015

jgreco said:
If that was true, then there wasn't a pressing need for 802.3ad. The fact that some vendors have made a feature work does not make it a good idea; basically if you look at what happens at layer 2, it becomes a royal mess, especially when you have an OS like FreeBSD that is abstracted so that it doesn't work with just a single type of networking hardware.

Almost all modern UNIX operating systems use abstracted networking stacks. Output traffic is handled through the route table, so the outbound load balancing that you are hoping to see by putting two interfaces on a single network ... just doesn't happen. The authors of the modern stacks know 802.3ad exists; it isn't 1989 anymore and RFC1122 stupidity is pointless. Making multiple interfaces work through a SECOND mechanism - in particular the "route it out the same interface" mechanism many people in your situation seem to expect - would require each packet to be looked up again in a different way, dramatically reducing throughput.

Multipath I/O on separate networks, preferably on separate switches and separate cards gives you a heck of a lot of resilience, a feature you just don't get if you try to stick it all on one wire.

The fact that the GUI used to allow you to do it and doesn't now is unfortunate; it never should have allowed it. It yields an unexpectedly broken networking configuration in a way that you wouldn't expect.

Apple sums the topic up very nicely.

I am not going to follow up further on this topic. This is one of those things that's a matter of "what can be made to work" vs "what is correct." I too can make stupid things work. Non-ECC memory can work in a server. RAID5 can recover a 10TB+ filesystem. etc. Doesn't make it correct, and doesn't mean it will work reliably.

Thread necromancer here but I'm deploying FreeNAS for some less-than-critical things at my company so the subject has been on my mind a lot recently. I wanted to weigh in on this real quick...

Many vendors do this and there is a redundancy benefit that hasn't been mentioned here. When you have 2 network cards on an ESXi host and 2 network cards on a SAN or NAS using MPIO and a single subnet, you have more paths available and thus more redundancy. Each ESXi iSCSI nic has two paths to the SAN. The downside of having separate subnets is they can't see each other (without routing, which is not recommended for iSCSI), so you can lose one nic on the SAN and one nic on the ESXi host and you have an APD, all paths down situation. If you've ever had this happen with ESXi on iSCSI you know that is a terrible thing. Things start freezing and freaking out, orphaned LUNs, etc. It's a mess. As unlikely as that is to happen, people who have worked in or run their own datacenter know that if it can happen, eventually it will.

That said, I would challenge the statement "just because vendors do it doesn't make it a good idea". While that's true it infers they are doing something that is flawed. That's not true; they understand the problem and they've worked around it to make it easier to configure and more redundant which is a benefit for their customers.

Apple's explanation of this topic is sort of simplified and actually doesn't really explain anything (what did we expect, it's Apple). What they are actually talking about is something called ARP flux. ARP flux can be avoided with careful configuration and I almost guarantee that is how storage vendors are doing this since it goes without saying that these SANs are running some kind of UNIX or linux at the core (mine are, I can SSH into them).

Here's a few articles that explain how ARP flux works with examples and how to get around it: http://www.jefflane.org/multiple-nics-same-subnet-avoiding-arp-flux
http://blog.cj2s.de/archives/29-Preventing-ARP-flux-on-Linux.html

jgreco · Jun 1, 2015

zeroluck said:
Thread necromancer here but I'm deploying FreeNAS for some less-than-critical things at my company so the subject has been on my mind a lot recently. I wanted to weigh in on this real quick...

Many vendors do this and there is a redundancy benefit that hasn't been mentioned here. When you have 2 network cards on an ESXi host and 2 network cards on a SAN or NAS using MPIO and a single subnet, you have more paths available and thus more redundancy. Each ESXi iSCSI nic has two paths to the SAN. The downside of having separate subnets is they can't see each other (without routing, which is not recommended for iSCSI), so you can lose one nic on the SAN and one nic on the ESXi host and you have an APD, all paths down situation. If you've ever had this happen with ESXi on iSCSI you know that is a terrible thing. Things start freezing and freaking out, orphaned LUNs, etc. It's a mess. As unlikely as that is to happen, people who have worked in or run their own datacenter know that if it can happen, eventually it will.

No, that's just the idiot's way to do it. You can use multiple portals and LACP and that works as well, actually even better, because the paths don't go down.

That said, I would challenge the statement "just because vendors do it doesn't make it a good idea". While that's true it infers they are doing something that is flawed. That's not true; they understand the problem and they've worked around it to make it easier to configure and more redundant which is a benefit for their customers.

Right, and then when a standards based method to do this came around, they said "oh, no, we've already got this figured out, we're not going to bother with that."

It certainly isn't MORE redundant, because with LACP, you can actually have both portals remain active with a single physical interface. So that's just a silly claim.

Apple's explanation of this topic is sort of simplified and actually doesn't really explain anything (what did we expect, it's Apple). What they are actually talking about is something called ARP flux. ARP flux can be avoided with careful configuration and I almost guarantee that is how storage vendors are doing this since it goes without saying that these SANs are running some kind of UNIX or linux at the core (mine are, I can SSH into them).

Here's a few articles that explain how ARP flux works with examples and how to get around it: http://www.jefflane.org/multiple-nics-same-subnet-avoiding-arp-flux
http://blog.cj2s.de/archives/29-Preventing-ARP-flux-on-Linux.html

Not useful and also not the real problem - FreeBSD does allow you to nail ARP to a specific interface, but it's messy and nongeneralized, and generally a bad thing to do. The real problem here is outbound interface selection. There are even solutions to that problem, but again, messy, nongeneralized, and generally a bad thing to do. The real problem is that fighting the modern networking paradigm is a bad thing to do.

Put differently: My dad drove stick, and I learned to drive stick, but the new cars are automatics, and I don't try to manually shift gears on them - my Highlander Hybrid doesn't even *have* gears (it's CVT).

Use your equipment the way it was designed to be used to minimize the chances something bad will happen. Again, this is a boring conversation and everything you really need to know has already been summarized for you:

https://forums.freenas.org/index.php?threads/multiple-network-interfaces-on-a-single-subnet.20204/

Stupid IP tricks are a dime a dozen. If you want to see stupid IP tricks, sometime ask me about guerilla subnetting of a Class B (really!) IP space. The real networking guys know it is better to work within the standards to reduce the likelihood of inadvertent cockups. The standards were devised because people wanted frameworks that would work in a portable manner, not just stupid hacks to make things work in specific cases.

Important Announcement for the TrueNAS Community.

MPIO iSCSI Network Performance

Jeff Fanelli

Cadet

viniciusferrao

Contributor

jgreco

Resident Grinch

viniciusferrao

Contributor

zeroluck

Dabbler

jgreco

Resident Grinch

Similar threads

Important Announcement for the TrueNAS Community.

MPIO iSCSI Network Performance

Jeff Fanelli

Cadet

viniciusferrao

Contributor

jgreco

Resident Grinch

viniciusferrao

Contributor

zeroluck

Dabbler

jgreco

Resident Grinch

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "MPIO iSCSI Network Performance"

Similar threads