Super Weird Networking?

Joined
Oct 2, 2019
Messages
9
I posted before on this... problem... and didn't get any response.

And then I figured out how to make it work, or so I thought, and that solution worked for 2/6 servers, a completely different solution connected a third, and nothing seems to work for the remaining three.

So I stand up a new FreeNAS box built on a R510 with 12x 3TB disks. I follow the guides and think I've done pretty well. I give FreeNAS an available IP on the existing storage network and add it to the Dynamic Discovery of a host that's already working on the storage network with a Dell EqualLogic. It finds nothing new. So I play around with stuff for awhile, try moving the FreeNAS to a new network (172.31.1.0/24 instead of the 172.31.0.0/24 the existing Storage Network is on). Set up an ESXi host to match. Nada.

What ends up working is a second "test switch" and FreeNAS/ESXi hosts on a different storage network (not sure if the second part was important, I'm shooting in the dark because that's all I have.) Then I decide to try moving all the connections on the test switch back to the storage network switch and lo and behold, it still works. (The storage switch isn't flat and it won't talk to me at it's IP address for some reason and I just haven't broken down to shove a console connection into it. No idea what might be going on inside it.)

I figure I have it licked - just duplicate the working settings from one host to another and done, right? Except no, none of those hosts will work.

I take an existing physical interface, remove it from the vSwitch it's on, create a new vSwitch, add all the networking and that now-spare physical interface, add it to the software iSCSI, add the Dynamic Discovery IP of the FreeNAS... nothing. And I have two hosts that have separate vSwitches for the FreeNAS and the existing SAN, and one that has both on the same vSwitch (with different IP's on the VMkernal ports. No idea which way is "right" (if any), but both work.

I try plugging in a cable to a previously unused NIC interface on one host and creating a brand-new vSwitch for it. Set up in Storage Adapters, nothing again.

So I get to thinking, what's different about these three from the other three? The only thing I can think of is that all three of those started working when they were plugged into that test switch with the FreeNAS.


Someone out there has to know more about this than me (can't be all that many who know less.)

Do separate SANs have to be on separate networks? Do I need to have separate switches or is it possible/likely there's something in the switch config on this old storage network switch that's messing up the ability for hosts to find the FreeNAS but not stay connected once they have?

I'm about 90% sure I've convinced the guy in charge he's gotta spring for a 10Gbe switch that should solve all this problem as well as making the entire network not suck in all caps, but the other 10% is largely based on me demonstrating that everything works and the only thing holding it all back from being awesome is the 10 gigabit network. This is one of those jobs where I'm here to fix Z, but we have to convince them to fix A-Y, sequentially, and individually, while making it look like I'm totally focused on Z the whole time. "Right, so we've made great headway on Z. You can see by this graph and the multi-color pie chart we're nearly there, but over here that problem J is holding things back."

FreeNAS was the answer to the "can't you just put bigger disks in the EqualLogic?" Oh, you mean the 7-year old SAN with a failed drive that has more data on it than the collective remainder of all your storage capacity in the building? Actually, I have a better idea... have you ever heard of open source?
 
Joined
Oct 2, 2019
Messages
9
No, we've implemented a series of different work-arounds we try until one works. We do not have a consistent solution that works for all hosts.
 
Joined
Dec 29, 2014
Messages
1,135
Is it possible you have a mix of hosts/network ports where some allow jumbo frames, and some do not? There are a lot less moving parts if the NAS and the hosts are on the same network. If that isn't working, it would seem that there is something odd either in the interface configs of the NAS or hosts, or perhaps the switch ports. It depends on the individual switch if jumbo frames have to be enabled on a port by port basis. I have seen in a number of customer environments that trying to do jumbo frames where something along the path doesn't support it is always a big mess. For that reason, I always steer away from jumbo frames. I can get reasonably sustained burts 9G+ reads off my FreeNAS without jumbo frames.
 
Joined
Oct 2, 2019
Messages
9
I bet that's what it is. Since I can't read the config on that switch I can't be sure, but that would certainly explain why some hosts work without any trouble, and other don't seem to work no matter what I do - some of the ports are set up for jumbo frames and some aren't, and since all the settings for the old SAN were configured for jumbos I set everything for the new one up the same way.

It'll probably be a little while before I can take things down for maintenance again to fix it properly, but I'll let you know how it goes.
 
Joined
Oct 2, 2019
Messages
9
Sorry I didn't update this sooner.

I'm pretty sure that was the problem - we ended up running the FreeNAS entirely on 10Gbe, got new switches, NICs, and cables to support it and everything works as you'd expect it to. For a little while we ran it smoothly (though slowly) on 1Gbe by ditching the jumbo frames and that worked too.

We could probably run both FreeNAS and the old SAN on the same logical network but ultimately decided it just made more sense to carve out a new Class C just for it.

Thanks for the help!
 
Joined
Oct 2, 2019
Messages
9
Nope, just old enough to have habits.

I generally type like I talk and I talk to a lot of suits on a daily basis, so saying "Class C" is less "techy" than saying "slash 24". I can tell them that Class A is huge, Class C is small, and Class B is in the middle. The explanation for "slash 24" goes into what a subnet is and the people with all the decision power over whether or not the project gets approved, funded, and so on go glassy-eyed and lose interest.

It's fun to use exclusionary vocabulary, but in the interest of making the business function more effectively I've been training myself out of that habit.
 
Joined
Dec 29, 2014
Messages
1,135
Joined
Oct 2, 2019
Messages
9
What's the subnet mask of a Class C network? And what's the subnet of /24 CIDR notation?

The entire point I was making was that in the interest of making my job easier, the company I work for run more smoothly, and generally not being a dick to everyone who isn't an IT pro, I have the habit of using language that makes IT more accessible to the non-IT people in the room.

If you want to quibble over details of whether it's technically correct to call a /24 network a "Class C" network because that's outdated and/or doesn't use the correct private IP address for a Class C network that's fine. I'm just going to continue trying to make everyone where I work operate as a team instead of perpetuating an Us/Them attitude between the IT and non-IT people. If that means that sometimes someone gets to play the "well that's not entirely accurate" game because my habit is to use more accessible language that's fine.

Most places it's not the IT guys who are in charge of which IT guys get laid off. The guys who do make those decisions want simple answers, and they do not care what the technical distinctions between Class C and /24 are. They just know one of them sounds a lot simpler to understand.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
A Class C address actually refers to a specific portion of the IPv4 address space (first octet high bits 110), see RFC791, not just the size of the block.

It's not possible to have a Class C address in, for ex., Class A "10-net" space or Class B "172.16-net" space. You can absolutely allocate a /24 chunk of 172.31-net space, but it's not "carving out a Class C". It's "carving a /24 out of 1918 Class-B space".

I don't really care what your organizational issues are, and you are free to perpetuate misinformation within your team. Here, however, in these forums, misinformation is typically corrected so as not to perpetuate incorrectness. Many of the users here are hobbyists and actively learn their networking and server stuff from posts here. "Using language that makes IT more accessible to the non-IT people" is a poor excuse. In such a world, you are equating a car, a pickup, and a van because they have four wheels and an engine. I find most people can understand the difference between things if it is appropriately explained, so there's a lot of that that goes on here.

Please expect that you may be corrected if you post technically inaccurate things, and try not to take such offense.
 
Top