Help speed up my Dell C6220 hypervisors

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
I'm using three nodes of a Dell PowerEdge C6220 unit as a Proxmox VE cluster (which uses my TrueNAS CORE box for some of its storage, hence a tenuous connection to this forum). ISOs and some low-demand VM disk images are stored on the TrueNAS server, but most of the live disk images are stored on a Ceph pool consisting of three, 2TB Samsung EVO 860 SSDs, one in each node. The nodes are linked together via 10 Gb fiber with Chelsio T420 cards. And while performance is a whole lot better than it was with spinners, I'd still like to speed it up further--and also expand the VM disk storage.

If I want faster performance than SATA SSDs, the obvious (AFAIK) next step is NVMe SSDs, but these servers don't have any NVMe support (and that also means that 4x 2.5" bays per node are pretty useless, and also that I lose hot-swap capability for my VM storage devices). No problem, there are PCIe-NVMe cards--but the servers only have a single PCIe (3.0, x16, half-height) slot, which is currently occupied with the NICs. Well, there's a solution to that too: the servers have "mezzanine" slots, and there are Intel X520 NICs available that fit them. They're available on eBay, and not too expensive, though units with the brackets I'd need do seem to cost more.

So that's sorted, then. Buy mezzanine X520 cards x3, buy PCIe-NVMe cards x3, buy NVMe SSDs x3, profit! Well, opposite of profit, actually, but... It's getting expensive, but it seems straightforward enough. The last sticking point is that I'm not seeing cards that would let me use two NVMe drives on a single card. I'm seeing plenty of cards that have two slots, but have a SATA port for the second slot, and you need to use a SATA cable to the mainboard to use that slot--which seems to completely defeat my purpose in the whole exercise. So I think that leads to the real questions:
  • Are there cards which would let me use two NVMe SSDs in a single half-height PCIe slot?
  • Alternatively, am I going about this in completely the wrong way, and should I be thinking in a different direction entirely?
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
are you sitting down?
Well, I don't normally use the computer standing...
How about:
That does look like it should do the trick. The card that would handle four would be great, but there isn't room in this system for that.
Does the server support bifurcation?
This sounds like an important question, and I don't know--should it be listed on the datasheet or something? I don't recall seeing it, but I'll check.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
Its a very important question. If Yes then you can use nice cheap cards
If no - then you need a PCIe switching card and these are a lot more expensive. It needs a PLX or PEX (?) chip
SMC do one (dunno the part code).
Intel used to do one: AXX4PX8HDAIC which if you can get one ought to suit you perfectly as its low profile and does 4 NVMe cards allegedly

Some background info: https://forums.servethehome.com/ind...apters-that-do-not-require-bifurcation.31172/

You also might need M.2 to U.2 adapters
 
Last edited:

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Thanks. So it sounds like the options are:
  • Single NVMe adapter. Cheap, and doesn't need any special motherboard/chipset/BIOS features to work.
  • Non-switching multiple NVMe adapter. Cheap, but needs support for bifurcation. Based on what I've been able to find about this feature, if I have it, I should have BIOS features controlling it. I don't see anything in the product manuals about it, which doesn't seem encouraging.
  • Switching multiple NVMe adapter. Expensive, but doesn't need any special features.
I also looked up C6220 and bifurcations and its a definate maybe.
I tried searching and, while Google is giving hits, I haven't found anything that actually contains "C6220" (which actually should be "C6220 II"; I'd forgotten I had the later version) and "bifurcation." Sounds like time to start looking through the BIOS pages--at least I have a spare node I can do that with, so I don't have to take down a Proxmox node.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
So, regarding bifurcation: I haven't looked at the details for Xeon E5 v1/v2, but I think the layout is the same as for v3/v4. That means that the CPU has three ports, two x16 and one x8, and all of them can bifurcate down to x4 lanes (so x4/x4/x4/x4 and x4/x4). Now, the major catch is that before the popularization of NVMe SSDs, the vast majority of applications would have a single PCIe device per slot - the only complication being slots that share lanes with others. So, the system firmware would not support configuring the CPU for weirder things like x4/x4/x4/x4, unless a root port was being shared by two on-board NICs and two x4 physical ports or something.
Dell Gen13 systems got a firmware update mid-cycle that exposed all the bifurcation options, but I don't think Gen12 was getting new features by then.

Now, I don't know what the basis for Dell's firmware is, but if it's some highly-customized AMI derivative, I suspect that configuring this by hand would not be difficult. Broadly speaking, the steps are:
  1. Decompile the firmware image using AMI's tools to analyze the settings that exist.
  2. Find the settings for the PCIe configuration and what options are available.
  3. Figure out the addresses in the system firmware settings memory region.
  4. Manually edit them with arcane UEFI tools and confirm that the system works as expected and is stable.
  5. (optional) Make the changes persistent by creating a new firmware image and flashing it to your hardware.

Sidenote: I'm not entirely sure how these things work on newer systems that expect them. I can imagine a number of scenarios:
  • It doesn't, and relies on manual configuration. Cheap, easy, most likely to work.
  • PCIe init trickery, like trying to init x4/x4/x4/x4 first and then falling back to x8/x4/x4, etc. Messy, likely to break things.
  • SMBus trickery. I seem to recall some vendors using small I2C EEPROMs to identify M.2 and similar breakout cards to the system firmware. Very much non-standard, but reliable within an ecosystem.

Very hacky idea, moderately crazy too:
(and that also means that 4x 2.5" bays per node are pretty useless, and also that I lose hot-swap capability for my VM storage devices)
IF the Gen13 equivalent of your system has a backplane in the same format that takes PCIe SSDs, you can try acquiring a few and swapping them out. They should be U.2, so PCIe would be wired in parallel to SATA/SAS. That would give you access to four x4 SSDs by using a U.2 adapter instead of an M.2 one.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
@danb35 So - how did it all turn out?
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Thanks for the reminder--got busy with other things and forgot to post back here. I don't see anything in the BIOS settings that looks anything like bifurcation, and I am running on the latest BIOS version. I'm not sure I'd be up for building my own custom BIOS, which seems to be what Eric's suggesting. So in that case, the cheap multi-NVMe cards seem to be out; my options would be a single-NVMe card (cheap, but no room for future expansion) or the expensive switching multi-NVMe card. The single card is cheap enough that I wouldn't too much mind tossing it down the road. So I guess the remaining question is whether I'd need two NVMe SSDs per node now.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
But that 4 port switching card......... 4 NVMe drives per server of NVMe goodness - its just so shiny and you would only need 3 of them (plus the 12 NVMe drives that you would just have to fill them with)

:cool:

Oh - there is a two port switching card - but its not nearly as shiny (but is half the price)
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
The thing is, there's a significant price premium for 4 TB SSDs, such that one of them isn't a whole lot less than two 2 TB SSDs plus the four-port switching card. And since increasing capacity is one of the objectives of the whole process, that might point in favor of getting those--maybe one at a time. This is turning into an expensive proposition; maybe I can spread it out a bit.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
Well that was a far more serious answer than my response deserved
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Maybe, but you aren't wrong--I like the shiny as much as the next guy, and I'd love to fill this up with three of those fully-populated cards. But then reality intrudes, and at over $1k each (populated with 4x 2 TB SSDs), I don't see it right now.

But spread over the course of a number of months, well, I think I'm starting to talk myself into it... Maybe the NICs first, and one of those cards, get a couple of SSDs once the card gets here...
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Well, I pulled the trigger on one of those four-SSD cards. Now to find some of the mezzanine NICs that include the relevant bracket...
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
The saga continues...

The four-SSD card arrived yesterday, somewhat sooner than expected. And I found on eBay mezzanine NICs that appeared to have the relevant bracket:

So down one of the nodes this morning, slide out the tray, and... it doesn't look right. It doesn't look right at all. And after disassembling the tray a good bit, I realize that's because it just isn't right. The NIC has the right part number, but doesn't look like the right bracket at all; there's just no way the screw holes will line up with where they need to go. I could just ghetto the thing, I suppose, but I'd rather it be right. So, off to (1) figure out what bracket I need, and (2) buy a couple.

This looks like exactly the right bracket, but it's the wrong interface:

...which has the wheels in my head turning, which probably means this is going to get (more) expensive for me. But Mellanox twin 40Gbit cards with the mezzanine interface and the right bracket seem to be widely available and inexpensive, whereas 10Gbit NICs with the mezzanine interface and right bracket appear to be unobtainium. So what if...

@jgreco, as (to the best of my knowledge) our resident networking guru, am I thinking correctly that I could buy a few of those NICs, QSFP+ optics from fs.com, fiber patch cables, and a switch like this:

...and then patch it in to my existing 10Gbit infrastructure? And other than being gross overkill, any real reason not to do it, or major issues I'm missing there?

Edit 2: ...or something like this instead?

No new switch, no new optics, just new NICs and adapters. But that sounds suspiciously close to "too easy", so I'm a little skeptical.
 
Last edited:

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
No new switch, no new optics, just new NICs and adapters. But that sounds suspiciously close to "too easy", so I'm a little skeptical.
Well, if the ConnectX-2 supports breaking out the QSFP+ ports, it really should be that easy. Modern stuff generally does, but the ConnectX-2 is somewhat older.
and a switch like this:
Careful with that one, they make it sound like it has DC PSUs, which I presume would do you little good. I've seen a lot of discussion around the Arista switches on STH and I get the general impression that research is advisable so as to not accidentally end up in the deep end of the pool.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
they make it sound like it has DC PSUs, which I presume would do you little good.
Yeah, I'd noticed that after I posted the link. I guess it depends on the expected DC input voltage, but I'd probably need to look for something different--though there doesn't seem to be a lot out there with more QSFP+ ports than SFP+
if the ConnectX-2 supports breaking out the QSFP+ ports
...and this is where I'm getting kind of concerned. Because I'm seeing references to lots of things I don't quite understand, and I'm having trouble finding much in the way of specs or documentation for that card. Googling the Dell p/n gives me a bunch of resellers (most using the same, wrong, stock photo) with very sparse (and uniformly wrong--there's no way the card weighs anything close to 5 lb) specs. But I see references to Infiniband, and some cards being Infiniband-only, with others being Ethernet-only, and yet others being configurable one way or the other. I see some references to the CX-2 cards doing 40 Gbit/sec on Infiniband, but only 10 Gbit/sec on Ethernet. I think I'm bordering on an area where I don't know what I don't know.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
There are a few threads about the mellanox ConnectX cards and it all sounds somewhat technically complex, subject to just not working.

I have thought about three of those type of cards and DAC's to connect my 2 ESXi servers to the NAS (for iSCSI) just to see if it works, but honestly I just don't need it (40Gb) and I have 10Gb working just fine
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Current Proxmox is built on Debian 11 and doesn't lock out apt like SCALE does, so I feel relatively OK about making the hardware "work" with the OS, at least to the point of having working drivers available. But I feel less OK about (1) whether these particular cards support Ethernet at all, and (2) if they do, if they also support breaking out the QSFP+ ports as Eric mentions. And more troublesome is that my Google-fu is failing me in ascertaining the answers to either of these questions. If the answers to both questions were known to be "yes", I think that's my solution--sure, I'm wasting 3/4 of the performance capability of the NICs, but that's capability I don't have now, and LAN throughput isn't what's limiting me now.

Maybe the answer is to just buy one of the NICs and an adapter, and see how it goes. I started a thread on STH, but no replies as yet--though it was just last night I posted:

...but while browsing around there, I found this:

Uh oh...
 
Last edited:

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Maybe the answer is to just buy one of the NICs and an adapter, and see how it goes.
This is the way I'm heading--ordered one of the NICs last night. If it shows up in Proxmox and can be set to Ethernet mode, I'll see about an adapter. If I can get a link on that, I should be set, and will order the rest of the NICs and adapters. Maybe it'd be best to use my (currently-unused) fourth node as the test bed...

And probably going to pick up a Brocade ICX6610--at about $250, that would replace four different switches in my server room.
 
Top