Help speed up my Dell C6220 hypervisors

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
NIC arrived today--and in addition to the part numbers that were shown in the seller's photos, there's also a model number that wasn't visible: MCQH29-XFR. Searching that led me to a manual:

It appears this card was initially sold for the C6100, a predecessor to the C6220. But the manual appear to confirm that the card does Ethernet at 10 Gbit/sec, which is a good sign. Let's see if I can make it work.

Edit: First, it physically fits, which is a good start. Second, Windows recognizes it:
1652375680160.png

I have a QSFP+-to-SFP+ adapter on the way, so I'll check for network connection when it gets here.
 
Last edited:

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
Looks like a good start
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Agreed. I'm a little concerned about the "IPoIB" part, but the manual suggests it will autodetect IB or Ethernet. I guess I'll find out.

Otherwise, it looks like the Mellanox Firmware Tools can set the card to use Ethernet--I'd just need to find a copy of the latest version 3 release, as they dropped support for the CX-2 cards with version 4.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Thanks, I'd found that previously--which is why the test machine is running Windows at the moment. But for whatever reason, it only supports CX-3 adapters:
Code:
PS C:\Users\Dan Brown> mlxconfig -h
    NAME:
        mlxconfig
    SYNOPSIS:
        mlxconfig [-d <device> ] [-y] <s[et] <parameters to set>|q[uery]|r[eset]>

    OPTIONS:
        -d|--dev <device>               : Perform operation for a specified mst device.
        -y|--yes                        : Answer yes in prompt.
        -v|--version                    : Display version info.
        -h|--help                       : Display help message.

    COMMANDS:
        q[uery]                  : query current supported configurations.
        s[et]                    : set configurations to a specific device.
        r[eset]                  : reset configurations to their default value.
        clear_semaphore          : clear the tool semaphore.

    Supported Configurations:
        SRIOV                    : SRIOV_EN=<1|0> NUM_OF_VFS=<NUM>
        WOL_PORT1                : WOL_MAGIC_EN_P1
        WOL_PORT2                : WOL_MAGIC_EN_P2=<1|0>
        VPI_SETTINGS_PORT1       : LINK_TYPE_P1=<1|2|3> , 1=Infiniband 2=Ethernet 3=VPI(auto-sense).
        VPI_SETTINGS_PORT2       : LINK_TYPE_P2=<1|2|3>
        BAR_SIZE                 : LOG_BAR_SIZE=<Base_2_log_in_mb> , example: for 8Mb bar size set LOG_BAR_SIZE=3

    Examples:
        To query current Configuration     : mlxconfig -d mt4099_pciconf0 query
        To set Configuration               : mlxconfig -d mt4099_pciconf0 set SRIOV_EN=1 NUM_OF_VFS=16 WOL_MAGIC_EN_P1=1
        To reset Configuration             : mlxconfig -d mt4099_pciconf0 reset

    Supported devices:
        ConnectX3, ConnectX3-Pro (FW 2.31.5000 and above).


The MFT Release Notes state that the CX-2 was removed with 4.0.0, so it should have been supported by 3.8.0 (which was the last 3.x release). But for some reason, at least this build only supports the CX-3 cards. My first thought was to wonder whether it were actually the correct version, but it looks like it is:
Code:
PS C:\Users\Dan Brown> mlxconfig -v
mlxconfig, mft 3.8.0-56, built on Jan  8 2015, 17:59:51. Git SHA Hash: 76bc2611be7804738dcb777d3e9b5b2670bbbd9e


And in other news, the adapter arrived today, a couple of days earlier than had been expected. Woohoo! Out to the shop, plug it into one of the sockets, plug a SFP+ DAC into that and into my Mikrotik switch, and... nothing. No link light on the card or the switch, and Windows shows that the cable is unplugged, regardless of which port on the NIC I plug it into. Bother.

Edit: Cables are always the first troubleshooting step, so I tried replacing the DAC with another--same result. So dug out an optic for the Mikrotik switch, a Finsair optic for the Mellanox end, and a OM4 patch cable--still the same result. I think we can rule out the cable as the source of the problem.
 
Last edited:

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
OK, with some prodding from the folks at STH, it looks like I have a working solution. See:

In short, install sysfsutils on proxmox, edit /etc/sysfs.conf to add these two lines:
Code:
bus/pci/devices/0000\:82\:00.0/mlx4_port1 = eth
bus/pci/devices/0000\:82\:00.0/mlx4_port2 = eth

...and Bob's your proverbial uncle. I'd still like to be able to flip the relevant firmware bit so it defaults to this, but this has it working, it survives a reboot, and it gets pretty close to line speed using iperf3. That's at least a 90% solution.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Very shiny indeed. But I'm thinking I then need to (1) figure out with OCuLink is, (2) figure out a way to put up to eight of those devices inside the half-of-1U nodes, and (3) pay for all that. But it does look like a nice tool for some needs.

I'm out of town for a few days, but I'm figuring I'll order the rest of the Mellanox NICs and adapters once I get back, and then start in earnest on the NVMe conversion.

But I've also seen some improvements from an unexpected direction, and that was replacing the switch(es) in the server room. Previously, I'd had a hodgepodge of a Dell X1052, two Mikrotik switches, and a small Unifi PoE switch out there, and through some combination of poor network design and limited capabilities of those devices, I was getting very poor throughput over what should have been 10Gbit network connections. But while trying to sort out the Mellanox NIC issue at STH, I ran across this thread about Brocade ICX switches and ordered a ICX6610-48P for the server room. Now, instead of 12 SFP+ ports across three devices (four of which were used for connecting the switches to each other), I have 16 of them in a single switch. Instead of four switches there, I have one. And instead of getting only gigabit speeds between my TrueNAS box and any of the Proxmox hosts, I'm now getting 10 gigabit.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
Well thats a major improvement anyway
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Agreed. It also simplifies administration. Doesn't do much to improve noise levels, but the rest of the contents of that rack are pretty noisy too.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
(1) figure out with OCuLink is,
Just another connector with another set of crazy trapdoors for those of us who haven't researched it in depth. There's weird stuff like "you need this SFF-8643 to Oculink cable instead of that one" with no discernible reason attached. If anyone has a good resource on what the hell the deal is with Oculink, I'd appreciate it.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Well, everything is moving slower than expected, but that shouldn't be too surprising, I guess. I have all the Mellanox NICs, and all the QSA-SFP+ adapters--at two adapters/NIC, the adapters cost more than the NICs, but the total is still just over what the Intel NICs would cost without the bracket. Kudos to the seller for shipping the adapters DHL Express from China; less kudos to DHL for scanning them as delivered two days before they actually delivered them. But they're all here and in good condition, and one of them is installed in one of my nodes. Few tweaks to /etc/network/interfaces to account for the different device names, and everything's up and running. Woohoo!

Samsung EVO 970 Plus should be here tomorrow, if I can believe Amazon.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
Which NVMe to PCIe adapters did you get?
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
So, SSD arrived today. Putting it into the card was easy enough. But when I went to put the card in the chassis, it turns out it interferes with the latch on one of the DIMM sockets (if it isn't one thing, it's another). Remove the DIMM, the latch sits a bit lower, now I can install the card. I'm down 4 GB in that node now, but "down 4 GB" still leaves about 80 GB. Added the new SSD to the Ceph pool, and it's rebalancing the pool now. Now repeat two more times...
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
Thats a PITA.
I put a simple NVMe to PCIe card in my main TN, looked at, looked at the now invisible and unusable bulk of the SATA ports and thought "good job I don't need them"
Right Angled cables might work - but I am not convinced
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
Exactly
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
I haven't gotten around to getting the PCIe-NVMe cards for the other two nodes, but just had occasion to add a second NVMe to the card in the first node this morning. System picked it up automatically, I added it to the Ceph pool, and away it went. So the PCIe switching part of the card seems to actually be working--good to know.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
In the interests of keeping up with the "Shiny"
Thats even shinier than the PCIe Cards you are buying
 
Top