Register for the iXsystems Community to get an ad-free experience and exclusive discounts in our eBay Store.

Mellanox ConnectX-4 100GbE with FreeNAS 11.1 U4

Kenny Simpson

Neophyte
Joined
May 19, 2017
Messages
6
I installed a ConnectX-4 card in a x16 slot, but no new interfaces are showing up. I do see in the logs that there is a mlx5_core driver that does recognize which port has a module:

mlx5_core0: <mlx5_core> mem 0xf8000000-0xf9ffffff irq 64 at device 0.0 numa-domain 1 on pci13
mlx5_core0: INFO: firmware version: 12.17.2032
mlx5_core0: INFO: Module 0, status: plugged
mlx5_core1: <mlx5_core> mem 0xf6000000-0xf7ffffff irq 68 at device 0.1 numa-domain 1 on pci13
mlx5_core1: INFO: firmware version: 12.17.2032
mlx5_core1: INFO: Module 1, status: unplugged

I've tried 'kldload mlx5en', but that yields:
kldload: can't load mlx5en: module already loaded or in kernel

'kldstat -v' does not show such a module, but neither does it show mlx5_core.

How can this card be enabled to provide ethernet interfaces?
 

Kenny Simpson

Neophyte
Joined
May 19, 2017
Messages
6
It looks like the cards come by default set to be Infiniband, and their firmware tools must be used to reconfigure the card to be in Ethernet mode: http://www.mellanox.com/page/management_tools
'mlxconfig -d pci0:134:0:0 set LINK_TYPE_P1=2 LINK_TYPE_P2=2' is what was needed for me.

Note however that their tools use 'python' and expect that to mean python2 - so I did need to edit some of the tools to say python2 instead.
 

skyyxy

Member
Joined
Jul 16, 2016
Messages
122
It looks like the cards come by default set to be Infiniband, and their firmware tools must be used to reconfigure the card to be in Ethernet mode: http://www.mellanox.com/page/management_tools
'mlxconfig -d pci0:134:0:0 set LINK_TYPE_P1=2 LINK_TYPE_P2=2' is what was needed for me.

Note however that their tools use 'python' and expect that to mean python2 - so I did need to edit some of the tools to say python2 instead.
HI Kenny, its working now? and what about the 100GbE speed? I also want to buy one. Thanks a lot.
 

Kenny Simpson

Neophyte
Joined
May 19, 2017
Messages
6
Yes, it works and connects at 100GbE. The machine is only able to push ~60Gb in iperf due to maxing out CPU (older/slower xeon) with FreeNas 11.1-U5, but is able to push 20-30Gb pretty steadily all day long.

After this initial config, the card has given no issues at all - very nice card, everything just works.
 

skyyxy

Member
Joined
Jul 16, 2016
Messages
122
Yes, it works and connects at 100GbE. The machine is only able to push ~60Gb in iperf due to maxing out CPU (older/slower xeon) with FreeNas 11.1-U5, but is able to push 20-30Gb pretty steadily all day long.

After this initial config, the card has given no issues at all - very nice card, everything just works.
Thats Awsome!!!! I will buy it at later, Thank you sooo much.
 

bolan0000

Newbie
Joined
Dec 2, 2018
Messages
2
Yes, it works and connects at 100GbE. The machine is only able to push ~60Gb in iperf due to maxing out CPU (older/slower xeon) with FreeNas 11.1-U5, but is able to push 20-30Gb pretty steadily all day long.

After this initial config, the card has given no issues at all - very nice card, everything just works.
It is too slow, and it is a waste of money to buy a ConnectX-4 100GbE card! Have you ever tried to find out what is the limit?
 

purduephotog

Member
Joined
Jan 14, 2013
Messages
73
I'm trying to do 40gbe and hot having any luck.
Love to know the upper limits. I'd be willing to do direct attachment for the important machines if nothing else to grab data
 

skyyxy

Member
Joined
Jul 16, 2016
Messages
122
I'm trying to do 40gbe and hot having any luck.
Love to know the upper limits. I'd be willing to do direct attachment for the important machines if nothing else to grab data
I tried to use intel xl710 40GbE card, but also just 1GB/S speed, maybe need for tunable paremeters in freenas? Also may 100GbE card too? Thanks.
 

skyyxy

Member
Joined
Jul 16, 2016
Messages
122
It is too slow, and it is a waste of money to buy a ConnectX-4 100GbE card! Have you ever tried to find out what is the limit?
Do you have set any paremeters in Freena for 40GbE card? Thanks
 

Kenny Simpson

Neophyte
Joined
May 19, 2017
Messages
6
It is too slow, and it is a waste of money to buy a ConnectX-4 100GbE card! Have you ever tried to find out what is the limit?
The limit is pretty clearly the slow cpu trying to push it (CPU E5-2620 v2 @ 2.10GHz)
We are in the process of getting beefier cpus to push this further (couple generations newer and running at higher clock speeds).

As for 40 vs 100, realize that 40 is really 4x10, and 100 is 4x25, so the single-channel limit for 100 is 2.5x what it is for 40 (or 10).

We are able to saturate a 25Gb client connection when there is a single iperf client.

On another machine, we did try to do lacp of two 10Gb connections. This seemed to eat cpu very quickly and the most aggregated throughput I was able to see was ~8Gb. This is less than what a single 10Gb was able to push. We have since dropped the lacp and fell back to single 10Gb (which it can saturate with ~30% cpu). Maybe we could split the networks and put half the clients on one interface and half on the other, but dropping in a better nic is cleaner.
 

alpha754293

Neophyte
Joined
Jul 18, 2019
Messages
8
My apologies for resurrecting an old topic, but I found this via a Google search as I am preparing to embark on possibly a similar journey as this.

Just as a note though, the other reason why the Infiniband might not have been working for you originally was because you might not have had a subnet manager running on your IB network.

You can enable that with:

Code:
node1# /etc/init.d/opensmd start start


And that'll start the OpenSM subnet manager and then if you run:

Code:
node1# ibv_devinfo


You should see that the ports are now active.

And it is true that with ethernet, you don't need to run a subnet manager with that, but the Mellanox 100 GbE switches are more expensive per port than their 100 Gbps IB switches.
 

alpha754293

Neophyte
Joined
Jul 18, 2019
Messages
8
The limit is pretty clearly the slow cpu trying to push it (CPU E5-2620 v2 @ 2.10GHz)
We are in the process of getting beefier cpus to push this further (couple generations newer and running at higher clock speeds).

As for 40 vs 100, realize that 40 is really 4x10, and 100 is 4x25, so the single-channel limit for 100 is 2.5x what it is for 40 (or 10).

We are able to saturate a 25Gb client connection when there is a single iperf client.

On another machine, we did try to do lacp of two 10Gb connections. This seemed to eat cpu very quickly and the most aggregated throughput I was able to see was ~8Gb. This is less than what a single 10Gb was able to push. We have since dropped the lacp and fell back to single 10Gb (which it can saturate with ~30% cpu). Maybe we could split the networks and put half the clients on one interface and half on the other, but dropping in a better nic is cleaner.
Sorry for replying to this now rather old thread, but the CPU utilization that you are seeing is likely also due to the fact that TrueNAS/FreeNAS doesn't support RDMA (at least not on the Infiniband side of things).

There is, however, supposed to be support for RoCE, but I currently don't have my IB card connected in a point-to-point fashion to test 100 GbE over RoCE.

I was able to write 10 GiB of zeros over NFS at around 3 Gbps, but I was able to read it back to /dev/null at around 53.6 Gbps.

Granted, that's not particularly useful in "real" situations, and I wasn't about to try and create like a 16 GB RAM drive (I only have 32 GB of RAM installed which is the max that's supported, I think, by my system's platform being that I was only using an Intel Core i7-3930K on an Asus X79 Sabertooth motherboard.)

As such, with four mechanically rotating hard drives on both ends, my ability to test the system at full line speed is severely limited.

However, to your point about it being 4x EDR (4x 25 Gbps link), that's handled by the NIC itself, i.e. for me to be able to hit the 53.6 Gbps read rate of a 10 GiB file of zeros, I didn't have to anything extra/special/unique about it.

It's just a "vanilla" NFS export.

If you're running 100 GbE and you're running RoCE and if your clients are Windows clients, try and see if you can get SMB 3.0/SMB direct working on your system. It MAY or MAY NOT work.
 

Rand

Neophyte Sage
Joined
Dec 30, 2013
Messages
885
There is no RoCE support in FN/TN

Only thing you could do to get any kind of RDMA would be to use Chelsios with iWarp...
 

alpha754293

Neophyte
Joined
Jul 18, 2019
Messages
8
There is no RoCE support in FN/TN

Only thing you could do to get any kind of RDMA would be to use Chelsios with iWarp...
This was according to what I found on the mlx5ib man pages (here: https://www.freebsd.org/cgi/man.cgi....2-RELEASE+and+Ports&arch=default&format=html)


The mlx5ib driver provides support for infiniband and Remote DMA over
Converged Ethernet, RoCE, for PCI Express network adapters based on Con-
nectX-4 and ConnectX-4 LX.
So....my thought process was that if you can set or change the port type from IB to ETH, then according to the mlx5ib man pages, you should be able to run RoCE.

I don't have a point-to-point 100 GbE connection that I can test that out with (cuz my IB cables are plugged into an IB switch and it would be quite a bit of a pain to take them out to test this as a point-to-point connection), so I can't confirm nor validate what's stated in the man pages for mlx5ib as to whether RoCE actually works or not.

It would be nice, but by the same token, since NFSoRDMA doesn't work (at least not in TrueNAS Core 12.0 U1.1), I'm not entirely sure/certain whether it would matter as I also suspect/expect that SMD Direct also doesn't work (from a system running TrueNAS Core 12.0 U1.1) and there is next to no documentation on how to get it up and running because even the Mellanox documentation is doing its own thing that may or may not work with the mlx5ib kernel module and/or the ipoib kernel module that ships with the OS, "out of the box".
 

Rand

Neophyte Sage
Joined
Dec 30, 2013
Messages
885
Ok, to be honest I have not tested if the NIC actually might be RoCE capable, since the problem is that there are no RoCE/RDMA enabled services running on TNC (neither SMB, nor NFS/uiSCSI, let alone NVMEoF).
I had opened a FR a while ago that was not too well received (i.e. not enough ppl voted for it).
They picked it up for TrueNAS Scale (where it is o/c significantly easier to implement since linux already provides RDMA enabled services), but I have not seen anything for FreeBSD (RDMA clienst yes, but no server capabilities).

Current driver on TNC 12 for a CX5 is
mlx5en: Mellanox Ethernet driver 3.5.2 (September 2019)

So the IB driver is not loaded (as IB is not supported in the first place)
 
Top