Infiniband support on the horizon?

Jason Bacon

Cadet
Joined
Sep 11, 2013
Messages
7
Just wondering if Infiniband support has been explored (again) lately.

The stock FreeBSD support has matured to the point where Infiniband can be enabled/updated in a few minutes (by using kernel modules rather than rebuilding the kernel) and offers reasonable performance. IB support is enabled in userland by default now. I updated the wiki page to explain the details:

https://wiki.freebsd.org/InfiniBand

I'm testing on a PowerEdge R730xd with a Mellanox FDR card in one of our HPC clusters.

From pciconf -lv:

vendor = 'Mellanox Technologies'
device = 'MT27500 Family [ConnectX-3]'
class = network

Below is a benchmark comparison using NFS 4 from the same compute node to a CentOS server (NFS over RDMA) and a FreeBSD server (NFS over TCP), connected mode on all nodes. As you can see, overall performance is comparable, except for overwrite. I have not put much effort into tuning at this time - this is using mostly default settings, although I played with MTU to balance stability and performance. The CentOS default MTU of 65520 caused periodic network hangups on the FreeBSD server (from which it always recovered after a few minutes). Reducing to 1/4 of that (16380) eliminated any instability with very little loss of throughput.

CentOS 7 XFS / server:

Averages of 3 trials:
125.03 GiB write 4.00 MiB blocks 145903.00 ms 877.53 MiB/s
1024 seek 4.00 MiB blocks 23.98 ms 170.67 MiB/s
125.03 GiB read 4.00 MiB blocks 236010.00 ms 542.49 MiB/s
125.03 GiB rewrite 4.00 MiB blocks 158151.00 ms 809.57 MiB/s

FreeBSD 12 / ZFS server:

Averages of 3 trials:
125.03 GiB write 4.00 MiB blocks 174645.00 ms 733.11 MiB/s
1024 seek 4.00 MiB blocks 14.67 ms 273.07 MiB/s
125.03 GiB read 4.00 MiB blocks 225402.00 ms 568.03 MiB/s
125.03 GiB rewrite 4.00 MiB blocks 413798.00 ms 309.41 MiB/s

At the moment I'm torture-testing the server with a genome assembler called canu and some particularly low-quality input data that causes canu's I/O to go through the roof. This assembly brought down our CentOS NFS servers until I switched them to NFS over RDMA. So far so good with FreeBSD. I had one incident where the FreeBSD IB interface shut down when it ran out of buffer space, but that was on a run that I had improperly restarted, causing even more excessive I/O. That issue will have to be diagnosed, of course, but right now it's handling the toughest load we've ever encountered in our HPC service.
 
Last edited:

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Search the forum for Infiniband. There are several threads about it and you can make it work, but there will likely never be any official support for it.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080

Jason Bacon

Cadet
Joined
Sep 11, 2013
Messages
7
Thanks for the feature request link.

Regarding the (un)likelihood of future support, does iXsystems not see a market for FreeNAS in High Performance Computing? Infiniband is the dominant interconnect in HPC and HPC uses *a lot* or storage. Many HPC admins are not computer gurus by choice, but scientists who get stuck with the job and prefer appliances over BYO solutions so they can focus on their science. A large percentage of HPC sites are in academia, where some of the other storage appliances are prohibitively expensive and the lower cost of a FreeNAS box would be welcome.

I'm personally content configuring a stock FreeBSD box, but I just wanted to draw your attention back to this as it seems like it would be relatively easy to support at this point and might sell more TrueNAS appliances.

Cheers,

JB
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,175
It's supported in TrueNAS, though, isn't it?
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
It's supported in TrueNAS, though, isn't it?
From what I remember, and it might have changed, they don't use Infiniband even in TrueNAS implementations. This is the quote:
iXsystems doesn't have so much as a single Infiniband card or customer to justify that investment
That was in October of 2015 though. Most people are saying that 10Gb SFP+ is the better way to go, but if this poster has an installed base that is already on Infiniband I do recall that some users have posted about being able to get it working.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Regarding the (un)likelihood of future support, does iXsystems not see a market for FreeNAS in High Performance Computing? Infiniband is the dominant interconnect in HPC and HPC uses *a lot* or storage. Many HPC admins are not computer gurus by choice, but scientists who get stuck with the job and prefer appliances over BYO solutions so they can focus on their science. A large percentage of HPC sites are in academia, where some of the other storage appliances are prohibitively expensive and the lower cost of a FreeNAS box would be welcome.
You might want to point that out to them in the feature request. Before it would be worth the time for iXsystems to implement, they would need to have a good feeling about being able to sell some hardware. They make nothing on FreeNAS (it's free) so they would need to sell some servers and they make good ones at a competitive price, or they would need to sell systems that include TrueNAS. TrueNAS is the more sophisticated software that is similar to FreeNAS but they provide support services with it, kind of like the RedHat schema where they give away CentOS but charge for RHEL.
I'm personally content configuring a stock FreeBSD box, but I just wanted to draw your attention back to this as it seems like it would be relatively easy to support at this point and might sell more TrueNAS appliances.
I am all for having more functionality integrated in FreeNAS. There is some possibility that my next job is going to be supporting an HPC cluster.
 

Jason Bacon

Cadet
Joined
Sep 11, 2013
Messages
7
10Gb Ethernet is generally a better option outside HPC, but it can't compete with Infiniband's 1 microsecond latency and higher throughput for distributed parallel programs (e.g. MPI) on an HPC cluster. Almost nobody in HPC is going to bother with separate high-speed networks for storage and IPC, so a storage appliance that can't plug into the existing Infiniband fabric will never be attractive to most HPC admins. BTW, many modern host adapters can operate in either Ethernet or Infiniband mode, but only one at a time. So you can build a storage box and then plug it into either type of network without swapping out the card. I think this was mentioned in some of the other threads on Infiniband.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
BTW, many modern host adapters can operate in either Ethernet or Infiniband mode, but only one at a time. So you can build a storage box and then plug it into either type of network without swapping out the card.
While this is true, I don't think the card that iXsystems generally recomends is one that can switch:
https://www.amazon.com/FreeNAS-Dual-Port-Upgrade-Ports-Twinax/dp/B011APKCHE/
I think this was mentioned in some of the other threads on Infiniband.
Sounds like you have looked into this a bit. I have not finished reviewing the thread, but you might want to have a look at this one. They are discussing making a custom build of FreeNAS that includes Infiniband support.
https://www.ixsystems.com/community/threads/infiniband-support.15573/page-3
 

Jason Bacon

Cadet
Joined
Sep 11, 2013
Messages
7
I believe the FreeBSD OFED stack was ported largely by Mellanox engineers, so you'd want to go with one of their recent ConnectX HCAs.

http://www.mellanox.com/page/products_dyn?product_family=193&mtag=freebsd_driver

In theory the OFED stack should support cards from other vendors, but I don't know who might be testing them. We use ConnectX cards here.

You should not need to download drivers from the Mellanox site, though. They are included in the FreeBSD src tree.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,175
From what I remember, and it might have changed, they don't use Infiniband even in TrueNAS implementations. This is the quote:
That was in October of 2015 though. Most people are saying that 10Gb SFP+ is the better way to go, but if this poster has an installed base that is already on Infiniband I do recall that some users have posted about being able to get it working.
I was mixing it up with Fiber Channel in my head, so you're probably right.
 
Top