Jason Bacon
Cadet
- Joined
- Sep 11, 2013
- Messages
- 7
Just wondering if Infiniband support has been explored (again) lately.
The stock FreeBSD support has matured to the point where Infiniband can be enabled/updated in a few minutes (by using kernel modules rather than rebuilding the kernel) and offers reasonable performance. IB support is enabled in userland by default now. I updated the wiki page to explain the details:
https://wiki.freebsd.org/InfiniBand
I'm testing on a PowerEdge R730xd with a Mellanox FDR card in one of our HPC clusters.
From pciconf -lv:
vendor = 'Mellanox Technologies'
device = 'MT27500 Family [ConnectX-3]'
class = network
Below is a benchmark comparison using NFS 4 from the same compute node to a CentOS server (NFS over RDMA) and a FreeBSD server (NFS over TCP), connected mode on all nodes. As you can see, overall performance is comparable, except for overwrite. I have not put much effort into tuning at this time - this is using mostly default settings, although I played with MTU to balance stability and performance. The CentOS default MTU of 65520 caused periodic network hangups on the FreeBSD server (from which it always recovered after a few minutes). Reducing to 1/4 of that (16380) eliminated any instability with very little loss of throughput.
CentOS 7 XFS / server:
Averages of 3 trials:
125.03 GiB write 4.00 MiB blocks 145903.00 ms 877.53 MiB/s
1024 seek 4.00 MiB blocks 23.98 ms 170.67 MiB/s
125.03 GiB read 4.00 MiB blocks 236010.00 ms 542.49 MiB/s
125.03 GiB rewrite 4.00 MiB blocks 158151.00 ms 809.57 MiB/s
FreeBSD 12 / ZFS server:
Averages of 3 trials:
125.03 GiB write 4.00 MiB blocks 174645.00 ms 733.11 MiB/s
1024 seek 4.00 MiB blocks 14.67 ms 273.07 MiB/s
125.03 GiB read 4.00 MiB blocks 225402.00 ms 568.03 MiB/s
125.03 GiB rewrite 4.00 MiB blocks 413798.00 ms 309.41 MiB/s
At the moment I'm torture-testing the server with a genome assembler called canu and some particularly low-quality input data that causes canu's I/O to go through the roof. This assembly brought down our CentOS NFS servers until I switched them to NFS over RDMA. So far so good with FreeBSD. I had one incident where the FreeBSD IB interface shut down when it ran out of buffer space, but that was on a run that I had improperly restarted, causing even more excessive I/O. That issue will have to be diagnosed, of course, but right now it's handling the toughest load we've ever encountered in our HPC service.
The stock FreeBSD support has matured to the point where Infiniband can be enabled/updated in a few minutes (by using kernel modules rather than rebuilding the kernel) and offers reasonable performance. IB support is enabled in userland by default now. I updated the wiki page to explain the details:
https://wiki.freebsd.org/InfiniBand
I'm testing on a PowerEdge R730xd with a Mellanox FDR card in one of our HPC clusters.
From pciconf -lv:
vendor = 'Mellanox Technologies'
device = 'MT27500 Family [ConnectX-3]'
class = network
Below is a benchmark comparison using NFS 4 from the same compute node to a CentOS server (NFS over RDMA) and a FreeBSD server (NFS over TCP), connected mode on all nodes. As you can see, overall performance is comparable, except for overwrite. I have not put much effort into tuning at this time - this is using mostly default settings, although I played with MTU to balance stability and performance. The CentOS default MTU of 65520 caused periodic network hangups on the FreeBSD server (from which it always recovered after a few minutes). Reducing to 1/4 of that (16380) eliminated any instability with very little loss of throughput.
CentOS 7 XFS / server:
Averages of 3 trials:
125.03 GiB write 4.00 MiB blocks 145903.00 ms 877.53 MiB/s
1024 seek 4.00 MiB blocks 23.98 ms 170.67 MiB/s
125.03 GiB read 4.00 MiB blocks 236010.00 ms 542.49 MiB/s
125.03 GiB rewrite 4.00 MiB blocks 158151.00 ms 809.57 MiB/s
FreeBSD 12 / ZFS server:
Averages of 3 trials:
125.03 GiB write 4.00 MiB blocks 174645.00 ms 733.11 MiB/s
1024 seek 4.00 MiB blocks 14.67 ms 273.07 MiB/s
125.03 GiB read 4.00 MiB blocks 225402.00 ms 568.03 MiB/s
125.03 GiB rewrite 4.00 MiB blocks 413798.00 ms 309.41 MiB/s
At the moment I'm torture-testing the server with a genome assembler called canu and some particularly low-quality input data that causes canu's I/O to go through the roof. This assembly brought down our CentOS NFS servers until I switched them to NFS over RDMA. So far so good with FreeBSD. I had one incident where the FreeBSD IB interface shut down when it ran out of buffer space, but that was on a run that I had improperly restarted, causing even more excessive I/O. That issue will have to be diagnosed, of course, but right now it's handling the toughest load we've ever encountered in our HPC service.
Last edited: