10GbE performance (iperf = good, data copy = slow)

Magius

Explorer
Joined
Sep 29, 2016
Messages
70
I believe I've solved the problem as well as I'll be able to given the limitations of my current hardware. As a quick recap, last night I modified several FreeNAS tunables and was able to bump the speed from 75 MBps to 105 MBps on an rsync between the FN server and Linux server. That's an increase of almost 40%, certainly non-trivial.

This evening I did some research into tuning 10GbE on the Linux side and used a Mellanox guide to tweak some tunables there. The document I used is here, and below is what I entered in Ubuntu:
Code:
sysctl -w net.ipv4.tcp_timestamps=0
sysctl -w net.ipv4.tcp_sack=1
sysctl -w net.core.netdev_max_backlog=250000
sysctl -w net.core.rmem_max=4194304
sysctl -w net.core.wmem_max=4194304
sysctl -w net.core.rmem_default=4194304
sysctl -w net.core.wmem_default=4194304
sysctl -w net.core.optmem_max=4194304
sysctl -w net.ipv4.tcp_rmem="4096 87380 4194304"
sysctl -w net.ipv4.tcp_wmem="4096 65536 4194304"
sysctl -w net.ipv4.tcp_low_latency=1


With both sets of tunables active I was able to measure 155 MBps on an rsync between the two servers, and the Linux server was running at ~97% CPU so it's maxed out. At least this speed proves that everything is running at 10GbE as it's too fast for gigabit, so it looks like nothing is wrong with my ESXi configuration or OS drivers, hooray! Soon I'll be upgrading the CPU in the Linux server and reconfiguring it with FreeNAS to support ZFS replication, but at least now I'm comfortable knowing I'm getting everything the old unit is capable of, and the bottleneck is no longer the NIC for the time being.
 

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
If Linux was running at 90+% that's tell me it's not using any offloading. Sounds like a card/driver issue. Did you try iperf in TCP mode? As I recall it defaults to UDP. Also checking for excessive interupts may be worth while. You should be getting far better speeds without all the fiddling.
 

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
What kind of melnox cards are you using? What version of linux is you client does it have the same model melnox card? What are the specs of the client CPU ram etc. What CPU is in the host. We can play with tunables on both systems until we're blue in the face but until we know the underlying hardware on both ends we won't know of were close to what we can expect.
 
Joined
May 10, 2017
Messages
838
With both sets of tunables active I was able to measure 155 MBps on an rsync between the two servers,

Rsync is great but it's not the fastest, I could never get more than around 150MB/s for a single rsync session, I could get some improvements by changing the SSH cipher, but it's still a lot slower than for example scp, these were a few tests I made recently between a FreeNAS and a couple of Linux servers directly connected with 10GbE Mellanox NICs.

FreeNAS as source, CPU has hardware AES support, destination Linux, tested on servers with and without AES hardware CPU support, test was transfering a single 4.3GB file.

Code:
No hardware AES on dest:

scp default cipher                                100% 4181MB 150.1MB/s   00:27
scp -c aes128-ctr                                 100% 4181MB 192.8MB/s   00:21
scp -c aes128-gcm@openssh.com                     100% 4181MB 185.2MB/s   00:22
rsync default cipher                              4,384,403,013 100%  112.54MB/s    0:00:37
rsync -c aes128-gcm@openssh.com                   4,384,403,013 100%  142.83MB/s    0:00:29


Hardware AES on dest:

scp default cipher                                100% 4181MB 194.0MB/s   00:21
scp -c aes128-ctr                                 100% 4181MB 394.2MB/s   00:10
scp -c aes128-gcm@openssh.com                     100% 4181MB 431.7MB/s   00:09
rsync default cipher                              4,384,403,013 100%  120.25MB/s    0:00:34
rsync -c aes128-gcm@openssh.com                   4,384,403,013 100%  144.51MB/s    0:00:28


So based on these and my internet googling I belive it won't be easy to get more than around 150MB/s from rsync, what I've been doing to get around that is using simultaneous rsync transfers, around 5 or so and I can get to around 350MB/s, there are various ways of doing it, you can google it.
 

Magius

Explorer
Joined
Sep 29, 2016
Messages
70
Johnnie Black you're completely correct and that's some good background info for anyone who stumbles into this thread. In my case it's not as relevant because I'm doing a local rsync, not using any encryption at all. Specifically, instead of rsyncing to a remote host at IP:\folder, I mount the remote host into the local file system, and rysnc to the mount point, "fooling" rsync into thinking it's a local transfer and not using encryption.

In the early days of troubleshooting this, before I saw the difference between local and remote rsync and the encryption slowdown, I assumed the limitation was the rsync encryption as well, but I was able to disprove that in a couple different ways. First, changing the encryption cipher made no difference to the rsync speed at all. Second, trying to rsync to IP:\folder actually slowed everything *way* down since that brought in encryption baggage compared to the described "local" method. The final nail in the coffin was that no matter what method I tried, from dd to scp to plain old CP, mount via NFS or CIFS, etc. they all maxed at the exact same 75 MBps, with no more than 35% CPU utilization on the Linux machine. So clearly the problem was with the network, it just wasn't clear what was the cause.

I even tried multiple transfers as described in some posts above, but that barely changed anything. The best I was able to do before changing any tunables was two streams getting 45-50MBps each or 3 streams getting 30-35 MBps each. Better than the 75MBps of a single stream, but still pretty pathetic, and clearly still artificially capping due to something with the network, not yet pushing my CPU to max...

kdragon75, I realize that my 150 MBps speed even after fiddling with tunables is nothing to brag about, however I'm confident now that at least I'm maxing the abilities of my old Linux server. Clearly the network bottleneck was alleviated by the tunables, and the current bottleneck is due to my junk CPU on the Linux side. Once I upgrade that CPU it will be interesting to see whether the network becomes a bottleneck again, and at what speed, but for now the important point is that all equipment is obviously synced up at 10GbE. 150 MBps is "slow", but it's too fast for gigabit architecture. So that rules out misconfiguration in my ESXi infrastructure, OS drivers, etc., which is a relief.

For the record, my Linux server is a Supermicro mobo with a dual core Celeron, 2.2 GHz. It's mdadm RAID-6. I built it in 2008, and as much as I *really* wanted to use ZFS back then, FreeNAS didn't really exist as a usable product, and ZFS support on Linux was just barely in its infancy so I didn't trust it. I wasn't going to build a Solaris server, so here we are :) It's an old dog, but it's done it's job for me for over 10 years now. When I built the new FreeNAS server to replace it, I grabbed a couple cheap Mellanox ConnectX-2 cards on eBay. They're similarly long in the tooth, but they work well enough in Linux and FreeNAS, so for the $35 or so I spent for two cards and a fiber it was a fun experiment. I plan to upgrade the CPU in the Linux machine, install FreeNAS, and use it as a replication target over 10GbE for the new primary server. Even if the speed tops out at only say 3Gbps, the intent was to keep the replication traffic off my primary LAN. I could have done that with a gigabit NIC just the same, using 10GbE was more of a "because I can and it's cheap" kind of thing.

Long story short, after tweaking the tunables I'm getting all I can out of my network for now, until I put more horsepower into the Linux machine. Whenever I do that I'll try to drop back in and let folks know what kinds of speeds I can get, but in the meantime I'm satisfied with what I have, and I've proven that the system isn't grossly misconfigured.
 

acquacow

Explorer
Joined
Sep 7, 2018
Messages
51
dual-core Celeron, 2.2 GHz, mdadm RAID-6

Yeah, that could be a slight bottleneck.

But then again, I'm using 10-core 1.9GHz Xeons and maxing 10gige just fine. I guess I just have enough core overhead that it's not an issue. I only have pci-e flash at all of my endpoints as well. The drives max out around 1.4GB/sec.

I've found rsync to be slow, esp with small files, but I also mostly do transfers over samba with multichannel enabled. That doesn't have any issue breaking 10gigE speeds (with large files).

Also, a buddy just linked me this, you may find it interesting:
https://calomel.org/aesni_ssl_performance.html
 
Last edited:
Top