10GbE performance (iperf = good, data copy = slow)

Magius

Explorer
Joined
Sep 29, 2016
Messages
70
I built a new FreeNAS (on ESXi) server, 11.2, and I'm trying to copy all the data from my old Ubuntu server to it. There's a point to point 10GbE cable between them, and when I run iperf in either direction it measures 4gbps. I'm happy with that, it's a limitation of virtualizing the NIC in ESXi instead of passing it through.

In any case, I've tried about a dozen different ways to copy the data from the old server to the new, but almost all of them max out at 75MBps (600mbps) or less. I've confirmed that the transfer is using the 10GbE path (both FreeNAS & Ubuntu show traffic on the 10GbE interfaces, not on the 1GbE), but I can't get the speed up no matter what I've tried.

Here are several of the things I've tried, with some notes/thoughts about what I think it tells me:
1.) Mount the FreeNAS pool under Ubuntu using NFS, run rsync on Ubuntu as a "local" transfer = 75-80 MBps
2.) Don't mount anything, run rsync on Ubuntu as a "remote" transfer to 'user@FreeNAS:/mnt/Tank' = 35-40 MBps

Clearly mounting the share and doing a local transfer is faster. Presumably because it skips the SSH tunnel, encryption, etc.

3.) Mount the FreeNAS pool under Ubuntu using NFS, run dd on Ubuntu = 75-80 MBps
4.) Mount the FreeNAS pool under Ubuntu using NFS, run cp on Ubuntu = I forget the speed, but it wasn't any higher, might have been a little lower

The above two show it's not a limitation of rsync, I think, since 'dd' runs the same speed. I didn't expect 'cp' to be fast, but I did it for the heck of it.

5.) Mount the Ubuntu share under FreeNAS using mount_smbfs, run rsync on FreeNAS as a "local" transfer = 75-80 MBps

The CPU usage on Ubuntu was never higher than 30-35% during any of the above methods, but I thought I'd try moving some execution off to FreeNAS instead since it's a beefier server. Basically #5 is #1 in reverse, hoping to offload everything but the actual data movement from Ubuntu. The above result seems to show that it's not a CPU limitation. The usage stayed around 30% on Ubuntu, and speed didn't improve at all.

6.) SSH into Ubuntu and do Step #1 three times on three different folders, hoping to get ~3x aggregate speed (ie: parallel transfers)

CPU utilization on Ubuntu went up to 60-70% or so, as expected, but transfer speed ended up being split between the three, totaling 80MBps, the same as before. So the limit is definitely not CPU/mem on the old server, it must be something about the 10GbE network path?

7.) Using a Windows workstation on the 1GbE network, mount both old and new servers' Samba shares, then drag and drop folders from old to new = 90 MBps

This last one is the real kicker. It uses the 1GbE network instead of the 10GbE network, but manages to run ~15% faster, clearly limited by the 1GbE interface on the Windows machine. I can't for the life of me understand why a point to point transfer over the 10GbE network goes even slower than this drag and drop over 1GbE through Windows and multiple switches. I would suspect something misconfigured about the 10GbE interface, maybe even misconfigured in ESXi, but using iperf between FreeNAS and Ubuntu (in either direction) I get 4gbps no problem..? I'd be thrilled to get even half that moving real data across, which would be more than 3x what I'm actually getting now. Does anyone have any suggestions?

I'm really hoping this is one of those "you idiot" moments where I've overlooked something very simple. It's sad and humorous at the same time that my best bet for copying data at this point is a drag and drop in Windows to get around the "slower" 10GbE link, lol :) It's no huge deal since this is a one-time transfer and even if it takes days it won't bother me. But in the future I want to run rsync backups from FreeNAS to Ubuntu over the 10GbE link, and I'd like to get as much bandwidth as possible out of the link. Thanks!
 

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
Can you post a sanitized copy of you rVMs .vmx file? Or at least the output of lspci in FreeNAS?
 

Magius

Explorer
Joined
Sep 29, 2016
Messages
70
Here's the result of lspci. I can take a look at pulling the VMX and sanitizing it later if necessary.
Code:
root@freenas:~ # lspci
00:00.0 Host bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX Host bridge (rev 01)
00:01.0 PCI bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX AGP bridge (rev 01)
00:07.0 ISA bridge: Intel Corporation 82371AB/EB/MB PIIX4 ISA (rev 08)
00:07.0 ISA bridge: Intel Corporation 82371AB/EB/MB PIIX4 IDE (rev 08)
00:07.0 ISA bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 08)
00:07.0 ISA bridge: VMware Virtual Machine Communication Interface (rev 08)
00:0f.0 VGA compatible controller: VMware SVGA II Adapter
00:10.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 01)
00:11.0 PCI bridge: VMware PCI bridge (rev 02)
00:15.0 PCI bridge: VMware PCI Express Root Port (rev 01)
       ... removed 20+ of the same PCI bridge lines at addresses 15, 16 17, 18...
00:18.0 PCI bridge: VMware PCI Express Root Port (rev 01)
02:00.0 USB controller: VMware USB1.1 UHCI Controller
02:01.0 Ethernet controller: Intel Corporation 82545EM Gigabit Ethernet Controller (Copper) (rev 01)
02:02.0 USB controller: VMware USB2 EHCI Controller
02:04.0 SATA controller: VMware SATA AHCI controller
02:05.0 Ethernet controller: Intel Corporation 82545EM Gigabit Ethernet Controller (Copper) (rev 01)
03:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03)
0b:00.0 SATA controller: Intel Corporation C610/X99 series chipset sSATA Controller [AHCI mode] (rev 05)


That didn't look right to me, since it showed two gigabit controllers, which made me think to dump ifconfig as well. From that it looks like em1 (my 10GbE interface) autonegotiated 1000baseT.
Code:
em1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=9b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM>
        ether 00:0c:29:b1:d6:89
        hwaddr 00:0c:29:b1:d6:89
        inet 192.168.99.10 netmask 0xffffff00 broadcast 192.168.99.255
        nd6 options=9<PERFORMNUD,IFDISABLED>
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active


I'm looking into that now, see if I can tweak ESXi to get FreeNAS to report 10GbE (I swear it used to before I did the 11.2 upgrade..?). Still seems weird iperf can do 4gbps over what appears to be a gigabit link..? I feel like you sent me down a productive path though, I'll report back if I figure out anything else useful. Thanks!
 

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
I built a new FreeNAS (on ESXi) server, 11.2, and I'm trying to copy all the data from my old Ubuntu server to it. There's a point to point 10GbE cable between them, and when I run iperf in either direction it measures 4gbps. I'm happy with that, it's a limitation of virtualizing the NIC in ESXi instead of passing it through.
...snip...
I get iperf transfer rates close to line speed over the 10G interfaces on my virtualized FreeNAS instances. I had to configure jumbo frames to get it:

https://forums.freenas.org/index.ph...ceiving-network-data.53722/page-2#post-378891

I got less than half this rate -- ~4.5Gb/s -- using the standard 1500-byte MTU. Sounds like you may have the same problem.

Nevertheless... I never get anything close to 10G transfer rates in real life, whether via SMB, rsync, or replication. No idea why, and I've pretty much given up in despair at ever utilizing the full potential of my network. Ah, well.

After seeing your lspci output, I'm wondering how you have your FreeNAS VM configured. Specifically, are you sure you're not passing the NIC through? My FreeNAS VMs all report the VMware VMXNET3 ethernet controller when I run lspci.

Good luck!
 

Magius

Explorer
Joined
Sep 29, 2016
Messages
70
You're totally right Spearfoot. Seeing that lspci output shocked me, because I know it used to be set up with the VMXNET3 driver. I just went into my ESXi config and somehow that adapter type had been changed to e1000. I just changed it back to VMXNET3 and I'm rebooting FreeNAS, so hopefully this is the "you idiot" moment I was hoping for :) It's just odd, because I haven't knowingly changed anything on the ESXi side in months. Obviously the 11.2 upgrade isn't going to jack with my ESXi settings, so I have no idea how they changed.

I'll tinker around on this path and report back whether it's working or not. Thanks again to both of you!
 

Magius

Explorer
Joined
Sep 29, 2016
Messages
70
This tale only gets weirder. I reset the adapter in ESXi to VMXNET3, and it's now detected as that in FN lspci. I re-rean iperf and now I measure 5.11gbps, up a little bit from the 4gbps I had before. However, I just kicked off the same rsync command (it resumed where it left off) but it's only running at 50-55 MBps, down from the previous 75 MBps. I totally thought that once we identified the misconfigured GbE controller this would resolve itself... ugh.

I am only using 1500MTU for what it's worth, but frankly I just wanted to see it "working properly" first before I went into more exotic tweaks like MTU. Another interesting note, if I drag and drop between servers from Windows on the 1GbE network *while* the rsync is running over the 10GbE, the rsync rate drops from 55 to 30 MBps, but the Windows copy goes at 88 MBps no problem. The Ubuntu CPU spikes to 70% from 30%, but clearly the servers have the resources to push more data between them. That 10GbE path is just gimped for some reason...
 

Magius

Explorer
Joined
Sep 29, 2016
Messages
70
For reference, if anyone wants to see the new info:
lspci entry:
Code:
13:00.0 Ethernet controller: VMware VMXNET3 Ethernet Controller (rev 01)


ifconfig:
Code:
vmx0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=60039b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,TSO6,RXCSUM_IPV6,TXCSUM_IPV6>
        ether 00:0c:29:b1:d6:89
        hwaddr 00:0c:29:b1:d6:89
        inet 192.168.99.10 netmask 0xffffff00 broadcast 192.168.99.255
        nd6 options=9<PERFORMNUD,IFDISABLED>
        media: Ethernet autoselect
        status: active
 

Magius

Explorer
Joined
Sep 29, 2016
Messages
70
Sorry for all the back to back posts, but just another data point. The 55 MBps I had above was when running rsync on FreeNAS, from an SMB mounter filesystem on the Ubuntu to the Tank. I just re-tested the other direction, running rsync from Ubuntu, from its own filesystem to the NFS mounted Tank, and got the original 75 MBps.

Before fixing the adapter driver they were both giving 75 MBps, so it's still weird that now running the copy from FreeNAS got slower, but at least I'm back running at the same speed I was while I continue to troubleshoot :)
 

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
At this point, please provide a *detailed* map of the network including all hardware, uplink settings, physical switch config, and addressing.
 

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
Also what is the guest OS setting for the VM in ESXi? (all part of the .vmx file)
 

acquacow

Explorer
Joined
Sep 7, 2018
Messages
51
If you ever need to compare data points, I'm running FreeNAS 11.1-U6 and ESXi 5.5 and can saturate 10gige between my ESXi VMs and FreeNAS just fine.

This is only with large files though, like ISOs. You'll never saturate 10gige with small files. eg: my user profile folder only transfers around ~30MB/sec due to all the small files in it.
 

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
If you ever need to compare data points, I'm running FreeNAS 11.1-U6 and ESXi 5.5 and can saturate 10gige between my ESXi VMs and FreeNAS just fine.

This is only with large files though, like ISOs. You'll never saturate 10gige with small files. eg: my user profile folder only transfers around ~30MB/sec due to all the small files in it.
I haven't seen any hardware listed yet...Therefore your performance experience is not applicable.
 

acquacow

Explorer
Joined
Sep 7, 2018
Messages
51
My FreeNAS box is a xeon-d 1541 with 32GB of DDR, I have a raid z1 SSD tier and a z2 8HDD tier.
ESXi box is a xeon 2648Lv2 with 96GB of DRAM and 3.2TB of pci-e flash with an intel x540 nic.
Switch is a netgear 10gig XS708T
 

Magius

Explorer
Joined
Sep 29, 2016
Messages
70
Sorry it's been a busy week with the holidays. The good news is it didn't take that long to rsync ~5TB between the two servers, I just let it run overnight one night, but I never did figure out why the 10 GbE link is going so slowly.

I'll try to answer some of the questions from above. First, the FreeNAS VM is configured as FreeBSD 64-bit OS. That seemed like the closest match when I built the VM long ago, however I've always had a notification saying "The configured guest OS (FreeBSD (64-bit)) for this virtual machine does not match the guest that is currently running (FreeBSD 11.2-STABLE)", so maybe I've been foolish for ignoring that all along? :)

General hardware description of the ESXi server is a Supermicro X10SRL-F, 8-core Xeon, 64GB ECC, with an LSI SAS HBA and one of the two on-board Intel HBAs passed through to FN. The LSI connects to the backplane of the SM 846 chassis and the Intel HBA has an s3700 SSD connected for ZIL/SLOG.

As far as the 10GbE networking, I hate to say it, but it's about as simple as it gets. Starting with the physical hardware I have a Mellanox 10GbE card in the motherboard owned by ESXi as vmnic2. I have a second identical Mellanox 10GbE card sitting in my Linux server, and a fiber running between them. It's point to point at the physical level, no other switches or anything between the cards in the ESXi and Linux servers.

Back in ESXi, unfortunately I can't pass the 10GbE NIC through to FreeNAS because it creates some kind of IRQ errors from what I remember. I researched that back when I first built the machine and at the time it seemed like that was "normal" for my configuration and not something that I could fix. Now that ESXi 6.7 is out (I'm running 6.5) I've considered updating and trying pass through again, but it doesn't really concern me because unlike the storage adapters I didn't really need FreeNAS to own the NIC...

So, the 10Gb vmnic2 is configured with auto-negotiate disabled and speed set to 10,000 MBps, full duplex under ESXi. It is then connected as the uplink for a vSwitch1 I created. The original vSwitch0 is used for all the 1GbE networking for the VMs. Here are the settings of vSwitch1:
Code:
vSwitch Details
    MTU    9000
    Ports    3072 (3057 available)
    Link discovery    Listen / Cisco discovery protocol (CDP)
    Attached VMs    1 (1 active)
    Beacon interval    1
NIC teaming policy
    Notify switches    Yes
    Policy    Route based on originating port ID
    Reverse policy    Yes
    Failback    Yes
Security policy
    Allow promiscuous mode    No
    Allow forged transmits    No
    Allow MAC changes    No
Shaping policy
    Enabled    No


vSwitch1 only has one other thing connected to it, which is the FreeNAS VM, NIC #2. That NIC is configured with the VMXNET3 type in ESXi and inside FreeNAS it detects it as vmx0:

Code:
vmx0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 9000
        options=60079b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,TSO6,LRO,RXCSUM_IPV6,TXCSUM_IPV6>
        ether 00:0c:29:b1:d6:89
        hwaddr 00:0c:29:b1:d6:89
        inet 192.168.99.10 netmask 0xffffff00 broadcast 192.168.99.255
        nd6 options=9<PERFORMNUD,IFDISABLED>
        media: Ethernet autoselect
        status: active


One thing that strikes me as odd in the above is the "SIMPLEX" tag. That doesn't seem normal, but my 1GbE ports also say simplex, so maybe that's just a FreeBSD thing I'm not familiar with. For what it's worth, just for fun I've tried manually setting the media type to 10GbaseT from the shell but it won't take. I don't know if there's a way to tell what speed it's getting from the "autoselect"? As you can see though, from the virtuual 10GbE NIC inside FreeNAS the connection is to the virtual 10GbE switch inside ESXi, and then directly out via fiber to the Linux server 10GbE NIC.

When I run iperf, I usually see ~9 Gbps if the Linux client is testing the FreeNAS server, and ~3 Gbps if the FreeNAS client is testing the Linux server:
Code:
marc@Backup:~$ iperf -c 192.168.99.10
------------------------------------------------------------
Client connecting to 192.168.99.10, TCP port 5001
TCP window size: 96.1 KByte (default)
------------------------------------------------------------
[  3] local 192.168.99.19 port 57336 connected with 192.168.99.10 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  10.2 GBytes  8.74 Gbits/sec
marc@Backup:~$ iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 192.168.99.19 port 5001 connected with 192.168.99.10 port 63856
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec  3.48 GBytes  2.98 Gbits/sec


So far, no matter what I've tried with actual data transfers though, I keep capping right around 100 MBps, just like you'd expect from a GbE NIC. Even running multiple transfers, say an rsync from FN to Linux and another rsync from Linux to FN, there's plenty of CPU to spare, but they'll share the bandwidth and run at ~50 MBps each. The bottleneck is definitely in the network, and whatever the root cause is the symptom seems to be that my 10GbE link only runs at 1GbE, except when I use iperf where it works great...?
 

Magius

Explorer
Joined
Sep 29, 2016
Messages
70
Something hit me and I remembered that with older versions of FN there were some networking tunables and something about mbufs where you had to tweak one or both to get good 10GbE performance. I did a little searching and came up with this list of tunables:
Code:
sysctl kern.ipc.somaxconn=2048
sysctl kern.ipc.maxsockbuf=16777216
sysctl net.inet.tcp.recvspace=4194304
sysctl net.inet.tcp.sendspace=2097152
sysctl net.inet.tcp.sendbuf_max=16777216
sysctl net.inet.tcp.recvbuf_max=16777216
sysctl net.inet.tcp.sendbuf_auto=1
sysctl net.inet.tcp.recvbuf_auto=1
sysctl net.inet.tcp.sendbuf_inc=16384
sysctl net.inet.tcp.recvbuf_inc=524288


Here are my current values for comparison:
Code:
kern.ipc.somaxconn: 128
kern.ipc.maxsockbuf: 2,097,152
net.inet.tcp.recvspace: 65,536
net.inet.tcp.sendspace: 32,768
net.inet.tcp.sendbuf_max: 2,097,152
net.inet.tcp.recvbuf_max: 2,097,152
net.inet.tcp.sendbuf_auto: 1
net.inet.tcp.recvbuf_auto: 1
net.inet.tcp.sendbuf_inc: 8,192
net.inet.tcp.recvbuf_inc: 16,384


The tunable values above were for FN 9.x, but I wonder if I should try to plug them in even though I'm running 11.2? It looks like the buffers and max connections are increased 8x and the send & receive space values are increased 64x. If I did decide to try these, would I put them in through the GUI in the System -> Tunables page? Just add the variable name and the new value I want there, then reboot after they're all in? Or do I need to edit the /etc/sysctl file manually?

Or am I barking up an obsolete tree here and these tunables are no longer relevant in FN 11? :)
 

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
Remove all network tunables and reset everything to an MTU of 1500. Newer versions of FreeBSD handle 10gbe much more readily than it used to.

Basically start with a clean, un-tweaked state and work from there.
 

acquacow

Explorer
Joined
Sep 7, 2018
Messages
51
Well, now that I know you hardware, I can add this.

I had virtualized FreeNAS on an X9SRL-F with an E5-2648L v2 cpu running ESX 5.5. I didn't pass through the 10GigE card and just used the VMWare virtual nic in FreeNAS.

I had MTU of 9000 set in FreeNAS and in the vswitch/vnic for the 10gigE virtual network in ESX.

I can set this up again if you'd like to compare notes.

Considering I'm on a generation older hardware, you should be able to max 10gigE just fine.

-- Dave
 

Magius

Explorer
Joined
Sep 29, 2016
Messages
70
kdragon I already have a clean un-tweaked state. I was asking above whether I should add those tunables to attempt to make things better. It sounds like your vote on that is no? I might try it anyway over the next few days, since I'm fresh out of other options...

Also, I had the MTU at 1500 when I created this thread, and bumped it to 9000 last week to see if it made any difference. It made iperf run faster, but had no effect, positive or negative, on the speed of rsync. Makes sense, since the problem seems to be the 10GbE link running at 1GbE. I just can't figure out where in the chain it thinks it's gigabit.

Aquacow, it sounds like your setup on the X9 board was very similar to mine. I agree with you it seems I should be able to do this with no problems. I wonder if there's anything on the Linux side that I should be looking at as well, maybe the FreeNAS/ESXi side has been a red herring all along..? I'll give that a look over the next few days, and at some point, once I know the data is secure on the FreeNAS server, I plan to rebuild the Linux server using FreeNAS as well. Maybe things will change once it's talking FN to FN. Fingers crossed, lol.
 

Magius

Explorer
Joined
Sep 29, 2016
Messages
70
Just for fun I entered the tunables from above and re-ran the test. I didn't enter them as persistent through the GUI, just temporary using sysctl at the shell. Re-running iperf the speed from the FN client to the Linux server nearly doubled from 3 Gb, to 5.72 Gb. The speed in the other direction was unchanged at 9 Gb.

I then re-ran one of my rsync commands on the Linux host, pushing data over to FN, and instead of the 75 MBps I was getting before it went at 105 MBps. That's a substantial improvement, but still pretty much what a maxed gigabit line would do. The CPU usage on the Linux server was ~70% instead of ~30-35% like before, so it's possible that's affecting the transfer, but it'd still think it should go a little faster until it maxes the CPU.

I still feel like there's a network bottleneck somewhere, but the tunables definitely helped. I'll probably make them permanent in the GUI and just cut my losses until I upgrade the Linux server to FN and start testing ZFS replications.
 
Top