Poor 10G network performance

Status
Not open for further replies.

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,466
Although this involves my FreeNAS box (specs in my sig if relevant), I believe it's only tangentially--but there are some folks around here who just know this stuff quite a bit better than I, and I'm hoping someone can chime in with some helpful information or suggestions.

My problem, in short, is that I'm running into poor network performance between my FreeNAS server and a guest on a Proxmox VE 4.1 host. For connections running through a Dell 5524 switch, I'm seeing widely variable transfer rates in a range (as measured by iperf) of 432 Mb/sec to 1.8 Gb/sec. For a 10G connection, this seems entirely too low.

With the benefit of some troubleshooting from the Proxmox forums (https://forum.proxmox.com/threads/poor-network-performance-on-guest.25502/), it appears there are two distinct issues: (1) throughput to the guest is significantly slower than throughput to the host, using the same physical connections; and (2) throughput through the switch is significantly slower than throughput over a direct NIC-to-NIC connection. It is the latter that I'd like to focus on here.

The FreeNAS box and the PVE host each have a Chelsio T420-SO-CR 2-port 10G NIC installed. One port of each NIC is connected to the switch using a 2M twinax patch cable. The other port of each NIC is connected to the other NIC using the same kind of cable. The two ports are configured on different subnets--the port connected to the switch is on 192.168.1.0/24; the direct-connected port is on 192.168.2.0/24.

Here are the (rough) numbers I'm seeing:
  • VM <-> FreeNAS via switch: 430 Mb/sec - 2 Gb/sec
  • VM <-> FreeNAS via direct connection: ~ 5.5-6 Gb/sec
  • Host <-> FreeNAS via switch: 4.2-4.5 Gb/sec
  • Host <-> FreeNAS via direct connection: 9.4 Gb/sec
The VM <-> FreeNAS via switch number is, as you see, highly variable. I believe the 430 Mb/sec number is an anomaly; I've only seen that once. But I have seen 800-900 several times, so even taking that number on the low end, it's pretty variable.

So, connections through the switch are half the speed, or less, of direct connections. That sounds like a problem for a switch that's supposed to handle 10G. The switch's administration page indicates an operating temp of 46 C, so I wouldn't think it's overheating.

I understand there can be compatibility issues with SFP+, and I may not be able to determine that my patch cables are 100% compatible with both the NICs and the switch. I can look for Dell-branded optics for the switch, Chelsio-branded optics for the NICs, and fiber to hook it all up, but that's looking like $150 at least to test it.

I guess it's possible that the switch itself is defective. It's under warranty, but I'm thinking I'll have a hard time getting service without a hard failure.

Thoughts?
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
. It's under warranty, but I'm thinking I'll have a hard time getting service without a hard failure.
If it's under warranty, I'd call Dell and ask why a direct connection is fine, but when a switch is involved, performance is degraded.

I'd start looking at:
physical (SFP+'s, or twinax) issues.
switch configuration (STP?, Jumbo frames? Have you looked at the port interface stats for fragementation?)
 

Mlovelace

Guru
Joined
Aug 19, 2014
Messages
1,111
Do you have congestion in the switch ports? If you do do you have flow control enabled on the nics/switch? What config do you have on the switch ports?
 
Status
Not open for further replies.
Top