Upgrade recommendations

Status
Not open for further replies.
Joined
May 10, 2017
Messages
838
and got pretty darn good results.

Very similar to I get get, though in my case between FreeNAS and Win10:

Code:
root@tower7:~ # iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 4.00 MByte (default)
------------------------------------------------------------
[  4] local 10.0.0.5 port 5001 connected with 10.0.0.90 port 50831
[ ID] Interval	   Transfer	 Bandwidth
[  4]  0.0-10.0 sec  11.1 GBytes  9.53 Gbits/sec
 

melloa

Wizard
Joined
May 22, 2016
Messages
1,749
Very similar to I get get, though in my case between FreeNAS and Win10:

And with fresh install without autotune ...

Code:
iperf -c 10.10.10.152
------------------------------------------------------------
Client connecting to 10.10.10.152, TCP port 5001
TCP window size: 1.00 MByte (default)
------------------------------------------------------------
[  3] local 10.10.10.220 port 40337 connected with 10.10.10.152 port 5001
[ ID] Interval	   Transfer	 Bandwidth
[  3]  0.0-10.0 sec  10.8 GBytes  9.31 Gbits/sec



Now is find the ESXi drive for the T320 and try and match that speed on a VM, as it proves the hardware is working ;)

Thank you all for your help with my researches. I really appreciated.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
I just ran iperf between my two FreeNAS units, and got pretty darn good results.
I still have not figured out what the problem is in my configuration. This is between my two FreeNAS systems that are connected by 10Gb ports on the Aruba switch.
Code:
------------------------------------------------------------
Client connecting to 192.168.1.103, TCP port 5001
TCP window size: 32.8 KByte (default)
------------------------------------------------------------
[  3] local 192.168.1.23 port 13404 connected with 192.168.1.103 port 5001
[ ID] Interval	   Transfer	 Bandwidth
[  3]  0.0-10.0 sec  5.94 GBytes  5.10 Gbits/sec

Code:
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 4.00 MByte (default)
------------------------------------------------------------
[  4] local 192.168.1.103 port 5001 connected with 192.168.1.23 port 13404
[ ID] Interval	   Transfer	 Bandwidth
[  4]  0.0-10.0 sec  5.94 GBytes  5.10 Gbits/sec

I am tempted to think that the Mellanox card might be at fault. I have a Chelsio card in one NAS and a Mellanox in the other.
 
Joined
Dec 29, 2014
Messages
1,135
Do you have SSH enabled so you can have multiple sessions up? I don't know if it will prove anything, but what happens if you try iperf to itself on the 10G LAN segment on each box? Does it make any difference if you use a Twinax cable between the boxes (if that will reach)?
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Do you have SSH enabled
Absolutely. First thing I do after installing windows is install Cygwin because I like the SSH function in it better than PuTTY but I use them both.
I don't know if it will prove anything, but what happens if you try iperf to itself on the 10G LAN segment on each box?
I think you are saying to have the system be both client and server. I know that is possible, but I don't know if it will give useful results. I can try that when I get home, but that won't be until after 8 PM.
Does it make any difference if you use a Twinax cable between the boxes (if that will reach)?
I don't have one long enough to go direct from one to the other bypassing the switch. Is that what you mean? I do have a fiber with transceivers. I could give that a try, but I want to put the tunables that @Johnnie Black shared into the systems, reboot and test first, just to see if that is where it is at. I had been using some tunables that were posted by @jgreco a while back, but I can't find that post. These are different and I am curious if it makes a difference.

I will have to get back to this later because I am at work now.
 
Joined
Dec 29, 2014
Messages
1,135
I think you are saying to have the system be both client and server. I know that is possible, but I don't know if it will give useful results.

Yes, that is exactly what I am suggesting. I don't know if it will prove anything either. That SHOULD actually put the packets on the wire. Since you are only involving one system at a time, that MIGHT help identify which one is holding your performance back.

I don't have one long enough to go direct from one to the other bypassing the switch. Is that what you mean? I do have a fiber with transceivers.

Yes, that is what I am suggesting there as well. I would try the iperf to itself first since that is fairly easy, and non-impactful. My gut is that it is something with one of the NIC's, but it is worth taking the switch out of the equation if nothing conclusive shows from the self iperf test. I would be more inclined to suspect the Mellanox NIC, but that is just because the Chelsio's tend to work so well. I did a test from my HP DL320 G7 with a Mellanox NIC to my FreeNAS servers, and got good throughput. I don't know if the ESXi 6.0 Mellanox drivers get better performance than the FreeBSD Mellanox drivers.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
I would be more inclined to suspect the Mellanox NIC, but that is just because the Chelsio's tend to work so well.
I have a second Chelsio card, and I have been meaning to change to it, but I just have not taken the time to pull the covers off the system. If not tonight, maybe this weekend. I have been stuck at half of 10Gb speed for around a year, and I was blaming my home brew switch for it, now I don't have that and it would be nice to finally figure out where the problem is, especially seeing the results @melloa posted above.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
And with fresh install without autotune ...

Code:
iperf -c 10.10.10.152
------------------------------------------------------------
Client connecting to 10.10.10.152, TCP port 5001
TCP window size: 1.00 MByte (default)
------------------------------------------------------------
[  3] local 10.10.10.220 port 40337 connected with 10.10.10.152 port 5001
[ ID] Interval	   Transfer	 Bandwidth
[  3]  0.0-10.0 sec  10.8 GBytes  9.31 Gbits/sec



Now is find the ESXi drive for the T320 and try and match that speed on a VM, as it proves the hardware is working ;)

Thank you all for your help with my researches. I really appreciated.
Great job there. I hope to get my system working that well soon.
 

melloa

Wizard
Joined
May 22, 2016
Messages
1,749
Great job there. I hope to get my system working that well soon.

I'm only half done. The test above was between two bare metal freenas servers with T320 via the Aruba switch. I now have two other problems to solve:

1 - ESXi 6.5 -> no drivers for my T320 and no $ for a T520, so using Mellanox. With that I was able to get one way (server to VM) @ 9.3 Gb/s, but still working on VM to server (getting 1.9 Gb/s ...)
2 - My linux workstation, with a T320, is also getting ~ 1.9 Gb/s.

More to come ;)
 
Joined
Dec 29, 2014
Messages
1,135
With that I was able to get one way (server to VM) @ 9.3 Gb/s, but still working on VM to server (getting 1.9 Gb/s ...)

This isn't as specific a comment as I would like, but there is definitely something in ESXi that throttles actions in the hypervisor itself in favor of the functions that service virtual machines. I noticed that particularly when testing from the ESXi host that has my 3 production VM's mounted from FreeNAS. They aren't very busy, so the decrease in iperf throughput between that and my identically configured 2nd ESXi host wasn't explainable another other way for me. I would love to have a more technical explanation, but it exceeds my somewhat limited ESXi knowledge.
 

melloa

Wizard
Joined
May 22, 2016
Messages
1,749
but there is definitely something in ESXi that throttles actions in the hypervisor itself in favor of the functions that service virtual machines

I'm not an ESXi expert myself; Learning as I go. I'll post my observations if successful so others would be able to duplicate.

One thing though, I was able to increase the through put of the VM tx to 9Gb/s turning traffic shape on. Didn't do any tweaks yet and rx through put still holding the 1.9 Gb/s.

upload_2018-10-18_14-0-21.png
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
I still have not figured out what the problem is in my configuration.

Long thread... I haven’t been keeping up, but have you tried adjusting the tcp congestion control algorithm?

https://forums.freenas.org/index.ph...low-only-to-freenas-sanity-check.43811/page-2

When I was tuning 10gbe in ESXi, I had an issue, where it worked 10/10 (up/down) from Ubuntu and only 1/10 from a FreeNAS VM to/from a FreeNAS bare metal.

Was quite strange.

Congestion control fixed it. And I think I later confirmed that Ubuntu default was indeed the new setting... was it cubic?

Maybe :)
 

melloa

Wizard
Joined
May 22, 2016
Messages
1,749
When I was tuning 10gbe in ESXi, I had an issue, where it worked 10/10 (up/down) from Ubuntu and only 1/10 from a FreeNAS VM to/from a FreeNAS bare metal.

One question I have not asked ... what NICs are you guys using on ESXi 6.5? I just got one of this to test, as my T320 is unsupported by VMWare: https://www.ebay.com/p/Chelsio-T520...0-1160-50/11019429541?iid=123402002557&chn=ps

My Mellanox is supported, so will keep it for the VMs and pass through that to FreeNAS ... maybe will improve my performance.

Will also try the above suggestion.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Yes, that is exactly what I am suggesting. I don't know if it will prove anything either. That SHOULD actually put the packets on the wire. Since you are only involving one system at a time, that MIGHT help identify which one is holding your performance back.
It is a little odd how the results come out.

This is from the primary NAS to itself:
Code:
# iperf -c 192.168.1.103
------------------------------------------------------------
Client connecting to 192.168.1.103, TCP port 5001
TCP window size: 47.9 KByte (default)
------------------------------------------------------------
[  3] local 192.168.1.103 port 61184 connected with 192.168.1.103 port 5001
[ ID] Interval	   Transfer	 Bandwidth
[  3]  0.0-10.0 sec  49.7 GBytes  42.5 Gbits/sec

This is from the backup NAS to itself:
Code:
 # iperf -c 192.168.1.23
------------------------------------------------------------
Client connecting to 192.168.1.23, TCP port 5001
TCP window size: 47.9 KByte (default)
------------------------------------------------------------
[  3] local 192.168.1.23 port 45422 connected with 192.168.1.23 port 5001
[ ID] Interval	   Transfer	 Bandwidth
[  3]  0.0-10.0 sec  55.5 GBytes  47.5 Gbits/sec

This is from my Windows laptop to itself with its integrated a 1Gb NIC
Code:
> iperf3 -c 192.168.1.3
Connecting to host 192.168.1.3, port 5201
[  4] local 192.168.1.3 port 49661 connected to 192.168.1.3 port 5201
[ ID] Interval		   Transfer	 Bandwidth
[  4]   0.00-1.00   sec   139 MBytes  1.16 Gbits/sec
[  4]   1.00-2.00   sec   144 MBytes  1.21 Gbits/sec
[  4]   2.00-3.00   sec   152 MBytes  1.27 Gbits/sec
[  4]   3.00-4.01   sec   154 MBytes  1.28 Gbits/sec
[  4]   4.01-5.01   sec   155 MBytes  1.30 Gbits/sec
[  4]   5.01-6.01   sec   158 MBytes  1.32 Gbits/sec
[  4]   6.01-7.00   sec   156 MBytes  1.32 Gbits/sec
[  4]   7.00-8.00   sec   158 MBytes  1.32 Gbits/sec
[  4]   8.00-9.00   sec   159 MBytes  1.33 Gbits/sec
[  4]   9.00-10.00  sec   159 MBytes  1.34 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval		   Transfer	 Bandwidth
[  4]   0.00-10.00  sec  1.50 GBytes  1.28 Gbits/sec				  sender
[  4]   0.00-10.00  sec  1.50 GBytes  1.28 Gbits/sec				  receiver
iperf Done.

So, I changed the tunables on both of the NAS systems and rebooted.
This is what I got from the Windows Desktop, with a 10Gb Mellanox card, to the Primary NAS, also with a Mellanox card:
Code:
>iperf -c 192.168.1.103
------------------------------------------------------------
Client connecting to 192.168.1.103, TCP port 5001
TCP window size:  208 KByte (default)
------------------------------------------------------------
[  3] local 192.168.1.11 port 61288 connected with 192.168.1.103 port 5001
[ ID] Interval	   Transfer	 Bandwidth
[  3]  0.0-10.0 sec  4.59 GBytes  3.95 Gbits/sec

This is what I get from the backup NAS, with the Chelsio card to the primary NAS:
Code:
# iperf -c 192.168.1.103
------------------------------------------------------------
Client connecting to 192.168.1.103, TCP port 5001
TCP window size: 2.00 MByte (default)
------------------------------------------------------------
[  3] local 192.168.1.23 port 32462 connected with 192.168.1.103 port 5001
[ ID] Interval	   Transfer	 Bandwidth
[  3]  0.0-10.0 sec  5.83 GBytes  4.99 Gbits/sec

So, I am still unsure what the issue is that makes my network slow. Any ideas?
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
I did another test that I realized I had not done yet.

From the Windows 10 system to the Primary NAS, with the Mellanox card:
Code:
>iperf -c 192.168.1.103
------------------------------------------------------------
Client connecting to 192.168.1.103, TCP port 5001
TCP window size:  208 KByte (default)
------------------------------------------------------------
[  3] local 192.168.1.11 port 61508 connected with 192.168.1.103 port 5001
[ ID] Interval	   Transfer	 Bandwidth
[  3]  0.0-10.0 sec  4.69 GBytes  4.03 Gbits/sec

From the Windows 10 system to the Backup NAS, with the Chelsio card:
Code:
>iperf -c 192.168.1.23
------------------------------------------------------------
Client connecting to 192.168.1.23, TCP port 5001
TCP window size:  208 KByte (default)
------------------------------------------------------------
[  3] local 192.168.1.11 port 61510 connected with 192.168.1.23 port 5001
[ ID] Interval	   Transfer	 Bandwidth
[  3]  0.0-10.0 sec  9.79 GBytes  8.41 Gbits/sec

That is a big difference. I think I need to change out that Mellanox card...
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
I changed the Mellonox card in the primary NAS and these are the numbers now:

From Windows to the Primary NAS:
Code:
>iperf -c 192.168.1.103
------------------------------------------------------------
Client connecting to 192.168.1.103, TCP port 5001
TCP window size:  208 KByte (default)
------------------------------------------------------------
[  3] local 192.168.1.11 port 61563 connected with 192.168.1.103 port 5001
[ ID] Interval	   Transfer	 Bandwidth
[  3]  0.0-10.0 sec  9.28 GBytes  7.97 Gbits/sec

From the backup NAS to the Primary NAS, both using Chelsio cards:
Code:
# iperf -c 192.168.1.103
------------------------------------------------------------
Client connecting to 192.168.1.103, TCP port 5001
TCP window size: 2.00 MByte (default)
------------------------------------------------------------
[  3] local 192.168.1.23 port 31612 connected with 192.168.1.103 port 5001
[ ID] Interval	   Transfer	 Bandwidth
[  3]  0.0-10.0 sec  11.0 GBytes  9.42 Gbits/sec
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Scratch that. After another reboot, the primary is right back to where it was, even using the Chelsio card and the settings are the same in tunables.
It doesn't make sense to me and I have to get some sleep.
 

melloa

Wizard
Joined
May 22, 2016
Messages
1,749
My Mellanox is supported, so will keep it for the VMs and pass through that to FreeNAS ... maybe will improve my performance.

Just took a step back and will take a sabbatical until the T5 arrives.

It is a little odd how the results come out.

I was experiencing the same with my Mellanox on ESXi. rx and tx results very different. I'm moving away from it and will try the T5 in pass through.

With both bare metal servers (FreeNAS 11.1 and FreeBSD 11.2) using Chelsio T3s, I'm getting 9.5 Gb/s. The only odd thing is that I had to load the drivers manually on a fresh FreeBSD install ... Worked, but the install didn't find the card ?!

With all the time and $ we are investing with those tests, would be nice to have a resource with proved working configurations and hardware, as I've suggested, but the feedback received was that all was been discussed here, on the network forum, so will try and post a summary after I have mine all working.
 
Joined
Dec 29, 2014
Messages
1,135
It is a little odd how the results come out.

This is from the primary NAS to itself:

Clearly either the driver or FreeNAS is not putting the packets on the wire. I thought that might be the case. I think it is FreeBSD doing that since Windows doesn't get anywhere near the same level when talking to itself.
 

melloa

Wizard
Joined
May 22, 2016
Messages
1,749
I wanted to publicly thank you guys for all patience and help.

Today, I'm running at:

upload_2018-10-23_23-23-16.png


I'm compiling my notes on my compatibility tests and will send to the forum.
 
Status
Not open for further replies.
Top