Super slow LAN speed - or is it?

Status
Not open for further replies.

friolator

Explorer
Joined
Jun 9, 2016
Messages
80
I've looked at the other threads and have tried a bunch of things suggested there that didn't work, so here goes...

We have a FreeNAS 9.3 setup that was working fine in our old office. We moved upstairs a few weeks ago, and the whole server room was reconfigured. All the hardware is the same, just a new location. In the old office, we would easily get 100MB/s throughput to the FreeNAS box over our gigabit network. A few weeks before the move, a drive failed. After the move, I swapped that drive and it took about 24 hours to resilver. During that time we were getting about 1MB/S throughput on the high end, but that made sense because it was in degraded mode. After resilvering, we're getting an inital burst of speed over 100MB/s, but it rapidly drops to 10MB/s, where it remains for the duration of the file transfer. That's 1/10 of what we had before the drive swap.

I realize this is not a high end setup, but it has worked fine for us for many months.

Hardware details:
Motherboard: Honestly, can't remember. Something cheap we used for years with OpenFiler before switching to FreeNAS last year. Is there a command line tool I can use to get this info? I'd rather not pull the server out of the rack.
Intel(R) Core(TM)2 Duo CPU E7500 @ 2.93GHz
2GB RAM
Areca ARC-1280/1280ML RAID in pass-thru mode (16x 2TB drives)
Motherboard has onboard Gigabit (1 port)
Intel Pro 82546GB dual-port Gigabit card

Cisco SG 300-28 Managed Gigabit switch

Here's what I've tried:
* All three gigabit ports on the FreeNAS box. (all report 1GB on their LEDs, and in ifconfig, and on the switch, regardless of cable or switch port)
* Different ports on the switch
* Turned auto-negotiate off on the switch (it's set to 1000M Full Duplex)
* different cables (all Cat5e or Cat6)
* rebooted FreeNAS system
* rebooted Cisco switch

I've also tried copying to the FreeNAS box from multiple machines, all with the same 10MB/s speed limit. All machines on the network are gigabit.

Again, the weird thing here is that it all seemed to happen after a drive failure and resilvering - we used to get very consistent transfer speeds over 100MB/second, so I'm not sure that it's necessarily the network at all. Could something else have caused performance to suddenly degrade? The current status of the pool is "HEALTHY" and all the drives appear to be functioning normally.

If there's additional information that might help to diagnose this, I'm here all day and can post it. Just let me know.

Thanks!
 

adrianwi

Guru
Joined
Oct 15, 2013
Messages
1,231
2GB RAM? Really? It's a miracle it works at all :D
 

friolator

Explorer
Joined
Jun 9, 2016
Messages
80
Yeah, I know. Thing is, I set it up experimentally last year to see if it would work for us as a platform, and it did. I copied a lot of stuff to it to see how it would perform and it was on par with our old OpenFiler setup. We use it as a mid-way point for storing project files before we back up to LTO tape. So it's not used often, but it holds a lot of stuff, usually for several months before it's backed up.

The thing is, this behavior is entirely new. It was working like a champ before that drive swap. Other than the drive swap nothing has changed from before the failure when it was working really well.
 

fta

Contributor
Joined
Apr 6, 2015
Messages
148
You need to narrow down where the bottleneck is. iperf can be used to check the network. If that's ok, start looking at the freenas box and check your read/write speeds directly on the machine.
 

adrianwi

Guru
Joined
Oct 15, 2013
Messages
1,231
And put at least another 6GB RAM in it (and probably a bit more with 16x drives)
 

friolator

Explorer
Joined
Jun 9, 2016
Messages
80
And put at least another 6GB RAM in it (and probably a bit more with 16x drives)

I get that. But if it was working perfectly well before and now it's suddenly 10x slower, obviously it's not the RAM. I understand the system wants more, but if we do upgrades to this machine, it's probably not going to be to that motherboard, it'll probably be in the form of a new mobo/CPU entirely. So I'm not inclined to spend money on RAM for a motherboard we're ultimately not going to use.

In the mean time, *something* changed that caused the performance to tank and I'd like to try to get the machine back to where it was.
 
Last edited:

friolator

Explorer
Joined
Jun 9, 2016
Messages
80
You need to narrow down where the bottleneck is. iperf can be used to check the network. If that's ok, start looking at the freenas box and check your read/write speeds directly on the machine.

I'll give this a try this morning and get back to you.
 

CraigD

Patron
Joined
Mar 8, 2016
Messages
343
I am guessing, but I think some Cat5 cable is being used limiting your network speed to 100Mb/s
 

friolator

Explorer
Joined
Jun 9, 2016
Messages
80
I am guessing, but I think some Cat5 cable is being used limiting your network speed to 100Mb/s

All the cable I've tested is Cat5e or Cat6. I'll try another Cat6 cable to see if that makes any difference. Would the NIC and ifconfig still report 1000M if the speed is being throttled to 100 though? Because all hardware and software I've checked so far seems to think this is a gigabit connection...
 

CraigD

Patron
Joined
Mar 8, 2016
Messages
343
My bad electrical interference maybe?

I would not wait before doing a backup

Sorry for going down the wrong road

Have fun
 

friolator

Explorer
Joined
Jun 9, 2016
Messages
80
Trying to rule out the network with iperf, but I'm getting weird results. I've used it before on other machines, and the numbers I get always look sane. But check this out:

[root@freenas ~]# iperf -c 192.168.1.112 -p 20000 -t 60 -f M
------------------------------------------------------------
Client connecting to 192.168.1.112, TCP port 20000
TCP window size: 0.03 MByte (default)
------------------------------------------------------------
[ 6] local 192.168.1.3 port 16466 connected with 192.168.1.112 port 20000
[ ID] Interval Transfer Bandwidth
[ 6] 0.0-60.0 sec 17592186044266 MBytes 293201585863 MBytes/sec

When I don't specify -f M in the command, I get a similarly ridiculous number.

Any ideas?
 

friolator

Explorer
Joined
Jun 9, 2016
Messages
80
Disk Speed test:

[root@freenas ~]# dd if=/dev/zero of=/mnt/Z2_21TB/testfile bs=4M count=10000
10000+0 records in
10000+0 records out
41943040000 bytes transferred in 57.621217 secs (727909652 bytes/sec)

So I don't think the issue is disk speed, if I'm getting over 700MB/s
 

adrianwi

Guru
Joined
Oct 15, 2013
Messages
1,231
I'm no FreeNAS expert, but I've read enough post here from those who are to know that insufficient RAM can result in strange and hard to explain operations. That fact it was working and now isn't kind of fits that pretty well.
 

maglin

Patron
Joined
Jun 20, 2015
Messages
299
Did the resilver complete? What does zpool status display? Saying "it's not the RAM" when you are at 25% of the minimum is not a valid argument. I'll be surprised if you get much help until you at least have 8GB of RAM. I'm assuming any data on this box doesn't matter to you or the company. If it did the $50 it costs for the ram would be spent. Ironically you will be paid more to troubleshoot it and you will eventually add more RAM and say the people that made the minimum requirements actually knew what they where talking about.

And if you are 1000G link your router and NIC will tell you on the link lights. But the biggest change is that resilver which requires a RAM to complete. Best of luck.


Sent from my iPhone using Tapatalk
 

Sakuru

Guru
Joined
Nov 20, 2015
Messages
527
Disk Speed test:

[root@freenas ~]# dd if=/dev/zero of=/mnt/Z2_21TB/testfile bs=4M count=10000
10000+0 records in
10000+0 records out
41943040000 bytes transferred in 57.621217 secs (727909652 bytes/sec)

So I don't think the issue is disk speed, if I'm getting over 700MB/s

That's probably due to compression. Create a new dataset with compression off.
 

friolator

Explorer
Joined
Jun 9, 2016
Messages
80
The resilver completed. Took 24 hours, but it's done. The status is listed as "Healthy"

Once the resilver is complete, does the system require more RAM than before? I mean, in all respects other than the network speed, it is functioning normally. internally we're getting pretty fast disk speeds (as indicated above, over 700MB/second).
 

Nick2253

Wizard
Joined
Apr 21, 2014
Messages
1,633
Since your network speed dropped to ~1/10th of what it was, I'd say it's about 90% likely that the negotiated link is dropping to 100Mbps during transfer. There could be a bunch of different things that cause this, from a bad network card, to a bad switch port, a bad cable, bad noise, etc. The easiest way to test this is to do file transfer in and out of a RAM drive (which would actually be ill advised on this system, since you only have 2GB).

Other than that, how recently before the move did you confirm that you were still transferring at 1Gb/s. Is it possible that you only realized this problem after the move because you weren't looking before?
 

friolator

Explorer
Joined
Jun 9, 2016
Messages
80
Since your network speed dropped to ~1/10th of what it was, I'd say it's about 90% likely that the negotiated link is dropping to 100Mbps during transfer. There could be a bunch of different things that cause this, from a bad network card, to a bad switch port, a bad cable, bad noise, etc. The easiest way to test this is to do file transfer in and out of a RAM drive (which would actually be ill advised on this system, since you only have 2GB).

I've tried different cables, all three gigabit ports on the machine (one onboard, two on a PCI card), and I've tried different ports on the switch. The Switch and the NICs are all showing gigabit speed on their status lights. ifconfig shows gigabit as well. On the switch, I've set the network speed to be 1GB (not auto) for the port I'm currently using.

Other than that, how recently before the move did you confirm that you were still transferring at 1Gb/s. Is it possible that you only realized this problem after the move because you weren't looking before?

Probably a week or so before I got the alert that a drive had failed. I shut the alert off and let it go in degraded mode for about a week before we moved. It was shut down for at least a week after the move while we wired stuff up. First thing I did after getting it back online was to replace the bad drive.

I would have noticed if it was this slow, because it typically takes 3-4 hours to back up an LTO5 tape over our network, and I did three in one day. No way that would have been possible at these speeds.
 

CraigD

Patron
Joined
Mar 8, 2016
Messages
343
Nice you still have your data

Good result!
 
Status
Not open for further replies.
Top