I can only hit 40-45MB/s per host copying to FreeNAS

Status
Not open for further replies.

fullspeed

Contributor
Joined
Mar 6, 2015
Messages
147
Hey Guys,

I've been using Freenas at home for at least 5 years and it's been bascially flawless. I am a sysadmin put in charge of setting up storage quite often so I decided to spool up a couple servers at work with the intention of getting rid of our dodgy and inefficient windows servers.

I have two identical server/array setups one at our primary site and one at our secondary site, everything went great as I expected with the install and configuration. I have snapshots/replication and all the shares setup and I'm super happy in that sense.

The only issue is the very poor network performance, CIFS/NFS/FTP etc,. all top out at 40-45MB/s regardless of anything I do (raidz1/raidz2/raidz3/mirror/single disk/zil/no zil/tunables etc,.)

Here is my setup (I have two of these)
--
OS - FreeNAS-9.3-STABLE-201503071634
Host - Dell R510 w/ H700 HW raid (for OS)
- 128GB ECC Ram
Network - Intel X520 10GBe (in failover config)
HBA - LSI 9300-16e
Array - SuperMicro SC847J x 2
Drives - 45 x 4TB drives (mix of WD REDs / Seagate NAS drives)
l2arc - 4 x 512GB SSD (Samsung 850 Pro)
ZIL - 2 x 200GB SSD (Intel S3700)

Zpool config ~ three 12 disk raidz3 @ 130TB Raw / 92TB Avail
--
I've connected them both to AD and they are using windows permissions successfully. I haven't changed a whole lot, it's running SMB3 max, host lookups off, local master off, time server off (on CIFS). What is interesting is raidz3, raidz2, raidz1, mirror, single disk, zil, no zil etc,. have zero affect on performance.. all configurations result in almost identical speeds. I've tried over a dozen sysctls (temporarily) and they all had basically no affect.

So here is the kicker, when I copy from different sources to freenas it breaks through this ceiling. For example:

Fileserver 1 - Initiate copy, transfers at 40MB/s, open up a second tranfer on same server and now both files transfer at 20MB/s. open up two more and they are all 10MB/S.
Fileserver 2 - While first copy is still happening I transfer a file and it hits 40MB/s
Fileserver 3 - While first two copies are happenign I start a transfer and it hits 40MB/s!

Basically through one server/host I can only get a max of 40-45MB/s to Freenas regardless of any tuning or settings, this happens on 10G and 1G network adapters in failover or standalone configuration as well.

Here is a picture of my network interface, You can see the big "steps" where I added a new source, it looks like each one is limited to about 300-400mbit.

upload_2015-3-10_14-17-16.png
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
What's the client machines you are using to copy data to/from the FreeNAS box? They running an OS that is pre-windows 7?

lagg is just a very bad idea, especially when you are having problems. KISS (keep it simple, stupid) and once you've verified things perform and are stable, only *then* should you do lagg. Until then, ditch the lagg. ;)

This reeks of something that isn't related to the FreeNAS box as the problem though. Some kind of screwed up networking or something along those lines.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Saw in your other thread that you have 15-20TB of SQL databases.. those always transfer slow.

Also one thing I forgot to mention, but 2TB of L2ARC with 128GB of RAM is just silly. You've exceeded the 5:1 L2ARC:ARC ratio by 4 fold! You'll actually create new problems if you have too much. If you want 2TB of L2ARC you are going to need to upgrade to at *least* 256GB of RAM or so.
 

fullspeed

Contributor
Joined
Mar 6, 2015
Messages
147
What's the client machines you are using to copy data to/from the FreeNAS box? They running an OS that is pre-windows 7?

lagg is just a very bad idea, especially when you are having problems. KISS (keep it simple, stupid) and once you've verified things perform and are stable, only *then* should you do lagg. Until then, ditch the lagg. ;)

This reeks of something that isn't related to the FreeNAS box as the problem though. Some kind of screwed up networking or something along those lines.

I totally agree with you regarding the KISS method! I have tested it with just a single 10 gig link and even switched to a 1gig copper just to be sure and they all suffer from the same issue.

The clients I have used to test are Windows 8 / 8.1 / 2012 / 2012 r2 / Ubuntu, all of them cap out at the same speed. I shared your same view initially that it was unrelated to Freenas which is why I booted up an Ubuntu livecd and broke 100MB/s over cifs on the same hardware using a single disk, Freenas is doing 42MB/s to a single disk (and every other configuration I have ever done)

a single disk no ssd zil/l2arc is doing 42MB/s
a large raidz3 with zil/l2arc SSDs is doing 42MB/s

Edit:

Regarding your post above, everything transfers slow, not just the database backups. If you think the l2arc might have an impact is there anyway to limit it as a test?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Have you tried getting rid of SMB3? It's not the default because it's somewhat buggy still. :P
 

fullspeed

Contributor
Joined
Mar 6, 2015
Messages
147
Have you tried getting rid of SMB3? It's not the default because it's somewhat buggy still. :p

I've tried all of them and NFS/FTP etc,. all slow :(
 

Mlovelace

Guru
Joined
Aug 19, 2014
Messages
1,111
Try some iperf test between the ubuntu server and the freeNAS to rule out the network config. If that comes back okay then focus on the freeNAS config/protocol side of things.
 

Apollo

Wizard
Joined
Jun 13, 2013
Messages
1,449
What does the "top" command reveal? Are you maxing out any of the cores/threads? Make sure you display all the cores/threads.
Maybe you have a low end Xeon that cannot keep up.
Do You have replication or high compression enabled other than LZ4?
Are you throtling your network connection either from the network or from within Freenas?
In the BIOS, do you limit CPU performance in any way?
If you replicate one dataset or copy a very big file within the pool, what throughput can you achieve?
Is your pool encrypted?
Are you transferring data across ssh with encryption?
 

alpaca

Dabbler
Joined
Jul 24, 2014
Messages
24
What is the controller and backplane in use? I am assuming you're not using the h700 RAID controller as that could certainly be a problem. You have a lot of drives connected and the backplane/controller could be an issue. From recent build experience, I have seen a wonky dual-expander backplane cause lots of aggravation.
 

fullspeed

Contributor
Joined
Mar 6, 2015
Messages
147
What does the "top" command reveal? Are you maxing out any of the cores/threads? Make sure you display all the cores/threads.
Well if its one or eight transfers they all roughly look like this:
74421 root 1 94 0 331M 29588K CPU12 12 0:16 74.37% smbd


Maybe you have a low end Xeon that cannot keep up.
This is about the max i've seen (With 8-10 transfers, snapshotting, replication etc,.)
upload_2015-3-11_9-43-30.png



Do You have replication or high compression enabled other than LZ4?
LZ4 is enabled, I've tried without replication and it has no effect on performance.

Are you throtling your network connection either from the network or from within Freenas?
On the network side no, on the Freenas side I don't even know how I would do that. If you can tell me how to check I will.

In the BIOS, do you limit CPU performance in any way?
No that was one of the first things I checked

If you replicate one dataset or copy a very big file within the pool, what throughput can you achieve?
Same limit, around 300MB/s or 40MB/s. (looking at network graph on destination side), this is one or five replication streams.

Is your pool encrypted?
No

Are you transferring data across ssh with encryption?
I'm snapshotting and replicating to another freenas box via ssh, turning this on or off has no affect on transfers

 

fullspeed

Contributor
Joined
Mar 6, 2015
Messages
147
What is the controller and backplane in use? I am assuming you're not using the h700 RAID controller as that could certainly be a problem. You have a lot of drives connected and the backplane/controller could be an issue. From recent build experience, I have seen a wonky dual-expander backplane cause lots of aggravation.

LSI 9300-16e HBA Controller
SuperMicro SC847J Chassis/Backplane

I think I ruled out hardware when I broke over 100MB/s with Ubuntu on a single drive.
 

fullspeed

Contributor
Joined
Mar 6, 2015
Messages
147
Try some iperf test between the ubuntu server and the freeNAS to rule out the network config. If that comes back okay then focus on the freeNAS config/protocol side of things.

This is freenas1 -> freenas2, basically the same iperf as windows -> freenas1

Server listening on TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[ 4] local 192.168.0.84 port 5001 connected with 192.168.0.85 port 18176
[ ID] Interval Transfer Bandwidth
[ 4] 0.0-10.0 sec 1.59 GBytes 1.36 Gbits/sec
[ 5] local 192.168.0.84 port 5001 connected with 192.168.0.85 port 49576
[ 5] 0.0-10.0 sec 1.64 GBytes 1.41 Gbits/sec
[ 4] local 192.168.0.84 port 5001 connected with 192.168.0.85 port 32947
[ 4] 0.0-10.0 sec 1.56 GBytes 1.34 Gbits/sec
[ 5] local 192.168.0.84 port 5001 connected with 192.168.0.85 port 14053
[ 5] 0.0-10.0 sec 1.59 GBytes 1.36 Gbits/sec
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
If that is through a 10Gb LAN connection, you have majorly misconfigured your network somewhere. That should be at least 7Gb (and that's being generous, it really should be over 9Gb/sec).

Oh, and you should be able to do that over a single connection, not 4 parallel connections like you are doing. ;)
 

Apollo

Wizard
Joined
Jun 13, 2013
Messages
1,449
I suspect you are core/thread limited.
Out of the 74% Samba usage, you have ZFS and other process running that do not necessarily show under "top". So I think your cores are maxed out.
By the way, the CPU report graph only show the combined CPU usage and will not indicate if one core really maxes out.
I would have liked the report to show the individual thread being plotted. This is way too confusing as it is.

Back to Freenas throttling network throughput, I think it is only available during replication as part of the replication setup.

What is the Xeon part number? I did look up the Dell specs, but they claim it is using the latest Xeon processor (What a marketing crap). But from which Century?
 

titan_rw

Guru
Joined
Sep 1, 2012
Messages
586
I suspect you are core/thread limited.
Out of the 74% Samba usage, you have ZFS and other process running that do not necessarily show under "top". So I think your cores are maxed out.

This is why I like "top -SH" so I can see kernel threads, and it displays per thread instead of per process.

What is the Xeon part number? I did look up the Dell specs, but they claim it is using the latest Xeon processor (What a marketing crap). But from which Century?

True, a slow clocked xeon is not going to help single thread speeds. I'm running a e5-1650 and use 85% of a core doing about 800 mbytes/sec over samba. But then it's a 3.5 ghz part. Whereas a 2603 or something that may only be running at 1.6 would be a different story.
 

Apollo

Wizard
Joined
Jun 13, 2013
Messages
1,449
This is why I like "top -SH" so I can see kernel threads, and it displays per thread instead of per process.



True, a slow clocked xeon is not going to help single thread speeds. I'm running a e5-1650 and use 85% of a core doing about 800 mbytes/sec over samba. But then it's a 3.5 ghz part. Whereas a 2603 or something that may only be running at 1.6 would be a different story.

800MBytes/sec and in 800MB/s ?
This is over 10Gbit network right?

Over my 1Gbit/s Ethernet, and Xeon E3-1241-V3 I am able to achieve the following:

118MB/s => 15% smbd
=> 13% intr(irq265: igb0:que)
 

titan_rw

Guru
Joined
Sep 1, 2012
Messages
586
800MBytes/sec and in 800MB/s ?
This is over 10Gbit network right?

Over my 1Gbit/s Ethernet, and Xeon E3-1241-V3 I am able to achieve the following:

118MB/s => 15% smbd
=> 13% intr(irq265: igb0:que)

Yea, 10gig intel x540's. I don't remember what the load from irq intr's were. Not terribly high though. Samba was definitely the closest to the single thread wall.
 

fullspeed

Contributor
Joined
Mar 6, 2015
Messages
147
I suspect you are core/thread limited.
Out of the 74% Samba usage, you have ZFS and other process running that do not necessarily show under "top". So I think your cores are maxed out.
By the way, the CPU report graph only show the combined CPU usage and will not indicate if one core really maxes out.
I would have liked the report to show the individual thread being plotted. This is way too confusing as it is.

Back to Freenas throttling network throughput, I think it is only available during replication as part of the replication setup.

What is the Xeon part number? I did look up the Dell specs, but they claim it is using the latest Xeon processor (What a marketing crap). But from which Century?

http://ark.intel.com/products/37111/Intel-Xeon-Processor-X5570-8M-Cache-2_93-GHz-6_40-GTs-Intel-QPI

When I first installed Freenas and was testing all the disk configs, even with one disk with no repl/snapshotting I was still hitting that 40/45MB/s limit. I'd post my top right now but nothing is really going on because after transferring 38TB I lost 3 disks from one of my raidz3s which is going to be another 24 hrs for the resliver :(. I am not surprised as when you dump 45 disks in and hammer them full throttle you are going to find some bad drives, unfortunately three of them were in one group :/

Could you guys explain further the CPU/kernel threads/interrupts and how I would test this? Even if its not the CPU I should rule it out.

Also i'll add that I spun up a live iso of nas4free strictly because it also has ZFS and I was able to hit 82MB/s, although that seems to be it's cap as well.
 

Ray Milyard

Patron
Joined
Aug 8, 2014
Messages
262
Before last few FreeNAS updates I was getting 100k+ copying files from SMB shares to my Win 8.1 box. Not I get a Max of 12mb's.
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
12 MB/s or 12 Mb/s?

If it's 12 MB/s it's pretty much exactly the throughput of a 100 Mb/s link (12.5 MB/s) so I assume some part of the chain in your network is degraded to 100M because of poor quality cables for example (you need CAT5E or better for a gigabit network).
 
Status
Not open for further replies.
Top