First FreeNAS Build & on Xen Server with HBA - is my performance testing expected?

Status
Not open for further replies.

bassmann

Dabbler
Joined
Oct 18, 2017
Messages
17
Hi – I’ve embarked on my first FreeNAS build for home use and are testing configurations before I go live with the setup.

Based on what I have done so far I have questions around file copy speeds and NIC throughput. Searching the forums I can see there is a lot of info out there, but I’m having trouble determining if the performance I am getting is expected or I have a problem. And if the latter, what I need to do about it. Not sure where my bottlenecks are, reporting not showing any from what I can tell. Cpu, mem, eth all good. Not 100% sure about HDD.


Tests

  1. File copy from my clients SSD to Freenas peaks out at 70MB/sec. I have Gigabit Ethernet, shouldn’t I be able to get to 110MB/sec?
  2. File copy from a SMB share on Windows 10 within the same dataset – 200-300MB/sec
  3. File copy from between 2 datasets with the same pool via SMB shares on Windows 10 client, back to 70MB/sec. I understand this should drop vs 2) but by this much?

Hardware:

  • I am running FreeNAS as a VM on Xen Server 7.2
  • The Xen Server has a LSI 9211 HBA with latest firmware (P20 in IT mode) running on FreeNAS 11u4. This is pass-through to the FN VM. For those interested, I could not get this working until I upgraded from Xen Server 7.0 to 7.2
  • The HBA has 6 WD Red 3TB HDD configured as RaidZ2
  • The machine has 32GB ram with 16GB allocated to the FreeNAS VM. A E5-2670 Xeon with 4 cores assigned on a Supermicro X9SRL-F.
  • I am using the onboard Ethernet of the X9SRL-F which has 2 ports
  • The windows 10 client is running pro on a i3-1230v3 Xeon

My research tells me

  • Copy across datasets is slower as it physically writes files to disk, but I do not know how much slower to expect.
  • If connecting to Windows Clients, a windows dataset and SMB shares should be used. Avoid mixing Unix datasets with SMB shares etc.
  • I think I read RaidZ2 writes to 1 disk at a time and is slowed by double parity writes. Is this my issue? I could consider other volume types.

Hope I’ve given enough info
 
Last edited by a moderator:

jasonmicron

Dabbler
Joined
Oct 24, 2017
Messages
15
For test 1: That could be limited by the disk write speed in the NAS.
For test 2: If you didn't reboot the client after test 1, you're probably seeing cached speeds. Reboot and try this test again, I'd bet you'll see around 70MB/s.
For test 3: Again, this could be write speed limitations on your disks in the NAS.

A better test would be to do a dd on the mounted NAS volume itself. SSH into the freenas system and see what your speeds are when doing:
Code:
dd if=/dev/zero of=/<path>/<to>/<RAID>/testfile1 bs=1M count=1000

If that test completes too quickly, increase the count to 10000. This test will create a file called "testfile1" in your mounted RAID path. You should see this test complete with the maximum write speed capable of your RAID volume. This can help determine if the issue is your network or your RAID array writes.

Edit: Sorry if those paths above don't work for the DD test, I'm not too familiar with BSD but that's how you would do that in Linux.
 

bassmann

Dabbler
Joined
Oct 18, 2017
Messages
17
Hi jason - thankyou for the tip on dd cmd. I ran for count = 1000, 10000, 10000 which I understand equates to a 1GB, 10GB and 100GB write to a dataset. If I got this right then 24sec or a 100GB write is fast...

I repeated my original tests with a reboot of the FN VM with consistent results posted above. In addition I tested 2 more things:
- copy from FN to the SSD on the Win10 client does 110MB/s (so maxing out the Gigabit connect), its ~70MB/s going the other way.
- Test 2- With a reboot, copying a file within a dataset is about 300MB/s. re: your comment of cache speeds. with I do a 2nd copy without reboot it increase to 1.2GB/sec

There's a bottleneck somewhere, just not sure where at this point...I'll keep running tests and report back. I suspect but it's a big guess only, something to do with Samba or shares.

Code:
[root@PROGFS ~]# dd if=/dev/zero of=/mnt/VOL_Z2/Data/testfile1 bs=1M count=1000
1000+0 records in															   
1000+0 records out															 
1048576000 bytes transferred in 0.198135 secs (5292222166 bytes/sec)			
[root@PROGFS ~]# dd if=/dev/zero of=/mnt/VOL_Z2/Data/testfile1 bs=1M count=10000
10000+0 records in															 
10000+0 records out															 
10485760000 bytes transferred in 2.311500 secs (4536344561 bytes/sec)		   
[root@PROGFS ~]# dd if=/dev/zero of=/mnt/VOL_Z2/Data/testfile1 bs=1M count=10000
0																			   
100000+0 records in															 
100000+0 records out															
104857600000 bytes transferred in 24.071685 secs (4356055583 bytes/sec)		 
[root@PROGFS ~]#	 
 

bassmann

Dabbler
Joined
Oct 18, 2017
Messages
17
Further testing, so I killed my pool and created 2 vols with my 6 * WD Red HHD's.

1) 2 disks in a stripe (Raid 0)
2) 4 disks in a striped mirror (Raid10)

Below are the dd results. I also reran my original tests 1, 2, 3 and got similar results to my RaidZ2 pool, infact they were slightly worse at around >60MB/sec (vs. 70MB/sec with RaidZ2).

What is this telling me? Any thoughts on what I should test to confirm what my bottlenecks are?


Code:
[root@PROGFS ~]# dd if=/dev/zero of=/mnt/Stripe/ST1/testfile1 bs=1M count=100000
100000+0 records in															 
100000+0 records out															
104857600000 bytes transferred in 22.802012 secs (4598611724 bytes/sec)		 
[root@PROGFS ~]# dd if=/dev/zero of=/mnt/Stripe/ST1/testfile1 bs=1M count=100000
100000+0 records in															 
100000+0 records out															
104857600000 bytes transferred in 22.931702 secs (4572604311 bytes/sec)		 
[root@PROGFS ~]#			 
 

bassmann

Dabbler
Joined
Oct 18, 2017
Messages
17
This is a 6 disk Raid0. This should be pretty fast right.

Fastest so far off the dd test.

When i copy via my Win10 client within a dataset i get 1GB/s across a 100GB file copy
When I copy the same file across 2 datasets within the same pool I get >60MB/sec

dd results

Code:
[root@PROGFS ~]# dd if=/dev/zero of=/mnt/Vol/DS1/testfile1 bs=1M count=100000   
																				
100000+0 records in															 
100000+0 records out															
104857600000 bytes transferred in 21.121758 secs (4964435150 bytes/sec)


What should I test next?
 

bassmann

Dabbler
Joined
Oct 18, 2017
Messages
17
An finally a single disk. dataset copies within and across the same as previous tests >60MB/s

dd test same results over 100GB write. Can I conclude my raid format is not the issue?

Code:
[root@PROGFS ~]# dd if=/dev/zero of=/mnt/Vol/DS1/testfile3 bs=1M count=100000   
100000+0 records in															 
100000+0 records out															
104857600000 bytes transferred in 22.591603 secs (4641441280 bytes/sec)		 
[root@PROGFS ~]#
 

bassmann

Dabbler
Joined
Oct 18, 2017
Messages
17
This is interesting. I’m back to a RaidZ2 pool of 6 disks…
LZ4 compression off

Code:

[root@PROGFS ~]# dd if=/dev/zero of=/mnt/Pool/DS1/testfile1 bs=1M count=100000   
100000+0 records in 
100000+0 records out 
104857600000 bytes transferred in 247.795156 secs (423162428 bytes/sec) 
[root@PROGFS ~]# 
[root@PROGFS ~]# 


LZ4 compression on

Code:
[root@PROGFS ~]# dd if=/dev/zero of=/mnt/Pool/DS1/testfile3 bs=1M count=100000 
100000+0 records in 
100000+0 records out 
104857600000 bytes transferred in 24.815851 secs (4225428373 bytes/sec) 
[root@PROGFS ~]# 
 

bassmann

Dabbler
Joined
Oct 18, 2017
Messages
17
Still on my RadiZ6 pool testing...

So a bit more reading and my understanding is I should be running tests without compression so realistically my write speed is 423MB/sec (based on above).

Why does it drop to ~>60MB/sec when copying outside a dataset? I haven't been able to conclusively determine this however;
- I understand writes within a dataset are server side copies.
- I still don't get why it should drop so much. If i copy across 2 datasets with compressions on or off its consistent at ~60MB/sec via shares. Why is this the case?

Out of interest I setup FTP reading that shares & / or protocols can have an impact.
- I was able to achieve ~>90MB/sec copying from Win10 to FN, so a big improvement on previous ~>60MB/sec rates. Going the other way maxes my GB router at 110GB/sec.

There's still a server side bottleneck somewhere in the chain but not sure where.
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
1. Dd tests must be done with compression turned off.
2. Copying into same share will use server side copy and be much faster than your network.
3. Did you run an iperf test? Both directions.
4. Your file size will impact your performance. Ideally you can do a dd test over smb to get streaming performance.
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
Copying into same share will use server side copy and be much faster than your network.

This will depend on the intersection of protocol, server and client side capabilities.
 

bassmann

Dabbler
Joined
Oct 18, 2017
Messages
17
I ran some net tests with iperf. Never used it before, good to learn something new. Network looks like it has an issue.
Seems ok going FN to Win10 client but slower the other way.

In respect to Win10 (client) to FN (server) I tested transfer to both 82574L NICs on the server on different cables and achieved similar, slower results. On the client, the NIC is a cheap onboard something (Realtek).

FN to Win10

Code:
iperf -c clientip 
------------------------------------------------------------ 
Client connecting to 172.21.8.51, TCP port 5001 
TCP window size: 32.8 KByte (default) 
------------------------------------------------------------ 
[  3] local 172.21.8.70 port 46974 connected with 172.21.8.51 port 5001 
[ ID] Interval  Transfer  Bandwidth 
[  3]  0.0-10.0 sec  1.07 GBytes  919 Mbits/sec 
[root@PROGFS ~]# 


Code:
iperf -c clientIP -w 64KB 
------------------------------------------------------------ 
Client connecting to 172.21.8.51, TCP port 5001 
TCP window size: 64.2 KByte (WARNING: requested 64.0 KByte) 
------------------------------------------------------------ 
[  3] local 172.21.8.70 port 34425 connected with 172.21.8.51 port 5001 
[ ID] Interval  Transfer  Bandwidth 
[  3]  0.0-10.0 sec  1.05 GBytes  903 Mbits/sec 
[root@PROGFS ~]# 


Win10 to FN

Code:

iperf -c serverNIC1
------------------------------------------------------------
Client connecting to 172.21.8.70, TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[304] local 172.21.8.51 port 50024 connected with 172.21.8.70 port 5001
[ ID] Interval  Transfer  Bandwidth
[304]  0.0-10.0 sec  897 MBytes  752 Mbits/sec

iperf -c serverNIC2
------------------------------------------------------------
Client connecting to 172.21.8.101, TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[304] local 172.21.8.51 port 50045 connected with 172.21.8.101 port 5001
[ ID] Interval  Transfer  Bandwidth
[304]  0.0-10.0 sec  905 MBytes  759 Mbits/sec

 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
There is a problem with the network going from the windows machine to the freenas box. Do you have 2 nic's on the same subnet for a single machine? You should not be doing that and only 1 nic per subnet.
 

bassmann

Dabbler
Joined
Oct 18, 2017
Messages
17
Some more test results. I just tested dd over the smb.

I couldn't work out how to run dd from my Windows client, but luckily I have the same machine on Mint18 so I did the test from Linux client to FN. All hardware the same.

throughput seems good over smb

Code:
PROGLM Other # dd if=/dev/zero of=/mnt/Other/testfile1 bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 9.15357 s, 115 MB/s
PROGLM Other # dd if=/dev/zero of=/mnt/Other/testfile2 bs=1M count=10000
10000+0 records in
10000+0 records out
10485760000 bytes (10 GB, 9.8 GiB) copied, 89.5904 s, 117 MB/s
PROGLM Other # dd if=/dev/zero of=/mnt/Other/testfile3 bs=1M count=100000
100000+0 records in
100000+0 records out
104857600000 bytes (105 GB, 98 GiB) copied, 892.435 s, 117 MB/s


And i re-ran my iperf tests linux client

...1 NIC good the other not...but resullts drastically different to the W10 results posted eariler.

Code:
iperf -c 172.21.8.70 -w 64KB
------------------------------------------------------------
Client connecting to 172.21.8.70, TCP port 5001
TCP window size:  128 KByte (WARNING: requested 64.0 KByte)
------------------------------------------------------------
[  3] local 172.21.8.51 port 45560 connected with 172.21.8.70 port 5001
[ ID] Interval  Transfer  Bandwidth
[  3]  0.0-10.0 sec  1.07 GBytes  923 Mbits/sec


iperf -c 172.21.8.119 -w 64KB
------------------------------------------------------------
Client connecting to 172.21.8.119, TCP port 5001
TCP window size:  128 KByte (WARNING: requested 64.0 KByte)
------------------------------------------------------------
[  3] local 172.21.8.51 port 51420 connected with 172.21.8.119 port 5001
[ ID] Interval  Transfer  Bandwidth
[  3]  0.0-10.0 sec  112 MBytes  94.2 Mbits/sec


And Server Side back to Linux

Code:
[root@PROGFN ~]# iperf -c 172.21.8.51 -w 64KB   
------------------------------------------------------------   
Client connecting to 172.21.8.51, TCP port 5001   
TCP window size: 64.2 KByte (WARNING: requested 64.0 KByte)   
------------------------------------------------------------   
[  3] local 172.21.8.70 port 34461 connected with 172.21.8.51 port 5001   
[ ID] Interval  Transfer  Bandwidth   
[  3]  0.0-10.0 sec  1.02 GBytes  872 Mbits/sec   
[root@PROGFN ~]#   



Seeing the post from SweetnLow I then checked my NIC setup on FN (where there are 2 82574L ports). One is configured with static IP i.e. 172.21.8.70, the 2nd is dynamic IP.

I then ran ifconfig

Code:
ifconfig   
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384   
  options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>   
  inet6 ::1 prefixlen 128   
  inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1   
  inet 127.0.0.1 netmask 0xff000000   
  nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>   
  groups: lo   
xn0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500   
  options=503<RXCSUM,TXCSUM,TSO4,LRO>   
  ether ae:50:78:36:da:e4   
  inet 172.21.8.70 netmask 0xffffff00 broadcast 172.21.8.255   
  nd6 options=9<PERFORMNUD,IFDISABLED>   
  media: Ethernet manual   
  status: active   
xn1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500   
  options=503<RXCSUM,TXCSUM,TSO4,LRO>   
  ether f6:a0:b7:fd:76:6a   
  inet 172.21.8.119 netmask 0xffffff00 broadcast 172.21.8.255   
  nd6 options=9<PERFORMNUD,IFDISABLED>   
  media: Ethernet manual   
  status: active   


To be honest, I've always been confused by subnets so I can't honestly say if this is saying there's a problem. I tried reading up on it, but its over my head at present (although willing to learn).

At this point I'm gonig to see if there's a newer NIC driver for my Windows client, and think about options for the network setup on FN.

Cheers,
 

bassmann

Dabbler
Joined
Oct 18, 2017
Messages
17
And since some of my earlier tests were flawed I've reran the local dd tests to determine max write performance.


dd tests redone as follows:
  • compression off
  • 3 files sizee, 1GB, 10GB, 100GB
Code:
[root@PROGFS ~]# dd if=/dev/zero of=/mnt/Pool/DS1/testfile1 bs=1M count=1000 
1000+0 records in 
1000+0 records out 
1048576000 bytes transferred in 0.199553 secs (5254627139 bytes/sec)  

[root@PROGFS ~]# dd if=/dev/zero of=/mnt/Pool/DS1/testfile2 bs=1M count=10000 
10000+0 records in 
10000+0 records out 
10485760000 bytes transferred in 21.245978 secs (493540940 bytes/sec)  

[root@PROGFS ~]# dd if=/dev/zero of=/mnt/Pool/DS1/testfile3 bs=1M count=100000 
100000+0 records in 
100000+0 records out 
104857600000 bytes transferred in 247.970698 secs (422862866 bytes/sec) 
[root@PROGFS ~]# 


This is quite good given the hardware I have, right?
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
And since some of my earlier tests were flawed I've reran the local dd tests to determine max write performance.


dd tests redone as follows:
  • compression off
  • 3 files sizee, 1GB, 10GB, 100GB
Code:
[root@PROGFS ~]# dd if=/dev/zero of=/mnt/Pool/DS1/testfile1 bs=1M count=1000
1000+0 records in
1000+0 records out
1048576000 bytes transferred in 0.199553 secs (5254627139 bytes/sec) 

[root@PROGFS ~]# dd if=/dev/zero of=/mnt/Pool/DS1/testfile2 bs=1M count=10000
10000+0 records in
10000+0 records out
10485760000 bytes transferred in 21.245978 secs (493540940 bytes/sec) 

[root@PROGFS ~]# dd if=/dev/zero of=/mnt/Pool/DS1/testfile3 bs=1M count=100000
100000+0 records in
100000+0 records out
104857600000 bytes transferred in 247.970698 secs (422862866 bytes/sec)
[root@PROGFS ~]#


This is quite good given the hardware I have, right?
Your local dd tested are exactly what they should be. That one nic with bad performance looks like it's only 100mbps. I still think you should just gave one cable plugged into freenas. Having 2 causes problems.
 

bassmann

Dabbler
Joined
Oct 18, 2017
Messages
17
I just upgraded NIC drivers on W10 client. No change to iperf tests.

Code:
Your local dd tested are exactly what they should be. That one nic with bad performance looks like it's only 100mbps. I still think you should just gave one cable plugged into freenas. Having 2 causes problems.


I will test this scenario and to be honest, 2 NIC's on my FN setup don't really have a purpose at present...My only thought is that given my server is visualized I wouldn't mind the capacity to assign the 2nd NIC to a VM, say a proxy/firewall.

Is it possible to force FN only use 1 of the 2 NIC's it see's? That way I could allocate the 2nd to another VM.
 

bassmann

Dabbler
Joined
Oct 18, 2017
Messages
17
Building upon previous:

1) I removed one of the NICs from the FN config and ran iperf to test Win10 to FN performance. No change vs. earlier tests

2) I unplugged the 2nd Ethernet cable on FN server, rebooted and ran the test again. No change.

Code:
iperf -c 172.21.8.70
------------------------------------------------------------
Client connecting to 172.21.8.70, TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[304] local 172.21.8.51 port 49819 connected with 172.21.8.70 port 5001
[ ID] Interval  Transfer  Bandwidth
[304]  0.0-10.0 sec  898 MBytes  752 Mbits/sec


3) Repeated 2) but on 2nd NIC with different cable after reboot. No change

Code:
iperf -c 172.21.8.70
------------------------------------------------------------
Client connecting to 172.21.8.70, TCP port 5001
TCP window size: 64.0 KByte (default)
------------------------------------------------------------
[304] local 172.21.8.51 port 50216 connected with 172.21.8.70 port 5001
[ ID] Interval  Transfer  Bandwidth
[304]  0.0-10.0 sec  872 MBytes  730 Mbits/sec


Running out of ideas what to test next. Under windows seems to point the Realtek NIC on the client machine, although this is speculative, not conclusive.

To summarise results so far;
- local write to disk is inline with hardware
- Write performance across LAN is sub performance
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
Put one of the NICs in your windows box...
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
I'd say the realtek nic in the client is just bad quality. I use to have one in a windows box and hated it. Could never get over 80mbps.
 
Status
Not open for further replies.
Top