Max performance I can get?

Status
Not open for further replies.

Pascal Robert

Dabbler
Joined
Apr 20, 2015
Messages
14
Hi,

I'm trying to use FreeNAS as a NFS server for our Citrix XenServer 7.1 cluster. Sadly, performance is not optimal, and I'm trying to find out, why.

Hardware for FreeNAS

- SuperMicro 6027R-E1CR12N
- Disks:
- 8 Seagate Entreprises HDD (4 TB each)
- 4 Intel S3500 SSD 480 GB, two for ZIL, two for cache
- 2 Intel 80 GB disks in mirror, for FreeNAS
- Intel X550-T2 as the 10 Gbps network card
- NetGear 10 Gbps switch between the FreeNAS box and the 3 XenServer hosts

Setup:

Code:
[root@storix1] ~# zpool status
  pool: Stockage
 state: ONLINE
  scan: scrub repaired 0 in 11h20m with 0 errors on Sun Nov 26 11:21:00 2017
config:

	NAME											STATE	 READ WRITE CKSUM
	Stockage										ONLINE	   0	 0	 0
	 mirror-0									  ONLINE	   0	 0	 0
	   gptid/b593a60e-1563-11e7-952e-a0369fbd5878  ONLINE	   0	 0	 0
	   gptid/c5b6a903-1563-11e7-952e-a0369fbd5878  ONLINE	   0	 0	 0
	 mirror-1									  ONLINE	   0	 0	 0
	   gptid/d203bc54-1563-11e7-952e-a0369fbd5878  ONLINE	   0	 0	 0
	   gptid/de3f47dd-1563-11e7-952e-a0369fbd5878  ONLINE	   0	 0	 0
	 mirror-2									  ONLINE	   0	 0	 0
	   gptid/eec49d66-1563-11e7-952e-a0369fbd5878  ONLINE	   0	 0	 0
	   gptid/f9dd0243-1563-11e7-952e-a0369fbd5878  ONLINE	   0	 0	 0
	 mirror-3									  ONLINE	   0	 0	 0
	   gptid/0a3f2895-1564-11e7-952e-a0369fbd5878  ONLINE	   0	 0	 0
	   gptid/1ae06ada-1564-11e7-952e-a0369fbd5878  ONLINE	   0	 0	 0
	logs
	 mirror-4									  ONLINE	   0	 0	 0
	   gptid/1cbe95b1-1564-11e7-952e-a0369fbd5878  ONLINE	   0	 0	 0
	   gptid/1d0dae82-1564-11e7-952e-a0369fbd5878  ONLINE	   0	 0	 0
	cache
	 gptid/26381152-1564-11e7-952e-a0369fbd5878	ONLINE	   0	 0	 0
	 gptid/30e76545-1564-11e7-952e-a0369fbd5878	ONLINE	   0	 0	 0


I did some tests with Bonnie++ on a Ubuntu box connected directly to the FreeNAS box and mounting the NFS share on the Ubuntu box, but by 1 Gbps instead of 10 Gbps. I also ran bonnie++ on a XenServer host, connected to the switch at 10 Gbps.

Result:

Ubuntu box (NFS, 1 Gbps direct to the NAS):
Code:
# bonnie++ -d /mnt/tests/ -s 128G -n 0 -m TEST -f -b -u nobody
Using uid:65534, gid:65534.
Writing intelligently...done
Rewriting...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Version  1.97	   ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1	 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine		Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
TEST		   128G		   112771   8 12947   2		   114593   8  62.2  17
Latency						 874ms	  251s			   282ms	4389ms


XenServer 7.1 (NFS, 10 Gbps, mtu 9000, NetGear 10 Gbps switch in between):
Code:
# bonnie++ -s 180G -d /run/sr-mount/2dd11327-af36-1d5c-40e6-6019c203c6ce -n 0 -m virtuix2 -f -b -u nobody
Using uid:99, gid:99.
Writing intelligently...done
Rewriting...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Version  1.97	   ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1	 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine		Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
virtuix2	   180G		   188109  43 94173  33		   517105  60 204.0  24
Latency						 255ms	2482ms			 77923us	 942ms


bonnie++ on the FreeNAS box, in a jail:
Code:
root@Bonnie:/ # bonnie++ -s 200G -d /var/ -n 0 -m freenas -f -b -u root															
Using uid:0, gid:0.																												
Writing intelligently...done																										
Rewriting...done																													
Reading intelligently...done																										
start 'em...done...done...done...done...done...																					
Version  1.97	   ------Sequential Output------ --Sequential Input- --Random-													
Concurrency   1	 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--													
Machine		Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP													
freenas		200G		   556268  85 414935  92		   1227872  95 332.5  12												
Latency						7226ms	 178ms			 53926us	 275ms 


When running the tests directly on the FreeNAS, the ZIL devices uses up to 227.9 MBps. When doing it by NFS, it goes to 191.9 MBps. So why the bonnie++ test show such a big difference while the ZIL devices usage is not that different?
 

mav@

iXsystems
iXsystems
Joined
Sep 29, 2011
Messages
1,428
Numbers you are getting with bonnie++ running inside a jail look adequate to your configuration.

Speeds over 1G link also look acceptable, aside of rewrite speed, that is too slow, possibly because you hit some read-modify-write cycles, may be because your NFS write size of 64KB is less then 128KB default ZFS dataset record size. Please check both.

Write speeds in case of Xen are not good indeed. I tend to explain that with NFS sync write policy, which may be different from both local writes and writes via NFS from Ubuntu box. You haven't specified FreeNAS version you are using here. You may try RC or nightly build of coming soon FreeNAS 11.1, which includes few patches to address synchronous NFS write performance.
 
Last edited:

Pascal Robert

Dabbler
Joined
Apr 20, 2015
Messages
14
I mounted the NFS share on the Ubuntu box with the same NFS options as the XenServer boxes.

Code:
rw,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,acdirmin=0,acdirmax=0,soft,proto=tcp,timeo=100,retrans=12,sec=sys,mountvers=3,mountport=924,mountproto=tcp,local_lock=none


FreeNAS 9.10.2-U6.
 

Pascal Robert

Dabbler
Joined
Apr 20, 2015
Messages
14
Planning to. I'm moving away the VM on the ZFS dataset to try different things. For the record, the server have a
LSI 3108 card, but the disks for the ZFS pool are not using hardware RAID, they are configured as JBOD. The only RAID we use is for the FreeNAS installation.
 

Pascal Robert

Dabbler
Joined
Apr 20, 2015
Messages
14
Upgraded to FreeNAS-11.0-U4 (54848d13b). Quite a difference! For the test in the jail, I went from 513 MB/s to 802 MB/s on write, and from 1199 MB/s to 1480 MB/s on read.

One thing I do notice: when doing the test in the jail, the HDD drives are busy at 24%. When doing the test by NFS, the drives are busy at 8%.

Mounting the NFS share with the default options, direct cable (no switch) connection between the XenServer host and the FreeNAS box:
Code:
mount -t nfs 172.16.1.10:/mnt/Stockage/XenPool /mnt/freenas/


Code:
# bonnie++ -s 180G -d /mnt/freenas -n 0 -m virtuix2 -f -b -u nobody
Using uid:99, gid:99.
Writing intelligently...done
Rewriting...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Version  1.97	   ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1	 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine		Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
virtuix2	   180G		   319172  61 138289  47		   333453  43  3118 392
Latency						 178ms	1528ms			   232ms	1001ms

1.97,1.97,virtuix2,1,1513167197,180G,,,,319172,61,138289,47,,,333453,43,3118,392,,,,,,,,,,,,,,,,,,,178ms,1528ms,,232ms,1001ms,,,,,,


Mounting the NFS by XenServer's SR, switch in the middle, with the following options:
Code:
rw,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,acdirmin=0,acdirmax=0,soft,proto=tcp,timeo=100,retrans=12,sec=sys,mountvers=3,mountport=924,mountproto=tcp,local_lock=none


Code:
# bonnie++ -s 180G -d /run/sr-mount/52c40b4f-b33d-088e-f110-cd7f8a391cb0 -m virtuix2 -f -b -u nobody
Using uid:99, gid:99.
Writing intelligently...done
Rewriting...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.
Create files in random order...done.
Stat files in random order...done.
Delete files in random order...done.

Version  1.97	   ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1	 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine		Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
virtuix2	   180G		   335900  63 155024  49		   548155  62  3086 360
Latency						 152ms	1449ms			 86319us	 994ms

Version  1.97	   ------Sequential Create------ --------Random Create--------
virtuix2			-Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
			  files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
				16  1786  42  3112  43  1599  37  1906  41  2299  42  1901  42
Latency			 11437us   13929us	 197ms   11114us	5326us	5022us

1.97,1.97,virtuix2,1,1513162409,180G,,,,335900,63,155024,49,,,548155,62,3086,360,16,,,,,1786,42,3112,43,1599,37,1906,41,2299,42,1901,42,,152ms,1449ms,,86319us,994ms,11437us,13929us,197ms,11114us,5326us,5022us
 

Pascal Robert

Dabbler
Joined
Apr 20, 2015
Messages
14
Inside a VM, I get:

Code:
bonnie++ -s 8G -d /tmp -m deskpro -f -b -u nobody

Version  1.97	   ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1	 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine		Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
deskpro		  8G		   91701  17 84658  18		   220788  25  3330 108
Latency					   91258us	 484ms			 77893us	 208ms
 

Pascal Robert

Dabbler
Joined
Apr 20, 2015
Messages
14
Loks like we will need to move to a SSD on NVME if we want to use NFS or iSCSI on sync. With some tuning on the TCP stack, I can the SLOG drives to get busy up to 94%, 227 MB/s, and I have two VMs write at 115 MB/s each. With iSCSI with sync=standard or sync=disabled, I reach 335 MB/s on each VM, and no activity on the SLOG.
 

tvsjr

Guru
Joined
Aug 29, 2015
Messages
959
Are you getting SMART data passed through on your pool drives on the 2308? There's been arguing back and forth over whether configuring a 2308 in JBOD is "proper" for FreeNAS. Considering the money you're throwing at this, it might be worth trying a second HBA configured for proper IT mode for your pool drives.
 

Pascal Robert

Dabbler
Joined
Apr 20, 2015
Messages
14
I have the SMART data for the two drives in the SLOG, because they are connected to one of the SATA ports on the motherboard (they are not connected to the 12 drives backplane). The L2ARC and pool drives are connected to the LSI RAID card, in JBOD mode.
 

Pascal Robert

Dabbler
Joined
Apr 20, 2015
Messages
14
Found out that the motherboard do have SATA-3 ports, so I moved the two SLOG SSDs to one of those ports.

Just found out that the motherboard do have SATA-3 ports, so I moved the two SLOG devices to those ports. Ran the bonnie++, which gives:

zfs set sync=standard Stockage. NFS.

sysctl changes on XenServer host:
Code:
ifconfig eth7  txqueuelen 300000
sysctl -w net.ipv4.tcp_congestion_control=cubic
sysctl -w net.core.rmem_max=16777216
sysctl -w net.core.wmem_max=16777216
sysctl -w net.ipv4.tcp_rmem="4096 87380 16777216"
sysctl -w net.ipv4.tcp_wmem="4096 65536 16777216"
sysctl -w net.core.netdev_max_backlog=300000
sysctl -w net.ipv4.tcp_mtu_probing=1
sysctl -w vm.dirty_background_ratio=5
sysctl -w vm.dirty_ratio=80


While bonnie++ is writing on two VMs at the same time:

Code:
# zpool iostat -Td Stockage 10
			   capacity	 operations	bandwidth
pool		alloc   free   read  write   read  write
Stockage	 128G  14.4T	  0  3.20K	  0   136M
Stockage	 128G  14.4T	  0  4.72K	  0   189M
Stockage	 128G  14.4T	  0  4.54K	  0   184M
Stockage	 128G  14.4T	  0  4.74K	  0   238M
Stockage	 128G  14.4T	  0  4.44K	  0   226M


SLOG devices were used at 77.8%, between 177 MBps and 226 MBps bandwith (so no change from SATA-2 to SATA-3). HDD in the vdev use 10 MBps max, 20% max busy.

While bonnie++ is reading from two VMs at the same time:

Code:
# zpool iostat -Td Stockage 10
			   capacity	 operations	bandwidth
pool		alloc   free   read  write   read  write
Stockage	 129G  14.4T	  0  1.74K	  0  22.4M
Stockage	 129G  14.4T	  0  2.27K	  0  26.2M
Stockage	 129G  14.4T	  0	258	  0  3.43M


Results for the two VMs:

Code:
# sudo bonnie++ -s 64G -d /tmp -m vm1 -f -b -u nobody
Using uid:65534, gid:65534.
Writing intelligently...done
Rewriting...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.
Create files in random order...done.
Stat files in random order...done.
Delete files in random order...done.
Version  1.97	   ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1	 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine		Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
vm1			64G		   64615  11 76593  17		   157693  19  3026  93
Latency						 149ms	 588ms			   202ms   13653us
Version  1.97	   ------Sequential Create------ --------Random Create--------
vm1				-Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
			  files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
				 16   452   3 +++++ +++   482   2   471   3 +++++ +++   456   2
Latency			 45301us	1276us   39018us   95488us	  68us	 143ms
1.97,1.97,deskpro,1,1513357069,64G,,,,64615,11,76593,17,,,157693,19,3026,93,16,,,,,452,3,+++++,+++,482,2,471,3,+++++,+++,456,2,,149ms,588ms,,202ms,13653us,45301us,1276us,39018us,95488us,68us,143ms


Code:
# sudo bonnie++ -s 64G -d /tmp -m vm2 -f -b -u nobody
Using uid:65534, gid:65534.
Writing intelligently...done
Rewriting...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.
Create files in random order...done.
Stat files in random order...done.
Delete files in random order...done.
Version  1.97	   ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1	 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine		Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
vm2				64G		   62337  11 58416  14		   176264  21  4405 135
Latency						4407ms	1070ms			 93882us   21830us
Version  1.97	   ------Sequential Create------ --------Random Create--------
vm2				-Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
			  files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
				 16   577   4 +++++ +++   638   4   599   4 +++++ +++   719   3
Latency			   206ms	1091us	 205ms	 206ms	  56us	 205ms
1.97,1.97,deskpro,1,1513356669,64G,,,,62337,11,58416,14,,,176264,21,4405,135,16,,,,,577,4,+++++,+++,638,4,599,4,+++++,+++,719,3,,4407ms,1070ms,,93882us,21830us,206ms,1091us,205ms,206ms,56us,205ms
 

tvsjr

Guru
Joined
Aug 29, 2015
Messages
959
I have the SMART data for the two drives in the SLOG, because they are connected to one of the SATA ports on the motherboard (they are not connected to the 12 drives backplane). The L2ARC and pool drives are connected to the LSI RAID card, in JBOD mode.
The lack of SMART data says ZFS isn't getting direct access to the drives, which may cause you performance issues. Further, no SMART means no testing and no monitoring... which means no proactive notifications of disk failure before the drive fails completely.

NVMe is the smart place for your SLOG. BTW, if you have power loss protection, you don't need a pair... nor do you need two L2ARC devices.

Finally, you're focused entirely on bandwidth. In reality, the bigger issue with a VM workload is IOPS. You've only got 4 vdevs running drives that are good for about 75 IOPS each. That's 300 IOPS. If you plan to do anything serious with the array... heavy database loads, SIEM, etc... you'll feel the pain.
 

Pascal Robert

Dabbler
Joined
Apr 20, 2015
Messages
14
Our VMs are currently on 3 hosts, two of them have a RAID-5 (3 disks) and one with a RAID-10 (4 disks), so that's about 380 IOPS total with 50% write. The FreeNAS box should, if I calculated correctly, offer 360 IOPS total with 50% write, that's why we went with this setup. In fact, we asked for a quote from ixSystems, and they offered the same setup (two SSD for L2ARC, two SSD for SLOG, 8 HDD for vdevs). We do have databases in there (SQL Server and MySQL), and one mail server (Zimbra). If we move to NVMe for L2ARC and SLOG, we could add two spindles to add some IOPS.

And indeed, maybe we should move to a HBA and remove the RAID card.
 

tvsjr

Guru
Joined
Aug 29, 2015
Messages
959
It's been proven time and again, sometimes at the loss of a pool, that using a RAID controller, not a straight-through HBA, is a Bad Idea(tm).

Depending on how much storage you need, you could always go to SSD/NVMe storage. Then, you get all kinds of performance (and start battling other limitations, like the network interface).
 
Status
Not open for further replies.
Top