could i have done better - VmWare and NFS setup

orddie

Contributor
Joined
Jun 11, 2016
Messages
104
Hi All,

I have the following setup for my NAS box.

  • Tyan S7012 motherboard (LGA1366)
  • Intel(R) Core(TM) i3-8100 CPU @ 3.60GHz
  • 16GB memory
  • LSI 9211 HBAs in IT mode
  • 12 Crucial MX500 500GB 3D NAND SATA 2.5 Inch Internal SSD
  • 1x OCZ Vertex 2 boot drive for free-nas
  • Intel 82599EB 10-Gigabit SFI/SFP+ Network Connection. This is the dual port version in a LAG - so 20Gbit to the UniFi switch
  • NFS shares out to each VmWare host


My VmWare hosts
  • Vmware 6.5 SU2
  • 6 core - 12 TH processor
  • 64GB memory
  • Intel 82599EB 10-Gigabit SFI/SFP+ Network Connection. - Single port connection
  • Local Cache drive equal to or large than the local ram size


I did not breakout dedicated cache or log stores and i wonder how much of an improvement i could gain by doing so?



i did a disk throughput test and got the following at a 1gb sample
Code:
-----------------------------------------------------------------------
CrystalDiskMark 6.0.2 x64 (C) 2007-2018 hiyohiyo
                          Crystal Dew World : https://crystalmark.info/
-----------------------------------------------------------------------
* MB/s = 1,000,000 bytes/s [SATA/600 = 600,000,000 bytes/s]
* KB = 1000 bytes, KiB = 1024 bytes

   Sequential Read (Q= 32,T= 1) :  1122.746 MB/s
  Sequential Write (Q= 32,T= 1) :    64.847 MB/s
  Random Read 4KiB (Q=  8,T= 8) :   305.169 MB/s [  74504.2 IOPS]
Random Write 4KiB (Q=  8,T= 8) :     2.379 MB/s [    580.8 IOPS]
  Random Read 4KiB (Q= 32,T= 1) :   309.609 MB/s [  75588.1 IOPS]
Random Write 4KiB (Q= 32,T= 1) :     2.340 MB/s [    571.3 IOPS]
  Random Read 4KiB (Q=  1,T= 1) :    30.896 MB/s [   7543.0 IOPS]
Random Write 4KiB (Q=  1,T= 1) :     1.122 MB/s [    273.9 IOPS]

  Test : 1024 MiB [C: 38.9% (15.4/39.7 GiB)] (x5)  [Interval=5 sec]
  Date : 2019/01/29 13:48:14
    OS : Windows Server 2012 R2 Server Standard (full installation) [6.3 Build 9600] (x64)



zpool status
Code:
root@freenas:~ # zpool status
  pool: RaidZ-SSD
state: ONLINE
  scan: scrub repaired 0 in 0 days 00:05:37 with 0 errors on Sun Jan 20 00:05:37                                                                                                          2019
config:

        NAME                                            STATE     READ WRITE CKS                                                                                                         UM
        RaidZ-SSD                                       ONLINE       0     0                                                                                                              0
          raidz1-0                                      ONLINE       0     0                                                                                                              0
            gptid/9d87ed1c-85eb-11e8-b7d4-001b21a7c63c  ONLINE       0     0                                                                                                              0
            gptid/9db768be-85eb-11e8-b7d4-001b21a7c63c  ONLINE       0     0                                                                                                              0
            gptid/9de87a5e-85eb-11e8-b7d4-001b21a7c63c  ONLINE       0     0                                                                                                              0
            gptid/9e195c36-85eb-11e8-b7d4-001b21a7c63c  ONLINE       0     0                                                                                                              0
            gptid/9e563029-85eb-11e8-b7d4-001b21a7c63c  ONLINE       0     0                                                                                                              0
            gptid/9e8b976e-85eb-11e8-b7d4-001b21a7c63c  ONLINE       0     0                                                                                                              0
          raidz1-1                                      ONLINE       0     0                                                                                                              0
            gptid/9ecceb86-85eb-11e8-b7d4-001b21a7c63c  ONLINE       0     0                                                                                                              0
            gptid/9f0802d3-85eb-11e8-b7d4-001b21a7c63c  ONLINE       0     0                                                                                                              0
            gptid/9f4216be-85eb-11e8-b7d4-001b21a7c63c  ONLINE       0     0                                                                                                              0
            gptid/9f7dc472-85eb-11e8-b7d4-001b21a7c63c  ONLINE       0     0                                                                                                              0
            gptid/9fb7456b-85eb-11e8-b7d4-001b21a7c63c  ONLINE       0     0                                                                                                              0
            gptid/9ff49624-85eb-11e8-b7d4-001b21a7c63c  ONLINE       0     0                                                                                                              0

errors: No known data errors

  pool: freenas-boot
state: ONLINE
  scan: scrub repaired 0 in 0 days 00:00:06 with 0 errors on Sat Jan 26 03:45:06                                                                                                          2019
config:

        NAME        STATE     READ WRITE CKSUM
        freenas-boot  ONLINE       0     0     0
          ada2p2    ONLINE       0     0     0

errors: No known data errors
 
Last edited:

Heracles

Wizard
Joined
Feb 2, 2018
Messages
1,401
Hi,

As for your config, did you consider using iSCSI to host a VMWare datastore directly in your FreeNAS ? I would not say it would be clearly better, but it would be different.

As for the rest, you did not mentioned if your RAM is ECC or not. I encourage you to use ECC...

There is one point I strongly advise you against : RaidZ1. I do not consider RaidZ1 provides a good enough protection to your data. Even less with a vDev of 6 drives. A single 12 drives RaidZ3 vdev or 2x 6 drives RaidZ2 would offer a much better protection. Raid10 would also offer a strong protection, but would reduce your space significantly.

A 12 drives RaidZ3 would reduce the space by the equivalent of only 1 drive compared to what you have now. Considering the big improvement in protection, I really encourage you to go that way (or better).
 

Hoeser

Dabbler
Joined
Sep 23, 2016
Messages
23
As Heracles said, Ditch the Raidz1 and go with a span of 2x raidz2 vdevs unless you really don't care about your data. ECC memory is also recommended. I had a FreeNAS box running similar consumer grade hardware that lasted a few years, though it did have the odd random reboot every 3 or 4 months which I attribute mostly to not having ECC memory.

As for your poor write performance, it is a direct result of sync writes. ESXi forces sync with NFS datastores. You can test this by disabling sync on the entire pool temporarily and re-running your tests. This will bump your write speeds way up, but also not provide any power loss protection for your ZFS pool.

That being said, I don't think you made the wrong call - but you will absolutely need an appropriate SLOG device to improve performance. I run a strictly NFS setup for ESXi and greatly prefer it to iSCSI. I've just personally experienced far fewer issues with NFS on FreeNAS as compared to iSCSI... with my workloads anyway. I run an Optane 900P 280GB SLOG and my NFS speeds are the same either with sync=always or sync=disabled - removing the sync penalty pretty much entirely.

If you do switch over to iSCSI, you'll see an improvement in write speed (with your pool set to sync=standard) because iSCSI presents block storage allowing ESXi and in turn its Guest VMs the choice of doing either a synchronous or asynchronous write operations.
 

orddie

Contributor
Joined
Jun 11, 2016
Messages
104
@Hoeser I agree with your NFHS POV. I was talking to a few VmWare guys and they also say NFHS over iSCSI for a few reasons (all being different)

You only have SLOG device? does each of your ESXI host have a cache / swap drive local? I did forget to mention that. Each of the hosts has local Cache.
 

Hoeser

Dabbler
Joined
Sep 23, 2016
Messages
23
My ESXi hosts are completely diskless. No caching.

I have only the Optane 900P SLOG , and even that is partitioned for 3 separate zpools. Partitioning a single device for multiple ZILs isn't necessarily recommended but for my use case its fine as it's pretty rare that I have more than two pools that are particularly busy at any one time.
 

orddie

Contributor
Joined
Jun 11, 2016
Messages
104
i added a log device (assume this is what ya all mean by SLOG device. The drive is an 250 GB M.2 by WD. Link

looks like read took a large performance hit but write is higher.

Code:
root@freenas:~ # zpool status
  pool: RaidZ-SSD
state: ONLINE
  scan: scrub repaired 0 in 0 days 00:05:37 with 0 errors on Sun Jan 20 00:05:37 2019
config:

        NAME                                            STATE     READ WRITE CKSUM
        RaidZ-SSD                                       ONLINE       0     0     0
          raidz1-0                                      ONLINE       0     0     0
            gptid/9d87ed1c-85eb-11e8-b7d4-001b21a7c63c  ONLINE       0     0     0
            gptid/9db768be-85eb-11e8-b7d4-001b21a7c63c  ONLINE       0     0     0
            gptid/9de87a5e-85eb-11e8-b7d4-001b21a7c63c  ONLINE       0     0     0
            gptid/9e195c36-85eb-11e8-b7d4-001b21a7c63c  ONLINE       0     0     0
            gptid/9e563029-85eb-11e8-b7d4-001b21a7c63c  ONLINE       0     0     0
            gptid/9e8b976e-85eb-11e8-b7d4-001b21a7c63c  ONLINE       0     0     0
          raidz1-1                                      ONLINE       0     0     0
            gptid/9ecceb86-85eb-11e8-b7d4-001b21a7c63c  ONLINE       0     0     0
            gptid/9f0802d3-85eb-11e8-b7d4-001b21a7c63c  ONLINE       0     0     0
            gptid/9f4216be-85eb-11e8-b7d4-001b21a7c63c  ONLINE       0     0     0
            gptid/9f7dc472-85eb-11e8-b7d4-001b21a7c63c  ONLINE       0     0     0
            gptid/9fb7456b-85eb-11e8-b7d4-001b21a7c63c  ONLINE       0     0     0
            gptid/9ff49624-85eb-11e8-b7d4-001b21a7c63c  ONLINE       0     0     0
        logs
          gptid/a1d09709-2419-11e9-9930-001b21a7c63c    ONLINE       0     0     0


Code:
-----------------------------------------------------------------------
CrystalDiskMark 6.0.2 x64 (C) 2007-2018 hiyohiyo
                          Crystal Dew World : https://crystalmark.info/
-----------------------------------------------------------------------
* MB/s = 1,000,000 bytes/s [SATA/600 = 600,000,000 bytes/s]
* KB = 1000 bytes, KiB = 1024 bytes

   Sequential Read (Q= 32,T= 1) :   682.693 MB/s
  Sequential Write (Q= 32,T= 1) :   142.498 MB/s
  Random Read 4KiB (Q=  8,T= 8) :   334.324 MB/s [  81622.1 IOPS]
Random Write 4KiB (Q=  8,T= 8) :    13.005 MB/s [   3175.0 IOPS]
  Random Read 4KiB (Q= 32,T= 1) :   265.587 MB/s [  64840.6 IOPS]
Random Write 4KiB (Q= 32,T= 1) :    13.306 MB/s [   3248.5 IOPS]
  Random Read 4KiB (Q=  1,T= 1) :     9.259 MB/s [   2260.5 IOPS]
Random Write 4KiB (Q=  1,T= 1) :     6.294 MB/s [   1536.6 IOPS]

  Test : 1024 MiB [C: 38.5% (15.3/39.7 GiB)] (x5)  [Interval=5 sec]
  Date : 2019/01/29 18:08:10
    OS : Windows Server 2012 R2 Server Standard (full installation) [6.3 Build 9600] (x64)
  
 

Hoeser

Dabbler
Joined
Sep 23, 2016
Messages
23
Read performance is not impacted by a log device.

That WD M.2 is a rather crap log device. You want something that has some form of power loss protection either by way of an array of capacitors or in the case of the optane 900 the 3d crosspoint memory which simply doesn't have a cache buffer. Also low latency is king here. Using a conventional consumer SSD will not be effective. To see the effective of a truly good slog, simply put your pool on sync=disabled temporarily and run the test again.

Also read up on slog devices in the forums as there are lots of posts with device recommendations.
 

orddie

Contributor
Joined
Jun 11, 2016
Messages
104
Read performance is not impacted by a log device.

That WD M.2 is a rather crap log device. You want something that has some form of power loss protection either by way of an array of capacitors or in the case of the optane 900 the 3d crosspoint memory which simply doesn't have a cache buffer. Also low latency is king here. Using a conventional consumer SSD will not be effective. To see the effective of a truly good slog, simply put your pool on sync=disabled temporarily and run the test again.

Also read up on slog devices in the forums as there are lots of posts with device recommendations.


SLOG is the same as log, yes? when i added the device to the pool it only had the option of log.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080

orddie

Contributor
Joined
Jun 11, 2016
Messages
104
The 900p will hopefully arrive today. Is the process as simple as removing the old log drive and adding the new?
 

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
NFS shares out to each VmWare host
But why? You can share one NFS with all on you hosts. You don't gain anything if it's all backed by the same pool unless you have different ZFS options on each dataset.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080

orddie

Contributor
Joined
Jun 11, 2016
Messages
104

orddie

Contributor
Joined
Jun 11, 2016
Messages
104
But why? You can share one NFS with all on you hosts. You don't gain anything if it's all backed by the same pool unless you have different ZFS options on each dataset.
I do not following your concern.

I have a single FreeNAS box with a single IP, sharing a single NFS share. All of the VMware hosts use that single IP to gain access to the share.
 

Hoeser

Dabbler
Joined
Sep 23, 2016
Messages
23
I do not following your concern.

I don't think he understood your layout.

The rip and replace is simple, you can do it all in the UI. Remove exsting log, add new log. Enable sync=always on the pool and ensure any sub-volumes or datasets are inheriting their sync settings.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457

orddie

Contributor
Joined
Jun 11, 2016
Messages
104
just wow!

writes went from sup 150 to over 650. when i narrow down the testing to 50MB I hit over 1000MB/s.


suggestions on getting my read up?

1GB test
Code:
-----------------------------------------------------------------------
CrystalDiskMark 6.0.2 x64 (C) 2007-2018 hiyohiyo
                          Crystal Dew World : https://crystalmark.info/
-----------------------------------------------------------------------
* MB/s = 1,000,000 bytes/s [SATA/600 = 600,000,000 bytes/s]
* KB = 1000 bytes, KiB = 1024 bytes

   Sequential Read (Q= 32,T= 1) :   679.758 MB/s
  Sequential Write (Q= 32,T= 1) :   678.974 MB/s
  Random Read 4KiB (Q=  8,T= 8) :   389.629 MB/s [  95124.3 IOPS]
 Random Write 4KiB (Q=  8,T= 8) :    56.367 MB/s [  13761.5 IOPS]
  Random Read 4KiB (Q= 32,T= 1) :   297.247 MB/s [  72570.1 IOPS]
 Random Write 4KiB (Q= 32,T= 1) :    86.169 MB/s [  21037.4 IOPS]
  Random Read 4KiB (Q=  1,T= 1) :     5.666 MB/s [   1383.3 IOPS]
 Random Write 4KiB (Q=  1,T= 1) :    23.558 MB/s [   5751.5 IOPS]

  Test : 1024 MiB [C: 39.3% (15.6/39.7 GiB)] (x5)  [Interval=5 sec]
  Date : 2019/02/01 18:37:55
    OS : Windows Server 2012 R2 Server Standard (full installation) [6.3 Build 9600] (x64)
  


50MB test
Code:
-----------------------------------------------------------------------
CrystalDiskMark 6.0.2 x64 (C) 2007-2018 hiyohiyo
                          Crystal Dew World : https://crystalmark.info/
-----------------------------------------------------------------------
* MB/s = 1,000,000 bytes/s [SATA/600 = 600,000,000 bytes/s]
* KB = 1000 bytes, KiB = 1024 bytes

   Sequential Read (Q= 32,T= 1) :   683.196 MB/s
  Sequential Write (Q= 32,T= 1) :  1096.497 MB/s
  Random Read 4KiB (Q=  8,T= 8) :   411.400 MB/s [ 100439.5 IOPS]
 Random Write 4KiB (Q=  8,T= 8) :   289.604 MB/s [  70704.1 IOPS]
  Random Read 4KiB (Q= 32,T= 1) :   303.907 MB/s [  74196.0 IOPS]
 Random Write 4KiB (Q= 32,T= 1) :   292.430 MB/s [  71394.0 IOPS]
  Random Read 4KiB (Q=  1,T= 1) :     9.731 MB/s [   2375.7 IOPS]
 Random Write 4KiB (Q=  1,T= 1) :    31.226 MB/s [   7623.5 IOPS]

  Test : 50 MiB [C: 39.1% (15.5/39.7 GiB)] (x5)  [Interval=5 sec]
  Date : 2019/02/01 18:51:38
    OS : Windows Server 2012 R2 Server Standard (full installation) [6.3 Build 9600] (x64)
  
 

Ixian

Patron
Joined
May 11, 2015
Messages
218
More memory for Freenas. There may be specific situations with ESXi hosting Freenas where an L2Arc helps, and you can carve that out of the same 900p you are using for the log drive.

This is a fantastic thread about it you should read. Log/cache discussion and examples start on page 4.

You should also be using ECC memory.
 
Top