Register for the iXsystems Community to get an ad-free experience and exclusive discounts in our eBay Store.
Resource icon

Sync writes, or: Why is my ESXi NFS so slow, and why is iSCSI faster?

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,258
Updated approach to my VMware NFS use of FreeNAS 9.2.1.7.....
I finally understand more about ZIL (SLOG) etc.. and so I bought
1) Dell XPS 8700 - Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz, with 32GB RAM.
2) 2 x 500 GB Samsung EVO SSDs - Samsung 850 EVO 500GB 2.5-Inch SATA III Internal SSD (MZ-75E500B/AM)
3) 1 x 120 GB Kingston for ZIL - Kingston Digital 120GB SSDNow V300 SATA 3 2.5 Solid State Drive (SV300S37A/120G)

Steps:
1) Used web GUI to configured the 2 x 500 GB SSDs as Mirror zpool with lz4 (initial default compression 6.58x).
2) Used web GUI to configure the 1 x 120GB Kingstone as ZIL (Pool) - and then used GUI to detach it (leaving formatting in place). *I understand the ZIL does not need nearly 120GB, but it was only $50 and I have no other use for that SSD.
3) I used command line to attach the ZIL to the zpool

# zpool status
pool: aeraidz
state: ONLINE
scan: none requested
config:

NAME STATE READ WRITE CKSUM
aeraidz ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
gptid/cd5048df-b7ec-11e6-8d53-6805ca4185e3 ONLINE 0 0 0
gptid/cd6475c8-b7ec-11e6-8d53-6805ca4185e3 ONLINE 0 0 0
logs
ada1p2 ONLINE 0 0 0

4) I then setup NFS Share and mounted to my ESXI 5.5.0 hosts - with sync=default
[root@aenas3 /]# zfs get sync
NAME PROPERTY VALUE SOURCE
aeraidz sync standard default
aeraidz/.system sync standard default
aeraidz/.system/cores sync standard default
aeraidz/.system/rrd sync standard default
aeraidz/.system/samba4 sync standard default
aeraidz/.system/syslog sync standard default


5) Then I did a migration - of a 160GB (Provisioned, 72GB Used Storage) VM and WO HO... I have 90MBs / second write.....
5a) FreeNAS network .... almost fully saturated 1GB network
View attachment 14786

5b) ada1 = ZIL. ada2/ada3 = Mirror zpool
View attachment 14787


5c) Here's the VMware write performance in KBps of the migration
View attachment 14785


6) I did additional migrations and experimented with sync=disabled, and sync=always but it didn't materially change the write performance on other VMware Migrations.

Another data point - I have another FreeNAS box
- Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz with 8GB RAM.
- Single (striped zpool) Samsung 850 EVO 500GB SSD.
- NFS to VMware

With sync=disabled (very risky) I get 90GB /sec BUT only 7GB with sync=always. 7GB/sec is just too slow for practical VMware use.

CONCLUSION:
- It looks lik" the ZIL really works for the VMware / NFS case. Amazing after a couple of years of fooling with this that a simple ZIL addition seems to have made the difference between usable performance and non-usable performance for VMware on NFS.
- I think I have a fully 'sane' FreeNAS setup with sync=standard and reasonable VMware NFS performance. And with compression, that 500GB easily extends to 1TB+.... of space for my VMs.

I'd be interested in comments - particularly if I have not actually achieved an OK/Safe (e.g. ZFS meta data preserved, sync=standard) solution for an adequate level of VMware NFS performance.
Just curious... why are you using this old version of FreeNAS?

Regarding your system... The Dell XPS 8700 is a desktop PC that doesn't support ECC RAM, which is important for data integrity.

The Kingston SSD isn't a very good choice for a SLOG device. A good SLOG device needs integral capacitor power backup, low latency, fast writes, and very high write endurance. A good entry-level SSD SLOG device is the Intel DC S3700/S3710. A better choice is the Intel DC P3700, an NVMe device... but these are pricey. You can get an Intel 750 (also an NVMe device) for less money, but it's not going to perform as well as the P3700.

To be honest, you may not be gaining much benefit from a ZIL SLOG device. Your pool is built using SSDs, which are pretty darned fast to begin with...

If you're just running a lab environment, you'll get the best performance by setting sync=disabled on your NFS-based VM datastore. I did this for over a year before adding an Intel DC S3700 SLOG device.

Here are some related threads:

https://forums.freenas.org/index.php?threads/some-insights-into-slog-zil-with-zfs-on-freenas.13633/

https://forums.freenas.org/index.ph...csi-vs-nfs-performance-testing-results.46553/

Good luck!
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,862

Ericloewe

Not-very-passive-but-aggressive
Moderator
Joined
Feb 15, 2014
Messages
16,950
I can't find anywhere that shows the V300 having built-in power loss protection. If that is the case, just save your money and turn sync off and repurpose the V300. Or consider getting a proper SLOG device.
It doesn't. It's also infamous for the switcheroo Kingston pulled, replacing the original, reviewed fast-ish NAND with much slower chips after a few months of production, once the reviews had all come in.
 

Ken Almond

Junior Member
Joined
May 11, 2014
Messages
19
Just curious... why are you using this old version of FreeNAS?

Regarding your system... The Dell XPS 8700 is a desktop PC that doesn't support ECC RAM, which is important for data integrity.

The Kingston SSD isn't a very good choice for a SLOG device. A good SLOG device needs integral capacitor power backup, low latency, fast writes, and very high write endurance. A good entry-level SSD SLOG device is the Intel DC S3700/S3710. A better choice is the Intel DC P3700, an NVMe device... but these are pricey. You can get an Intel 750 (also an NVMe device) for less money, but it's not going to perform as well as the P3700.

To be honest, you may not be gaining much benefit from a ZIL SLOG device. Your pool is built using SSDs, which are pretty darned fast to begin with...

If you're just running a lab environment, you'll get the best performance by setting sync=disabled on your NFS-based VM datastore. I did this for over a year before adding an Intel DC S3700 SLOG device.

Here are some related threads:

https://forums.FreeNAS.org/index.php?threads/some-insights-into-slog-zil-with-zfs-on-FreeNAS.13633/

https://forums.FreeNAS.org/index.ph...csi-vs-nfs-performance-testing-results.46553/

Good luck!
Just curious... why are you using this old version of FreeNAS?

Regarding your system... The Dell XPS 8700 is a desktop PC that doesn't support ECC RAM, which is important for data integrity.

The Kingston SSD isn't a very good choice for a SLOG device. A good SLOG device needs integral capacitor power backup, low latency, fast writes, and very high write endurance. A good entry-level SSD SLOG device is the Intel DC S3700/S3710. A better choice is the Intel DC P3700, an NVMe device... but these are pricey. You can get an Intel 750 (also an NVMe device) for less money, but it's not going to perform as well as the P3700.

To be honest, you may not be gaining much benefit from a ZIL SLOG device. Your pool is built using SSDs, which are pretty darned fast to begin with...

If you're just running a lab environment, you'll get the best performance by setting sync=disabled on your NFS-based VM datastore. I did this for over a year before adding an Intel DC S3700 SLOG device.

Here are some related threads:

https://forums.FreeNAS.org/index.php?threads/some-insights-into-slog-zil-with-zfs-on-FreeNAS.13633/

https://forums.FreeNAS.org/index.ph...csi-vs-nfs-performance-testing-results.46553/

Good luck!
>Dell XPS 8700 is a desktop PC that doesn't support ECC RAM
Agreed - but it was only $485 + $180 for 32GB RAM. So it was cheap/home computer and *quiet* and whole project is only about $1,000.


>why are you using this old version of FreeNAS?
Do not have a reason to update - e.g. current version seems stable and OK for my purposes.


>To be honest, you may not be gaining much benefit from a ZIL SLOG device. Your pool is built using SSDs, which are pretty darned fast to begin with...

I have been running a single (stripe mode) SSD with sync=disabled for a while - with 100MBs write speed.
However, as soon as I set sync=standard or sync=always, performance drops to 10MBs. My previous Seagate 7200 RPM ZFS was 5MBs. 5MBs / 10MBs is just too slow for VMware VMs. They will limp along but if you do snapshots (for backup) etc ... problems arise. At 100MBs write capibility - things run smoothly.

As I read the FreeNAS blogs, running sync=disabled leaves the ZFS filesystem (and therefore the VMs) open to corruption, even meta data corruption of the ZFS filesystem where a scrub might not be able to repair.

With the ZIL, I can set sync=always or sync=standard AND I get the 100MBs on write (full network bandwith of the 1Gbit interface card). So now, I have VMware, mounting NFS FreeNAS storage, and leveraging FreeNAS / ZFS capabilities pretty well. I have an UPS backup on the machine.

*I agree this is home system, but with 2 of these FreeNAS systems - I can use VMware High Availability (e.g. if one NFS mount goes down the VMs continue without interruption on the other NFS mount). Plus - I do daily backups using Avamar - so I restore if all else fails. But of course, if the house burns down - all will be lost :)

>The Kingston SSD isn't a very good choice for a SLOG device.
Yes, I get this and understand that there is a small window (2 secs?) of vulnerability if the device fails - but it does appear fast enough and was very cheap ($48). I read that you can mirror the SLOG - but I didn't go that far.

>Good luck![/QUOTE]
Thank you for the comment and I hope my reply provides more explanation.
 

Ken Almond

Junior Member
Joined
May 11, 2014
Messages
19
Update on adventures with using NFS for VMware over 1Gb ethernet. The previous post was about deploying a ZIL and wa la, I was able to go from 10MBs to 90MBs writes. The 1GBit Ethernet was pretty well saturated - running 800Mbs (so it matches up). The 90-100Mbs is perfectly adequate for running VMs under VMware. To give you a comparison - I use cheap Adaptect 3405, 5405, 6405, 6805 raid on cheap Seagate 3GB / 7200, atatched to ESXi host - and get around 60MBs raw write performance.

However, to finish my ZIL effort... after reading more (and understanding more) its seems that best practice is to *mirror* the ZIL - even with the latest versions of FreeNAS to a avoid 2 sec window of corruption if the ZIL fails.
So I bought additional 120 GB Kingston for ZIL - Kingston Digital 120GB SSDNow V300 SATA 3 2.5 Solid State Drive (SV300S37A/120G) because I couldn't find much that was smaller and it was only $48.

I added a mirrored ZIL and performance did not seem to change significantly. I read a post that suggested mirrored ZIL causes performance problems. It could be - but in my common case of VMware / NFS - the 90MBs performance seemed to be the same.

As a side note - I did a log of searching on 'how to mirror a zil', and it turns out to be very easy.
1) You setup your pool - e.g. 'aeraidz'
2) Then you find the ada0, ada1, ... of the 2 devices for your ZIL
3) Then you simply run this command (on FreeNAS 9.10.1): zpool add aeraidz log mirror ada0 ada1
where 'aeraidz' is the pool, the 'log mirror ada0 ada1' command says attach a "LOG" as a mirror device built with ada0 and ada1.
4) Here is the final result (notice logs, "mirror-1", made up of ada1 and ada2):
[root@aestray] ~# zpool status
pool: aeraidz
state: ONLINE
scan: scrub repaired 0 in 0h29m with 0 errors on Wed Dec 14 05:55:22 2016
config:

NAME STATE READ WRITE CKSUM
aeraidz ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
gptid/d6d7e89e-c1a8-11e6-b6ef-6805ca2137d9 ONLINE 0 0 0
gptid/d719b7fe-c1a8-11e6-b6ef-6805ca2137d9 ONLINE 0 0 0
logs
mirror-1 ONLINE 0 0 0
ada1 ONLINE 0 0 0
ada2 ONLINE 0 0 0
 
Last edited:

LaserAllan

Member
Joined
Mar 25, 2017
Messages
33
This is an interesting topic,

I run x12 ESXi nodes with all of them writing to NFS shares on ZFS. So far I have a x12 drive z2 pool divided into 6 drives for each vdev.

I recently added an NVME SLOG drive Intel P3700, that really speeded up my NFS-writes allot.
I can easily saturate gigabit ethernet, so the next plan is to add 10G networking for the storage-network.
 

jgreco

Resident Grinch
Moderator
Joined
May 29, 2011
Messages
13,526
This is an interesting topic,

I run x12 ESXi nodes with all of them writing to NFS shares on ZFS. So far I have a x12 drive z2 pool divided into 6 drives for each vdev.
Definitely not the way to do that.


I recently added an NVME SLOG drive Intel P3700, that really speeded up my NFS-writes allot.
I can easily saturate gigabit ethernet, so the next plan is to add 10G networking for the storage-network.
 

LaserAllan

Member
Joined
Mar 25, 2017
Messages
33
Definitely not the way to do that.





I guess i should clarify, I do not run the VM storage on ZFS in this case, i use NFS to mount certain datasets into VMs....

A total rebuild would have to be done if i am going to use all mirrors :)
 

jgreco

Resident Grinch
Moderator
Joined
May 29, 2011
Messages
13,526
I guess i should clarify, I do not run the VM storage on ZFS in this case, i use NFS to mount certain datasets into VMs....

A total rebuild would have to be done if i am going to use all mirrors :)
Okay, well, then that's fine then. Note you might still have some speed limits because RAIDZ is somewhat slower than mirrors.
 

guemi

Junior Member
Joined
Apr 16, 2020
Messages
22
Is SMB any difference here? I am using TrueNAS Core 12.0

I have a Hyper-V hypervisor whom I want to mount VHDX files on, and in turn put a VHDX in that VHDX (Vhdxception) due to wanting to run VSS Backups from the hypervisor host.


The way I see it I have three options:


A) Make an NFS share, put VHDX-files on that, mount those VHDX-files on my hypervisor and then create the actual virtual drive a VM will use into that.

VSS ran on the hypervisor will see the VHDX on the NFS drive as a regular drive, and make a snapshot like if it was a local drive - all is good in the world. (I've actually tested this).

B) Doing A; but using SMB share instead of NFS (Tested this too, works splendid)


C) Making one ISCSI target per VM, and putting the VHDX there. Again, VSS snapshots will work.


What I do not completely grasp yet is what will happen when my TrueNAS 12.0 loses network or power (It's obviously redundant, but it's a question of when not if as always with tech.)

I am using 2x CORSAIR MP510B NVMes as cache from my HDD-pool, so I am very much able to get good speeds (And 10 gbps Base-T copper connectivity) to my NAS, but I am worried about potentially corrupt VHDX's when power goes out.


Which one would be considered "best practice" between A, B and C?
 

jgreco

Resident Grinch
Moderator
Joined
May 29, 2011
Messages
13,526
iSCSI is undoubtedly the best at providing precise feedback to the hypervisor about the state of things. NFS should be fine as well, although with the normal "NFS freezes" when the datastore vanishes. I wouldn't trust SMB because it is a reverse engineered cluster-fsck and who knows what might be lurking on either the client or server side.

ZFS and sync writes are precisely what this thread is all about. If you want to avoid corruption, you need a system that reliably handles sync writes. This needs to be resilient at all levels. That starts with enabling sync writes to a SLOG device that has appropriate power loss or other similar protection in the case of a power failure, and continues up the stack.
 

guemi

Junior Member
Joined
Apr 16, 2020
Messages
22
iSCSI is undoubtedly the best at providing precise feedback to the hypervisor about the state of things. NFS should be fine as well, although with the normal "NFS freezes" when the datastore vanishes. I wouldn't trust SMB because it is a reverse engineered cluster-fsck and who knows what might be lurking on either the client or server side.

ZFS and sync writes are precisely what this thread is all about. If you want to avoid corruption, you need a system that reliably handles sync writes. This needs to be resilient at all levels. That starts with enabling sync writes to a SLOG device that has appropriate power loss or other similar protection in the case of a power failure, and continues up the stack.
Excellent then I am on the right track. I'll scrap SMB.

Thank you for your input!
 
Top