Data integritry NFS vs iSCSI on ZFS with VMWARE

Status
Not open for further replies.

MichelZ

Dabbler
Joined
Apr 6, 2013
Messages
19
I have read several posts now about NFS+ZFS SYNC problem with VMWARE.
They all talk about safety of data, and to not turn off ZFS SYNC under any circumstances.
I get that...

But what about iSCSI?
Is VMWARE + iSCSI more error-prone then when the storage host would go down unexpected?
Is it less safe to use iSCSI instead of NFS? Am I missing something fundamental here?

Thanks
Michel
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I would say they are about equally as safe, in some situations.

iSCSI devices can suffer from complete disconnects if the latency gets too high. Obviously this isn't good for the file system or machine using the device. There are strict latency limits on iSCSI, while NFS has far far more lax requirements. This is why iSCSI problems are relatively severe and can cause file system and file corruption while NFS just suffers from less than optimal performance.
 

MichelZ

Dabbler
Joined
Apr 6, 2013
Messages
19
Thanks again for your answer.

So you are saying the possibility of data loss in an NFS scenario is smaller than with an iSCSI scenario.
How can I prevent FreeNAS from getting data loss / corruption with iSCSI, when the FreeNAS system crashes while iSCSI clients are connected?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
If your FreeNAS system is crashing while iSCSI clients are connected, you have bigger fish to fry. It seems the answer to THAT problem would be "fix your NAS."

While iSCSI is more demanding, VMware is pretty good about being anal retentive with how it handles data on datastores. I'm not sure I'd be all that paranoid about data loss unless you've got a horribly broken system; I've seen VMware basically halt VM disk I/O due to iSCSI problems, and then recover neatly when the datastore recovers.

I put a lot of time and effort into figuring out how to keep a ZFS system responsive under heavy workloads in bug 1531. Good reading. I suspect that a properly provisioned FreeNAS system would make a sweet iSCSI target. The problem is, the definition of "properly provisioned" would probably be eye-meltingly shocking to mere mortals...
 

MichelZ

Dabbler
Joined
Apr 6, 2013
Messages
19
Well, crashing could also be *the system lost power*..

So, what is the difference between iscsi and nfs with sync=disabled then?
Everyone recommends using iscsi and not nfs with sync=disabled. What's the real difference between them?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Not much really. As I said above NFS doesn't have the same strict timing needs as iSCSI. Most people aren't going to go looking to disable sync with NFS. Additionally, many NFS writes aren't sync writes, but I believe every iSCSI write is a sync write. With iSCSI it's by far the "best" way to increase speed with iSCSI and also the "worst" way to decrease reliability.

If you read up on what the sync function does and how it works you'll understand that disabling it at all except for testing purposes is not recommended by anyone that values their data. If your system is stable you should try disabling it and see what happens. Run your benchmarks and you'll be so impressed with the performance increase. As long as the system doesn't have any services that crash, the system crashes, or the system doesn't have the opportunity to shutdown properly, your data is safe. The problem is that if any of those happen the results are universally very crappy.
 

MichelZ

Dabbler
Joined
Apr 6, 2013
Messages
19
OK, iscsi = all sync, NFS (vmware) = all sync. Agreed then?

So why do I get 80 write IOPS with NFS and 28000 write IOPS with iSCSI?
Everyone blames NFS & ZFS Sync, but you say iSCSI has the same SYNC requirements? I know I'm missing something, but it does not go into my head... :)
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
OK, iscsi = all sync, NFS (vmware) = all sync. Agreed then?

So why do I get 80 write IOPS with NFS and 28000 write IOPS with iSCSI?
Everyone blames NFS & ZFS Sync, but you say iSCSI has the same SYNC requirements? I know I'm missing something, but it does not go into my head... :)
Not quite. With NFS writes you have to remember that all of the writing is being requested and performed by VMWARE software itself. So whatever VMWARE is doing is causing its own performance issues(or gains). iSCSI is simply a protocol that VMWARE is dishing up, and VMWARE doesn't have any real processing involved. It's a dumb-pipe and just passes the data along without any need for processing.

Also, keep in mind that NFS writes are to a system(in your case FreeNAS) that maintains the files on its file system(ZFS in your case). But iSCSI stores its files in its own file system, in its own iSCSI device on a remote file system(ZFS) on your remote machine(FreeNAS). The differences are pretty major, and depending on many factors one will certainly beat the other for some situations.

Also how the data is read(and written) matters. If you are writing 1 sector at a time with sync writes, expect terrible IOPS. If you are writing huge swaths of data at the same time(such as copying a single 10GB file) your only "sync" is at the end. The same can be true for reads. If you are making singular reads from 1 sector at a time expect performance to tank. This is where read-ahead caches help. They prevent every single application from having to roll their own read-ahead cache into every single application(this is how it used to be back in the 80s and before. Performance depends on loading type. ZFS may be optimizing its writes into txgs for iSCSI and not for NFS. I don't know. The beast that is ZFS has internal workings I only barely understand.

Regarding the difference in performance, I'm not sure what "expected" performance should be. For all I know your higher IOPS with iSCSI could be because you accidentally turned off sync writes somehow. Your situation is a combination of many factors between 2 different machines with different OSes and different settings. It's not something I'm necessarily expecting will be solved just by a few forum posts. You definitely have to find the bottleneck. How to find it or how much you can fix it is a different story. There's a reason why there are 2 week courses in ESXi. :p

Personally, if you are getting 28k IOPS to the same zpool as your NFS writes, then I'd say you have a VMWARE configuration issue and should look at that first. We can argue back and forth about who/what is to blame, but until you make a choice and try to prove or disprove it I doubt you'll get much better advice from this forum. The VMWare forums may be able to provide better info on how NFS is working internally with them.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
No, iSCSI (at least with file extents) is (at least by default) async.

You basically have a few choices here. VMware isn't being totally stupid. Their interest is in maintaining the integrity of your VM's even in the event of faults and failures. So on the high end, storage systems actually implement battery backed write caches and other various strategies to be able to acknowledge sync write options. Low end hard-drive based NAS engines that lack BBU will either simply disregard the sync requests (and go fast) or honor it (and go real slow). But either way, that isn't really VMware's fault. VMware really has no clue about WHAT its VM's are trying to write to disk or how important it might be or how resilient it might be in the event of a crash, so it is a rational choice.

Pushing everything with sync writes is effectively a giant exercise in measuring latency in your I/O subsystem. ZFS is big and piggy but has features that are designed to at least offset some of those downsides. If you don't use those features, ZFS very much resembles a low end hard-drive based NAS. Your choices in that case end up being, arrange to ignore sync and accept some possibility of data loss, or write sync and go real slow.

But with ZFS, you do have the option to use ZFS's mechanism for guaranteeing sync writes. This is the functional equivalent of a BBU write cache on a high end NAS. It requires you to have a SLOG device with supercapacitor or other similar power-loss write-completion protection, and then you successfully accelerate NFS in sync mode.

For iSCSI, you have to actually TELL ZFS that you want the writes to be sync (and you probably should), by setting sync=always.

So what's the difference between iSCSI with sync=standard and NFS with sync=disabled? One of them is not disabling sync within ZFS... some people fear sync=disabled could lead to filesystem loss or corruption within ZFS. You can do your own research on that one. So the NFS route may put the ZFS filesystem itself at risk, but the iSCSI route does not.

However, in both cases, you must be able to guarantee that data your VMware host writes is actually written to disk. You can do this through a SLOG device, or by taking the performance hit of writing to the pool with sync. If you choose not to make such guarantee, then what happens is when some adverse event occurs, your VM thinks that it has updated some blocks on disk, those disk blocks are in your NAS's memory to get written out to disk, the NAS reboots, never writes them, and suddenly the VM disk is inconsistent with what the VM thinks ought to be out there, and much hilarity ensues.
 

Pvaladez

Cadet
Joined
Sep 18, 2013
Messages
1
Just to clarify a bit on what jgreco said, choosing to use a slog(separate device for zil) means that should also be writing to the pool with sync. You can either force all writes to be synchronous by setting sync=always, or you can rely on your client (nfs or iscsi or whatever) to send synchronous write requests to your nas. If you're not writing with sync, then you will not benefit from a slog at all, because the zil is only used for synchronous writes. So if you are in line with the idea that you must guarantee that your data makes it to disks, then the only option you have is to use synchronous writes and employ a slog to speed up those synchronous writes.

Also, if you decide to set sync=disabled to speed up performance and not worry about guaranteeing data gets written, to my knowledge it doesn't mean the the zfs filesytem is in any more danger of corruption than it would normally be. The zil is not used directly for zfs filesystem integrity- the self healing nature of checksums in zfs already does that. Rather the zil is there to protect the integrity of the filesystem residing on top the zfs nas and to make sure no data is lost while in transit to the disks.

I recommend this blog on the subject:
http://nex7.blogspot.com/2013/04/zfs-intent-log.html
 
Status
Not open for further replies.
Top