Asynchronous write dangers

Status
Not open for further replies.

dairyengguy

Dabbler
Joined
Jul 17, 2015
Messages
28
I was trying to find out how to size a SLOG, and I came across this post. I had been planning on pricing a SLOG so that I can enable sync and maintain performance on my test install. It raised a few questions for me.

Asynchronous Writes
Asynchronous writes have a history of being "unstable". You have been taught that you should avoid asynchronous writes, and if you decide to go down that path, you should prepare for corrupted data in the event of a failure. For most filesystems, there is good counsel there. However, with ZFS, it's a nothing to be afraid of. Because of the architectural design of ZFS, all data is committed to disk in transaction groups. Further, the transactions are atomic, meaning you get it all, or you get none. You never get partial writes. This is true with asynchronous writes. So, your data is ALWAYS consistent on disk- even with asynchronous writes.

So, if that's the case, then what exactly is going on? Well, there actually resides a ZIL in RAM when you enable "sync=disabled" on your dataset. As is standard with the previous synchronous architectures, the data blocks of the application are sent to a ZIL located in RAM. As soon as the data is in the ZIL, RAM acknowledges the write, and then flushes the data do disk, as would be standard with synchronous data.

I know what you're thinking: "Now wait a minute! The are no acknowledgements with asynchronous writes!" Not always true. With ZFS, there is most certainly an acknowledgement, it's just one coming from very, very fast and extremely low latent volatile storage. The ACK is near instantaneous. Should there be a crash or some other failure that causes RAM to lose power, and the write was not saved to non-volatile storage, then the write is lost. However, all this means is you lost new data, and you're stuck with old but consistent data. Remember, with ZFS, data is committed in atomic transactions.

From here: https://pthree.org/2013/04/19/zfs-administration-appendix-a-visualizing-the-zfs-intent-log/

I was wondering about this, as I have seen threads on this forum suggesting that we should avoid async due to potential corruption issues. However, it would seem contrary to this blog. Can anyone explain what I am missing here?
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
I think there is a difference between consistent and uncorrupt. After a power loss during an async write the filesystem might be consistent, but you might have files that are only half there (or gone completely). In other words, thefilessystem looks fine, but an application trying to use that data might not find the file it was looking for.

http://constantin.glez.de/blog/2010/07/solaris-zfs-synchronous-writes-and-zil-explained
 

dairyengguy

Dabbler
Joined
Jul 17, 2015
Messages
28
Agreed. I read that article earlier in my search for answers.

This again leads me to believe that these issues will only occur on a failure. Since I will be discarding any backup that happens on a failure (or have three more backups that I can use), I believe I can write asynchronously with an acceptable level of risk.
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
Seems reasonable
 
Status
Not open for further replies.
Top