*Why* is SMB "async" and NFS "sync"?

Joined
Oct 22, 2019
Messages
3,641
I understand that by default SMB uses async writes, while NFS uses sync writes.

When I try to find the original or historical reason for this, I only see discussions about "this is this, and that is that".

But to quench my curiosity, I'm very interested to understand why writes are handled differently with SMB vs NFS by default.

If someone suggests "You can increase write performance for NFS shares by setting the dataset property to sync=disabled", you'll likely be met with "That breaks standard!" or "That puts your data at risk!"

Okay, well if using NFS with sync=disabled ("async writes") puts your data at risk, then how is it any more risky than SMB shares which by default use async writes, anyways?

Is it not the same level of "risk" for data loss? Aren't users that save data via SMB taking on the risk of "scary dangerous async writes"?

If someone loses 10 seconds worth of data from writes that never sync'd to the storage media, why does it matter if this was over SMB or NFS using sync=disabled?

This is the part I don't understand.

I'm fully aware of the risks involved, and how using sync writes assures you that the data has been successfully written to disk. But if async writes means "You don't really value your data", there should be a giant red disclaimer when creating a new SMB share:
SMB shares use async writes by default! Understand the risks involved. :eek:

Otherwise, the messasing and explanations are inconsistent. Maybe it's not such a big deal to use NFS shares with sync=disabled, since all of your writes prior to it were already async anyways (i.e, over SMB)?
 
Last edited:

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
Sync writes only matter where the application doesn't handle failed writes.

If a use-case is things like office files (I guess for MS and SMB), then speed probably takes priority over reliability as you still have the application handling a locally stored copy on the client.

As you can see here: https://en.wikipedia.org/wiki/Network_File_System

NFS was originally designed to be stateless, operating over UDP, so needed to be safe by design, hence the sync writes in the beginning, but in NFSv3 async was introduced together with TCP to improve speed and flexibility.

I imagine the "default" is actually determined by the client (like ESX), so perhaps the protocol itself doesn't have one.

I'm not sure if that really answers your question, but ultimately it's about satisfying the client, which should be the side which understands if the "application" is accounting for data loss or not.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
In the case of VMware NFS storage the writes must be synchronous because VMware itself provides certain guarantees to the guest operating systems. When the guest OS says "flush to stable storage now" and the hypervisor answers "ok", the data must have been written or the guest OS' file system will trash its structures sooner or later.

Cacheing and consistency are managed in the guest OS which thinks it is writing to a disk drive.
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
is NFS sync by default though? I don't think it is.
IIRC, ESX writes to NFS as sync every time, but fairly sure NFS can do both.
SMB is not capabable of sync; it simply wasn't designed with that. the only way to get sync with SMB is to enable it in zfs.
 
Joined
Oct 22, 2019
Messages
3,641
In the case of VMware NFS storage the writes must be synchronous because VMware itself provides certain guarantees to the guest operating systems. When the guest OS says "flush to stable storage now" and the hypervisor answers "ok", the data must have been written or the guest OS' file system will trash its structures sooner or later.
SMB is not capabable of sync; it simply wasn't designed with that. the only way to get sync with SMB is to enable it in zfs.
If a use-case is things like office files (I guess for MS and SMB), then speed probably takes priority over reliability as you still have the application handling a locally stored copy on the client.

NFS was originally designed to be stateless, operating over UDP, so needed to be safe by design, hence the sync writes in the beginning, but in NFSv3 async was introduced together with TCP to improve speed and flexibility.

Appreciate these responses, and will include a "composite" follow-up to the above highlighted portions.

If someone is/was using SMB shares to serve to clients over the LAN, but then switched to NFS instead, wouldn't the same "risks" apply if they changed the dataset's property to sync=disabled? It shouldn't matter what software they're using (file manager, office suite, video editor, etc). (No VMware or guest OSes involved.)

Because as it stands now, such a transition from SMB -> NFS takes a significant performance penalty if the dataset remains at sync=standard. To regain the same performance, one would have to change the dataset to sync=disabled.

If writes are not guaranteed to be flushed to disk storage with SMB (server responds back "All good here!"), then why does it matter if under the same use-case with NFS the server responds in the same way because the dataset is sync=disabled?

Why is someone discouraged to use NFS with sync=disabled, when in fact it's just as "risky" as someone using SMB with sync=standard?

(Not including the case that @Patrick M. Hausen brought up in regards to VMs, which is a good point he made.)



is NFS sync by default though? I don't think it is.
I believe for the client, it defaults to "async", but this only affects how/when "written" data is sent to the server.

Whereas on the server, it defaults to "sync", which affects how/when the "received" data is safely flushed to disk storage.

The client and server settings are independent of each other.

It is on the server's side (where you have spinning HDDs) that can significantly impact write performance. Two ways to "violate" the NFS standard is to either explicitly configure the NFS server to use "async" or set the dataset's property to sync=disabled.

My head-scratching moment is what I wrote in bold text above. (It's not so much a technical question, but rather I'm curious about the inconsistency in regards to the "importance of data integrity and risk" in the same use-case scenario.




EDIT: Here's an analogy.

You use "Bradley Mailing Service" to send letters and greeting cards to people around the country. "Bradley" does not provide any tracking or confirmation.

Later on, you switch to "Maria Mailing Service". This service provides tracking and confirmation, yet to ensure this, such letters have delayed transit due to all the safeguards and confirmations. You decide to "opt out" of the tracking and confirmation benefits, because you want your letters to reach their recipients at the same length of time you've enjoyed with "Bradley Mailing Service".

Your friend warns you: "Don't do that! Don't opt out! You'll lose the benefits of tracking and confirmation!"

To which you respond: "But that's how it's always been with me before..."
 
Last edited:

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
I believe for the client, it defaults to "async", but this only affects how/when "written" data is sent to the server.

Whereas on the server, it defaults to "sync", which affects how/when the "received" data is safely flushed to disk storage.
I do not believe nfs is sync by default (at least in TrueNAS). sync is requested by the client by default, as it's assumed the client knows best if the data is sensitive or not (ESX, for example, assumes VMs are and always uses NFS with sync). this is the same for ZFS; datasets are async by default, only writing as sync when requested by the writing process (usually by NFS), or when manually set to sync.
Why is someone discouraged to use NFS with sync=disabled, when in fact it's just as "risky" as someone using SMB with sync=standard?
probably because if you disable sync, its always disabled; no client could request a sync write.
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
You decide to "opt out"
you can opt out by not enabling sync in the client, the mount point, the nfs server, or the zfs filesystem.
I don't see this analogy being accurate.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
Why is someone discouraged to use NFS with sync=disabled, when in fact it's just as "risky" as someone using SMB with sync=standard?
Because most clients/applications aren't expecting the SMB writes to be guaranteed, but the applications that require NFS/iSCSI do expect that and if you just lie to them to make things go faster, you're making their assumptions (that all data written is really written) false, which is dangerous for things like critical filesystem operations where your virtual disk becomes corrupt if some of those writes just disappear.
 
Top