Another SLOG question - confirm my understanding

robv1306 · Jan 13, 2023

Hi All,

Apologies that this is yet another SLOG post but I've searched high and low and cannot find the specific answer I'm looking for. Having read up on SLOG for the last few days, I'm looking to have someone confirm my understanding.

I'm a home user and I'm running 3x 4TB spinning HDDs in RAIDZ1. My rig is running only 16Gb of RAM. I'm running all synchronous writes since I don't have a UPS.

The specific question I have is whether having a SLOG would allow for fast writes regardless of how quickly the array can accept the data on the SLOG.

As I understand it, writing files to an array using a SLOG entails the data being written to the SLOG, then declared as complete before being written to the array and flushed off the SLOG. However, while I know that the SLOG is flushed every 5 seconds, I'm not sure what happens if the data that is written to the SLOG cannot be flushed to the array in the 5s.

For example, if I attempted to write a 5Gb file to my pool with an NVMe drive set up as a SLOG, my array will not be fast enough to accept all of that 5Gb file onto the HDDs within the 5s but the NVMe drive would happily ingest all of that within the timeframe. The question is, would the array still "write" the data to the array (i.e. onto the SLOG) at the speed of the NVMe drive and then flush to the array over multiple flush cycles (i.e. numerous 5s cycles) or would my SLOG be limited to accepting what my HDD array can accept within a 5s window?

In otherwords, I have a RAIDZ1 array of spinning rust. I'll be synchronously writing large singular files to the array. Will adding an NVMe drive (or mirrored drives) as a SLOG increase my write speed of my pool to that of the NVMe drive or will it be in some way limited?

Btw, I am aware that it's advisable to have PLP on the SLOG(s) but since this is only for home use, I have assessed that so long as I use a DRAM-less drive as the SLOG, I am comfortable with the residual risk of data loss as a result of the lack of PLP.

Thanks in advance.

Ericloewe · Jan 13, 2023

robv1306 said:
The specific question I have is whether having a SLOG would allow for fast writes regardless of how quickly the array can accept the data on the SLOG.

Simple answer is no. First of all, the SLOG is never read from in normal operation, the TXG still exists in memory and is read from there. Second, an SLOG can never enable faster writes than what can be achieved with async, so it's more of a mitigator of poor performance, rather than a performance booster.

robv1306 said:
As I understand it, writing files to an array using a SLOG entails the data being written to the SLOG, then declared as complete before being written to the array and flushed off the SLOG. However, while I know that the SLOG is flushed every 5 seconds, I'm not sure what happens if the data that is written to the SLOG cannot be flushed to the array in the 5s.

SLOG or no SLOG, writes are throttled down if the pool cannot keep up.

robv1306 said:
Will adding an NVMe drive (or mirrored drives) as a SLOG increase my write speed of my pool to that of the NVMe drive or will it be in some way limited?

No.

robv1306 said:
Btw, I am aware that it's advisable to have PLP on the SLOG(s) but since this is only for home use, I have assessed that so long as I use a DRAM-less drive as the SLOG, I am comfortable with the residual risk of data loss as a result of the lack of PLP.

I disagree with your analysis.

If the risk of the SLOG failing is acceptable, then the risk of losing a TXG or two is acceptable, in which case you might as well forget about SLOG and go async.
DRAM-less drives are not, in fact, DRAM-less. SATA SSDs have to rely on their tiny internal SRAMs and NVMe SSDs can request a chunk of memory from the host to use for their own caching purposes (e.g. to hold the Flash Translation Layer tables) through the magic of DMA. Both memories are rather susceptible to power losses.

robv1306 · Jan 13, 2023

Thanks for the reply Ericloewe.

Seems I've still got plenty to learn here.

One thing I might have to clarify here is that when I talk about "whether having a SLOG would allow for fast writes", I am defining writes as the time taken for the data to be secured (including in SLOG) and for the data transfer on my SMB share to present to windows that it's "finished". How long it actually takes to get the data out of RAM/SLOG and into the spinning disc array doesn't concern me at all so long as when windows says the transfer is "complete" the data is secure. Does that change your answer at all regarding speed?

Maybe I can ask a slightly modified question. Given my current set up of 3 spinning discs in RAIDZ1 and given that I've got it set to do nothing but synchronous writes, would I see a write performance uplift when:
a) writing small files such as pdfs and word docs etc
b) writing larger files such as high bitrate video files

If the answer to either of the above is yes, is there an equation by which the write speed can be estimated in this situation?

Thanks again for your help. I really appreciate it.

Ericloewe · Jan 13, 2023

robv1306 said:
I've got it set to do nothing but synchronous writes

Why though?

robv1306 said:
would I see a write performance uplift when:
a) writing small files such as pdfs and word docs etc
b) writing larger files such as high bitrate video files

Probably, if the SLOG is any good, especially in case a).

robv1306 said:
If the answer to either of the above is yes, is there an equation by which the write speed can be estimated in this situation?

No, but you can quickly get an upper bound of how fast things can be by disabling sync writes and testing your workload. SLOG with sync writes will never be faster than that.

rvassar · Jan 13, 2023

So... You have to step back and ask what a synchronous write is for. It ties to the open(2) library call in Unix, which implements a flag called O_SYNC. The open call returns a filehandle that you can then read/write, etc... It's how every POSIX compliant operating system initiates access to a file on a filesystem. The O_SYNC flag tells the OS not to return from any kind of write call to that filehandle until it receives confirmation that the data was committed to the underlying storage. The problem with O_SYNC popped up in the 80's when we introduced networking and network filesystems. You now had two OS'es involved in the write process to a file. The initiator of the transaction on the client, and the server (NAS) that was actually responsible for doing the work and confirming the write was committed to storage. One of the original network filesystems NFS, got around the problem by simply decreeing in its RFC that all writes would be performed O_SYNC. In the bad old days of single core monolithic kernels with primative process scheduling, meant the write call would drop into the kernel and not return for 50+ ms in many cases. And that was 50ms per write call, all of 20 calls per second. Various mechanisms to accelerate this were tried, some vendors dropped O_SYNC for their network attach filesystems, some implemented a lazy write. Sun Microsystems, the inventor of both NFS and ZFS, produced a card called Prestoserv that contained battery backed RAM, and implemented a kernel shim to allow the kernel to send the acknowledgement back to the client before the data had been committed to disk. That concept later became the SLOG in ZFS. But really all it implements is "Ok, I got it, you can move on." It free's the client from waiting, but it doesn't speed up the server's disks.

But... I have to ask: Are you even using O_SYNC writes? NFS certainly does. Local jails running MySQL/MariaDB may benefit. SMB/CIFS not so much.

Important Announcement for the TrueNAS Community.

Another SLOG question - confirm my understanding

robv1306

Cadet

Ericloewe

Server Wrangler

robv1306

Cadet

Ericloewe

Server Wrangler

rvassar

Guru

Similar threads