Removing cache drives from live pool

guemi · May 31, 2021

jgreco said:
Then why bother with SLOG at all? It's slowing your writes down without providing a benefit. Just turn off sync writes and be done with it if "some lost data [...] isn't a concern or a problem."

I was under the impression that adding a fast, large write cache in front of slower spinning disks would improve speeds? The SLOG is 1 TB M2 SSD with about 2500 MB/s write speed.
As long as we don't fill the write cache drive, shouldn't it write much faster to the POOL and then the SLOG will feed the spinning disks as fast as they can?

c77dk · May 31, 2021

guemi said:
I was under the impression that adding a fast, large write cache in front of slower spinning disks would improve speeds? The SLOG is 1 TB M2 SSD with about 2500 MB/s write speed.
As long as we don't fill the write cache drive, shouldn't it write much faster to the POOL and then the SLOG will feed the spinning disks as fast as they can?

SLOG is just a way to secure your sync writes - normally this device will only see writes, and no reads. If you need to read from it, then something brown has hit a fan.
What you seem to want could be implemented with another pool and some replication / maybe sVDEV (but be sure to get redundancy right)

jgreco · May 31, 2021

guemi said:
I was under the impression that adding a fast, large write cache in front of slower spinning disks would improve speeds? The SLOG is 1 TB M2 SSD with about 2500 MB/s write speed.
As long as we don't fill the write cache drive, shouldn't it write much faster to the POOL and then the SLOG will feed the spinning disks as fast as they can?

Yes, I'm sure that adding a fast, large write cache in front of slower spinning disks would improve write speeds.

All this means is that when I suggested you go and read the "Some insights into SLOG" post, you didn't, because you still seem to think that the SLOG is some sort of write cache.

It is absolutely NO SORT of write cache at all.

Please do avail yourself of the resources that people point you at. Almost all of them are really good things to know about ZFS, even where they may not be immediately relevant to your specific situation.

Your system's main memory is your write cache. SLOG is a thing that guarantees written data reaches the pool, i.e. in case of a power loss. Your fastest write mode is ALWAYS to disable sync writes; ZFS simply CANNOT write faster than the pool can accept flushed transaction groups.

sretalla · May 31, 2021

guemi said:
As long as we don't fill the write cache drive

The problem there (other than incorrectly identifying SLOG as write cache) is that the amount of writes allowed to sit in that state is 5 seconds worth, so any kind of sustained IO basically throttles as soon as the pool can't absorb it. The only thing it helps with is flattening out short spikes in IO write requirements.

guemi · May 31, 2021

jgreco said:
All this means is that when I suggested you go and read the "Some insights into SLOG" post, you didn't, because you still seem to think that the SLOG is some sort of write cache.

It is absolutely NO SORT of write cache at all.

Please do avail yourself of the resources that people point you at. Almost all of them are really good things to know about ZFS, even where they may not be immediately relevant to your specific situation.

Your system's main memory is your write cache. SLOG is a thing that guarantees written data reaches the pool, i.e. in case of a power loss. Your fastest write mode is ALWAYS to disable sync writes; ZFS simply CANNOT write faster than the pool can accept flushed transaction groups.

Then I suggest the official documentation that quite literally states that SLOG is a write cache be revisited:

8. Storage — TrueNAS®11.3-U3.2 User Guide Table of Contents

www.ixsystems.com

All drives in an encrypted pool are encrypted, including L2ARC (read cache) and SLOG (write cache).

And the linked post also repeatedly mentions SLOG as a write cache. Probably, because it is.

No idea what you're getting worked up given that a SLOG most definitely is a write cache. Now it may or may not speed up writes, but that doesn't make it less of a write cache.

c77dk · May 31, 2021

guemi said:
Then I suggest the official documentation that quite literally states that SLOG is a write cache be revisited:

8. Storage — TrueNAS®11.3-U3.2 User Guide Table of Contents

www.ixsystems.com

Please read the current docs https://www.truenas.com/docs/references/zilandslog/

jgreco · May 31, 2021

guemi said:
Then I suggest the official documentation that quite literally states that SLOG is a write cache be revisited:

8. Storage — TrueNAS®11.3-U3.2 User Guide Table of Contents

www.ixsystems.com

Feel free to submit a bug report. I am not an iX employee and I don't manage their technical writing. You are making the suggestion to the wrong person.

And the linked post also repeatedly mentions SLOG as a write cache. Probably, because it is.

It DOES NOT. I know, because I wrote the linked post. I specifically point you to the section that discusses the actual ZFS write cache, "The ZFS write cache. (No, it's not the ZIL.) "

No idea what you're getting worked up given that a SLOG most definitely is a write cache. Now it may or may not speed up writes, but that doesn't make it less of a write cache.

In computer science, a write cache refers to a resource where you write data to the cache, which is later read back out and spooled back to disk.

The ZIL/SLOG is never read under normal operating conditions.

Its sole purpose is to act as an intent log, which is what the "I" and "L" in "ZIL" stand for, to guarantee that sync writes are actually written to something that can withstand a power loss, kernel crash, etc.

The ZIL/SLOG is *exclusively* written to, never read from, except at one time: it is read during pool import to make sure that the transactions in the log were actually written to the pool disks.

Therefore, it is most definitely not a write cache.

Additionally, it does NOT generally speed up writes. ZIL/SLOG writes are guaranteed to be slower than async writes. ZIL/SLOG helps with synchronous writes because it eliminates the need to commit them to the pool.

Please consider actually reading the "Some insights into SLOG" post this time. I put extensive time and effort into the things I post in an effort to educate people who are having trouble with the concepts, which are, admittedly, sometimes a bit difficult.

HoneyBadger · May 31, 2021

I thought this was an L2ARC thread, how are we talking about SLOG? ;)

guemi said:
Then I suggest the official documentation that quite literally states that SLOG is a write cache be revisited:

8. Storage — TrueNAS®11.3-U3.2 User Guide Table of Contents

www.ixsystems.com

Both this and the newer page at https://www.truenas.com/docs/references/slog/ have some incorrect assumptions/interpretations of SLOG, specifically around sizing (it's more complex than "throughput over 5s") and the section on "SLOG for asynchronous writes" which is just "wrong by definition" - you can force all writes to be treated as sync, and therefore go to the ZIL/SLOG, but async writes will still happily go to RAM only, and they'll still be faster.

guemi said:
And the linked post also repeatedly mentions SLOG as a write cache. Probably, because it is.

No idea what you're getting worked up given that a SLOG most definitely is a write cache. Now it may or may not speed up writes, but that doesn't make it less of a write cache.

Much like the "RAIDZ/RAID0/stripe" terminology, it's important to understand that people aren't just being pedantic here, but trying to describe the underlying ZFS behaviour. Think of the ZIL/SLOG as the ZFS equivalent to "battery backup" for a RAID card. It's not in the primary data path, it's there to provide assurance for your most important data in case of failures. When a new write comes into your ZFS server, it goes into RAM. If it's marked as sync, it also goes to the ZIL or SLOG. But when it comes time to "batch up the transaction" for committing to the capacity/data vdevs, those reads come from RAM, not the SLOG.

It's not a scenario where you can say "well, I have a machine with 32GB of RAM, and a massive 512GB SSD - I want to be able to write 500GB at SSD-speeds and then have it spool to a bunch of very slow disks." Because the "write cache" lives in RAM, the maximum you'd ever be able to hold would be that 32GB (less system/service overhead) of outstanding dirty data, and in order to do that, you'd blow out your RAM-based read cache.

You can copy everything to a pool on said SSD, and then use a server-side rsync/snapshot/copy script to move it to the HDD based storage, but that's a more manual undertaking. ZFS doesn't do "data tiering."

Given your previous copy speeds (500-600MB/s on a four-drive RAIDZ1) I'll eat my hat if you aren't already running sync=standard so this is as fast as that pool will get - and for a four-drive Z1 that's impressive for sequential speeds as it implies 150-200MB/s of usable throughput per spindle, depending on how compressible your data is. If you want to go faster, I suggest more back-end vdevs, and possibly in mirrors if any of that access trends towards "concurrent clients" or "random"

guemi · Feb 21, 2022

jgreco said:
That seems curious. I would think that's an error/bug/problem, because that should be a nonobjectionable change. If you can duplicate that, it would be interesting to follow up on.

I managed to reproduce this again. SMB crashed out right and wouldn't respond at all.

Very interestingly, iscsi stayed working this time - it had 2 seperate disks connected to the same fileserver which was fully functional. (Connected via ethernet)
ICMP also worked.

But this time SNMP, SSH, WebUI stopped responding / working so I had to hard reboot the server.

Important Announcement for the TrueNAS Community.

Removing cache drives from live pool

guemi

Dabbler

c77dk

Patron

jgreco

Resident Grinch

sretalla

Powered by Neutrality

guemi

Dabbler

8. Storage — TrueNAS®11.3-U3.2 User Guide Table of Contents

c77dk

Patron

8. Storage — TrueNAS®11.3-U3.2 User Guide Table of Contents

jgreco

Resident Grinch

8. Storage — TrueNAS®11.3-U3.2 User Guide Table of Contents

HoneyBadger

actually does care

8. Storage — TrueNAS®11.3-U3.2 User Guide Table of Contents

guemi

Dabbler

Similar threads

Important Announcement for the TrueNAS Community.

Removing cache drives from live pool

Dabbler

Patron

Resident Grinch

Powered by Neutrality

Dabbler

Patron

Resident Grinch

actually does care

Dabbler

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Removing cache drives from live pool"

Similar threads