Example:
Your system is running smoothly. Your slog is taking in data for the 5 seconds or so between ZFS transactions.
Before the next transaction occurs, you have a loss of power or a kernel panic.
Due to a bug in the SSD firmware (insert any explanation you want for why your SSD decides to stop working), the SSD will no longer be detected by any system when you plug it in.
You power on the server and the zpool won't mount automatically because the slog device is missing. :p
You'll have to do some extra work from the CLI to mount the zpool, but you have lost the previous 5 seconds of writes that other network clients were promised (via sync writes) were made to the zpool.
Believe it or not there's a few things that are very likely to occur:
- Loss of power and/or kernel panic seem to always happen to servers at some time or other. Human error, programming error, etc.
- SSDs tend to stop working on power-on and not when you are already using them.
- A system that is used in a production environment will *always* have *some amount of data* in the slog device. (this means you ARE going to have uncommitted zpool data in the slog every second of every day)
So unless:
- You are going to argue that a power loss and kernel panic have nearly zero percent chance of happening (I've heard it all.... dual redundant UPSes, diesel generator on standby, etc.)
- You are going to argue that SSDs never fail.
- You are never going to have data on the SSD.
Then you probably want to consider a mirror.
For home users you might have a corrupted movie file, plex database, etc. and can recover in a couple of hours so it might not be worth the cost of that expensive SSD.
Imagine corrupting an 800GB sql database like I was working with someone on last week. They've paid LOTS of money to recover from a mistake they made, directly related to an unexpected loss of power. You can't just do a "quick recovery" or run some diagnostic tool "quickly" and fix the kind of problems that you'll have. It can take hours (or days) to figure out what is going on and fix it.
So how much money did that slog device cost you? And how much *might* it cost if your slog device fails at a bad time (and there is no "good time" for an slog device to fail).
Alternatively, consider an environment where you are literally relying on your slog device to have good write performance to your zpool (VMs on ESXi over NFS). If you lose your slog device suddenly, your performance of the zpool will nosedive and you'll likely see many (most?) of your VMs get kicked offline due to timeouts on access the VM virtual disk files. So what will happen to those VMs that were improperly shutdown?
Suddenly my horror stories that I've seen firsthand seem terrible and changing the quantity from 1 to 2 on your purchase order seems like a good deal huh?
Hint: It is a good deal.