Maximizing SLOG SSD write endurance

Status
Not open for further replies.

DaveY

Contributor
Joined
Dec 1, 2014
Messages
141
I've been wading through tons of threads trying to piece together info on configuring a SSD for SLOG so I can maximize write endurance, but haven't been able to find any clear definitive guide on the topic. In fact, there even seems to be conflicting info on how best to do this. Please note I'm NOT asking which Brand/Model/NAND Technology has the highest write cycles, but more about "maximizing" the writes so it creates even wear on the SSD and thus increasing total writes I can perform on the drive, regardless of the drive's Max. write cycles. Can someone please help me with these questions?

1. Most posts claim "modern SSDs" have some sort of intelligent write leveling across the drive. If so, why do we still need to underprovision/overprovision, especially if SLOG will most likely not exceed 6.25GB even on a fast 10Gbe network? If I have a 120GB SSD, that's plenty of space for the controller to take care of write leveling across unused space, no? Why wouldn't I just leave it as one big partition?

2. Underprovisioning vs Overprovisioning? Is this just a semantics thing or is there a difference? If so, which method is better for write cycles?

3. Some posts claim SSD controllers will only distribute writes across free space WITHIN a partition, while some say it will distribute writes across ALL unused space, including space outside the partition. Which one is it or does it depend on SSD vendor? If it's the former, then wouldn't underprovisioning actually shorten the life of the drive since a 8GB underprovioned partition would constantly be written over?

I apologize if all this info is already out there, but the forum has gotten to a size that makes reading through every old thread a full day task. If the topic is already covered, can someone please provide a link?

Thanks
 

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
I've been wading through tons of threads trying to piece together info on configuring a SSD for SLOG so I can maximize write endurance, but haven't been able to find any clear definitive guide on the topic. In fact, there even seems to be conflicting info on how best to do this. Please note I'm NOT asking which Brand/Model/NAND Technology has the highest write cycles, but more about "maximizing" the writes so it creates even wear on the SSD and thus increasing total writes I can perform on the drive, regardless of the drive's Max. write cycles. Can someone please help me with these questions?

1. Most posts claim "modern SSDs" have some sort of intelligent write leveling across the drive. If so, why do we still need to underprovision/overprovision, especially if SLOG will most likely not exceed 6.25GB even on a fast 10Gbe network? If I have a 120GB SSD, that's plenty of space for the controller to take care of write leveling across unused space, no? Why wouldn't I just leave it as one big partition?

2. Underprovisioning vs Overprovisioning? Is this just a semantics thing or is there a difference? If so, which method is better for write cycles?

3. Some posts claim SSD controllers will only distribute writes across free space WITHIN a partition, while some say it will distribute writes across ALL unused space, including space outside the partition. Which one is it or does it depend on SSD vendor? If it's the former, then wouldn't underprovisioning actually shorten the life of the drive since a 8GB underprovioned partition would constantly be written over?

I apologize if all this info is already out there, but the forum has gotten to a size that makes reading through every old thread a full day task. If the topic is already covered, can someone please provide a link?

Thanks
Yes, there's quite a bit of discussion about ZIL SLOG devices here... some of it's even accurate! :D

My understanding -- quite possibly flawed! -- is that, yes, modern SSD controllers are coded to levelize wear on the drive. They're smart enough to have a good handle on distributing writes across all the blocks on the drive with a view towards equalizing the wear on them.

Also, the drive manufacturers allocate a portion of blocks outside those delivered as usable space to the purchaser. These 'extra' blocks are for use by the controller to substitute for any bad blocks that may appear during the drive's service life. So, for example, a 120GB drive might have an additional 1.2 GB of space for bad block substitution. I don't know the exact size for any particular model of 120GB drive, this is just for purposes of illustration. So as many as 1.2GB of bad blocks can appear over time in the 120GB partition and the controller will be able to make up for them using the 'extra' blocks, without have to admit to having bad blocks to the user's operating system, or the SMART diagnostics, or whatever, and the happy user continues to have 120GB of usable space on the drive.

The theory behind over-provisioning is that, if we make the partition visible to the operating system much smaller than the drive's putative size, we give the controller even more blocks to juggle and use for wear leveling and bad block substitution.

...or something like that. I could be completely mistaken, and if I am, then hopefully a drive engineer will pop in and enlighten us.

My hunch is that the most important decision to make when selecting a ZIL SLOG is choosing the right SSD, one matching the criteria of having high durability, low latency, power protection, etc. Overprovisioning may not make any real difference. For example, what if the drive's durability is so large that it can't possibly wear out before the service life of the server in which it's installed is over? In such a case, we have nothing to gain (or lose!) by over-provisioning the SSD.

Regardless, I always over-provision my ZIL SLOG devices to a size of 8GB, following these instructions at Thomas-Krenn: "SSD Over-provisioning using hdparm"
 

Dice

Wizard
Joined
Dec 11, 2015
Messages
1,410
My understanding of the topic matches @Spearfoot
I'd like to add a note, based on what I've picked up (which need to be straightened out) - the SSD would need a "flush" partitioning to leverage "wear control". If old partitions linger around, they might interfere even if they are not currently in use by the system. Therefore, a thorough formatting and partition session is key.
Hence, the link to Thomas-Krenn.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
3. Some posts claim SSD controllers will only distribute writes across free space WITHIN a partition
That'd be really silly. Why would anyone do that? It's extra effort for the software developer, extra effort for the software and extra effort for the hardware and you only lose.

SSDs have a table that maps their internal sectors to LBAs and that's basically it. If an LBA gets TRIMmed, the SSD takes note of the fact and doesn't bother preserving the data in question, somehow marking it as free.

Most posts claim "modern SSDs" have some sort of intelligent write leveling across the drive.
"Modern" being "every single SSD worthy of the name".

If so, why do we still need to underprovision/overprovision
You probably don't. You certainly do not if:
  • The OS doesn't write all over the partition, only accessing the same handful of LBAs AND
  • The drive doesn't worry about partitions (why the hell should it?)
 
Status
Not open for further replies.
Top