SSD / Optane overprovisioning

StorageCurious

Explorer
Joined
Sep 28, 2022
Messages
60
I've searched far and wide on this topic, and likely just haven't use the right search words and have found not quite precisely what I am looking for.

I know TrueNAS supports overprovisioning at pool creation (I seem to understand also when adding a SLOG to an existing pool).

My (surely poor) understanding of overprovisioning is that a SSDs needs free space to optimally record data in a way that spread the data over all the cells, making the SSD drive have a longer lifespan.

I thought (again, probably wrongly) that overprovisioning, say, 50GB meant that 50GB were going to be kept aside at all time for this. So on my personal PC, I understand this to mean I lock myself out of 50GB to make my Samsung SSD last longer.

When it comes to the SLOG drives (let's use 100GB as an example, because that is what I have), I know the actual data written on before it's flushed will be minimal compared to the overall drive size. Why do I need to set aside space in that case, since I know the drive will never fill up in any significant way?

What in my understanding is wrong about overprovisioning?

there is this here : https://www.truenas.com/community/threads/maximizing-slog-ssd-write-endurance.54830/ but that only leads me to ask "in this case why is overprovisioning a TrueNAS setting if it doesn't matter?"
 
Last edited:

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
overprovisioning is needing 50GB but getting a 500GB drive. that's it. you overprovisioned by 450GB.
generally, if you don't already know you need a SLOG drive and have the data to prove it, you probably shouldn't be trying to add a SLOG drive.
for one thing, default writes are async, and no async write will go to the SLOG. at all.
the idea of overprovising is that the endurance of larger drives is larger in total, so you can write more to the drive before it's life ends.
a SLOG drive is 99.99% writes. it will eat cheap drives endurance in months, if not weeks. you need high endurance drives with battery or flash backup cache. if your SLOG looses power or dies, your data can go with it, corrupting the pool.

if you are just interested in learning, then there is no problem.
people try to use consumer SSDs for a SLOG they dont need and find out the hard way, rather than researching it first.
 

StorageCurious

Explorer
Joined
Sep 28, 2022
Messages
60
overprovisioning is needing 50GB but getting a 500GB drive. that's it. you overprovisioned by 450GB.
generally, if you don't already know you need a SLOG drive and have the data to prove it, you probably shouldn't be trying to add a SLOG drive.
for one thing, default writes are async, and no async write will go to the SLOG. at all.
the idea of overprovising is that the endurance of larger drives is larger in total, so you can write more to the drive before it's life ends.
a SLOG drive is 99.99% writes. it will eat cheap drives endurance in months, if not weeks. you need high endurance drives with battery or flash backup cache. if your SLOG looses power or dies, your data can go with it, corrupting the pool.

if you are just interested in learning, then there is no problem.
people try to use consumer SSDs for a SLOG they dont need and find out the hard way, rather than researching it first.
Thanks, that seems to match my understanding of overprovisionning. So what does the overprovisionning option add?
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,829
IIRCC, over-provisioning outside the drive basically does what a well-implemented wear-leveling system should be doing in hardware on the drive itself - i.e. ensure that the whole SSD is being written to in order to ensure that all areas of the drive are worn down equally. In turn, that will maximize the life of the drive.

My rig has a p4801x Optane SLOG and I never even attempted to over-provision, etc. that drive. I just put it in, declared it a SLOG and that was that. But that was also driven by Intel intending their Optane for SLOG like applications, so there was a reasonable chance that it would work out of the box.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Some (most) manifacturers also already overprovise (ie Kingston with 7% and 28% depending on the product).
 

StorageCurious

Explorer
Joined
Sep 28, 2022
Messages
60
IIRCC, over-provisioning outside the drive basically does what a well-implemented wear-leveling system should be doing in hardware on the drive itself - i.e. ensure that the whole SSD is being written to in order to ensure that all areas of the drive are worn down equally. In turn, that will maximize the life of the drive.

My rig has a p4801x Optane SLOG and I never even attempted to over-provision, etc. that drive. I just put it in, declared it a SLOG and that was that. But that was also driven by Intel intending their Optane for SLOG like applications, so there was a reasonable chance that it would work out of the box.
I have the same - and the same assumptions. Thanks
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Overprovisioning happens at multiple levels.

First off, there can be more physical NAND on the drive than there is for the firmware to use as capacity. This could be as small as the difference between binary and base-10 - for example, 256GiB (binary, "gibibytes") of NAND on a device sold as 256GB (base-10, "gigabytes") uses only 238GiB of the physical NAND. The other ~7% can be used for spare cells and wear-leveling in the firmware - this is usually the "minimum overprovisioning" that's done. If that same drive is sold as a 240GB model, that's ~223GiB and a 14% overprovisioning level. Sell it as a 200GB (~186GiB) "write-intensive" model and it's up to 37%.

Some devices deliberately include far more physical NAND for added performance or consistency - I have some 200GB write-intensive HGST drives that contain 352GiB of physical NAND, for a massive 76% overprovisioning out of the box.

At the next level, the firmware will use any "known empty" space for its own wear-leveling purposes. TRIM is needed in order to inform the drive firmware that "Block X can be freed up, and you can use it for your internal garbage collection and page/block housekeeping" - but if TRIM isn't present, or can potentially get overwhelmed by other operations, we can enforce this free space using the Host Protected Area (HPA) - this tells the firmware to further limit the size of the disk as presented to the host OS. A secure erase followed by HPA will "guarantee" that the space is available for writing.

Finally, there's the "in guest" option of "keep more free space" which is what usually happens for most desktop/workstation workflows. Many modern SSDs, and some older ones, with multi-layer cells (MLC, TLC, QLC, "etc" ) "cheat" by using empty NAND to write in a faster, SLC-like manner. [1] If TRIM is present and working exactly as designed, this is roughly equal to the "firmware-level" HPA sizing - but TRIM prior to SATA 3.2 was a non-queued command, meaning that TRIMs would cause a drive to have to "pause" and flush its pending actions. There's also some strange issues with certain drive manufacturers, where queued TRIM support is advertised but poorly supported. Even where TRIM is present, the HPA method gives you the "extra safety net" so to speak. There's some further fun regarding metaslab sizing as well that I could go into as well.

Optane devices don't support the HPA, but you can manually make smaller partitions. Intel's official line is that the wear-leveling logic combined with the bit-addressable nature of those devices means overprovisioning "isn't necessary" but the metaslab sizing is worth considering.

[1] I should expand this with details on the actual write process into NAND cells at some point. Someone bug me about it later.
 
Last edited:
Top