ZFS RAID size and reliability calculator

ZFS RAID size and reliability calculator 2017-05-17

Status
Not open for further replies.

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
Thanks ;)

Note that the blocks overhead is still experimental, I'm currently sorting this out on the other topic (link in the first post).

For the number of RAID groups (I assume you mean like 3 striped RAID-Z2 of 6 drives (so 18 drives total) for example?) It's not that useful because you can use the app for one RAID and then just multiply the spaces values by the number of RAIDs you have. Plus it already supports this for the mirroring RAID types :) I might (maybe, perhaps) add this in the future but I don't think it's worthwhile.

Very good idea to add the percentage values, thanks. I'll remake the UI more like a table to make things more clear (I can't add % in between parenthesis for example because there is already the TB so it would be ugly and not easily readable).

I thought about the fault tolerance but the formula is complex IIRC (and I want to finish the overhead thing before) so I've put that on future features list :)

"what you have is simple and to the point" Yeah, pretty much exactly the goal of this app: simple, clear, light, do only a short list of things but do it well, ... ;)
 
Last edited:

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
Well, I've done several changes, I can almost call this version a v2.0 :D

I remade the UI to use a table. It should be far clearer and less cluttered with parenthesis and units everywhere ;)

I also added the percentages (the majority of them is constant, it's normal).

BTW everything should follow the zoom level now (before I used some fixed sizes in px), I think it may be useful on mobile devices as I saw there is several users who use the app on them.

I hope you'll like the changes as it took me some relatively long time to do them ;)
 

SwampRabbit

Explorer
Joined
Apr 25, 2014
Messages
61
Nice improvements, its more easy on the eyes now for sure

I did a quick test of it comparing the new configuration of my home test lab.
Fresh install on 9.3, 4TB x12 RAIDZ2, lz4 compression, ratio 8.24%

FreeNAS says Total RAID Space 43.5TiB Usable Data Space 28.1TiB

Your App Total RAID Space 43.66TiB Usable Data Space 26.31TiB

I assume the difference is the overhead calculations being taken into account.
Your app is nice because a user can set quotas to keep going past the
recommended free space now that they know it.
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
Probably the overheads, yeah. Where the FreeNAS numbers come from?

I think I'll disable the blocks overhead temporarily, because I haven't the exact formula for now and it may cause more bad than good.
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
Well, I've done several changes, again, I can almost call this version a v3.0 :p

I added the reliability stats. And by doing this I realized that the STH calculator is wrong for the RAID-Z3 stats: the UBER value is used in the formula to calculate them... yeah, on a file system who corrects any drive corruption... :rolleyes::D

I also disabled the blocks overhead temporarily and changed the height of the inputs to match the select's height ;)
 

homerjr43

Dabbler
Joined
Mar 24, 2015
Messages
16
Any advice would be appreciated. I have a Raidz2 with 12 2TB Drives. Based on the calculator, I should have 17.9 Tib usable space as reported in Freenas and windows, but Freenas and Windows only show 16 Tib. Cant figure out where the extra 1.9Tib went. I do not have any snapshots stored.

Thanks
Could this simply have to do with Ashift 9 vs 12?
 
Last edited:

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
The app is still under development and the overheads are incomplete so it's probably from here that the difference come from.

But it's a pretty big difference, I wonder if you have some jails? because they use some space too.
 

homerjr43

Dabbler
Joined
Mar 24, 2015
Messages
16
I have zero jails/plugins...reading around it appears that 12 disks is not optimal, but losing an extra 1.9TB above and beyond parity and meta seems like too much
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
Do you use compression?
 

homerjr43

Dabbler
Joined
Mar 24, 2015
Messages
16
Looks like this may be my problem..http://lists.freebsd.org/pipermail/freebsd-fs/2013-May/017337.html

Free space calculation is done with the assumption of 128k block size.
Each block is completely independent so sector aligned and no parity
shared between blocks. This creates overhead unless the number of disks
minus raidz level is a power of two. Above that is allocation overhead
where each block (together with parity) is padded to occupy the multiple
of raidz level plus 1 (sectors). Zero overhead from both happens at
raidz1 with 2, 3, 5, 9 and 17 disks and raidz2 with 3, 6 or 18 disks.

On 2013-05-29 15:18, Hans Henrik Happe wrote:
> Hi,
>> I've a system with 3TB WD NAS disks for ZFS. I created a 4k aligned 10> disk RAIDZ2. I noticed the overhead was ~1.4TB for the file system> (3*8*10^12/2^40 - <free space>). Then I tried with different number of> disks:>> 6: 0.2602885812520981> 7: 1.1858475767076015> 8: 1.149622268974781> 9: 0.7288754601031542> 10: 1.3953630803152919> 11: 2.061850775964558> 12: 2.915792594663799> 13: 1.5491229379549623> 14: 2.056995471008122> 15: 2.5648680040612817> 16: 3.0727405650541186> 17: 3.5806130981072783> 18: 0.7912140190601349>> the other good configs (6 and 18) is okay, but it seem strange that 10> has higher spaceoverhead than 18.
High overhead with 10 disks is because of allocation overhead.
128k / 4k = 32 sectors,
32 sectors / 8 data disks = 4 sectors per disk,
4 sectors per disk * (8 data disks + 2 parity disks) = 40 sectors.
40 is not a multiple of 3 so 2 sector padding is added. (5% overhead)

>> I then tried with RAIDZ:
>> 5: 0.19996666628867388> 9: 0.39560710452497005> 17: 0.8849408514797688>> This seems correct. Then RAIDZ2 again but with ashift=9:>> 6: 0.2602881230413914> 10: 0.4523236369714141> 18: 0.7912133224308491>> This also seems correct. The 6 and 18 results are basically the same for> ashift 9 and 12.>> Is there an explanation to this?>> I'm running FreeBSD 9.1-RELEASE-p3.>> Cheers,> Hans Henrik Happe

Assuming the overhead is almost 3Tib for 3TB disks, it would make sense that my overhead is almost 2Tib for 2TB drives. You may want to use these numbers if you decide to update your calculator.
 

homerjr43

Dabbler
Joined
Mar 24, 2015
Messages
16
Based on these numbers adding the 12th disk only netted me 1.83Tib but adding a 13th would add 4.05TiB.. So now the question is, do i have room in my 12bay supermicro case to add one more 3.5 Hard drive???
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
I've already read this page (in fact I've read pretty much everything related to overhead, space, blocks, ... in ZFS before developing the app) but this is valid only if you don't use compression.

But now I realize that if compression is enabled but not used because the data is already compressed, then it's very much like if the compression is disabled, so the power of two rule still apply...

I'll sort out all of this, but not now because I go on vacation in less than 12 hours :P
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I've already read this page (in fact I've read pretty much everything related to overhead, space, blocks, ... in ZFS before developing the app) but this is valid only if you don't use compression.

But now I realize that if compression is enabled but not used because the data is already compressed, then it's very much like if the compression is disabled, so the power of two rule still apply...

I'll sort out all of this, but not now because I go on vacation in less than 12 hours :p

That's not entirely correct. If you have any kind of files with compression, even if it is just the beginning/end of files (which is very common), then this can quickly go out the window due to the failed alignments.

Really, when it comes down to the overhead, things like *if* your data compresses, your blocksize, etc play major roles. It's not easy to just say "I got 15TB RAW, 10TB formatted, and I'm expecting 1TB of overhead". On the same system with the same data I can make that system store 8TB+ of data, or I can crush the zpool with just 4TB of data. It all odepends on block sizes, *if* any data compresses, etc.

Even then, some of the OpenZFS guys have expressed opinions that the whole "power of 2" that everyone lived by may not have been all that it was cracked up to be and may not have been as useful as people think it was. The one thing that everyone does seem to agree on is that overly wide zpools can create problems and should be avoided.

Overhead is not something that can be easily calculated away. I've seen people with 8TB of formatted capacity that couldn't even store 4TB of data. For this reason I mostly ignore the overhead questions on the forum because it is impossible for someone to throw out some number and be close to the target. Your guess has no better chance of being accurate than my guess. ;)
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
Yep, I realized that a few weeks ago and then I wanted to know if it's possible to calculate some average values that works for 90 % of the servers. I still need to talk with ZFS devs on IRC to see if it's possible or not.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Yep, I realized that a few weeks ago and then I wanted to know if it's possible to calculate some average values that works for 90 % of the servers. I still need to talk with ZFS devs on IRC to see if it's possible or not.

It's not. I asked the about this a year or so ago. Unless you have specific data, it's all a guess. :p
 

SirMaster

Patron
Joined
Mar 19, 2014
Messages
241
Yes, if the user changes the block size then everything changes.

I think it's pretty futile to try to make a RAIDZ capacity calculator because of how complicated it ultimately is.

Here is some example of the weirdness of ZFS:

If I create for example a 12-disk RAIDZ2 out of 3TB disks, with ashift=12, ZFS will tell me that my new empty pool capacity is 24.37TiB.

If instead I create the pool with ashift=9, then ZFS will tell me that my new empty pool capacity is 26.76TiB.

This is a difference of 2.39TiB, or a difference of about 9% less "capacity" using ashift=12.

Now is when the weirdness starts. Lets just look at the behavior for ashift-12 as that's what the vast majority of people will be using these days.

If I create a default dataset with 128KiB recordsize and write 1GiB of incompressible data to it, ZFS will report that I have used 1GiB of space. Everything so far makes sense here. If I enable compression and then write the 1GiB of incompressible data then same thing, ZFS says I have used 1GiB.

Now, if instead I create a dataset with a 1MiB recordsize and write a 1GiB file to the dataset with compression off, ZFS will again say that I have used 1GiB of data.

Now for the weird part. If I turn on compression and write 1GiB of incompressible data to a 1MiB recordsize dataset, ZFS will report that I am only using 0.91GiB of space!

This doesn't make sense, but it's just how the ZFS data accounting code handles the relationship between ashift, block padding, and different block sizes.

What is essentially happening is that I "lost" 9% of my capacity by using ashift=12. However, using 1MiB blocks with compression enabled will make "all" my files when stored on that 1MiB dataset "appear" 9% smaller. Large blocks essentially lets you reclaim all that wasted padding space that you appeared to lose in ashift=12 over ashift=9. The padding space gets compressed when compression is enabled which is how it works. If you look at the compressratio property of that 1MiB dataset with the 1GiB of incompressible data it will actually report 1.09x compression ratio!

Personally I think this is a bug (the compressratio number), but all the platforms that run the OpenZFS codebase that I have tested exhibit this behavior including Illumos, BSD, and Linux.

Matt Ahrens' conclusion from his stripe size blog post is good advice. Use RAIDZ, not too wide, enable compression (even for incompressible data) as it will compress the padding space a lot. Also, if you want to maximize your data storage capacity, use 1MiB recordsize datasets which FreeNAS 9.3 does support in the GUI.
 

LongHair4277

Cadet
Joined
May 2, 2015
Messages
5
Excellent App... was wondering how you go about using sub 1TB drives? ... ie 750GB I tried entering "0.75 " and ".75" and got nothing. Its like the xxTB field is set to be an integer as compared to a real.
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
Thanks ;)

Yep, integer only. But, a easy hack is to use 750 and just think of T as G :)
 
Status
Not open for further replies.
Top