Introduction to ZFS

Introduction to ZFS Rev 1d)

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,175
Ericloewe submitted a new resource:

Introduction to ZFS - A short introduction to ZFS, oriented towards FreeNAS users

This is a short introduction to ZFS. It is really only intended to convey the bare minimum knowledge needed to start diving into ZFS and is in no way meant to cut Michael W. Lucas' and Allan Jude's book income.

It is a bit of a spiritual successor to Cyberjock's presentation, but streamlined and focused on ZFS, leaving other topics to other documents.

To download the...

Read more about this resource...
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,175
Depending on feedback, I might add some illustrations, particularly around vdevs.

For now, expect typos, weird sentences and odd phrasings, but the content is hopefully okay.

I'm very open to suggestions about what topics would merit also being discussed (or more discussion, if already discussed), as I'm targeting a maximum of 15 pages, which leaves three whole pages plus three or so pages' worth of white space.
 

Heracles

Wizard
Joined
Feb 2, 2018
Messages
1,401
Hi Eric,

I noticed that you did not mentioned dedup in your text. Because a lot of storage providers present dedup as a wonderful feature and people are looking at every way they can to save storage, many turns on dedup when it hurts more than helps.

I would suggest you add a quick warning about it, maybe using something like this post by honeybadger.

Hope to help you improve you already nice document,
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,175
Yeah, that goes to show how little I think about dedup. Excellent point.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,175

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,110
I thought I felt my ears burning.

Yes, the current implementation of dedup in ZFS is an ugly mess. Matt Ahrens wrote a great little presentation about how to make it not suck by switching to a log-based method rather than hash tables, and having it automatically disable itself when under memory pressure - http/open-zfs.org/w/images/8/8d/ZFS_dedup.pdf - but at the moment it's a very effective method of shooting yourself in the foot.

Most potential dedup situations are better handled via snapshots, clones, and compression; or just applying the dedup to the data before it gets to ZFS.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,175
Yeah, nobody's picked up the idea yet, which is unfortunate.
 

pro lamer

Guru
Joined
Feb 16, 2018
Messages
626
It might be worth mentioning:
1. (In context of no fsck for zfs) - one needs a backup
2.(in context of datasets purpose) block size can differ between datasets
3. (Context of snapshots) a dataset deletion can be undone only under some operating systems and not FreeNAS.
4. Replicating a dataset "deletion" under FreeNAS causes the receiving FreeNAS delete the datasets which cannot be undone - data is deleted or lost within both backup and main copy
5. (Feature flags) I'd add that pool upgrade adds new flags and warn against not being able to use with older versions of operating systems sometimes
6. (Deduplication) make the rule of thumb more precise where whether it refers to raw storage or usable
7. (Deduplication) add this:
potential dedup situations are better handled via snapshots, clones, and compression; or just applying the dedup to the data before it gets to ZFS
Because I kept considering deduplication until I learned the ways to avoid it. Bare warnings were no use ;)

Sent from my phone
 

Yorick

Wizard
Joined
Nov 4, 2018
Messages
1,912
Nice! A link to an explanation of how to set up and use snapshots would be great. What does “recursive” mean? How do you get a file from an old snapshot without needing to clone the dataset first? As in, the SMB link to snapshots, which is great.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,458
What does “recursive” mean?
If there's confusion about this, I guess it should be explained, but "recursive" would seem to have a pretty obvious meaning: if checked, the snapshot of dataset X will also include all of X's child datasets. If not, it won't.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,458
On page 9, immediately under the heading for ZFS replication, the zfs send command isn't consistently in the "command" font ("zfs" is, "send" isn't). It's otherwise looking good so far.
 

Yorick

Wizard
Joined
Nov 4, 2018
Messages
1,912
That wasn’t obvious to me at all. I was wondering how it might apply to multiple snapshots over time and didn’t get anywhere with that.
Please do include that explanation.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,110
Yeah, nobody's picked up the idea yet, which is unfortunate.
I don't think there's as much demand/payoff for it versus things like vdev removal or pL2ARC so it doesn't have the focus.

Back on topic though, I've probably written enough about dedup here on the forums that I could boil it down to a quick summary with appropriate warnings. Really though the shortest answer is "you probably don't need it, even if you think you do."
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,175
Nice! A link to an explanation of how to set up and use snapshots would be great. What does “recursive” mean? How do you get a file from an old snapshot without needing to clone the dataset first? As in, the SMB link to snapshots, which is great.
I want to avoid instructions, since they can change a lot and that's what the manual is for.

That just made me realize that the manual should probably be mentioned with a link.
 
Joined
Jan 4, 2014
Messages
1,644
Ericloewe submitted a new resource:

Introduction to ZFS - A short introduction to ZFS, oriented towards FreeNAS users

OMG! OMG! OMG! Thank you! Thank you! Thank you! As I read through the various sections, I'll feed back my thoughts.

Structure of ZFS
'Generally, 10 disks is a good rule of thumb for the maximum width of a vdev' followed by 'Not horrible: ... RAIDZ2, 14-wide'
'RAIDZ1 is often considered too unreliable' followed by 'Not horrible: ... RAIDZ1, five-wide' (P 5-6)

I found this confusing and, for me, created an internal dilemma that I wasn't able to satisfactorily resolve. I can see this being contentious with one person saying 'Maximum width 10-disks' another saying, 'but the good book says 14-disks is not horrible'. In one sense you've qualified this by indicating that performance drops for really wide vdevs, but we've all come across individuals who push those boundaries. Is a 30-wide vdev still not horrible? It makes it difficult to assist someone who works outside of recommended boundaries. Someone who has a 14-wide vdev and then complains about performance issues, in my mind, loses some credibility because they are working outside of the recommended 10-disk maximum. I think some qualification is required here. 'not horrible' is a bit vague.

Another gray area for me is, for example 'RAIDZ2 10-wide...Is that inclusive of the 2 disks that are not available for data? Or can I used 12 disks and call it 10-wide, because I lose 2 disks anyway'. I think some qualification here would be useful.
 
Last edited:

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,175
In a way, that's just it - it's a grey area. It might work for you, it might not. @Allan Jude, for instance, runs a bunch of 12-wide RAIDZ2 vdevs because that's what fits neatly in his servers - but he tends to store video, so IOPS don't really matter that much. If someone else was doing something with a bunch of smaller files, they might feel that 12-wide gets too slow.

Another gray area for me is, for example 'RAIDZ2 10-wide...Is that inclusive of the 2 disks that are not available for data? Or can I used 12 disks and call it 10-wide, because I lose 2 disks anyway'. I think some qualification here would be useful.
I can clear that up - it's total width.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,110
If someone else was doing something with a bunch of smaller files, they might feel that 12-wide gets too slow.
And if they're running virtual machines on RAIDZ-anything, a rabid Badger eats all of their IOPS.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,458
Please do include that explanation.
Seems it's in the manual. IMO (which doesn't mean much, as this isn't my document), describing the FreeNAS buttons and switches is kind of outside the scope.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,175
IMO (which doesn't mean much, as this isn't my document), describing the FreeNAS buttons and switches is kind of outside the scope.
That's my opinion, too - there's no point in writing a parallel manual for FreeNAS. This is probably meant to be read before anyone gets as far as downloading FreeNAS, much less open the manual. When the new user finally has FreeNAS setup, they can then go "okay, so how do I configure snapshots, they're awesome and I want them" and look at the manual and see that section so-and-so tells them what they need to know. If it doesn't that's something that needs to be fixed in the manual. This also has the advantage of keeping things fairly agnostic - even if it's focused on FreeNAS, it's just as valid for vanilla FreeBSD, Linux, macOS, Windows or whatever.

I'm also not looking to replace the ZFS books, since they do a better job than I can reasonably do. The key part of the title is "introduction", leaving the details to other documents.
 

pro lamer

Guru
Joined
Feb 16, 2018
Messages
626
It might be worth mentioning:
1. (In context of no fsck for zfs) - one needs a backup
2.(in context of datasets purpose) block size can differ between datasets
3. (Context of snapshots) a dataset deletion can be undone only under some operating systems and not FreeNAS.
4. Replicating a dataset "deletion" under FreeNAS causes the receiving FreeNAS delete the datasets which cannot be undone - data is deleted or lost within both backup and main copy
5. (Feature flags) I'd add that pool upgrade adds new flags and warn against not being able to use with older versions of operating systems sometimes
6. (Deduplication) make the rule of thumb more precise where whether it refers to raw storage or usable
7. (Deduplication) add this:
situations are better handled via snapshots, clones, and compression; or just applying the dedup to the data before it gets to ZFS.
Because I kept considering deduplication until I learned the ways to avoid it. Bare warnings were no use ;)

Sent from my phone
Maybe also
8. Mentioning that there is no raid 01 equivalent in ZFS world.

Sent from my phone
 
Top