Slideshow explaining VDev, zpool, ZIL and L2ARC for noobs!

Yorick · Feb 12, 2020

danb35 said:
I guess that depends on how you define "bug." It's probably behaving as intended, but "as intended" is, well, not very good behavior. The pool configuration @kivalo is considering is not optimal or recommended, but that doesn't mean it's always a bad idea, and to prevent such configurations entirely with no possibility of bypassing the warnings doesn't seem like a very good idea--particularly when every previous version of FreeNAS has allowed for this (after appropriate warnings).

I'll quibble about "prevent such configurations entirely" - "zpool create raidz2 poolname dev1 ... devN" is still a thing. Anyone who runs that from CLI clearly meant to do that, whatever the (sub-optimal) outcome.

Without knowing how many people inadvertently, and warnings notwithstanding, created a vdev configuration they didn't mean to create, it's hard to make a judgment call on whether the pros of this change - keeps GUI users from shooting themselves in the foot - outweigh the cons: Need to drop to CLI to create mixed-disk-size vdevs.

naven · May 4, 2020

Good afternoon!

Installed Freenas... While I like everything, but I am not familiar with *nix systems, with the ZFS file system, and I would like to ask a few questions about the stability of such a storage. Say I have a pool of one hard drive. Can I insert this disk into a new computer with freenas installed from scratch and access the data? It's just that all these utilities are snapshots, pools, etc. - I can’t understand if the disk will work, as, say, I would insert a disk with ntfs or fat in a new computer.

danb35 · May 4, 2020

naven said:
Can I insert this disk into a new computer with freenas installed from scratch and access the data?

This seems an odd place to be asking the question, but yes--just import that pool into the new installation.

rreboto · Dec 8, 2020

cyberjock said:
Everyone that has used non-ECC RAM and the RAM went bad suffered a loss of their entire pool. Unfortunately, because of how everything works, it'll trash your original copy of the data, which will trash the backups too.

I'd appreciate clarification what you are emphasizing here:

Bad RAM will destroy your original copy and your backup!

Does this apply to snapshots as well? It has been my understanding thus far that snapshots are read-only, and if you have at least one snapshot from the time before RAM went bad, it would be possible to restore good data from a snapshot. So, if the backup server is keeping snapshots, there is still a chance that the data is in a safe, recoverable state.

Thanks!

PS: There's a lot of great information in the deck; thanks for putting it together!

danb35 · Dec 8, 2020

rreboto said:
Bad RAM will destroy your original copy and your backup!

Well, if bad data is written, and once written, that bad data is backed up, the backup is going to be bad as well--which is why you'd want to keep a history of backups.

Fox · Dec 8, 2020

I've had bad ram on a client (non-freenas) copy files to and from the NAS and thus corrupt them in the process. In this case, freenas snapshots would work to correct it if you catch the issue before the snashots roll off the server. I was lucky I caught it early due to other system errors.

The next client computer I buy/build, I will only have ECC memory.. Never again..

rreboto · Dec 8, 2020

Fox said:
I've had bad ram on a client (non-freenas) copy files to and from the NAS and thus corrupt them in the process. In this case, freenas snapshots would work to correct it if you catch the issue before the snashots roll off the server. I was lucky I caught it early due to other system errors.

The next client computer I buy/build, I will only have ECC memory.. Never again..

Thanks @Fox. Sounds like you are confirming that snapshots prior to RAM going bad will still be good. And, yes, understood: once the good snaps roll off, you're in a bad spot.

jgreco · Dec 8, 2020

rreboto said:
Thanks @Fox. Sounds like you are confirming that snapshots prior to RAM going bad will still be good. And, yes, understood: once the good snaps roll off, you're in a bad spot.

Well, it's also possible, if the memory corruption is an ongoing thing, for the snapshot to be good on disk but corrupted when read into memory to be handed off to you. There's been a lot of handwringing about the value of ECC over the years, and a lot of authoritative people saying this and that stupid thing. The simple fact of the matter is that if you shovel data around on a system with potentially bad RAM, you can potentially get garbage. This isn't a ZFS thing, it's basic computer science and the reason servers are generally designed with ECC.

Fox · Dec 8, 2020

rreboto said:
Thanks @Fox. Sounds like you are confirming that snapshots prior to RAM going bad will still be good. And, yes, understood: once the good snaps roll off, you're in a bad spot.

My problem was on the client only.. I would copy stuff to the NAS and the files would not compare. With bad ram on the NAS, I shudder to think what could happen.. I could think of a few scenarios where things go sideways.. Like a scrub, where it thinks there is an error on the disk and then it tries to correct it.. I am not sure if that repair would be at a low level, like fixing a bad block on the disk, or at a higher level, like a copy on write. If it did happen at a low level, I could picture the NAS "fixing" (corrupting) what it thinks is a bad block, and that block contains part of a snapshot.

But I am not an expert on ZFS... Bad memory is very rare, but In my experience, Ive had it happen with a DDR type memory stick a few years ago, which the manufacturer replaced free under the lifetime memory warranty. I have also had it happen many many years ago to a on-board CPU cache. This was back in the day when the CPU cache was not in the CPU (like L3).. In that case I was able to isolate the actual chip, which was luckily not soldered in, and then went to Fry's Electronics and bought another chip and pushed it into the slot to fix the problem.. Both times I experienced a very flaky system.. Random blue-screens, reboots, corrupted files, etc.... Stuff like that. It's not fun..

Yorick · Dec 10, 2020

Fox said:
Both times I experienced a very flaky system.. Random blue-screens, reboots, corrupted files, etc.... Stuff like that. It's not fun..

I had bad memory maybe twice, max thrice in the last 20 years - and it was a complete clusterfsck of a week+ of troubleshooting to narrow it down to memory, each time. Never. Again. Now that AsRock/Ryzen is a thing, I'm building ECC desktops only.

NASbox · Dec 11, 2020

Yorick said:
I had bad memory maybe twice, max thrice in the last 20 years - and it was a complete clusterfsck of a week+ of troubleshooting to narrow it down to memory, each time. Never. Again. Now that AsRock/Ryzen is a thing, I'm building ECC desktops only.

I'm curious:
What kind of a premium do you have to pay for an EEC build vs Non-ECC?

Also, what is the quality of Asrock borads? I was under the impression that they used Chinese caps vs Japanese caps, which can make quite a difference to motherboard life. (Or am I mistaken about this?)

If you are tuning the hardware every 2 years, then maybe it doesn't matter, but I tend to keep my hardware for quite a long time.

Yorick · Dec 11, 2020

NASbox said:
I'm curious:
What kind of a premium do you have to pay for an EEC build vs Non-ECC?
Also, what is the quality of Asrock borads? I was under the impression that they used Chinese caps vs Japanese caps, which can make quite a difference to motherboard life. (Or am I mistaken about this?)

ECC is dirt cheap, maybe 10-15 USD more per stick.

Er caps. Haven’t thought on that. Let me see what the Internet says! Okay from a quick sampling, every AsRock board I looked at claims “100% Japan made high quality conductive polymer capacitors”.

I’m with you on longevity. We replaced the husband’s PC after 10 years of service. I’m trying to make it to 10 years as well, which is 2022. Just have to have the discipline and not jump on that AM5 when it’s released in 2021 :).

NASbox · Dec 11, 2020

Yorick said:
ECC is dirt cheap, maybe 10-15 USD more per stick.

Er caps. Haven’t thought on that. Let me see what the Internet says! Okay from a quick sampling, every AsRock board I looked at claims “100% Japan made high quality conductive polymer capacitors”.

I’m with you on longevity. We replaced the husband’s PC after 10 years of service. I’m trying to make it to 10 years as well, which is 2022. Just have to have the discipline and not jump on that AM5 when it’s released in 2021 :).

That's good to know... I was under the impression that Asrock was a "cheap" "off" brand owned by Asus... they used some of the engineering from the name brand Asus boards, but cut corners and used lower cost/quality parts. If my impression is wrong, I'd love to hear feedback from the community as I would likely consdier buying one at some point.

Ericloewe · Dec 12, 2020

The similar names are just a coincidence.

NASbox · Dec 12, 2020

Ericloewe said:
The similar names are just a coincidence.

Are you saying there is no ownership connection between Asus and Asrock?

Yorick · Dec 12, 2020

ASRock was spun out from Asus in 2002. It's now a competitor in the motherboard space. You can read a bit about their history here: ASRock - Wikipedia

Pegatron, who own ASRock, were also spun out from Asus.

ASRock was originally created to make motherboards for the value OEM market, but started moving "upstream" in 2007 and have a good reputation in the DIY/enthusiast market now.

trek102 · Mar 3, 2021

Cyberjock. Thank you for writing this ZFS guide. This is amazing and extremely helpful!
However, it makes me want to NOT us ZFS !! There seems to be more risks associated than benefits.

HoneyBadger · Mar 3, 2021

trek102 said:
However, it makes me want to NOT us ZFS !! There seems to be more risks associated than benefits.

I'm curious why this is your perception, because it's definitely not the case. Given identical hardware, a filesystem like ZFS that checksums your data is safer than one that doesn't.

I certainly won't say that ZFS isn't complicated, but generally any "risk" is more of a "performance loss" than a "data loss" issue, unless you commit a major no-no (using RAID controllers with write-back cache) or try to implement a solution without understanding it ("sync writes make my database slow, so I'll just disable them")

trek102 · Mar 4, 2021

@HoneyBadger
Very easy - in my understanding ZFS is great in theory (and its a storage geek's wet dream) but in practice it has more downsides than alternative solutions. Dont get me wrong, I love the concepts of ZFS and for a home lab its fantastic but not for a production system for the following reasons:

Here is my experience: Do we need all the fancy features of self healing etc etc? In my 20 years of data and storage experience I never had anything like bit-rot happen to me. Also, I had 5 total power outages on enterprise servers and no data ever got lost or damaged (all under Linux MDADM RAID or LSI Hardware RAID and ext4). Also in all of 20 years, I had 3 Harddisk physically die on me so RAID5 worked pretty well. Some of our servers had EEC memory and some don't. Again in 20 years, nothing ever happened that was caused by memory.

So in my view ZFS wants to cover some obscure tail risks of highly unlikely data damage and for that you have to live with additional complexity and the fact that there are no repair tools. Once something goes wrong, all is gone. How can this be a sensible risk management strategy?
And here are the specific downside points I see in ZFS :

- first of all, and this hasn't even been mentioned here: Zpools are not (!) compatible across different server environments. If you run ZFS on Linux you cannot import your Pool into Truenas or vice versa. I am not sure what causes this restriction but its a no-go for any serious filesystem. The filesystem is an abstraction layer that should be independent from the rest of the environment. E.g. you can mount ext4 filesystems anywhere and you can import a Linux MDADM Raid pool into any Linux machine. A major factor for data recovery purposes.

- ZFS comes with a bunch of complexity but no real new features (vdevs already existed in Linux. Its just just called LVM); why even use more than 1 vdev in a zpool if a problem in any vdev kills the entire zpool? Obviously its more secure to have a separate zpool for each vdev. So just even more complexity for no apparent benefit.

- no recovery tools (!!??) -> this alone would disqualify any filesystem from serious usage apart from home-lab (which I do appreciate)

- "force-mounting can cause permanent damage" and "plenty of users have lost everything because of wrong commands" -> I dont think MDADM and LVM are that fragile. Its a key concept of a filesystem to not allow damaging actions.

- Scrubs (a key feature of ZFS self-healing) can cause more damage than people know. "Scrubs can completely destroy a zpool that is healthy because bad RAM. Again, a filesystem feature should not allow damaging its own integrity (even if people use non-EEC).

- cache management is overly complicated and apparently doesn't improve anything; reading about it just makes you want to hug your RAID controller with hardware cache and backup battery :) or even normal cache management provided by standard Linux.

- snapshots existed previously in Linux storage (via LVM) - so again nothing new. And if you rsync your snapshot to another location, you know it is safe and works because rsync and LVM have been around for 100s of years. With ZFS Send, you would never know if a restore is really working.

- You cannot add more HDs to a Vdev. In Linux LVM this is not a problem.

- deduplication and compression is nice in theory but nobody uses it in practice because storage space has become cheap and it is "discouraged when performance is important". "even the designers of ZFS recommend not using deduplication"

- I am not sure if ZFS encryption has any advantages over LUKS so no benefit here again

- iSCSI is not working/not recommended. This is a key feature of a storage solution and any Linux distro can do it (even in the Kernel)

- ZFS is apparently not usable as ESXI datastore -> this would be the perfect use-case in an enterprise world but again, too complex with ZFS and not advised.
So I love playing around with it but I am not sure if I would use ZFS on a server where the data matters?

jgreco · Mar 4, 2021

trek102 said:
@HoneyBadger
Very easy - in my understanding ZFS is great in theory (and its a storage geek's wet dream) but in practice it has more downsides than alternative solutions. Dont get me wrong, I love the concepts of ZFS and for a home lab its fantastic but not for a production system for the following reasons:

Here is my experience: Do we need all the fancy features of self healing etc etc? In my 20 years of data and storage experience I never had anything like bit-rot happen to me. Also, I had 5 total power outages on enterprise servers and no data ever got lost or damaged (all under Linux MDADM RAID or LSI Hardware RAID and ext4). Also in all of 20 years, I had 3 Harddisk physically die on me so RAID5 worked pretty well. Some of our servers had EEC memory and some don't. Again in 20 years, nothing ever happened that was caused by memory.

Really? You've never had fsck do anything but preen, never found any contents in a lost+found directory, never found a mysteriously truncated file?

So in my view ZFS wants to cover some obscure tail risks of highly unlikely data damage and for that you have to live with additional complexity and the fact that there are no repair tools. Once something goes wrong, all is gone. How can this be a sensible risk management strategy?

It's a complexity issue. The design expects to keep bad things from happening to begin with. A ZFS system can be storing a petabyte or more of information, and it would both take a huge amount of memory and also a huge amount of time to do a full "fsck-like" thing on ZFS. Your Linux MDADM, LSI RAID, and ext4 do not have the ability to manage petabyte-sized filesystems.

And here are the specific downside points I see in ZFS :

- first of all, and this hasn't even been mentioned here: Zpools are not (!) compatible across different server environments. If you run ZFS on Linux you cannot import your Pool into Truenas or vice versa. I am not sure what causes this restriction but its a no-go for any serious filesystem. The filesystem is an abstraction layer that should be independent from the rest of the environment. E.g. you can mount ext4 filesystems anywhere and you can import a Linux MDADM Raid pool into any Linux machine. A major factor for data recovery purposes.

Then ext3, ext4, and ffs all need to be labeled as no-gos too, because they also are not portable across "different server environments."

- ZFS comes with a bunch of complexity but no real new features (vdevs already existed in Linux. Its just just called LVM); why even use more than 1 vdev in a zpool if a problem in any vdev kills the entire zpool? Obviously its more secure to have a separate zpool for each vdev. So just even more complexity for no apparent benefit.

How do you build a several petabyte sized filesystem with LVM?

- no recovery tools (!!??) -> this alone would disqualify any filesystem from serious usage apart from home-lab (which I do appreciate)

Already discussed.

- "force-mounting can cause permanent damage" and "plenty of users have lost everything because of wrong commands" -> I dont think MDADM and LVM are that fragile. Its a key concept of a filesystem to not allow damaging actions.

Really? "force" mounting shouldn't be allowed? There's a reason it is called "force" mounting, it bypasses the safety checks. And I betcha I can dd /dev/zero onto a bunch of MDADM or LVM disks and destroy them, so apparently they're not very good at preventing damaging actions.

- Scrubs (a key feature of ZFS self-healing) can cause more damage than people know. "Scrubs can completely destroy a zpool that is healthy because bad RAM. Again, a filesystem feature should not allow damaging its own integrity (even if people use non-EEC).

This is why ECC memory is pushed so heavily. But other filesystems can also be damaged by their own repair tools if bad memory is present, so this is a specious argument.

- cache management is overly complicated and apparently doesn't improve anything; reading about it just makes you want to hug your RAID controller with hardware cache and backup battery :) or even normal cache management provided by standard Linux.

This is just an insane statement. The power of ZFS is that you can easily have a terabyte of RAM and terabytes of L2ARC fronting a petabyte of hard disk storage, and it will be insanely fast; your RAID controller with hardware cache and backup battery can't even begin to do that.

- snapshots existed previously in Linux storage (via LVM) - so again nothing new. And if you rsync your snapshot to another location, you know it is safe and works because rsync and LVM have been around for 100s of years. With ZFS Send, you would never know if a restore is really working.

Well, that's just an opinion, and one not really backed up by any facts.

- You cannot add more HDs to a Vdev. In Linux LVM this is not a problem.

That's not really an issue, though. ZFS is just designed differently.

- deduplication and compression is nice in theory but nobody uses it in practice because storage space has become cheap and it is "discouraged when performance is important". "even the designers of ZFS recommend not using deduplication"

But virtually EVERYONE uses compression, so you don't really have your facts straight.

- I am not sure if ZFS encryption has any advantages over LUKS so no benefit here again

But no downside either.

- iSCSI is not working/not recommended. This is a key feature of a storage solution and any Linux distro can do it (even in the Kernel)

- ZFS is apparently not usable as ESXI datastore -> this would be the perfect use-case in an enterprise world but again, too complex with ZFS and not advised.
So I love playing around with it but I am not sure if I would use ZFS on a server where the data matters?

iSCSI works fine and is absolutely recommended IF you are willing to play by ZFS's rules. I've said many times that ZFS, being a CoW filesystem, needs lots of resources to make iSCSI work well, but properly resourced, it will make HDD storage perform almost as well as SSD.

And lots of people use it as ESXi datastore in the real world, so I don't know if this is just you making uninformed statements or what. Lots of enterprises use ZFS for their most challenging storage needs where performance on huge amounts of storage is a key consideration, because when you give ZFS a bunch of resources, it will give you storage that is much faster than standard Linux or BSD filesystems.

Important Announcement for the TrueNAS Community.

Slideshow explaining VDev, zpool, ZIL and L2ARC for noobs!

Wizard

Cadet

Hall of Famer

Cadet

Hall of Famer

Explorer

Cadet

Resident Grinch

Explorer

Wizard

Guru

Wizard

Guru

Server Wrangler

Guru

Wizard

Dabbler

actually does care

Dabbler

Resident Grinch

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Slideshow explaining VDev, zpool, ZIL and L2ARC for noobs!"

Similar threads