Two Datasets vs Two Pools and Minimizing Chance of Corruption

HarryMuscle · Nov 18, 2021

Is it possible for a "misbehaving" dataset to cause corruption in other datasets in the same pool or to corrupt the whole pool and thus make other datasets inaccessible? Or in other words, if the goal is to minimize the chance of data corruption would it be safer to create two seperate pools instead of using two datasets in the same pool? Or is the risk so minimal it's not worth worrying about?

sretalla · Nov 18, 2021

All blocks are equal in the eyes of ZFS, so one dataset is as protected as the one next to it...

That said, metadata corruption can happen (although by default it's already stored more than once in every pool, so you need to have a badly failing disk or disks in order for that to be the case) and may make it harder to get to data that's in a child dataset of one that has corruption (should simply be a case of re-mounting the child, but there can be complications to it depending on circumstances).

2 separate pools will have 2 entirely separate sets of metadata, so you could argue that one either way... more metadata is more ground for lightning to strike upon... but if you're just splitting it up among the same number of disks in total, you aren't really going to gain anything.

To protect against metadata corruption, you'd be much better off having a 3 or even 4-way mirror special/metadata VDEV in your pool and benefit from the performance boost as well as the "additional safety"... that wouldn't need to be very large and can probably be very cheap SSDs.

Davvo · Aug 1, 2022

sretalla said:
To protect against metadata corruption, you'd be much better off having a 3 or even 4-way mirror special/metadata VDEV in your pool and benefit from the performance boost as well as the "additional safety"... that wouldn't need to be very large and can probably be very cheap SSDs.

I am sorry to resuscitate such an old thread, but I am perplexed by the number of mirrors needed since metadata is stored 2 or 3 times in a single disk.
Wouldn't a standard 2-way mirror be enough, maybe with a hotspare shared?
4-way mirror sounds a bit overkill to me, even for an enterprise level solution (I am not in the field though, so this could very well be standard and I would be puzzled for nothing).
Edit: also, what are the performance implications of this? And any issues with SSD Endurance?

sretalla · Aug 2, 2022

The starting point for the discussion is replicating the safety you have in your pool data VDEVs... for a pool of mirrors, that would possibly be a simple mirror for the metadata VDEV too... but that then puts you in an interestingly risky situation when one of the disks holding the metadata fails... now your pool has to lose only one disk to be unrecoverable. (which would be true of any of the mirrored VDEVs in the pool too)

So the question is what would you do if you want to be able to lose any 2 disks and still have redundancy... that means a 4-way mirror for metadata (and maybe RAIDZ3 in your data VDEVs depending on the use-case).

Anyway, as you say, the likelihood of losing metadata on any one disk through corruption is reduced since copies=3 (if I remember correctly... at least 2 if not) is already set for all metadata.

But that doesn't really help if you're losing entire disks.

Important Announcement for the TrueNAS Community.

Two Datasets vs Two Pools and Minimizing Chance of Corruption

HarryMuscle

Contributor

sretalla

Powered by Neutrality

Davvo

MVP

sretalla

Powered by Neutrality

Similar threads