What's the stigma behind using RAIDZ1?

jace92

Dabbler
Joined
Dec 14, 2021
Messages
46
Hey all!

I'm actively working on increasing the storage on my TrueNAS deployment from 2x mirrored 14TB Exos drives to 3x in RAIDZ1 but while researching this I've seen A LOT of warnings and chastisement for doing RAIDZ1 (especially with drives that large). I've seen that the resliver process increases the chance for another failure thus losing everything, however, for someone like me who can really only get drives once I've saved up it's kinda hard NOT to want to do RAIDZ1.

Is it really as bad as I've seen in other posts? I've also seen that the bad is overhyped but those posts are few.

I do have a 2nd machine as my backup server that only turns on once a week to pull the data from the main one and then shuts off after a few hours as well as an offline backup of my stuff. As my storage increases, though, those both get a little tricker to do.

I'd like to hear whatever you have to say about it!

Thanks!
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
1. Do you like your data? (how much would you pay to have it not go away forever?)
2. Do you have a backup of it? (sort-of answers question 1)
3. How much of a gambler are you?

With one disk failed in RAIDZ1 and during the entire resilver process, you are completely without protection against corruption of your data (you will still know about it if it happens thanks to ZFS checksums, but can do nothing to correct it) in addition to being a single disk failure away from it all being gone forever (unless your backup is up to date and in good working order... how often do you test your backups?). Even if the chance of a second failed disk were 1 in 1000, would you roll that die and live with the outcome? what if it were 1 in 100? There are many factors which drive the odds... age of the drives is one... unfortunately all drives currently in an array will age at the same rate, so all will be "old" if you make it to 5 years without a lost disk... what then? If that makes it a 1 in 10 chance?

I suppose with those questions and the above statement, you should have enough to go on to make your decision.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
You already hit the nail on the head and I think you already know the answer. @sretalla is correct, if your data is important to you then build a safer pool. As you have stated, you backup once a week. Maybe that is on Saturday. What if you drive fails on Friday? If you are making differential backups and have the time to update your backups, you would be fine, but if another drive fails (yes it does happen, even if it's just a section of drive that can't be read) then your data could be gone as well.

You have read about it, we are echoing what you read. It is not recommended to use RAIDZ1 with very large hard drives.

But with all that said, it holds the same risk as a pair of mirrored drives. You can loose one drive with either configuration and you still have access to your data, loose the second drive and all that data is gone. So what I'm saying is, it's completely up to you.

So your plan is to backup all your data, destroy the mirror, create a RAIDZ1 with 3 drives, then restore all your data? Why did I lay that out? Because some folks will add a single drive to a pool and make it a stripe, thus the new drive you added is now a single point of failure.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
There is quite a bit out there, and here on this forum, that exaggerates the risk of RAID5/RAIDZ1--I believe the date given for "RAID5 is dead" was 2009, 14 years ago now. There are lots of posts suggesting that it's more likely than not that an additional disk will fail while resilvering, and there's quite a bit of information suggesting that the resilvering operation itself is in some way uniquely stressful for your pool. Both of these claims are bunk.

But with that said, the risk does exist, and it's significantly greater than with RAIDZ2 (or better yet, RAIDZ3). As a result, we generally recommend against RAIDZ1 with disks larger than 1-2 TB, though that can vary a bit depending on your risk tolerance and the resilience of your backup strategy.

We tend to be conservative here--we care about our data, and we tend to assume that people are using TrueNAS and ZFS because they also care about their data. We typically make our recommendations on those assumptions. If your data isn't that important--say it's a media pool of stuff you can easily re-download re-rip--you might not need to take such precautions. But don't then put irreplacable data on such a pool.
 

jace92

Dabbler
Joined
Dec 14, 2021
Messages
46
Thanks for the replies!

You have read about it, we are echoing what you read. It is not recommended to use RAIDZ1 with very large hard drives.

But with all that said, it holds the same risk as a pair of mirrored drives. You can loose one drive with either configuration and you still have access to your data, loose the second drive and all that data is gone. So what I'm saying is, it's completely up to you.

If mirrored and RAIDZ1 are the same "riskiness", why isn't mirroring frowned upon? Wouldn't the extra space provided with not much more hardware be a good thing? Unless it's that extra data that becomes the issue when you talk about rebuilding or restoring.

So your plan is to backup all your data, destroy the mirror, create a RAIDZ1 with 3 drives, then restore all your data? Why did I lay that out? Because some folks will add a single drive to a pool and make it a stripe, thus the new drive you added is now a single point of failure.

You are correct. Backup, kill, rebuild as RAIDZn (Whatever I feel I can afford/justify with the wifey :cool:). No random striping of a single disk. I at least know better than that!

But with that said, the risk does exist, and it's significantly greater than with RAIDZ2 (or better yet, RAIDZ3). As a result, we generally recommend against RAIDZ1 with disks larger than 1-2 TB, though that can vary a bit depending on your risk tolerance and the resilience of your backup strategy.

I think my question from above could go here too about is there a difference from what I've been doing with a 2 disk mirror vs RAIDZ1. I'm certainly going to be looking for spare change in the couch for a 4th disk if I can, but then I have to consider potential future expansion and how that will need to work in regards to potentially having to do this again.... Unless the "coming" change to ZFS happens and I can expand a pool easily I don't know how I'd do that just yet. I don't like "failing to plan" cause we all know the rest of that saying...

Would a RAIDZ2 or striped mirror be a better choice? With Z2, any 2 drives could fail and you'd be fine but striped mirror only 1 of each mirror can die, I believe. The stripped mirror has the advantage of just adding another mirror to increase the pool, correct? How have you guys handled expanding your storage before?
 
Last edited:

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
is there a difference from what I've been doing with a 2 disk mirror vs RAIDZ1
Yes. In the event of a disk failure, the mirror will resilver considerably faster, and there's only one disk whose failure during resilvering would cause failure of the entire pool.
With Z2, any 2 drives could fail and you'd be fine but striped mirror only 1 of each mirror can die, I believe.
Correct.
The stripped mirror has the advantage of just adding another mirror to increase the pool, correct?
Also correct.
How have you guys handled expanding your storage before?
I started with three disks in RAIDZ1, then added another three. Then realized that that didn't provide very good data protection and created a new pool of six disks in RAIDZ2, migrating all the data to that, putting the old six disks into a second RAIDZ2 vdev in the same pool. Then added six disks at a time in RAIDZ2.
 

jace92

Dabbler
Joined
Dec 14, 2021
Messages
46
the mirror will resilver considerably faster
In your opinion, would a stripped mirror or RAIDZ2 be better for 4 drives?

What do other's think between the two?
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
In your opinion, would a stripped mirror or RAIDZ2 be better for 4 drives?
There is so much information on this topic in this forum, it's crazy. There have been dozens of people who have asked the same types of questions. You can do an internet search for something like "truenas vdev pool mirror raidz" for example and you might find a few things to read. Here is a link that I found useful a while back. https://www.truenas.com/community/t...ning-vdev-zpool-zil-and-l2arc-for-noobs.7775/
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
In your opinion, would a stripped mirror or RAIDZ2 be better for 4 drives?

What do other's think between the two?
Depends on what you need. iSCSI wants more vDevs for higher IOPS, so 2 x 2 way Mirrors. But, simple archive / backup where reliability is more important, and speed less so, then 4 disks RAID-Z4.

You have hit upon the age old choice, reliability, speed & low cost. You can have 2 but not all 3.


One comment about risks of RAID-Z1. If you have 4 disks in a RAID-Z1, and loose a disk completely, you have to rely on 3 disks to supply correct blocks to rebuild the failed disk's data.

But, for a 2 way Mirror vDev, you only have to count on 1 disk for rebuild. Thus, some think 2 way Mirrors are safer than RAID-Z1 for disk replacement. With 22TB disks, (and even larger expected), I am starting to doubt that will be a good decision.

ZFS does support "replace in place", which can mitigate RAID-Z1 & 2 way Mirror disk replacement if the failed disk has not completely failed. This means you install a new disk into your TrueNAS server. Then, replace the failing disk with the newly installed disk. ZFS will basically use the failing disk for good data when it can. And then the redundancy when needed. This looks like ZFS is mirroring the failing disk with the new disk. But, whence the replacement is done, the failing disk is "detached" and no longer part of the pool.

This "replace in place" is in someways a ground breaking feature. Long ago I had a hardware disk array with a 4 disk RAID-5, but 2 disks were failing. No data loss because the blocks failing were in different places. So one failing disk was the other's redundancy. Without "replace in place" I had to backup the LUN, replace both disks, rebuild the RAID-5, (which is something ZFS does not need to do to every block of every disk in a pool), then restore the data.

(The cause of that problem was incomplete monitoring of external hardware RAID arrays... later firmware had adding hot spares live, which was basically the "replace in place" feature.)
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
why isn't mirroring frowned upon?

Because you can do three- or four-way mirroring, of course, which gets you dual and triple parity equivalent protection for those of us who require it.
 

jace92

Dabbler
Joined
Dec 14, 2021
Messages
46

Thanks for that! I'll have a more through look through when I have time. I opened it while at work and quickly realized that I won't be able to go through that just yet.

One comment about risks of RAID-Z1. If you have 4 disks in a RAID-Z1, and loose a disk completely, you have to rely on 3 disks to supply correct blocks to rebuild the failed disk's data.

I would only be using it with 3 disks if I did Z1. I'm looking into Z2 or stripped mirrored with 4 disks because I do value what I put on there which is why I have 2 different backups. But just because I have the backups doesn't mean I necessarily need to risk things if I can afford not to. On the other hand, $400 for 2 new disks is hard at the moment which is why I'm here looking for information from you guys.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
There is quite a bit out there, and here on this forum, that exaggerates the risk of RAID5/RAIDZ1--I believe the date given for "RAID5 is dead" was 2009, 14 years ago now. There are lots of posts suggesting that it's more likely than not that an additional disk will fail while resilvering, and there's quite a bit of information suggesting that the resilvering operation itself is in some way uniquely stressful for your pool. Both of these claims are bunk.

I'm going to take exception to this, because it is a known issue that ZFS arrays that are resilvering usually also increase in temperature, sometimes by more than 10'C. For many pools, this is the only time they experience thermal stresses, so being unnecessarily dismissive of a real threat is not helpful. Resilvering represents one of those operations where the potential for something to go wrong certainly exists, so at a minimum you should consider whether it would be prudent to take additional steps to mitigate the risk. This might be as simple as blowing a fan on your NAS during a resilver.
 

jace92

Dabbler
Joined
Dec 14, 2021
Messages
46
This might be as simple as blowing a fan on your NAS during a resilver.

In my case there would be 2 high power fans right in front of the drives so that would be ok.

I know it depends on what CPU is doing it, but how much processing power does it actually take to do the resilver? Is it a matter of the CPU runs at 100% until it's done or does TrueNAS sorta just figure it out to allow other things to keep working? I ask because the fans would be tied to the CPU temps.
 

jace92

Dabbler
Joined
Dec 14, 2021
Messages
46
Because you can do three- or four-way mirroring, of course, which gets you dual and triple parity equivalent protection for those of us who require it.

I guess I wasn't specific because in my head I was thinking about 2 disk mirrors when I originally asked that, I just didn't say it. I am starting to understand the benefit of multi-mirrored pools, at least to an extent. It doesn't seem practical to exponentially grow your array with mirrors.
 

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824
If mirrored and RAIDZ1 are the same "riskiness", why isn't mirroring frowned upon?
Not even close. Mirrors have way better risk profile as the number of drives go up.
Take for example, A 6-drive RAIDZ1 vs 6-drive striped mirrors. In a 6-drive RAIDZ1, you will lose your pool as soon as a second drive goes bad, a 6-drive striped mirrors can suffer up to 3 drive failures so long as the drives aren't in the same vdev. Not to mention that your RAIDZ1 pool suffers 5 times more I/O load while it's resilvering that 1 failed drive because it has to read a block from each of the surviving drives for each block resilvered vs only 1 in a striped mirrors. Hence, the resilvering time in striped mirrors are orders of magnitudes faster than RAIDZ1 and that's assuming you don't use the pool at all while resilvering.

Upgrading/expansion is also relatively simple with striped 2-way mirrors cause you only have to upgrade 2 at a time as opposed to 6 at a time in the case of our 6-drive RAIDZ1.

That being said, the one big downside in mirrors is obviously 50% space efficiency vs the much better 83% space efficiency of the 6-drive RAIDZ1, but in return, you get better upgrade/expansion path, way better performance overall whether degraded or not, way faster resilvering, and also better risk profile.
 

jace92

Dabbler
Joined
Dec 14, 2021
Messages
46
Take for example, A 6-drive RAIDZ1 vs 6-drive striped mirrors. In a 6-drive RAIDZ1, you will lose your pool as soon as a second drive goes bad, a 6-drive striped mirrors can suffer up to 3 drive failures so long as the drives aren't in the same vdev.

While I get where you are going with that, I mentioned above that I was referring to 2 disk mirrors and 3 disk RAIDZ1 for their "riskiness". Thanks for the example and your comment about the upgradability, though.
 

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824
While I get where you are going with that, I mentioned above that I was referring to 2 disk mirrors and 3 disk RAIDZ1 for their "riskiness".
Yeah I got that from some of the other posts, but I figured I'd give you more context on other larger scenarios as most people will eventually want to upgrade their array and it's usually not a question of if, but a question of when. Worth noting too, that once created, a RAIDZ vdev is immutable (at least for now). There is work on changing that, but I wouldn't hold my breath waiting on that to come online because it still needs extensive testing for production use.
 

jace92

Dabbler
Joined
Dec 14, 2021
Messages
46
Worth noting too, that once created, a RAIDZ vdev is immutable (at least for now). There is work on changing that, but I wouldn't hold my breath waiting on that to come online because it still needs extensive testing for production use.

Yea, I can't hold my breath for that long...
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
I know it depends on what CPU is doing it, but how much processing power does it actually take to do the resilver? Is it a matter of the CPU runs at 100% until it's done or does TrueNAS sorta just figure it out to allow other things to keep working? I ask because the fans would be tied to the CPU temps.
Yes the CPU does impact how fast resilvering occurs but up to a point and then the bottleneck is the drive throughput speed (could the the drive, the interface, and/or the pool configuration). So an ATOM CPU might have slow resilvering while a Xeon CPU would be limited by the drive throughput for an example. I honestly do not know if there was a study done on resilvering speeds using different CPUs so which CPU is the perfect balance, eh? I do think your current CPU is sufficient in this respect and your limitation would likely be drive throughput. I hope this makes sense.
 

jace92

Dabbler
Joined
Dec 14, 2021
Messages
46
I do think your current CPU is sufficient in this respect and your limitation would likely be drive throughput. I hope this makes sense.

Good to know, thanks!

So, I just thought of a related question. If I have the backups, is it better to let the system do the resilver (regardless of the RADZ used) or just re-copy from my backups?
 
Top