What's the stigma behind using RAIDZ1?

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
So, I just thought of a related question. If I have the backups, is it better to let the system do the resilver (regardless of the RADZ used) or just re-copy from my backups?
If you do have backups of your data, I would still replace the suspect drive and let the system resilver. Resilvering should be significantly faster than copying all the data back on the drives. If resilvering fails and your data is lost, that is where the backup would be used.
 

jace92

Dabbler
Joined
Dec 14, 2021
Messages
46

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
I'm going to take exception to this, because it is a known issue that ZFS arrays that are resilvering usually also increase in temperature, sometimes by more than 10'C.
I'm certainly willing to defer to your experience, but is this uniquely the case for a resilver vs. a scrub? Because my understanding (which is certainly prone to error) is that these processes impose essentially the same workload on the pool.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I'm certainly willing to defer to your experience, but is this uniquely the case for a resilver vs. a scrub? Because my understanding (which is certainly prone to error) is that these processes impose essentially the same workload on the pool.

I didn't feel the need to present two obvious examples, but yes.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Reading the following resource might help you.

A 2-way mirror VDEV is safer than a 3 disk RAIDZ1 VDEV.
Two 2-way mirror VDEVs are safer than a single 4 disk RAIDZ1 VDEV.
Two 2-way mirror VDEVs are not safer than a single 4 disk RAIDZ2 VDEV.

It's the eternal struggle between cost, space and resiliency: you can only have two.

I also disagree with @danb35 opinion regarding RAIDZ1.

You also have to look at disk capacity and URE but in the end, risk acceptance is a personal thing.
If you want to dig a bit deeper you can read this post and follow up with this exchange.
 
Last edited:

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
It's the eternal struggle between cost, space and resiliency: you can only have two.
Don't forget performance in that trade-off.
I also disagree with @danb35 opinion regarding RAIDZ1.
It'd probably to be helpful to identify what part of my opinion you disagree with.
I didn't feel the need to present two obvious examples, but yes.
So scrubs and resilvers do present comparable workloads to the (remaining, in the case of resilvers) disks in the pool? Because if that's the case, I don't think you're disagreeing with me. I'm not saying that a resilver is stress-free, but rather that it's no (or not significantly) more stressful to those disks than scrubs, which are a routine maintenance operation--and that's in response to the claim I've often seen made that resilvers are so uniquely stressful that they'll kill at least one of the remaining disks, destroying your pool.

In case I wasn't clear, I'm not ordinarily recommending RAIDZ1, I don't think it's a good choice for most use cases, and I'm not denying its risks--but I do think those risks are often exaggerated.
 

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824
Yeah, I do agree that the risks are exaggerated as well, especially for small use cases like 4x4TB drives for example. The general sentiment seems to be more like "No RAIDZ1 ever". It's definitely not as preferred as RAIDZ2 and above, but there are small instances where it's reasonable.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Don't forget performance in that trade-off.
Totally right.

It'd probably to be helpful to identify what part of my opinion you disagree with.
My bad, I believed to have addressed this.
There is quite a bit out there, and here on this forum, that exaggerates the risk of RAID5/RAIDZ1--I believe the date given for "RAID5 is dead" was 2009, 14 years ago now. There are lots of posts suggesting that it's more likely than not that an additional disk will fail while resilvering, and there's quite a bit of information suggesting that the resilvering operation itself is in some way uniquely stressful for your pool. Both of these claims are bunk.
I don't believe such concerns to be exaggerated. RAID5/Z1 being dead is a thing.
There is a bit of semplification in most posts regarding this argoment because it concerns URE and probability, as well as data that's difficult to obtain.

For example, I'd totally go with 4x 12TB drives with URE of 1e-15 in a RAIDZ1; I wouldn't go beyond 4TB ones if we talk about 1e-14.

Also, resilver is a stressful operation and while scrub is similar it has a few differences, they are not the same thing (a reason why SMR drives are not suggested).

Making new users understand the dangers of RAIDZ1 and more importantly URE rates in correlation with drive sizes is vital, especially for those with a "YouTube" or "gaming hardware" background (although I feel there is a lack of awarness in the experienced users as well).

It is also a matter of difference in mentality between experienced users that understand the importance of planning ahead and try to pass it to new, unexperienced users.

I plan on writing a resource about such aspects; while its true that risk acceptance is subjective we all agree that data loss is painful.
 
Last edited:

Volts

Patron
Joined
May 3, 2021
Messages
210
I'm not saying that a resilver is stress-free, but rather that it's no (or not significantly) more stressful to those disks than scrubs, which are a routine maintenance operation--and that's in response to the claim I've often seen made that resilvers are so uniquely stressful that they'll kill at least one of the remaining disks, destroying your pool.

I have the same question. Loss of redundancy is itself scary. But resilver should be the same work for the good devices.

Or less - resilver can cheat for mirrors, right? And I think it only reads necessary blocks, not all of them.
 

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824
I have the same question. Loss of redundancy is itself scary. But resilver should be the same work for the good devices.

Or less - resilver can cheat for mirrors, right? And I think it only reads necessary blocks, not all of them.
I'm not sure about resilver vs scrubs, but resilvers can be orders of magnitudes faster and generate much less I/O demand/load in mirrors than in RAIDZ since you're only loading 1 other drive in the vdev vs however many remaining drives in the RAIDZ vdev.

Suppose a 4-drive mirrors vs 4-drive RAIDZ1. A resilver in the mirrors would just load 1 vs 3 other drives in RAIDZ because for each block resilvered, a block also has to be read from each of the remaining drives. Now scale that up to 6. The mirrors stay at 1, while the RAIDZ now loads 5 drives instead of 3. That is now 5 times the I/O demand. Add more drives and you can see how RAIDZ array resilvers also scales linearly everytime you add another drive to the vdev.
 
Last edited:

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I'm not saying that a resilver is stress-free, but rather that it's no (or not significantly) more stressful to those disks than scrubs, which are a routine maintenance operation--and that's in response to the claim I've often seen made that resilvers are so uniquely stressful that they'll kill at least one of the remaining disks, destroying your pool.

There *may* be some differences, and significant ones at that. A resilver and a scrub are very similar in that they walk the pool in a virtually identical manner. We agree, I assume, and this seems to be your point. However, when resilvering, or even when just repairing checksum errors during normal read operations, you are also doing an additional write operation and some other stuff.

For a single disk, writing a single sector shouldn't be terribly hard. Again, I'm going to assume we can see eye to eye on that.

However, for a single SMR disk, writing a single sector involves rewriting the entire shingle, and we already know that this can get very hard on pools, even outside of a scrub operation, if more than a small number of rewrites are involved. This is what led to the original kerfuffle about SMR disks: people had pools that were failing to resilver, even if they had RAIDZ2 or RAIDZ3 protection.

Worse, for even a CMR disk, the sustained write activity increases stress particularly on the target (drive being replaced), increasing temperatures. It is not just a function of reading the existing data sectors and verifying the parity sectors. It is reading the block's sectors, back-calculating the missing data or parity sectors, and then writing that out to the replaced disk. This is more work than just reading all the disks. Reading is relatively trite and some of it is mitigated by drive and host caching. Writing semirandom sectors to rebuild ZFS blocks typically requires a seek for each ZFS block, which may be harder on the drive being written to. More work equals more heat.

Finally, resilvers on mirrors are somewhat easier than RAIDZ because you might only be involving two or three disks (meaning only two or three disks are running warm). RAIDZ, on the other hand, involves each disk in the vdev, and because the process is slower due to the nature of RAIDZ, all the component drives run busier, for longer, get warmer, and it just isn't really a great thing for them.

Think about it this way: You are already running in a degraded mode when a resilver is going on, so the risk of further degradation is more serious than when you're just doing a scrub and reading all the data. Scrubbing a RAIDZ2 that has no errors is pretty safe. However, yank one of those disks out, simulating a disk failure, and you suddenly have something that resembles a RAIDZ1 in terms of redundancy. Now take a hammer and start tapping on one of the other drives to represent an already marginal drive. How certain are you that the pool will survive? 90%? 95%? Fine. You have every right to decide on whatever level of resiliency floats yer boat. But it's important to understand that while your system is degraded, it is DEGRADED in multiple ways -- slower performance, less resiliency. It's easy to think "oh but it's RAIDZ2" and pretend this isn't a thing. But fate has this tendency to pick on the unprepared. If you go full on RAIDZ3 with a warm spare, you'll probably never see a disk fail. That's no fun for fate. It's the goober who decides to rely on RAIDZ1, where a disk fails, and then just one more bad thing happens during resilver, and the pool consistency is now in an indeterminate state. Maybe it's okay for those blocks to come back as all zeroes. Maybe not.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I plan on writing a resource about such aspects; while its true that risk acceptance is subjective we all agree that data loss is painful.

If you do, feel free to borrow any concepts I touched on. This would be an excellent resource on a tough topic to cover.
 

jace92

Dabbler
Joined
Dec 14, 2021
Messages
46
For example, I'd totally go with 4x 12TB drives with URE of 1e-15 in a RAIDZ1; I wouldn't go beyond 4TB ones if we talk about 1e-14.

With my 14TB Seagate Exos drives being 10^15 according to the spec sheet, how (if at all) does that affect any recommendations? I would assume that other high capacity drives like mine are the same though but I haven't checked.

Also, how many other's feel the same about that statement out of curiosity?
 

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824
Also, how many other's feel the same about that statement out of curiosity?
I'd probably run 4x8TB in RAIDZ1, but I wouldn't go any higher than that though.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
With my 14TB Seagate Exos drives being 10^15 according to the spec sheet, how (if at all) does that affect any recommendations?
Running the numbers gives me a 8,6% of experiencing an URE while reading an 80% full drive.
Assuming (BIG assumption) a 3% drive failure rate, with a 4-wide RAIDZ1 VDEV you have a 20% of data loss.
If we are talking about 1e-14 drives using the same parameters, you have a 70% of data loss.

I would assume that other high capacity drives like mine are the same though but I haven't checked.


Also, how many other's feel the same about that statement out of curiosity?
I did too, until the Western Digital attacked. To me is total nonsense (bordering criminal) to sell up to 22TB NAS PRO drives with an URE of 1e-14: we are talking about over 75% of an URE while at 80% capacity.
 
Last edited:

jace92

Dabbler
Joined
Dec 14, 2021
Messages
46
To me is total nonsense (bordering criminal) to sell up to 22TB NAS PRO drives with an URE of 1e-14

Now that I understand what that means and to look for it... ew... Not a fan of what that could mean.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
22TB NAS PRO drives with an URE of 1e-14: we are talking about over 75% of an URE while at 80% capacity.
...which of course presumes that the specs are the best they're going to do. Meanwhile, history suggests this is not the case--if it were, we'd regularly be seeing data errors (even corrected data errors would show up as a degraded pool) with such disks on a regular basis, and we just aren't seeing it. Which goes back to my belief that the risks of RAIDZ1, while present, are exaggerated.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
...which of course presumes that the specs are the best they're going to do.
That value is the declared (minimum) standard. Being the only data acquired in a scientific way it's the only usable for such calculations.

There is an immense lack of indipendent, scientific (which means reliable) data regarding HDDs. This hurts us consumers badly.

Which goes back to my belief that the risks of RAIDZ1, while present, are exaggerated.
I agree, but it's hard to quantify without proper numbers.
 
Last edited:

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Just wanted to give everyone an heads up since a revised draft of said resource is out, waiting for peer review and any kind of contribution.

To explain a few concepts it relies heavily on some of the comments on this thread, and goes into detail explaining drive failure rates, UREs, how both merge into a risk assesment, and what are the implications, all the while quoting numerous sources and articles. It also briefly (for now) touches resilvering and its effects on mirrors vs raidz layouts.​
 
Top