Increasing capacity HDD and Resilver

vincentj

Dabbler
Joined
Nov 16, 2022
Messages
42
Hi. I currently have 4 HDDs of 2tb each (pool with raidz-1). I was wondering if you could expand these 4 HDDs with a larger capacity in the future. I've also heard of the Resilver. I read that it "copy data"...but in what sense? What is this resilver actually for? Thank you
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
I was wondering if you could expand these 4 HDDs with a larger capacity in the future.
Of course, the manual explains how to do it. The disk replacement process automatically resilvers (rebuilds the array); that isn't something you need to do manually.
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
(pool with raidz-1)


makes sure you have backups, though, as a re-silvering raidz1 has a good chance of killing another drive before it completes. you are at the limit of what is generally considered reasonably safe for drive sizes of 2TB. upgrading will take you past that.

backups are a good idea anyway, but especially with raidz1 on large HDDs
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Here is a more in depth analysis about what @artlessknave wrote.

RAID5 is the hardware analogous of RAIDZ1.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Now, here's the thing: that article's obviously been around for a long time. And despite our advice, lots of folks are still using single-parity RAID--and their pools aren't just falling over and dying. Is double-parity safer? Of course it is. But "re-silvering raidz1 has a good chance of killing another drive" is just FUD.
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
part of why I say that is because in the last week or so I told a user their pool was gone. it was raidz1.
1 drive died and a 2nd drive died before they even got a replacement. poof.
it does happen. that particular article seems to be more about read errors, which definitely occur.
while it's unlikely overall, people who have no preparation for it, and thought that RAID = backup, are usually the ones who suffer.
it's one thing to know about it and that there is a risk, it's entirely another to be blindsided with it.
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
I'm also an actual backup admin in the enterprise space, so you can imagine I get....twitchy when there are no backups.
 

vincentj

Dabbler
Joined
Nov 16, 2022
Messages
42
Unfortunately (at least for the moment) I can't afford another hdd for backup. But exactly, what is resilver? my aim is to change the 4x2tb HDD to 4x4tb HDD. Also I've read that resize is not "feasible" in zfs even though it is technically possible. in any case I would like to physically change the ones I currently have with higher capacity hdds.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
But "re-silvering raidz1 has a good chance of killing another drive" is just FUD.

Sort of. But relying on resilvering to expand your RAIDZ1 array does put you in the position of removing the pool's redundancy while this long operation takes place, and people have been known to make inadvertent mistakes like messing with the wrong drive. This, plus the usual warnings that ZFS has no way to correct certain types of errors once introduced into the pool, should be enough to give one pause before blindly charging forward.

But exactly, what is resilver?

It's the process of rebuilding the pool after a failure. In your case, to increase the pool size, you yank one drive, put in the new drive, wait maybe 12 hours while it does its stuff, then you proceed to the next and the third and the fourth drive. During this time, you have no redundancy, so anything bad that happens, you now permanently own the bad thing. This is why we advocate RAIDZ2.
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
re-silver is the zfs term for rebuilding the data on either a new drive or a drive that went missing, bringing it up to date with the pool. in order to do what you are intended, you will need to re-silver 4 new drives into the pool, one by one.
not "feasible" in zfs even though it is technically possible
you cannot resize a raidz vdev. its not "infeasible", its actually impossible at this time. the number of drives and raidz level are set at vdev creation. you can achieve more storage in the vdev by replacing all the drives with large ones, which is what you are trying to do.

while each drive is re-silvering, your pool will have absolutely zero redundancy. if anything else goes wrong, the pool will likely not survive.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
But relying on resilvering to expand your RAIDZ1 array does put you in the position of removing the pool's redundancy while this long operation takes place
...unless you have a spare SATA port, in which case you can resilver in the new disk without taking the old one offline. I've heard this slows down the resilvering, but it preserves what redundancy you have. I'm not sure if that preservation of redundancy would make it worth using something like a USB drive dock in case you don't have a spare SATA port, but I guess it's something that could be considered.

I'm no fan of RAIDZ1, any more than anyone else here--but I think there's a tendency to go overboard in recommending against it, with frequent statements that it's a near-guarantee that the pool will die on resilver. And that's neither true nor helpful.
 

rvassar

Guru
Joined
May 2, 2018
Messages
972
Three thoughts:

1. After replacing all the drives in a RAIDz1, and completing the resilvering process one by one... You can in fact expand the pool capacity to use the additional space provided by the new drives. (provided you use larger drives...) These replacement drives need not occupy the same SATA or SAS port as the drive being replaced. As long as the boot pool can actually initiate the boot and the AHCI / SAS ports have driver support, ZFS just finds the roaming drives and stitches everything back together. (Just remember: SATA controllers don't talk to SAS drives, all SAS controllers talk to SATA drives)

2. The reason that 2007 article's predicted 2009 apocalypse hasn't panned out is it was written to a specific statistical benchmark, and that benchmark moved. The uncorrectable read 1 in 10^14 has for many manufacturers become 1 in 10^16 due to part commonality. The SATA spec requires 1 in 10^14 but the SAS spec required 1 in 10^16 even back in 2013. It's subtle, but significant. And remember SATA is a dead end, nobody is working on it anymore with a goal of advancing the state of the art. SAS is still being developed, so the 1 in 10^16 is the benchmark. (Actually I need to check and see what SAS4 requires...)

3. The resilver workload killing another drive is a good reason to keep a mix of drive models and ages in your pool. If they're all the same model, and all have the same number of hours accrued in the same environment, then this is a very real risk. But the ZFS scrub should be every bit as stressful as a resilver.
 
Top