Increasing capacity HDD and Resilver

gianfelicevincenzo · Jan 12, 2023

Hi. I currently have 4 HDDs of 2tb each (pool with raidz-1). I was wondering if you could expand these 4 HDDs with a larger capacity in the future. I've also heard of the Resilver. I read that it "copy data"...but in what sense? What is this resilver actually for? Thank you

danb35 · Jan 12, 2023

vincentj said:
I was wondering if you could expand these 4 HDDs with a larger capacity in the future.

Of course, the manual explains how to do it. The disk replacement process automatically resilvers (rebuilds the array); that isn't something you need to do manually.

artlessknave · Jan 12, 2023

vincentj said:
(pool with raidz-1)

makes sure you have backups, though, as a re-silvering raidz1 has a good chance of killing another drive before it completes. you are at the limit of what is generally considered reasonably safe for drive sizes of 2TB. upgrading will take you past that.

backups are a good idea anyway, but especially with raidz1 on large HDDs

Davvo · Jan 12, 2023

Here is a more in depth analysis about what @artlessknave wrote.

Why RAID 5 stops working in 2009

The storage version of Y2k? No, it's a function of capacity growth and RAID 5's limitations.

www.zdnet.com

RAID5 is the hardware analogous of RAIDZ1.

danb35 · Jan 12, 2023

Now, here's the thing: that article's obviously been around for a long time. And despite our advice, lots of folks are still using single-parity RAID--and their pools aren't just falling over and dying. Is double-parity safer? Of course it is. But "re-silvering raidz1 has a good chance of killing another drive" is just FUD.

artlessknave · Jan 12, 2023

part of why I say that is because in the last week or so I told a user their pool was gone. it was raidz1.
1 drive died and a 2nd drive died before they even got a replacement. poof.
it does happen. that particular article seems to be more about read errors, which definitely occur.
while it's unlikely overall, people who have no preparation for it, and thought that RAID = backup, are usually the ones who suffer.
it's one thing to know about it and that there is a risk, it's entirely another to be blindsided with it.

artlessknave · Jan 12, 2023

I'm also an actual backup admin in the enterprise space, so you can imagine I get....twitchy when there are no backups.

gianfelicevincenzo · Jan 13, 2023

Unfortunately (at least for the moment) I can't afford another hdd for backup. But exactly, what is resilver? my aim is to change the 4x2tb HDD to 4x4tb HDD. Also I've read that resize is not "feasible" in zfs even though it is technically possible. in any case I would like to physically change the ones I currently have with higher capacity hdds.

jgreco · Jan 13, 2023

danb35 said:
But "re-silvering raidz1 has a good chance of killing another drive" is just FUD.

Sort of. But relying on resilvering to expand your RAIDZ1 array does put you in the position of removing the pool's redundancy while this long operation takes place, and people have been known to make inadvertent mistakes like messing with the wrong drive. This, plus the usual warnings that ZFS has no way to correct certain types of errors once introduced into the pool, should be enough to give one pause before blindly charging forward.

vincentj said:
But exactly, what is resilver?

It's the process of rebuilding the pool after a failure. In your case, to increase the pool size, you yank one drive, put in the new drive, wait maybe 12 hours while it does its stuff, then you proceed to the next and the third and the fourth drive. During this time, you have no redundancy, so anything bad that happens, you now permanently own the bad thing. This is why we advocate RAIDZ2.

artlessknave · Jan 13, 2023

re-silver is the zfs term for rebuilding the data on either a new drive or a drive that went missing, bringing it up to date with the pool. in order to do what you are intended, you will need to re-silver 4 new drives into the pool, one by one.

vincentj said:
not "feasible" in zfs even though it is technically possible

you cannot resize a raidz vdev. its not "infeasible", its actually impossible at this time. the number of drives and raidz level are set at vdev creation. you can achieve more storage in the vdev by replacing all the drives with large ones, which is what you are trying to do.

while each drive is re-silvering, your pool will have absolutely zero redundancy. if anything else goes wrong, the pool will likely not survive.

danb35 · Jan 14, 2023

jgreco said:
But relying on resilvering to expand your RAIDZ1 array does put you in the position of removing the pool's redundancy while this long operation takes place

...unless you have a spare SATA port, in which case you can resilver in the new disk without taking the old one offline. I've heard this slows down the resilvering, but it preserves what redundancy you have. I'm not sure if that preservation of redundancy would make it worth using something like a USB drive dock in case you don't have a spare SATA port, but I guess it's something that could be considered.

I'm no fan of RAIDZ1, any more than anyone else here--but I think there's a tendency to go overboard in recommending against it, with frequent statements that it's a near-guarantee that the pool will die on resilver. And that's neither true nor helpful.

rvassar · Jan 14, 2023

Three thoughts:

1. After replacing all the drives in a RAIDz1, and completing the resilvering process one by one... You can in fact expand the pool capacity to use the additional space provided by the new drives. (provided you use larger drives...) These replacement drives need not occupy the same SATA or SAS port as the drive being replaced. As long as the boot pool can actually initiate the boot and the AHCI / SAS ports have driver support, ZFS just finds the roaming drives and stitches everything back together. (Just remember: SATA controllers don't talk to SAS drives, all SAS controllers talk to SATA drives)

2. The reason that 2007 article's predicted 2009 apocalypse hasn't panned out is it was written to a specific statistical benchmark, and that benchmark moved. The uncorrectable read 1 in 10^14 has for many manufacturers become 1 in 10^16 due to part commonality. The SATA spec requires 1 in 10^14 but the SAS spec required 1 in 10^16 even back in 2013. It's subtle, but significant. And remember SATA is a dead end, nobody is working on it anymore with a goal of advancing the state of the art. SAS is still being developed, so the 1 in 10^16 is the benchmark. (Actually I need to check and see what SAS4 requires...)

3. The resilver workload killing another drive is a good reason to keep a mix of drive models and ages in your pool. If they're all the same model, and all have the same number of hours accrued in the same environment, then this is a very real risk. But the ZFS scrub should be every bit as stressful as a resilver.

Important Announcement for the TrueNAS Community.

Increasing capacity HDD and Resilver

gianfelicevincenzo

Dabbler

danb35

Hall of Famer

artlessknave

Wizard

Davvo

MVP

Why RAID 5 stops working in 2009

danb35

Hall of Famer

artlessknave

Wizard

artlessknave

Wizard

gianfelicevincenzo

Dabbler

jgreco

Resident Grinch

artlessknave

Wizard

danb35

Hall of Famer

rvassar

Guru

Similar threads

Important Announcement for the TrueNAS Community.

Increasing capacity HDD and Resilver

Dabbler

Hall of Famer

Wizard

MVP

Hall of Famer

Wizard

Wizard

Dabbler

Resident Grinch

Wizard

Hall of Famer

Guru

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Increasing capacity HDD and Resilver"

Similar threads