ELI5: Why can't we add/remove disks to raidz vdevs?

Toydoll

Dabbler
Joined
Sep 17, 2015
Messages
33
Hi

Truenas and zfs can do some pretty darn advanced stuff and yet it's not possible to add or remove disks to a vdev that uses raidz, how come?

The more I think about it the less I understand it. I'm not stupid (in that way), if it was an easy fix it would have been done ages ago but why is it so hard? Could someone explain like I'm five (not literally, but almost)?

In my uneducated mind it sounds like adding a disk would be a two step progress.
1. Add the disk to the vdev. 2. Ask Truenas to spread the data so it incorporates the new disk.

Removing a disk would require some conditions to be fulfilled but if they were it wouldn't be that hard either. A simplified example:
I have six disks a 10Tb in a raidz2 which totals 40Tb of usable space. I do however only have 20 Tb used on the disks. Then in my mind I would be able to:
1. Let Truenas know that I plan to remove two drives but still keep the vdev and raidz2.
2. Truenas then moves those 20 Tb to the four drives that will still be in use.
3. After some time the data has been moved and I now have two empty drives that is not in the vdev.

Again; I do understand that it's not as easy as it sounds but why is it not so easy? I have found some info about it when I searched but it was way above my current knowledge. I would like a simplified version if there is any.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Before I longpost on this - did you find Matt Ahrens' slide deck and video from the 2017 OpenZFS developer summit?



This breaks it down fairly well, including why they have to do so much testing even for adding new vdev members. There's a risk of data loss if you're too hasty with the reflow of data; and even without that, checksums aren't verified during these operations.

(Plus it will probably fragment the hell out of your pool.)
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
The work has progressed on this and there's even a working pre-release version (of course not to be used on any pool you care about losing).

I don't know how long it will take for that version to be sufficiently tested and/or if it's even a good thing to do in the end (see final comment from @HoneyBadger ), but the function will come one day.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I think the simple answer here is that when Sun Microsystems sold you a server with a shelf of disks, removal wasn't a thing because there was no "slider" that would narrow the width of the shelf of disks, and when you added a shelf, that's a bunch of disks and you'd be adding it as a vdev.

One of the things that happens when designing complicated systems is that sometimes you do not allow for every possible thing that could ever possibly be wanted at some point way far down the road.

Unfortunately, ZFS has a bunch of complicated features that involve laying down data in certain ways while still being efficient about it. Generalizing that is not impossible, but significantly increases the difficulty to write the thing. Too many requirements can kill a project.
 
Top