GELI: replacing a failed disk with one of a bigger size.

Status
Not open for further replies.

panz

Guru
Joined
May 24, 2013
Messages
556
So, I'm still studying GELI and its behavior in the event of a disk failure.

Situation:
1) you have, say, a 6 disks Raidz2 pool, encrypted.
2) Something happens to one disk, including GELI metadata corruption (see https://bugs.freenas.org/issues/2375#ticket).
3) you replace the failed disk with another one of bigger size.

Now you have to: resilver the pool, but before that, you have to input the correct GELI metadata (remember: you've lost them - for some reason - in step #2). GELI restore will not work, because it can't find the metadata portion of the disk (because now the size is different).

At this point, I'm lost :) (using GELI resize won't work too, because the disk is not in the pool).
 

Dusan

Guru
Joined
Jan 29, 2013
Messages
1,165
Situation:
1) you have, say, a 6 disks Raidz2 pool, encrypted.
2) Something happens to one disk, including GELI metadata corruption (see https://bugs.freenas.org/issues/2375#ticket).
3) you replace the failed disk with another one of bigger size.

Now you have to: resilver the pool, but before that, you have to input the correct GELI metadata (remember: you've lost them - for some reason - in step #2).
This doesn't make any sense. Why do you care about the geli metadata on the failed disk? It absolutely doesn't matter -- you are going to replace it anyway. Every disk has it's own independent geli metadata.
You can do the replacement in the GUI, there is no need for any CLI magic, no need to "input the correct GELI metadata", just follow the manual and it works: Replacing a Failed Drive in an Encrypted Pool
 

panz

Guru
Joined
May 24, 2013
Messages
556
If you simulate a metadata "accident" by committing <geli clear /dev/daX> on the vdevs you're still safe if you did a backup of your geli metadata (<geli backup _some_provider_here>, a function that FreeNAS is still missing).

But, if you replace some disk and commit the resilvering (this may happen if you're away and someone else replaced the failed disk with a bigger one) the resilvering still works, but the pool is inaccessible because you can't restore the metadata on the bigger disk.
 

Dusan

Guru
Joined
Jan 29, 2013
Messages
1,165
If you simulate a metadata "accident" by committing <geli clear /dev/daX> on the vdevs you're still safe if you did a backup of your geli metadata (<geli backup _some_provider_here>, a function that FreeNAS is still missing).
I know, I also created a bug report about that: https://bugs.freenas.org/issues/3206
But, if you replace some disk and commit the resilvering (this may happen if you're away and someone else replaced the failed disk with a bigger one) the resilvering still works, but the pool is inaccessible because you can't restore the metadata on the bigger disk.
How does the somebody else "replace some disk and commit the resilvering"?
If you do it via the GUI and follow the manual then it works properly. The new device will be partitioned, geli initalized and the encrypted partition will be replaced into the pool. It really does not matter if the geli metadata on the old drive was corrupted -- you are replacing it. Try it in a VM.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I think someone got bored and is thinking so far outside the box he's lost track of where the box is....(looks at panz)
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
You should be able to do disk replacements per the manual. I wrote the section on replacing disks in encrypted pools. You shouldn't have to do anything from the CLI to make this all work aside from the intial steps to check the zpool attribute for auto-expand is set.
 

panz

Guru
Joined
May 24, 2013
Messages
556
You should be able to do disk replacements per the manual. I wrote the section on replacing disks in encrypted pools. You shouldn't have to do anything from the CLI to make this all work aside from the intial steps to check the zpool attribute for auto-expand is set.

Thank you, I noticed that well-written section. I'm going to do only one thing from the CLI (I'm the paranoid, you know...): geli backup.

After some testing I'm going to build my definitive machine (I'll ask advice the next month) and let the hardware in my signature be a Windows workstation :)
 
Status
Not open for further replies.
Top