Transplanting damaged drives from encrypted pool, can anything be salvaged?

Status
Not open for further replies.

threatbutt

Cadet
Joined
Aug 20, 2018
Messages
5
Hi all,

My NAS endured something of a catastrophe during a hasty relocation a year ago. My backups were lost/stolen and the machine itself was heavily damaged. I gave it a year but the backups haven't surfaced so I'm having to write them off and address the pile of junk I have left.

Before disposing of the machine I pulled 4 drives. Two of them had their SATA data connectors snapped off. I restored one to a state in which I could clone it with dd but the other is clicking at startup so I'm putting it aside for now.

I bought a used Dell R710+H200 to try to salvage what I can. I still had the USB key with my original FreeNAS install on it. BIOS and GUI recognize the two original and single cloned disk. I'd like to get to a state where I have images of all three disks to toss up to S3 and backup locally but am trying to come up with a recovery plan before I spin them up again. If I can get the array up again it would be much faster to recover select files than to clone/image the disks.

The first of my problems is that the volume I built is encrypted. I don't remember the passphrase exactly but could likely brute-force it within a few dozen tries, but have not figured out a safe way to do this offline yet. I have geli_recovery.key. It fails to apply because all the gptids are now different. For what it's worth, I get an error that "4 drives failed to decrypt" even though there are only 3 present.

The second of my problems is that I don't remember how I designed my pool. I think it was RAIDZ1, but don't remember if it had 1 or 2 parity drives-- I might have started with 5, lost one before the move, now lost another after it, but who knows. Could also have started with 4 and now lost 1. It's anybody's guess. I don't know if the GELI error above reveals anything related to this. Either way with everything being encrypted and locked, `zpool status` is telling me nothing.

My third problem is that in trying to re-import my volume, the disks available only number 2. There's da0p2 and da1p2 I believe. I'm thinking there should also be a da2p2 partition, likely on that cloned drive, that failed to duplicate? Unless I'm misinterpreting something. FWIW the original volume is still available to be detached, but I hesitated to do this because I didn't want to screw things up even more in my desperation.

I would appreciate any help that might get me some closure from this mess.

Thanks!
 
Last edited:
D

dlavigne

Guest
That does sound like a mess... Did you have any luck with it?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
It fails to apply because all the gptids are now different.
That doesn't make sense. Not only would they not change if you have the original disks or cloned them, they're irrelevant for decryption.
 

threatbutt

Cadet
Joined
Aug 20, 2018
Messages
5
That does sound like a mess... Did you have any luck with it?

Not yet. I imaged the three physical drives and the original FreeNAS install drive, so I'm able to at least try to reconstruct the source environment virtually.

The da*p2 business is hosing my efforts in the GUI; I don't trust it, especially given that all the CLI tools can see the disk just fine. I'm currently reading up on how GELI and ZFS work so I can try to decrypt and mount everything manually.

That doesn't make sense. Not only would they not change if you have the original disks or cloned them, they're irrelevant for decryption.

Hmm, ok. I had gotten that impression based on some fleeting error message, but good to know, thanks.
 

threatbutt

Cadet
Joined
Aug 20, 2018
Messages
5
Well, I started with a smashed chassis containing 4 drives, 2 of which were damaged, all of which were GELI-encrypted with one of three different keys, using a passphrase I couldn't fully remember in a configuration completely forgotten.

I could not find any way to divine the zpool configuration while the volume was locked, so I set about tackling that problem first. I knew most/all of the terms in the passphrase give or take a few characters, so I wrote a script to bruteforce every permutation of possible passphrase and GELI/recovery key against one of the encrypted partitions. It basically ran `geli attach -k /path/to/oneofthe/geli.key -j /path/to/tmpfile/containing/generated/passphrase /dev/ada1p2` over and over until something other than "GELI: invalid whatever" was returned. It is a slow process and not suited to brute-forcing of completely unknown passwords, but in this case since it's basically a dictionary attack with known parameters I figured it should solve itself long before the HDOTU.

Over a week later (serial GELI password cracking is not for the impatient ;) ) I got a match. The key I recovered from the FreeNAS /data/ directory worked, plus one of the combinations of passphrase terms and modifiers.

I tried plugging the passphrase in to the original FreeNAS unlock panel but it wouldn't decrypt; there were still errors about gptids mismatching or something, so I said to hell with it. I'm running cloned versions of everything in a virtualized environment with completely different hardware; I won't pretend like I expect everything to play nicely in its original configuration. Let's start over. Detached volume.

Went to go reimport the volume. Still only had the option of those two partitions, but I did a Hail Mary, selected those two volumes, fed FreeNAS the original GELI key and the cracked passphrase, and waited.

It unlocked a RAIDZ2-0 volume in a degraded but functional state.

(I remember heavily weighing the pros and cons of RAIDZ1 when I first set it up, but didn't remember whether or not I had actually gone with it, hence my initial concern. Hooray for erring on the side of being risk-averse...)

I'm currently extracting everything of value off of the volume and will verify contents later, though I did a quick sampling and things appear intact. I should probably scrub and reverify? Given the sensitivity of GELI to total data loss (it appears to me that opportunity for this to happen occurs with every disk replacement if you don't cycle and keep track of all the GELI keys every time?), I don't think I will be messing around with GELI again. Painful lesson learned but at least I didn't have to spend $20K sending disks off to a recovery service only to find I don't know the passphrase at all, don't have the right key, or that the array itself was unrecoverable because I ended up losing 3/5 disks from what turned out to be a RAIDZ1.

Time for a drink. Here's to better documenting my own systems, and to magnetic tape backups-- no worries about drive failures, broken actuators or head crashes, only ripping, tearing and melting :)
 
Last edited:

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Given the sensitivity of GELI to total data loss
That's the whole point of encryption.
(it appears to me that opportunity for this to happen occurs with every disk replacement if you don't cycle and keep track of all the GELI keys every time?)
FreeNAS streamlines this somewhat by rekeying all disks at once, to make sure you only need one set of keys for all disks. But yes. That is the price of full-disk encryption.
 

threatbutt

Cadet
Joined
Aug 20, 2018
Messages
5
That's the whole point of encryption.

Right, though I did not fully understand the implications of what I was building at the time. It just seemed like an easy solution to not have to worry about how to DBAN or degauss dead drives before disposal.

In any case, it did its job. God knows where my backups currently are but at least those will have kept their integrity.
 
Status
Not open for further replies.
Top