Confirm that drive in encrypted volume has really been offlined before replacing?

Status
Not open for further replies.

soulburn

Contributor
Joined
Jul 6, 2014
Messages
100
Today I ran into a problem and found the solution from this post: Can't Re-Key Encryption After Replace/Re-Silver.

Quoting the OP from the above thread:
My pool is set up as follows:

RAID-Z2-0
(6)3TB drives
RAID-Z2-1
(5)3TB Drives
(1)4TB Drive(This is the new disk that replaced a 3TB and just finished re-silvering)
RAID-Z2
(6)4TB Drives

After re-silvering takes place I' try to follow the steps from the freeNAS manual here:
http://doc.freenas.org/9.3/freenas_storage.html#replacing-an-encrypted-drive\

However when attempting to re-key the pool I get the following error:
"Error: Unable to set key: [MiddlewareError: Unable to set passphrase on gptid/d1a6e94d-9f9f-11e4-b99f-000c2934e0ad: geli: Cannot open gptid/d1a6e94d-9f9f-11e4-b99f-000c2934e0ad: No such file or directory. ]"

Part of the blame lies with me I suspect b/c I removed the old 3TB drive with out off lining it first, but I figured it would just simulate a HDD dying. I've replaced like 3 other hard drives that have failed (this one didn't fail but is being replaced one at a time to extend the vdev) and never had this problem despite not being able to mark them offline.

Also how the hell do you lose access to your whole pool if you reboot before re-key (warning according to freeNAS man.)
Any help is greatly appreciated.

Found this Thread:
https://forums.freenas.org/index.ph...ive-unable-to-set-key-geli-cannot-open.14554/

Seems he found a way to remove info for old drive, but I cant make heads or tales of the commands he was using.

[root@freenas] /data# sqlite3 /data/freenas-v1.db "select * from storage_disk;"

for me trying that get a y/n/e/a/prompt and replying yes gets permissions denied.

Edit #2: Finally got it figured out.

FIRST USE:
Code:
sqlite3 /data/freenas-v1.db "select * from storage_encrypteddisk;"

Then Identify the one giving the error and use
Code:
sqlite3 /data/freenas-v1.db "delete from storage_encrypteddisk where id=3;"


and replace the '3' in "where id=3" with the correct number of the offending entry.

The difference for my particular scenario vs. the aforementioned post was that the drive was offlined by FreeNAS automatically according to the GUI and this problem persisted.

Let me explain further.
  • I have a bunch of WD RED NAS 6TB drives connected to LSI HBA's on P20 firmware that are fully supported by FreeNAS.
  • I am running the latest FreeNAS stable build at the time of this post. (9.10-STABLE-201606270534)
  • One WD RED NAS 6TB drive fails on a RAID Z3 volume.
  • Failed drive was automatically offlined by FreeNAS (there was no option in the GUI to offline it). This is expected behavior and I've replaced other failed drives like this in the past. When there is no offline option, FreeNAS has already done this for you, or so it seems. No big deal.
  • I pop a replacement drive in and run through the re-silver. All is well.
  • Re-sliver completes. I attempt to re-key the volume, and get the errors listed in the above referenced thread regarding ""Error: Unable to set key:"
  • Solution given in edit #2 of above referenced thread allows me to remove the bad drive from the freenas-v1.db file, re-key the volume, and finally create and download a new geli.key and geli_recovery.key.
    • Code:
      sqlite3 /data/freenas-v1.db "select * from storage_encrypteddisk;"
      sqlite3 /data/freenas-v1.db "delete from storage_encrypteddisk where id=x;"
  • I reboot and test both my passphrase and recovery key and everything works as expected.
Is there any way I can check in the future to make sure the drive has really been offlined before I replace the member disk? For my particular scenario the drive died and FreeNAS said it offlined it because the GUI option to offline it was not there, but in actuality it was not offlined and I had to do it manually. I just want to avoid this in the future however I can. Thanks for any feedback.
 

Nick2253

Wizard
Joined
Apr 21, 2014
Messages
1,633
If you run "zpool status -v" you should be able to see the status of every drive in your pool(s). I don't know if that will be any different than what the FreeNAS GUI reports, but it would at least give you something more directly from the source.
 

soulburn

Contributor
Joined
Jul 6, 2014
Messages
100
If you run "zpool status -v" you should be able to see the status of every drive in your pool(s). I don't know if that will be any different than what the FreeNAS GUI reports, but it would at least give you something more directly from the source.
Thanks. I'll give that a try. You know, the more I think about it the more I think I already have an answer from the original post. If you run:
Code:
sqlite3 /data/freenas-v1.db "select * from storage_encrypteddisk;" 
it will give you the info on the disks in the array. If the GUI reports a disk offline and it shows up after that command, it hasn't really been offlined. I wonder if that's bug report worthy or if it was just an isolated incident? I'll report back if it ever happens again. Thanks.
 
Last edited:
Status
Not open for further replies.
Top