FreeNAS-11.3-U5 - Procedure for replacing encrypted pool drive is NOT properly documented

jasn

Dabbler
Joined
Dec 19, 2014
Messages
32
I have one geli encrypted pool configured on my 11.3-U5 Mini XL+. Since the system boots and unlocks my volume automatically, I understand this means there's NO passphrase enabled. I then discovered this thread pointing to the FreeNAS 11.3 Release Notes, which states under Known Impacts;
The system no longer allows moving the system dataset to an encrypted pool containing a passphrase. Since Directory Services and some SMB state information is stored in the system dataset, these services will not function correctly if the system dataset is locked or otherwise unavailable. It is recommended to move the system dataset to a non-encrypted pool or an encrypted pool not containing a passphrase.
So if I wanted to continue keeping the system dataset on this pool, I can't encrypt it with a passphrase. However, now that I had one drive that needed replacing, I discovered that the official FreeNAS documentation is incorrect/incomplete. According the 11.3-U5 documentation for replacing a drive;
Warning

Encrypted pools must have a valid passphrase to replace a failed disk. Set a passphrase and back up the encryption key using the pool Encryption Operations before attempting to replace the failed drive.
Additionally, the TrueNAS documentation on geli encryption, didn't explain things much better.

Having lost an entire encrypted pool to a previous FreeNAS upgrade procedure, I was nervous due to this inconsistent documentation, and that there isn't a properly documented process on how to replace a drive in an encrypted pool WITHOUT a passphrase. When I performed the requisite internet searches I found the following bug reports, which seemed to indicate folks realized that the FreeNAS documentation should be updated to better document encrypted pool drive replacements;

Update instructions on how to replace disk in encrypted pool
Improve Guide section on replacing disk in encrypted pool

as well as these previous forum posts that the bug reports linked to, describing procedures for replacing a drive in an encrypted pool;

Replaced drive in encrypted pool, howto verify keys before reboot?
Freenas 11.2-U5 Replacing encrypted disk in Pool

However none of them documented a procedure for replacing a drive in an encrypted pool without a passphrase. I did finally discover the following thread, where another user also discovered this inconsistency, and reported simply going ahead and replacing the drive in their 11.3-U5 encrypted pool without using a passphrase.

FreeNAS-11.3-U5 Replacing drive from encrypted pool

In an effort to help anyone with similar questions, here are the steps I performed for my drive replacement. My disclaimer is that without any official confirmation, I don't know if these are the correct steps. These are just the steps that I used. I performed all steps using the 11.3-U5 GUI.
  1. Powered down my system to identify the problem drive by serial number, marked it, then powered up my server.
  2. With my server up, I selected my pool and then clicked the right "lock" icon and selected the Reset Keys option, (documentation link), to reset the encryption for the pool, and to download my encryption key. I did NOT enter a passphrase.
  3. Selected my pool, and the lock icon, to select the Recovery Key option for my pool, (documentation link), to download a recovery key, as well.
  4. Selected my pool, clicked on the "gear" icon, and selected Status to view all drives in my pool.
  5. Selected the drive to be replaced, and put the drive offline.
  6. Because they're hot swappable, I then physically removed the offlined drive, and replaced it with the new drive.
  7. In the same Pool status menu, I then selected the offlined drive entry, and chose the Replace drive option.
  8. After a minute or two, the new drive was recognized by the system, and I was able to select the new drive as the replacement drive in the pool.
  9. I then received a popup message stated that the drive was being formatted, showing a percentage completion, which then got stuck at 30%.
  10. I then received a failure/error popup message stating; "Error: [ENOENT] options.disk: Disk not found. [ENOENT] options.label: Label".
At this point I was very nervous that something had gone wrong. However, after using the refresh button on the page, I eventually saw that the replacement drive was indeed online, and that the resilver process had begun. No encryption keys were requested to replace the drive. However, once the resilver process is completed, I will again use the Reset Keys option, (Step 2), to make sure that everything is correct, as forum user "daquirm3" documented in their 11.3-U5 forum thread. It's just extremely unnerving to be doing this without official documentation.

Lastly, in addition to checking these forums, and the FreeNAS subreddit, I filed a help desk ticket with ixSystems, (this is my third FreeNAS Mini and probably my last), hoping they could fill in the missing documentation gap. However, instead I was sent the following canned response;
The FreeNAS Mini products include 1 Year standard hardware warranty (unless additional warranty extensions are purchased). Direct support for FreeNAS is not included. However, most questions can be answered in the FreeNAS forums (truenas.com/community) or by visiting IRC (#freenas on freenode).
Except that THIS time, this question wasn't cleanly answered. It actually SHOULD be answered in the official documentation anyway. In closing I'd like to strongly agree with the user who posted the Jira system bug report. They opened the bug report with the following;
Replacing a bad disk is one of the most important things that a user can do with his or her FreeNAS system, and right now, if the pool is encrypted, the documentation is very confusing.
[snip]
There have been multiple forum posts where people try to make sure they understand, before they go through the process. The rest of the FreeNAS documentation is super clear, but this is one place that many people stumble.
Someone should really clear up this documentation confusion.

Good Luck..
 
Last edited:

jasn

Dabbler
Joined
Dec 19, 2014
Messages
32
After the resilver process had completed, attempting to use the encryption option to Reset Keys failed with an error message stating that the system was unable to set a passphrase. (This happened whether I selected to reset the encryption key with or without a passphrase). The error message popup received was the following;
[EFAULT] Unable to set key: [EFAULT] Unable to set passphrase on gptid/7c74f38b-1eeb-11ea-b9af-ac1f6b412b68: geli: Cannot open gptid/7c74f38b-1eeb-11ea-b9af-ac1f6b412b68: No such file or directory.
I then removed the write cache, (ZIL), and read cache, (L2ARC), drives from my pool, and again attempted to reset the pool encryption using the Reset Keys option. This time the process completed allowing me to download the new pool encryption key, after which I used the Recovery Key option to download the new recovery key as well.

Finally, upon system reboot the storage pool was automatically unlocked, with my previously failing drive replaced in the encrypted pool, without using a passphrase.
 
Last edited:
Top