Encryption Issue - No access and can't reset

robinmorgan · Jan 15, 2021

Upgraded to TrueNas from 11.2. Drive encryption has gone whack. If I try and extend the pool I get passphrase request. Entered my normal passphrase, didn't work. So I've tried to reset the keys. I'm in a jam...

Error Message:

Code:

Error: Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/middlewared/main.py", line 137, in call_method
    result = await self.middleware._call(message['method'], serviceobj, methodobj, params, app=self,
  File "/usr/local/lib/python3.8/site-packages/middlewared/main.py", line 1195, in _call
    return await methodobj(*prepared_call.args)
  File "/usr/local/lib/python3.8/site-packages/middlewared/schema.py", line 973, in nf
    return await f(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/middlewared/plugins/pool_/encryption_freebsd.py", line 129, in rekey
    await self.middleware.call('disk.geli_rekey', pool)
  File "/usr/local/lib/python3.8/site-packages/middlewared/main.py", line 1238, in call
    return await self._call(
  File "/usr/local/lib/python3.8/site-packages/middlewared/main.py", line 1206, in _call
    return await self.run_in_executor(prepared_call.executor, methodobj, *prepared_call.args)
  File "/usr/local/lib/python3.8/site-packages/middlewared/main.py", line 1110, in run_in_executor
    return await loop.run_in_executor(pool, functools.partial(method, *args, **kwargs))
  File "/usr/local/lib/python3.8/site-packages/middlewared/utils/io_thread_pool_executor.py", line 25, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/local/lib/python3.8/site-packages/middlewared/plugins/disk_/encryption_freebsd.py", line 159, in geli_rekey
    raise CallError(f'Unable to set key: {error}')
middlewared.service_exception.CallError: [EFAULT] Unable to set key: [EFAULT] Unable to set passphrase on gptid/4ef2e885-f835-11ea-821b-2c44fd842e10: geli: Cannot read passphrase: Inappropriate ioctl for device.

Samuel Tai · Jan 15, 2021

You'll have to reboot back into 11.2, if you've not upgraded your ZFS version yet, to expand your pool. TrueNAS 12 no longer allows GELI operations, only native ZFS encryption.

robinmorgan · Jan 15, 2021

I prefer not to reboot back into 11.2... how can I upgrade my ZFS? Would you recommend I roll back to 11.2 permanently, we are still suffering from half speed read and writes with a SCSI issue. I'm loosing faith.

Samuel Tai · Jan 15, 2021

If this is production critical, roll back to 11.2 to expand your pool, and then upgrade to 11.3U5, which is still stable. Note the caveats with the 11.3 upgrade before you pull the trigger, though.

robinmorgan · Jan 15, 2021

Thank you so much for the help. Can I assume they are working on the issue?

Samuel Tai · Jan 15, 2021

GELI is in the process of being deprecated. There is some discussion of adding tools for migration to native ZFS encryption, but for now the only method is ZFS send/receive from an unlocked GELI pool to a new ZFS native-encrypted pool.

robinmorgan · Jan 15, 2021

Thank you again. I have taken your advice and have rolled back. I am now getting a failed decrypt.

[EFAULT] Pool could not be imported: 30 devices failed to decrypt.

Code:

Error: concurrent.futures.process._RemoteTraceback:
"""
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/concurrent/futures/process.py", line 239, in _process_worker
    r = call_item.fn(*call_item.args, **call_item.kwargs)
  File "/usr/local/lib/python3.7/site-packages/middlewared/worker.py", line 97, in main_worker
    res = loop.run_until_complete(coro)
  File "/usr/local/lib/python3.7/asyncio/base_events.py", line 579, in run_until_complete
    return future.result()
  File "/usr/local/lib/python3.7/site-packages/middlewared/worker.py", line 53, in _run
    return await self._call(name, serviceobj, methodobj, params=args, job=job)
  File "/usr/local/lib/python3.7/site-packages/middlewared/worker.py", line 45, in _call
    return methodobj(*params)
  File "/usr/local/lib/python3.7/site-packages/middlewared/worker.py", line 45, in _call
    return methodobj(*params)
  File "/usr/local/lib/python3.7/site-packages/middlewared/schema.py", line 965, in nf
    return f(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/middlewared/plugins/zfs.py", line 390, in import_pool
    'Failed to mount datasets after importing "%s" pool: %s', name_or_guid, str(e), exc_info=True
  File "libzfs.pyx", line 369, in libzfs.ZFS.__exit__
  File "/usr/local/lib/python3.7/site-packages/middlewared/plugins/zfs.py", line 380, in import_pool
    raise CallError(f'Pool {name_or_guid} not found.', errno.ENOENT)
middlewared.service_exception.CallError: [ENOENT] Pool 5541826905007370100 not found.
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/middlewared/plugins/pool.py", line 1661, in unlock
    'cachefile': ZPOOL_CACHE_FILE,
  File "/usr/local/lib/python3.7/site-packages/middlewared/main.py", line 1141, in call
    app=app, pipes=pipes, job_on_progress_cb=job_on_progress_cb, io_thread=True,
  File "/usr/local/lib/python3.7/site-packages/middlewared/main.py", line 1081, in _call
    return await self._call_worker(name, *args)
  File "/usr/local/lib/python3.7/site-packages/middlewared/main.py", line 1101, in _call_worker
    return await self.run_in_proc(main_worker, name, args, job)
  File "/usr/local/lib/python3.7/site-packages/middlewared/main.py", line 1036, in run_in_proc
    return await self.run_in_executor(self.__procpool, method, *args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/middlewared/main.py", line 1010, in run_in_executor
    return await loop.run_in_executor(pool, functools.partial(method, *args, **kwargs))
middlewared.service_exception.CallError: [ENOENT] Pool 5541826905007370100 not found.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/middlewared/job.py", line 349, in run
    await self.future
  File "/usr/local/lib/python3.7/site-packages/middlewared/job.py", line 386, in __run_body
    rv = await self.method(*([self] + args))
  File "/usr/local/lib/python3.7/site-packages/middlewared/schema.py", line 961, in nf
    return await f(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/middlewared/plugins/pool.py", line 1673, in unlock
    raise CallError(msg)
middlewared.service_exception.CallError: [EFAULT] Pool could not be imported: 30 devices failed to decrypt.

Samuel Tai · Jan 15, 2021

Ugh. Do you have a recovery key available to unlock? If not, then you're out of luck.

robinmorgan · Jan 15, 2021

Yes I do!! I have all of them catalogued. I get the same error whilst using the key also!

Samuel Tai · Jan 15, 2021

Did you upgrade your pool in 12? If so, then there's no way to go back. Your only option now is to destroy and rebuild your pool from backup.

robinmorgan · Jan 15, 2021

Right, ok. What a mess.

robinmorgan · Jan 15, 2021

Samuel Tai said:
Did you upgrade your pool in 12? If so, then there's no way to go back. Your only option now is to destroy and rebuild your pool from backup.

So I'm back on 12-1U, my pool is unlocked and working as it was. Can IO confirm something, I am unable to expand the pool yet I am able to access it. I did reset (removed) encryption before upgrading to 12. I'm confused why I have this issue.

Samuel Tai · Jan 15, 2021

How did you remove encryption before upgrading? Did you use @Patrick M. Hausen's procedure? Or did you just reset GELI? A reset only generates new keys, and doesn't remove encryption. When you run zpool status do your pool members have a .eli suffix?

robinmorgan · Jan 15, 2021

I believe I used the GELI... I do get the .Eli suffix.

Code:

zpool status
  pool: freenas-boot
 state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
        still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(5) for details.
  scan: scrub repaired 0B in 00:00:32 with 0 errors on Thu Jan 14 03:45:36 2021
config:

        NAME          STATE     READ WRITE CKSUM
        freenas-boot  ONLINE       0     0     0
        da0p2       ONLINE       0     0     0

errors: No known data errors

pool: tank1
state: ONLINE
scan: scrub repaired 0B in 1 days 02:46:40 with 0 errors on Sun Jan  3 02:46:43 2021
config:

        NAME                                                STATE     READ WRITE CKSUM
        tank1                                               ONLINE       0     0     0
          raidz2-0                                          ONLINE       0     0     0
            gptid/5bb8c482-c217-11ea-ae35-2c44fd842e10.eli  ONLINE       0     0     0
            gptid/5c963e5b-c217-11ea-ae35-2c44fd842e10.eli  ONLINE       0     0     0
            gptid/5c284a9c-c217-11ea-ae35-2c44fd842e10.eli  ONLINE       0     0     0
            gptid/5bf57807-c217-11ea-ae35-2c44fd842e10.eli  ONLINE       0     0     0
            gptid/5d388caa-c217-11ea-ae35-2c44fd842e10.eli  ONLINE       0     0     0
            gptid/5e4cee32-c217-11ea-ae35-2c44fd842e10.eli  ONLINE       0     0     0
            gptid/5e6af86b-c217-11ea-ae35-2c44fd842e10.eli  ONLINE       0     0     0
          raidz2-1                                          ONLINE       0     0     0
            gptid/5c74676d-c217-11ea-ae35-2c44fd842e10.eli  ONLINE       0     0     0
            gptid/5c2b09d7-c217-11ea-ae35-2c44fd842e10.eli  ONLINE       0     0     0
            gptid/5ce451f5-c217-11ea-ae35-2c44fd842e10.eli  ONLINE       0     0     0
            gptid/5dfd9c18-c217-11ea-ae35-2c44fd842e10.eli  ONLINE       0     0     0
            gptid/5e9f7cd8-c217-11ea-ae35-2c44fd842e10.eli  ONLINE       0     0     0
            gptid/5f5d5bf8-c217-11ea-ae35-2c44fd842e10.eli  ONLINE       0     0     0
            gptid/5f2eae67-c217-11ea-ae35-2c44fd842e10.eli  ONLINE       0     0     0
          raidz2-3                                          ONLINE       0     0     0
            gptid/171fe6c0-0cea-11eb-b4f4-a0369f3909fe.eli  ONLINE       0     0     0
            gptid/16e1a84a-0cea-11eb-b4f4-a0369f3909fe.eli  ONLINE       0     0     0
            gptid/16c27717-0cea-11eb-b4f4-a0369f3909fe.eli  ONLINE       0     0     0
            gptid/19637f86-0cea-11eb-b4f4-a0369f3909fe.eli  ONLINE       0     0     0
            gptid/1a1fa8ed-0cea-11eb-b4f4-a0369f3909fe.eli  ONLINE       0     0     0
            gptid/19d169e0-0cea-11eb-b4f4-a0369f3909fe.eli  ONLINE       0     0     0
            gptid/1a03f3f5-0cea-11eb-b4f4-a0369f3909fe.eli  ONLINE       0     0     0
        logs
          gptid/d0ddf781-de10-11ea-9899-2c44fd842e10.eli    ONLINE       0     0     0

errors: No known data errors

Samuel Tai · Jan 15, 2021

Did your paste get cut off? The last gptid was truncated.

robinmorgan · Jan 15, 2021

Sorry yes, now updated

Samuel Tai · Jan 15, 2021

OK, the reset explains why the recovery key didn't work, as the reset creates new keys, which invalidate the old keys/passphrases. I think the safest option for you is to destroy your pool, reboot into 11.2, reload your pool from backup, making sure it's unencrypted, and then upgrade to 11.3U5.

robinmorgan · Jan 17, 2021

Hello again, I’m still a little lost. My pool has around 80TB of data, deleting everything and pulling it back from backup is going to be timely and expensive. I felt I was meticulous in the upgrade and followed all the official literature. Now, here I am with apparently with one option “reload from backup”. Have I missed something?

Samuel Tai · Jan 17, 2021

Yes, you misapprehended a GELI reset as a removal of encryption. It’s actually a deletion of all the old keys and generation of new keys, which invalidates any prior GELI passphrases and recovery keys, so the absolute next step you should’ve done was to re-establish your passphrase and set a new recovery key. Unfortunately, as you didn’t do that, your only recourse is to rebuild.

robinmorgan · Jan 17, 2021

Ok, thank you again. To confirm, encryption is always open. I can restart the server and the pool is unlocked. As confirmed I am unable to add a new passphrase or extend the pool.

Just for my complete understanding; what should I have done before upgrading? Is there a step by step?

I am assuming I should have re-established my passphrase after upgrade of OS but before upgrading the ZFS.

Important Announcement for the TrueNAS Community.

Encryption Issue - No access and can't reset

Dabbler

Attachments

Never underestimate your own stupidity

Dabbler

Never underestimate your own stupidity

Dabbler

Never underestimate your own stupidity

Dabbler

Never underestimate your own stupidity

Dabbler

Never underestimate your own stupidity

Dabbler

Dabbler

Never underestimate your own stupidity

Dabbler

Never underestimate your own stupidity

Dabbler

Never underestimate your own stupidity

Dabbler

Never underestimate your own stupidity

Dabbler

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Encryption Issue - No access and can't reset"

Similar threads