Help moving pool to new disks

mouseskowitz · Dec 17, 2020

I have my original GELI encrypted pool, and the disks are getting old and starting to die on me. With the new ZFS encryption, it seems like a good time to both migrate to bigger disks and the new encryption. The new pool is currently in a second TrueNAS server that I spun up temporarily and is on the same local network. I'm struggling to get the data moved over properly. I figure out how to do a ZFS replication, but everything seems to be locked and I don't see a way to unlock it. They system shows this icon by everything.

I found this post, but it seems to be for unencrypted pools, and there is this in the documentation about migrating from GELI. I think I'm missing a piece of the puzzle on how to combine everything. Could someone please point me in the right direction?

Chris Moore · Dec 17, 2020

Because of the two different encryption schemes, I bet you are going to need to fully configure the new NAS and get storage working and then use something like rsync to copy the data over.
I don't think something like zfs send and receive is going to work properly, but I don't have much experience with encrypted pools. I only set it up once for testing and that was with the old method under FreeNAS 9.10. I have not tried the new ZFS encryption yet. I imagine I will be giving it a test run soon though.

mouseskowitz · Dec 17, 2020

It looks like the stuff in the documentation is how to move encryption schemes, but the zfs replication push is for going between internal pools. I guess I just need to figure out how to do an external one via command line. I would prefer to stick with zfs replication to maintain the ACLs, but it's not the end of the world if I need to redo them due to using rsync.

winnielinnie · Dec 17, 2020

It is possible, and I have done it more than once (test environments), with success on my actual TrueNAS server. This was all done locally, not over a network connection. The fundamentals should be the same, however, with the only difference being the "remote" aspect of send and recv.

You can read about my examples in here: TrueNAS 12.0 GELI to Full Drive Encryption Migration

You can see why I was confused by this new native ZFS encryption when first playing around with TrueNAS Core 12.0: Confused by native ZFS encryption, some observations, many questions

mouseskowitz · Dec 18, 2020

@winnielinnie reading your write ups helped me realize that I had actually done everything correctly. The thing that had confused me was that icon that I posted above. If they're going to designate things with icons, they should probably put what they mean in the documentation. At least I didn't see it.

My zfs replication to an new zfs encrypted pool gave me what I want as I was just looking for encryption at a pool level. It took me a couple minutes to figure out that I had to set each child item to readonly=false via CLI to get the pool fully operational after import. Everything seems to up and functioning. Subjectively, the new pool feels significantly faster than the old one. I haven't don't any benchmarks on it.

winnielinnie · Dec 18, 2020

mouseskowitz said:
@winnielinnie reading your write ups helped me realize that I had actually done everything correctly. The thing that had confused me was that icon that I posted above. If they're going to designate things with icons, they should probably put what they mean in the documentation. At least I didn't see it.

The documentation is currently subpar and not on the level of what FreeNAS 11.3 gave us. I would argue that the documentation should be reviewed and edited by real end-users. The icon which you pasted above signifies a non-encrypted dataset nested underneath an encrypted parent dataset. Any data saved to that specific dataset is in the plain; not encrypted.

Double-check and triple-check your work! Hope it's all sorted out for you now.

UPDATE: If you're still seeing that icon, it means the dataset is not using encryption. ***A dataset must be set to use encryption during its creation, or else it cannot later be encrypted.

Once a dataset is created with encryption, the following can be changed later, as many times as you wish:

Inherit (or don't inherit) parent's / encryptionroot's encryption scheme (adopts same keyfile or passphrase and iterations)
Keyfile or manually generated 64-character random key string
Passphrase
Iterations

However, the following cannot be changed later:

Master Key
Key Width (i.e., 128, 192, 256)
Cipher

*** A dataset is either manually created, or automatically created during a send / recv. The recv option "-x encryption" tells the destination to inherit the parent dataset's encryption properties when the dataset is created. This is how you can send non-encrypted data from an old (11.3 or earlier) dataset, which will be encrypted in the new dataset, so long as the parent dataset is encrypted. (This is also why I create "pseudo-roots", such as "zdataroot" to serve as more manageagle top-level datasets.) Remember, when using 11.3 and prior, using GELI (to encrypt the partitions on the lower level) means that the ZFS datasets are not encrypted. For all ZFS knows, you're using non-encrypted data, regardless of the lower-level stuff with GELI. The way GELI works is closer to standard block device encryption. The block device (full disk, partition, etc) is encrypted, and anything that resides over it (ext4, NTFS, ZFS) is encrypted; yet the encryption is not done at the file system level (ext4, NTFS, ZFS), but rather at the block device level.

mouseskowitz · Dec 18, 2020

@winnielinnie I think I'm having trouble warping my head around the encryption change. I understand the GELI, I think. It was encrypting the pool at the disk level, and anything that was in that pool was thus encrypted when the pool or disk was detached. The new ZFS encryption is at a data set level and may or may not inherit from the parent?

I'm going to have to rethink how I'm doing this. Now my plan is to empty out my smaller SSD pool, blow it away, and rebuild it as the new pool type. Then I can do a local recv with the inherit encryption, rename the original, and do another recv to get it back in the original pool. Once I've double and triple checked that everything is there, I can delete the original unencrypted renamed data set. My goal is just to not need to worry about what's on the disk if I need to send it in for warranty replacement or when I retire it.

winnielinnie · Dec 19, 2020

mouseskowitz said:
I understand the GELI, I think. It was encrypting the pool at the disk level, and anything that was in that pool was thus encrypted when the pool or disk was detached.

With GELI (and thus 11.3 and earlier), what was really happening when you chose to create a pool with encryption, was that partitions were being created and encrypted for each physical device in the vdev (ada0p2, ada1p2, ada2p2, etc). None of the encryption is done via ZFS itself. These encrypted block devices (ada0p2, ada1p2, ada2p2, etc) are decrypted with a keyfile or passphrase upon re-importing them, which makes them available for the zpool import; and naturally there is now transparent access to the ZFS metadata and data. Without first decrypting these partitions, there would be no zpool to import since the metadata is not yet available (just a bunch of "random" garbage.)

With the new native ZFS encryption, none of that applies anymore. The block devices (e.g, physical partitions) are left as-is, used to build vdevs, which are used to create a pool, and then any such encryption is left to the end-user on a per dataset level. Importing the pools does not first require any block devices to be decrypted: you can import a zpool with native ZFS encryption on its datasets and simply "skip" the prompt when it asks if you want to unlock the datasets upon import. This means you'll have successfully imported a zpool with a bunch of locked (inaccessible) datasets. With GELI, the step that requires decrypting the block devices first is mandatory: you cannot skip this step when trying to import a zpool.

The new ZFS encryption is at a data set level and may or may not inherit from the parent?

Pretty much yes. The way it is presented in TrueNAS 12 gives you the impression that you are creating an "encrypted pool" when you first create a pool. This is not true. It is simply giving you the option to encrypt the very top-level root dataset (which shares the same name as the pool name). This does not dictate what you can and cannot do with any other child or parent datasets later on. However, if you leave the default options when creating new datasets, it will by default "inherit" its parent's encryption options: when the new dataset is created it will generate a new Master Key for itself, inherit the Cipher of the parent (e.g, AES GCM), inherit the Key Width of the parent (e.g, 256-bit), inherit the Keyfile or Passphrase of the parent, and inherit the Iterations of the parent.

The following CANNOT be changed later, at any time:

Master Key
Cipher
Key Width

The following CAN be changed after the dataset's initial creation, without affecting the saved data, files, and folders, simply by unchecking the "Inherit" box for the dataset's encryption properties:

Keyfile
Passphrase
Iterations

TrueNAS 12 hides some of the underlying functionality from the end-user for the sake of behaving as an appliance, rather than an operating system that is configured and tweaked as you would a distro such as Ubuntu Linux or FreeBSD. This is why certain things seem counter-intuitive. Everything is supposed to be done in the GUI, not via the terminal. While it's possible to create your own crazy zpool and dataset hierarchies that mix and nest different types of encryption (and non-encryption) all over the place, it's not recommended and strays away from a smoother workflow. You could technically "create" a new zpool with an encrypted top-level root dataset, and then create a non-encrypted dataset immediately underneath it. The state of the root dataset (locked vs unlocked) does nothing to protect the data in the nested non-encrypted dataset. The GUI may give the impression it does, because the parent is "locked" and intuitively we assume that whatever is nested underneath a locked parent is inaccessible. (This is not the case, as no matter how deep a child dataset is nested, if it is not encrypted its files and folders will always be accessible in the plain.)

My goal is just to not need to worry about what's on the disk if I need to send it in for warranty replacement or when I retire it.

There is a caveat concerning native ZFS encryption vs traditional block device encryption: what is exposed to the public. I'll give my own summary understanding comparing three common options:

Native ZFS, at rest (or powered off):

Inaccessible
- File Data
- File Names and Properties
- Sizes of individual files (unsure about this one)
- Directory listings and structures
- Logs (if logs are saved here)
Accessible
- Encryption Options and Hashes
- ZFS Metadata and Options
- Number of Files and Blocks (via pointers and inodes) [1]
- Dataset Names
- Snapshot Names
- Free Space
- Used Space

LUKS, at rest (or powered off):

Inaccessible
- File Data
- File Names and Properties
- Directory listings and structures
- Sizes of individual files and folders
- Number of files and folders
- Logs (if logs are saved here)
- Free Space (can be inferred if underlying devices were not filled with random data prior to creation)
- Used Space (can be inferred if underlying devices were not filled with random data prior to creation)
- File System and metadata within
Accessible
- Encryption Options and Hashes

VeraCrypt container, at rest (or powered off):

Inaccessible
- File Data
- File Names and Properties
- Directory listings and structures
- Sizes of individual files and folders
- Number of files and folders
- Logs (if logs are saved here)
- Free Space (container filled with random data upon creation)
- Used Space (container filled with random data upon creation)
- File System and metadata within
- Encryption Options and Hashes
Accessible
- Apparently nothing, just a suspicious mess of random data [2]

[1] Seems like native ZFS encryption is "leakier" than I first suspected, but it makes sense considering it requires certain information and metadata in order to send and recv, create snapshots, scrub, etc, without requiring encrypted datasets to be unlocked.

[2] Until decrypted, a VeraCrypt partition/device appears to consist of nothing more than random data (it does not contain any kind of "signature"). Therefore, it should be impossible to prove that a partition or a device is a VeraCrypt volume or that it has been encrypted.

mouseskowitz · Dec 20, 2020

@winnielinnie Thank you for your very detailed responses. They're really helpful.

I was able to move a dataset from pool A to pool B using

Code:

# zfs send -Rv zpool-A/mydata@migration | zfs recv -x encryption zpool-B/mydata

However, when I try to send it back, it doesn't like the -R. The system gives me a message that I need to use a raw send, -w. The problem I see with that is the dataset would maintain the encryption keys from the current pool. At least that's how I'm understanding the documentation. That is great if I'm backing up to the cloud or something like that, but it seems like it could get really confusing to mix keys within a pool long term. I've found other commands that will send a single snapshot, but I'd prefer to have all my snapshots if possible. How do I do that once the dataset gets the new encryption?

winnielinnie · Dec 20, 2020

mouseskowitz said:
I was able to move a dataset from pool A to pool B using
Code:
# zfs send -Rv zpool-A/mydata@migration | zfs recv -x encryption zpool-B/mydata
However, when I try to send it back, it doesn't like the -R.

What do you mean send it back? Is this something you're going to regularly send and recv, back and forth between poolA and poolB? I thought poolA was the old pool using legacy (GELI) encrryption, prior to TrueNAS 12.0? I don't believe you can send a ZFS-encrypted dataset to an old pool that is using GELI encryption underneath. TrueNAS 12.0 won't even allow you to create ZFS-encrypted datasets on an existing "legacy" pool using GELI.

The system gives me a message that I need to use a raw send, -w. The problem I see with that is the dataset would maintain the encryption keys from the current pool.

Those specific datasets will still use the encryption options and keys/passphrases as their initial creation when sending them elsewhere using the -w flag. They will show a distinct padlock icon, as they don't inherit encryption properties from the destination pool to where they are being sent. Their children, however, shouldn't show any padlock icons once the send/recv is complele, as the children are (by default) inheriting their parents' encryption properties.

I've found other commands that will send a single snapshot, but I'd prefer to have all my snapshots if possible. How do I do that once the dataset gets the new encryption?

-R is a recursive send that should include the specified snapshot, plus any previously created snapshots.*

-I (or -i) are incremental options that will only send the changes between the two specific snapshots (earlier compared to later). The destination needs to have the "earlier" snapshot in order for this to work. Much of this is configured through the GUI if you're using Replication Tasks. I just have issues trying to use the this feature when trying to create "on-demand, non-scheduled" replications/backups.

* Let's say you have a dataset named poolA/mydata/media and have some snapshots, created in this order:

poolA/mydata/media@milestone
poolA/mydata/media@auto-20200831
poolA/mydata/media@auto-20200906
poolA/mydata/media@auto-20200913
poolA/mydata/media@important-snap

Then you create a snapshot named @migrate-to-new-pool, and this is the snapshot name you use to send to the new pool.

If you use -R with the send command, it will send everything up until the point in time when you created @migrate-to-new-pool, including the previous snapshots, and including the children datasets beneath poolA/mydata/media that also contain the snapshot @migrate-to-new-pool.

mouseskowitz · Dec 20, 2020

I should clarify where I'm at in the process. I moved everything from the old GELI hdd pool to the new ZFS encrypted hdd pool. The data is now sitting in the new pool unencrypted. I have a ssd pool that I rebuilt to also use the new ZFS encryption. The GELI pool is retired and has been pulled from the system. So I have two pools in the system both with the new ZFS encryption. The datasets on the hdd pool are the ones that the are currently not encrypted and I want them to be.

So far I've tried moving one dataset with all its snapshots from the hdd pool to the ssd pool. It is now encrypted using the ssd key. Now I want to move it back with all its snapshots and have it be encrypted using the hdd key. The problem that I'm running into is that when the dataset is encrypted the -R seems to require using -w that maintains the encryption using the ssd key. Without using the -R, I only seem to be able to move one snapshot instead of all of them.

This method seems to accomplish what I'm looking for. Send the dataset from the hdd pool to the ssd pool leaving it unencrypted

Code:

zfs send -Rv hdd/dataset@migration | zfs recv ssd/dataset

Then I can send it back to the hdd pool and inherit the proper encryption

Code:

zfs send -Rv ssd/dataset@migration | zfs recv -x encryption hdd/dataset

If anyone is trying this, don't forget to make a new snapshot to reference, i.e. migration, before each send.

winnielinnie · Dec 21, 2020

mouseskowitz said:
So far I've tried moving one dataset with all its snapshots from the hdd pool to the ssd pool. It is now encrypted using the ssd key. Now I want to move it back with all its snapshots and have it be encrypted using the hdd key.

Which "key" are you referring to here? The key generated for the top-level root dataset (i.e, same name as the pool, "ssd" vs "hdd"), or for the particular child dataset(s) within (i.e, "dataset")?

This method seems to accomplish what I'm looking for. Send the dataset from the hdd pool to the ssd pool leaving it unencrypted
Code:
zfs send -Rv hdd/dataset@migration | zfs recv ssd/dataset
Then I can send it back to the hdd pool and inherit the proper encryption
Code:
zfs send -Rv ssd/dataset@migration | zfs recv -x encryption hdd/dataset
If anyone is trying this, don't forget to make a new snapshot to reference, i.e. migration, before each send.

Yes, that will work, and it will inherit the same encryption properties (and key/passphrase) as the parent or root dataset, this case that of "hdd". However, if you try to send it back again to the "ssd" pool, you're going to bump into the same issues once more where you either send it without -R (if you want to inherit the encryption properties/key of "ssd", or with -R, but as a raw stream using the same encryption properties/key as it resides on "hdd".)

What really adds to the counter-intuitive concepts is that the "pool" name is the same as the root dataset. Honestly, I think ZFS would have done better without a root dataset, and simply treat the "pool" as a container (accepting zpool commands). Any top-level dataset in the pool would be their own separate root dataset, with parents and children underneath (accepting zfs commands.) That's just my opinion.

mouseskowitz · Dec 22, 2020

winnielinnie said:
Which "key" are you referring to here? The key generated for the top-level root dataset (i.e, same name as the pool, "ssd" vs "hdd"), or for the particular child dataset(s) within (i.e, "dataset")?

I'm assuming that the encryption uses the .jason from the root dataset.

Now that I'm thinking about it. If you use -Rw on the send but -x encryption on the recv, which root key will it use? Is there a command or way to tell?

mouseskowitz · Dec 22, 2020

mouseskowitz said:
If you use -Rw on the send but -x encryption on the recv, which root key will it use?

I tried it and the raw stream (-w) doesn't allow the -x and throws an error message.

mouseskowitz · Dec 22, 2020

New problem. How do you unlock a dataset after doing a zfs send -Rw? I tried unlocking it with the .json from the original root dataset and it doesn't work.

winnielinnie · Dec 22, 2020

mouseskowitz said:
I tried it and the raw stream (-w) doesn't allow the -x and throws an error message.

It's either-or, since you can't use both.

winnielinnie · Dec 22, 2020

mouseskowitz said:
New problem. How do you unlock a dataset after doing a zfs send -Rw? I tried unlocking it with the .json from the original root dataset and it doesn't work.

That's because the .json for pool / root dataset is referencing a different name. I highlighted in bold the difference:

hdd/data
ssd/data

This means you need to open up the .json with a text editor and change the old reference(s) to the new reference(s) to match the entire root-parent-child to match the new pool. Since you're not using a passphrase, it makes it feel counter-intuitive. What happened when you sent it with -w is it copied everything over, exactly as-is, including the dataset's own 64-character string key (technically its hash) that it initially inherited from its original root/parent.

An example .json file for dataset_hdd_keys.json might look something like this:

Code:

{"hdd/data": "a284d92b66026dbef17bc5db68025e9579e8fa7ec00708668906bc75be666ff5"}

Notice that the this particular line is referencing hdd/data?

If you later send this as a raw stream (-w) to the pool ssd, which now resides as ssd/data, it means the .json needs to be copied, renamed, and edited to reflect the new pool's format. So you can make a copy and name the file dataset_ssd_keys.json, which looks like this. Notice the same exact 64-character key for the dataset ssd/data:

Code:

{"ssd/data": "a284d92b66026dbef17bc5db68025e9579e8fa7ec00708668906bc75be666ff5"}

However, in most cases the "encryptionroot" at the higher level is what is really being triggered to unlock itself (and its inherited children) with a .json keyfile. So in reality, the changes most likely should appear as so, since the root dataset for ssd was created with a different random key, and the specific dataset ssd/data is not inheriting from the root dataset ssd.

For hdd pool:

Code:

{"hdd": "a284d92b66026dbef17bc5db68025e9579e8fa7ec00708668906bc75be666ff5"}

For ssd pool:

Code:

{"ssd/data": "a284d92b66026dbef17bc5db68025e9579e8fa7ec00708668906bc75be666ff5"}

EDIT: I might have reversed the order. I lost track between hdd vs ssd, but the same principles apply. If it's the reverse order, then simply flip ssd and hdd in the examples.

Important Announcement for the TrueNAS Community.

Help moving pool to new disks

mouseskowitz

Dabbler

Chris Moore

Hall of Famer

mouseskowitz

Dabbler

winnielinnie

MVP

mouseskowitz

Dabbler

winnielinnie

MVP

mouseskowitz

Dabbler

winnielinnie

MVP

mouseskowitz

Dabbler

winnielinnie

MVP

mouseskowitz

Dabbler

winnielinnie

MVP

mouseskowitz

Dabbler

mouseskowitz

Dabbler

mouseskowitz

Dabbler

winnielinnie

MVP

winnielinnie

MVP

Similar threads

Important Announcement for the TrueNAS Community.

Help moving pool to new disks

Dabbler

Hall of Famer

Dabbler

MVP

Dabbler

MVP

Dabbler

MVP

Dabbler

MVP

Dabbler

MVP

Dabbler

Dabbler

Dabbler

MVP

MVP

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Help moving pool to new disks"

Similar threads