Unencrypted dataset size is smaller than same data encrypted

grigory · Jul 3, 2023

Hi there, I had an encrypted dataset, and I moved the files to an unencrypted one. The total data size is about 30% less when it is unencrypted, is this normal? should I be worried about something not getting moved? Everything else indicates all the data is there (spot checking, file count, no errors during move).

Thanks!

winnielinnie · Jul 3, 2023

There's not enough information to go by.

Can you expand?

How did you copy the data? What is your pool layout? Is this data moving within the same pool or across pools? How are snapshots involved? Anything else you can share that can help.

There's a good deal of useless information that can be had by using zfs and zpool commands to grab dataset and pool properties.

grigory · Jul 3, 2023

Sure, thanks for the response.
These datasets are in the same pool. (I have two pools, one for apps and one for data but these datasets are both in the data pool.)
The data was moved in the same pool.
I moved the data using the mv command from the system shell.
There are 28 snapshots on the encrypted dataset.
There are 0 snapshots on the unencrypted dataset.
I am not sure what else might be useful here, thanks again.

winnielinnie · Jul 3, 2023

grigory said:
I moved the data using the mv command from the system shell.

Why "mv" and not "cp" (or even better "rsync"? or even yet better "zfs send/recv?") If something goes wrong with "mv" or it's interrupted, you can be left in a state of limbo.

grigory said:
There are 28 snapshots on the encrypted dataset.
There are 0 snapshots on the unencrypted dataset.

This is likely why you're seeing a difference in used space.

grigory · Jul 3, 2023

Thanks, I actually made a typo and just updated my last post it had 28 snapshots, so it seems even more likely based on your reply.
IIRC I did some reading and mv was recommended to preserve file properties (create dates etc), although I am not sure if zfs send/recv would do this. I also wanted to be sure that nothing would have been encrypted in the new dataset. I agree rsync would probably have been better thinking back.
Thank you.

winnielinnie · Jul 3, 2023

grigory said:
mv was recommended to preserve file properties (create dates etc), although I am not sure if zfs send/recv would do this.

"mv's" advantage over "cp" or "rsync" is irrelevant when traversing beyond filesystems. (Each dataset is its own filesystem.)

Regardless, "cp -a" ("archive mode") will preserve all properties, as will "rsync -a".

"zfs send/recv" replicates the filesystem itself (which means you're getting an exact recreation of the filesystem, including timestamps and metadata.)

grigory said:
I also wanted to be sure that nothing would have been encrypted in the new dataset.

You can specify what properties to exclude when replicating a dataset for the first time. However...

...careful when mixing encrypted with unencrypted datasets in the same pool. Is your top-level root dataset encrypted? If so, it's highly discouraged not to use any unencrypted datasets in your pool. (It can even lead to unpredictable behavior.)

grigory · Jul 3, 2023

Ok, this is a good reminder, thanks for these details (I'm sure I will refer back to this post) The pool is not encrypted, only the dataset.
In light of these details, what is the recommended way to migrate from an encrypted dataset to am unencrypted one?

Thanks again

winnielinnie · Jul 3, 2023

grigory said:
what is the recommended way to migrate from an encrypted dataset to am unencrypted one?

You have to be extra careful using the command-line.

If this encrypted dataset has no children, you can replicate it into a new unencrypted dataset by first creating a new snapshot, and then sending that snapshot (while excluding the encryption property.)

First switch to the root user if you are not already root (not sure how SCALE handles this):

Code:

sudo su

Create the migration snapshot:

Code:

zfs snap mypool/cryptdata@migrate-`date +%Y-%m-%d_%H-%M`

Send the snapshot to a new dataset (which does not exist yet.) The new dataset will be created from this replication.

Code:

zfs send -v -L -e -c mypool/cryptdata@migrate-2023-07-03_14-38 | zfs recv -v -o encryption=off mypool/plaindata

You will now have a new unencrypted dataset named "plaindata" which will be as up-to-date as the moment you created the new "migrate" snapshot.

A quick breakdown:

mypool = the name of your pool (also the name of your root dataset)
cryptdata = the name of the currently encrypted dataset
plaindata = the name of the new unencrypted dataset (which doesn't exist until you run the send/recv)
@migrate-`date +%Y-%m-%d_%H-%M` = the new snapshot name which will be timestamped (with the "zfs snap" command)
@migrate-2023-07-03_14-38 = an example of how the actual snapshot name will appear
-v = be verbose
-L = large blocks support
-e = more efficient embedded stream
-c = more efficient compressed stream (don't "decompress" records that are already compressed)
-o = override a property on the receiving side (in this case, disable encryption)

If everything looks good, you can now destroy the old dataset.

However, keep in mind that you will not have any snapshots sent over, except for the most recent "migrate" snapshot.

The above method will not send any "nested children" living underneath your source dataset.

The above method requires that you (temporarily) have enough free space for both datasets, since nothing is deleted automatically.

EDIT: I should add that it's recommended to do the above in a new "tmux" session, so that you can close the window without interrupting the send/recv process. It's also advisable to use a "resume token" in case the process does interrupt or abort.

grigory · Jul 3, 2023

Thanks for this, this approach seems more comprehensive. I appreciate you walking through the steps to achieve this.
zfs is a new to me and I'm learning, it always feels safer doing the familiar method but I will definitely use this next time.

winnielinnie · Jul 3, 2023

It goes without saying, nothing is a substitute for having up-to-date backups.

You don't want to find yourself in a situation where you've lost everything because of a "typo" or "fat-fingering" the wrong syntax in a command or accidentally deleting the wrong target.

Important Announcement for the TrueNAS Community.

Unencrypted dataset size is smaller than same data encrypted

grigory

Dabbler

winnielinnie

MVP

grigory

Dabbler

winnielinnie

MVP

grigory

Dabbler

winnielinnie

MVP

grigory

Dabbler

winnielinnie

MVP

grigory

Dabbler

winnielinnie

MVP

Similar threads

Important Announcement for the TrueNAS Community.

Unencrypted dataset size is smaller than same data encrypted

Dabbler

MVP

Dabbler

MVP

Dabbler

MVP

Dabbler

MVP

Dabbler

MVP

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Unencrypted dataset size is smaller than same data encrypted"

Similar threads