Scripting transition from 128kB to 1MB dataset recordsize

Sawtaytoes · Nov 4, 2023

Hi! I'd like to make my dataset's recordsize change to 1MB for many of them that hold large files like pictures and movies, but I don't have a "run this script to do it" method.

What I'd like is something that:

Creates the new dataset.
Copies everything over.
Keeps re-running the copy command until no changes are being made.
Deletes the old dataset and renames the new one.

Does anyone have a script that does this reliably without changing permissions or file creation times?

probain · Nov 4, 2023

One method would be to zfs send | recv to a new dataset. After that's done, you could remove the old dataset. Thirdly, rename the "new" dataset. This obviously breaks previous snapshots for the original dataset(s). And if you share them directly, that'll get broken too.

Alternative method would be, zfs send | recv to a temporary dataset. Remove the old content (and possibly snapshots, if space is needed). Change the original datasets record size. Newer zfs version, goes up to 16M in record size

. Then zfs send | recv back to the original dataset. This would save shares. (I did this method a month ago).

Third method, is looking for some sort of rebalancing script (which is what you're asking for). I did a quick google, and found MarkusRessel: zfs-inplace-rebalancing. Obviously I haven't tried this script, I haven't read it. I do not vouch for it's validity or any such thing. I'm only giving pointers to examples

Daisuke · Nov 5, 2023

See my guide: Bluefin Recommended Settings and Optimizations
Search in page for Datasets Structure Organization, is exactly what you want to do. Create a new dataset with the correct recordsize, then copy the files recursively, from terminal. Lastly, delete the old dataset and rename the new dataset to old name, as instructed into guide.

Sawtaytoes · Nov 5, 2023

I'll checkout that guide.

A rebalance script only rewrites files in the existing dataset. I'm assuming once I change this "Record Size" value, all new files are going to use the new size?

If so, would I have to do that to datasets in other zpool where I send that same data?

Also, is 16M safe? I mean, these multi-gigabyte video files are all in a single dataset. It makes sense to me, but I'm not sure of any drawbacks to be aware.

In the case of smaller-file datasets, I can leave those at 128K since that's the default. That is, unless you suggest I change those as well. I was reading a lot about the record size in the past and was told 128K is a good default for everything, but if I can gain something from changing it, I think I should.

I have all my data already organized nicely into separate datasets each with their own similar file types. I don't have any VMs or databases, and when people were talking about changing the record size, they were typically running databases and VMs. I'm not. I only store file regular data. Any working files are typically on each PC. The NAS is a backup + large-volume data storage container.

asap2go · Nov 5, 2023

Sawtaytoes said:
I'll checkout that guide.

A rebalance script only rewrites files in the existing dataset. I'm assuming once I change this "Record Size" value, all new files are going to use the new size?

Yes. When you change the value then new data will be stored using the new recordsize.

Sawtaytoes said:
Also, is 16M safe? I mean, these multi-gigabyte video files are all in a single dataset. It makes sense to me, but I'm not sure of any drawbacks to be aware.

For a dataset only containing files bigger than 16MB there is nothing wrong using that setting.
Although I don't know how big the benefit is.

Sawtaytoes · Nov 5, 2023

Looks like we need to modify that zfs-inplace-rebalancing script to have `cp --reflink=never` with OpenZFS 2.2 because of how file copies work on datasets in the pool.

I'm also uncertain if it messes with file creation times.

I think transferring the pool and transferring it back seems like the safest and fastest approach. One of my datasets is ~30 TiB, and that has me worried, but I have enough space to make the transition as well as a bunch of large drives on-tap. Since this is a mirrored pool, it's a simple `zpool add`.

For the script, I'm thinking something like this:

Code:

zfs snap -r POOL/DATASET@backupTransfer # Creates a snapshot for transferring this dataset.
zfs send -RLI POOL/DATASET@backupTransfer | pv -Wbraft | zfs recv -Fuv -o recordsize=1M POOL/DATASET.new # Copies all data to a new dataset.
zfs destroy -rv POOL/DATASET@backupTransfer # Destroys the temporary snapshot.
zfs destroy -rv POOL/DATASET.new@backupTransfer # Destroys the temporary snapshot.
zfs destroy -rv POOL/DATASET@* # Destroys all snapshots in the original dataset.
zfs destroy -v POOL/DATASET # Destroys the original dataset.
zfs rename POOL/DATASET.new POOL/DATASET # Subs in the new dataset in place of the old one.

This is a combination of both your suggestions. Is this safe?

With `zfs send -L`, I don't have to get rid of any snapshots either, it'll transfer all of them!

Is there an easy way to restart from where I left off if I have to restart the server during one of these? `-I` enables restarting, but I don't know how to capture the last snapshot. Is it in the logs?

When I do `zfs recv -o recordsize=16M`, does that create the new pool at 16M, or does it only do it for the data I'm sending, and I have to set the dataset to 16M separately?

To make this a bit safer, I think it'd be good to run the first 4 steps (where it takes a snapshot and sends) twice. That way, if any changes come in right as the dataset is being deleted, it ensures they get pulled over.

Daisuke · Nov 9, 2023

Why do you want to do all these complicated steps? I told you exactly what to do.

Code:

/bin/cp -a /olddir/* /newdir/

Once data copied into new dataset, destroy the old dataset and rename new dataset.

Sawtaytoes · Nov 9, 2023

Daisuke said:
Why do you want to do all these complicated steps? I told you exactly what to do.

Code:
/bin/cp -a /olddir/* /newdir/

Once data copied into new dataset, destroy the old dataset and rename new dataset.

So many things can go wrong when copying files without duplicating the dataset. Not to mention I lose all my snapshots.

I've had data corruption when copying files before. I don't wanna have that happen ever again.

Sawtaytoes · Nov 16, 2023

First thing I did was convert all my datasets to 1M:

Code:

for dataset in $(zfs list -H -r -o name Bunnies); do
  zfs set recordsize=1M $dataset
done

Surprisingly, when doing a zfs send/recv through TrueNAS, it was already going to 1M on the other dataset.

Once that's done, I have can copy that data back or I can overwrite the entire zpool and start over (what I'm going to do for redundancy and capacity reasons), but only because I had 2 other zpools available; otherwise, I would've done this in the same pool.

You can also change all datasets to 1M recordsize now so everything new is also at 1M.

Important Announcement for the TrueNAS Community.

Scripting transition from 128kB to 1MB dataset recordsize

Sawtaytoes

Patron

probain

Patron

Daisuke

Contributor

Sawtaytoes

Patron

asap2go

Patron

Sawtaytoes

Patron

Daisuke

Contributor

Sawtaytoes

Patron

Sawtaytoes

Patron

Similar threads

Important Announcement for the TrueNAS Community.

Scripting transition from 128kB to 1MB dataset recordsize

Patron

Patron

Contributor

Patron

Patron

Patron

Contributor

Patron

Patron

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Scripting transition from 128kB to 1MB dataset recordsize"

Similar threads