Server migration query

Popolou

Dabbler
Joined
Nov 8, 2011
Messages
26
Just a quickie if in case anyone may have a better solution.

Will be migrating to new hardware so will be doing a zfs send | receive. However, i will also be changing all the recordsizes to make more efficient use of the hardware for the workflows so am i left with nuking it all afterwards with a rm -r only to then cp -a it back to make it 'take'?

Cheers
Pops
 
Joined
Oct 22, 2019
Messages
3,641
so am i left with nuking it all afterwards with a rm -r only to then cp -a it back to make it 'take'?
That's redundant work.

If you want your files to be constructed with a different recordsize, and you're already migrating your data anyways, don't use zfs send/recv. Create your new pool and dataset(s) with your preferred recordsize. Then use rsync to migrate your data to the new server, which uses the preferred recordsize.
 

Popolou

Dabbler
Joined
Nov 8, 2011
Messages
26
Thanks, did consider it but went for the speed option. On reflection however, clearly i can do it in one step rather than two so rsync does win after all :smile:
 
Joined
Oct 22, 2019
Messages
3,641
But what purpose is served by using "zfs send" to migrate data over to a new pool... to then immediately destroy all of the data?

It's like calling an Uber to pick you up, tell them to drive around town in a full circle, only to drop you off at your house. Then once you are home you say to yourself "Okay, now I can call an Uber to pick me up and take me to the store."
 

Popolou

Dabbler
Joined
Nov 8, 2011
Messages
26
:grin:

Sending it via ZFS would have been the cleaner & quicker route without i suppose having to do a diff to make sure of any discrepancies. But i also believed (perhaps mistakenly) that snapshots would not move across with rsync but i suspect from your suggestion that is not the case? I am looking into this now.
 
Joined
Oct 22, 2019
Messages
3,641
I think you're misunderstanding the concepts of records, recordsize, and snapshots.

Once written, a record is immutable. It's never modified in place, nor will it ever change in size. Snapshots deal in records, not files.

All your snapshots from when you had datasets with the old recordsize point to the existing (now "destroyed") records. When you outright destroy the files of which those records exist, those snapshots still point to the old records using your undesired recordsize.

So now you've got two sets of files: the "destroyed" ones that only exist in the snapshots, and the new ones which you created via "cp".

You essentially just used up twice the capacity for the same set of files.

EDIT: In other words, you can't have it both ways.
  • Either you abandon your former snapshots, so that you can use a fresh set of files with a new recordsize, or...
  • You keep your existing snapshots and files (which were written with the "unwanted" recordsize), and just live with the fact that the only files which will be affected by the new desired recordsize policy will be newly created files. (The already-existing files will remain with the records as-they-are, from when you had the previous recordsize policy.)
 
Last edited:

Popolou

Dabbler
Joined
Nov 8, 2011
Messages
26
Well observed. I completely overlooked the fact that the snaps are records and yes by 'replacing' them you are breaking the chain and doubling up on their resultant storage.

However, is there not a way i can have both options if i accept a compromise? In our case, an overwhelming majority of the data is static with only a few datasets consuming notable snapshot space. I suppose for those datasets which exhibit no churn (ie no sizeable snapshots created) i can blow them away and recreate them afresh. They will then gain the benefits of the adjusted recordsize. Those which cannot because of long-standing and large snaps will remain and as you say, update the recordsize for subsequent files so that they simply expire over time.
 
Joined
Oct 22, 2019
Messages
3,641
However, is there not a way i can have both options if i accept a compromise?
Both options cannot exist. It's "either-or", which you can use some granularity at a per-dataset level, decision after decision. (Which you noted in your latest post.)

The fact is that snapshots are tethered to immutable records. There's no way around this.

If you want to destroy all snapshots for certain datasets, while keeping the snapshots for others, that's your call. Whatever works best for you. But remember that there's no option which lets you keep the best of both worlds for a particular dataset. It's "either-or".
 
Top