SOLVED Source Replication disables/invalidates/removes/overwrites destination Snapshot schedules?

probain · Oct 29, 2023

I'm replicating a dataset from [nvme-mirror]-pool (source) to my [hdd-pool]. But in doing so, I'm having a bit of trouble with managing retentions and secondary snapshot-schedules on the target pool. Basically, my recursive snapshot schedules for the target dataset are completely overwritten and/or removed by the replication.

I've experimented will a all of the options for retention policies. And trying to stagger the snapshot schedules.

My goal is to have the replication to send one dataset per schedule, to a target dataset. And then have the snapshot schedules of the target pool take over for retention.

Hoping that you all will enlighten me where I'm getting things confused.

In here would also expect to see snapshots "auto-hourly-hdd-2023-XX-YY" besides the replicated ones

Replication task settings

Edit: Better wording, and hopefully a bit more clear on what I mean

probain · Oct 30, 2023

Further testing. And it indeed seems like the replication completely overwrites/removes all of the snapshots at the target. Which in one way is not entirely unexpected.
So how would the better solution be then? Where I would like to backup one dataset from one pool to another. Then let the target pool maintain its own snapshots of the data therein.

Reason:
The source pool is a mirrored NVME-pool. So space is at a premium. The HDD-pool is for long term storage.

Please and thank you for insights

Patrick M. Hausen · Oct 30, 2023

Set the Retention policy for the destination to "Custom" instead of "Same as source" and you can configure e.g. 6 months at the destination with 2 weeks at the source.

probain · Oct 30, 2023

Patrick M. Hausen said:
Set the Retention policy for the destination to "Custom" instead of "Same as source" and you can configure e.g. 6 months at the destination with 2 weeks at the source.

Thanks for your input.
I've experimented with that. And keeping hourly snapshots for weeks and/or months is a bit overkill.

I was hoping that there was a way to have the target "take over" the responsibility of managing snapshots afterwards. But the replication task really overwrites all of those.

Patrick M. Hausen · Oct 30, 2023

These two tasks keep hourly snapshots on the source for two weeks, replicate only one snapshot every 24 hours to the destination, and keep them there for 6 months. That's about all the granularity I think I could ever need.

Bildschirmfoto 2023-10-30 um 13.20.52.png

Bildschirmfoto 2023-10-30 um 13.18.56.png

probain · Oct 30, 2023

I'm getting the impression that what I'm looking for, might be impossible then? That the source completely dominates the snapshots that even the target takes on its own. Removing them in the process.

This comes a bit of a surprise. Since I was expecting them to work parallell of each other.

Patrick M. Hausen · Oct 30, 2023

The target is just an SSH/RPC destination. All properties of the replication are controlled on the source side. The target does not even need to be TrueNAS. Anything with SSH and ZFS will probably do.

You can of course configure a PULL replication. Then everything is controlled on the target side. That requires a trusted channel from the target to the source, though, and might be undesirable from a network/zone design point of view.

What is missing from my example tasks? There is no dynamic expiration, anyway, neither push nor pull. The source does not know the capacity of the destination. But you do. So start with one month retention, if the target pool is 10% filled after a month, jack it up to 3 or 6 months. You monitor your capacity, anyway, don't you?

probain · Oct 30, 2023

I just realized that I haven't mentioned that these two pools live inside the same server. But that is kind of besides the point anyways if it's only a transport method. Other than that I can't set up a PULL replication (good suggestion though).

I'm just surprised that the replication task deletes the snapshots that are taken on the target separately. Even manual ones. What I expected was for the snapshots to be parallel to each other. And I'm equally surprised that this seems to be accepted behaviour, working as intended.

Is there any way to use replication, and then have the target manage the snapshots it takes on this dataset? I kind of need to have the hourly replication. But keeping these for months, would result in many thousands snapshots. And unless my other two suggestions for strengthening the snapshots dashboard [shameless plugg] go through. That would become a real hassle to manage

And yes. Of course I monitor the capacity. Which is also how I noticed this unexpected behaviour.

Patrick M. Hausen · Oct 30, 2023

Any one dataset should be managed by one snapshot and replication task alone, IMHO. Anything else is calling for desaster - see @winnielinnie's link here - probably because nobody ever thought of that use case.

I do

- hourly snapshots of the datasets on my SSDs with 2 weeks retention
- replicate them to the spinning disk pool on the same system, still all hourly snapshots, 4 weeks retention
- replicate them to an offsite system, only one snapshot per 24 hours, 6 months retention

All of that works perfectly well and reliably. Why I would on the destination take another set of snapshots of something that is initiated and controlled by the source, I do not quite understand. Why does your target take extra snapshots locally? It can get all the snapshots you want to keep as replicated ones from the source. They are read-only on the target, anyway - you cannot modify them or replication will break.

probain · Oct 30, 2023

Thank you for your insights and comments. I'm going to have to think about what my options are. And that also includes re-evaluating some preconceived notions and assumptions.

Once again. Thanks!

winnielinnie · Oct 30, 2023

Patrick M. Hausen said:
Any one dataset should be managed by one snapshot and replication task alone

Which is why "staged" snapshots would be a great thing for TrueNAS. (It's available in other software.)

Theoretical illustration of "staged" snapshots, handle by a single task

Unfortunately, zettareppl is too entrenched in its "own way" of handling pruning (i.e, parseable names), which I don't believe iXsystems is willing to re-code from scratch an entirely new staging-pruning system.

Yes, I'm aware that this isn't the place to make feature requests, but I already know it will be shot down if I create a ticket for it. (It's a major feature that would obsolete their current implementation.)

Not trying to go off-topic. Just wanted to share my opinion.

Patrick M. Hausen · Oct 30, 2023

probain said:
Thank you for your insights and comments. I'm going to have to think about what my options are. And that also includes re-evaluating some preconceived notions and assumptions.

Just to elaborate a bit on my last remark ...

If you have replicated, say, a hundred snapshots to the target system or pool, and then take another snapshot of that target pool locally ... that one will be 100% identical with the last snapshot you replicated. The target is read only. It is whatever is in the youngest snapshot.

That's if you stick with what zettarepl offers and makes easy to configure.

See my next post for my thoughts on @winnielinnie's ideas.

Patrick M. Hausen · Oct 30, 2023

winnielinnie said:
Which is why "staged" snapshots would be a great thing for TrueNAS. (It's available in other software.)

Sure, that would be nice, but that is simply not what zettarepl does. Although it can be achieved somewhat if you pick a different destination for each of your retention policies. I think you can quite easily use the "replicate only snapshots" field to achieve that.

Local snapshots hourly, 4 weeks retention
Use a "replicate only" expression that matches once per day, replicate only these with 6 months retention
Use a "replicate only" expression that matches once per week, replicate only these with 2 years retention
Use a "replicate only" expression that matches once per month, replicate only these with 10 years retention

Use a different target dataset for each. You will need roughly 3 times the space, granted.

And now you really got me thinking

With all the reliance on naming schemes instead of ZFS properties ... why not use that to our advantage?

Create an hourly, daily, weekly, monthly snapshot task and put "hourly, daily, weekly, monthly" in the naming scheme.
Then you should be able to stay within a single dataset. The extra snapshots on the source do not take space.

If e.g. at 12 AM on the 1st of a month which also happens to be a Sunday you take

@auto-hourly-2023-10-30-00
@auto-daily-2023-10-30
@auto-weekly-2023-44-Sun # week number and day
@auto-monthly-2023-10

They won't take up four times the space of a single one. Since they are all identical, I think I have valid reasons to assume the space will be taken only once. Then run 4 replication tasks into the same target dataset. All different "namespaces", no risk of accidental deletion.

probain · Oct 30, 2023

Patrick M. Hausen said:
Sure, that would be nice, but that is simply not what zettarepl does. Although it can be achieved somewhat if you pick a different destination for each of your retention policies. I think you can quite easily use the "replicate only snapshots" field to achieve that.

Local snapshots hourly, 4 weeks retention
Use a "replicate only" expression that matches once per day, replicate only these with 6 months retention
Use a "replicate only" expression that matches once per week, replicate only these with 2 years retention
Use a "replicate only" expression that matches once per month, replicate only these with 10 years retention

Use a different target dataset for each. You will need roughly 3 times the space, granted.

And now you really got me thinking

With all the reliance on naming schemes instead of ZFS properties ... why not use that to our advantage?

Create an hourly, daily, weekly, monthly snapshot task and put "hourly, daily, weekly, monthly" in the naming scheme.
Then you should be able to stay within a single dataset. The extra snapshots on the source do not take space.

If e.g. at 12 AM on the 1st of a month which also happens to be a Sunday you take

@auto-hourly-2023-10-30-00
@auto-daily-2023-10-30
@auto-weekly-2023-44-Sun # week number and day
@auto-monthly-2023-10

They won't take up four times the space of a single one. Since they are all identical, I think I have valid reasons to assume the space will be taken only once. Then run 4 replication tasks into the same target dataset. All different "namespaces", no risk of accidental deletion.

This is basically what I'm doing in my pools. And this is also why I was surprised that my replications overwrote those. Even when the naming scheme was different..

Redcoat · Oct 30, 2023

probain said:
This is basically what I'm doing in my pools.

Which of Patrick's two methods is "basically what you are doing"?

probain · Oct 30, 2023

Redcoat said:
Which of Patrick's two methods is "basically what you are doing"?

AH, sorry..

I'm doing similar to this listed below.

Create an hourly, daily, weekly, monthly snapshot task and put "hourly, daily, weekly, monthly" in the naming scheme.
Then you should be able to stay within a single dataset. The extra snapshots on the source do not take space.

My snapshots are basically like this. With namingscheme : retention. Same goes for the pool 'nvme' too, instead of 'hdd'.

Code:

auto-hourly-hdd-%Y-%m-%d_%H-%M : 24h
auto-daily-hdd-%Y-%m-%d_%H-%M: 7 days
auto-weekly-hdd-%Y-%m-%d_%H-%M: 4 weeks
auto-monthly-hdd-%Y-%m-%d_%H-%M: 1 months

But I've now already given in. And rethought about how I use the replications. And will just have to live with the many many thousands of snapshots in the target dataset.

winnielinnie · Oct 30, 2023

Patrick M. Hausen said:
Create an hourly, daily, weekly, monthly snapshot task and put "hourly, daily, weekly, monthly" in the naming scheme.
Then you should be able to stay within a single dataset. The extra snapshots on the source do not take space.

[ ... snipped some stuff about how great Winnie is since it takes up too much text ... ]

They won't take up four times the space of a single one. Since they are all identical, I think I have valid reasons to assume the space will be taken only once. Then run 4 replication tasks into the same target dataset. All different "namespaces", no risk of accidental deletion.

The "issue" I have with that is it's a "workaround" for something that should be designed and implemented by the software team itself.

(It can work, and I've setup something similar for some friends that use TrueNAS.)

But guess what? Now you're creating 4 tasks for a single dataset. That means you'd have 16 tasks if done with four datasets, etc.

"Smart staging" of backups and snapshots from a single schedule is not a novel concept. Even file-based backup softwares includes this, such as BorgBackup and BackInTime.

I think what happened is that zettareppl was designed to leverage "parseable" names (since it's easier to code, and is faster than inspecting metadata or keeping track with a separate database); thus, after many years later, for legacy reasons, it just remained this way.

Now that I think about it, it might still be possible to do using "parseable" names. Hmmmmmm....

You create a single snapshot task, to take a snapshot every hour
There would be a "feature" in the snapshot task GUI to use a "preset" or "custom" staging/pruning schedule
It would rename existing snapshots with -weekly, -monthly, -yearly, etc, to make sure that at least ___ amount of them exist
The remaining ones would not be renamed, but rather destroyed.
The ones that were renamed will have their own retention policy, based on the "preset" or "custom" staging/pruning options you configured.

It sounds convoluted, but at least it can theoretically work with zettarepl's "parseable names".

Patrick M. Hausen · Oct 30, 2023

winnielinnie said:
But guess what? Now you're creating 4 tasks for a single dataset.

How many if these would you have? I have one VM dataset, of course iocage/jails, and then one share dataset. All with children, of course, but I just treat these three recursively and call it a day.

You are right, though

Important Announcement for the TrueNAS Community.

SOLVED Source Replication disables/invalidates/removes/overwrites destination Snapshot schedules?

probain

Patron

probain

Patron

Patrick M. Hausen

Hall of Famer

probain

Patron

Patrick M. Hausen

Hall of Famer

probain

Patron

Patrick M. Hausen

Hall of Famer

probain

Patron

Patrick M. Hausen

Hall of Famer

probain

Patron

winnielinnie

MVP

Patrick M. Hausen

Hall of Famer

Patrick M. Hausen

Hall of Famer

probain

Patron

Redcoat

MVP

probain

Patron

winnielinnie

MVP

Patrick M. Hausen

Hall of Famer

Similar threads

Important Announcement for the TrueNAS Community.

SOLVED Source Replication disables/invalidates/removes/overwrites destination Snapshot schedules?

Patron

Patron

Hall of Famer

Patron

Hall of Famer

Patron

Hall of Famer

Patron

Hall of Famer

Patron

MVP

Hall of Famer

Hall of Famer

Patron

MVP

Patron

MVP

Hall of Famer

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Source Replication disables/invalidates/removes/overwrites destination Snapshot schedules?"

Similar threads