Replication between identical datasets with no common snapshot

Patrick_3000 · Oct 3, 2023

I have a dataset on a SCALE server with eight child datasets. Let's call it Dataset1. It's approximately 5 TB. I have another dataset which is identical, including the child datasets, on another SCALE server. Let's call it Dataset2.

I use rsync over ssh to keep Dataset2 identical to Dataset1. Unfortunately, it requires eight rsync tasks to sync all the child datasets. I'd prefer to use replication, but I've had some problems with it in the past, and I'm wondering if there is a way to resolve those problems and start using replication again.

The problem I had with replication was that the first SCALE server (the source) had to be taken offline for a week, and during that time, all the snapshots expired on the second scale server (the destination) and were automatically destroyed as they were past their lifetime. In retrospect, I should have established longer snapshot lifetimes. But in any case, after the first SCALE server was back online, there were no more snapshots on the second SCALE server, so, at least as I understood it, I would have had to run the replication task from scratch and transfer the entire 5 TB dataset, which would have been a lot of I/O on the hard drives. So, I abandoned replication and switched to rsync.

So here is the main thing I'm wondering: does anyone know if there is a way to resume replication without transferring the entire 5 TBs of data? The datasets are identical anyway. Maybe there is a way through the command line to just transfer snapshots so there is a common snapshot between the datasets?

I also realize that I could just replicate from scratch and transfer the entire 5 TB, but the problem is that if I have to do that this time, it tells me that I might have to do it in the future if something else goes wrong, in which case I'm not sure that replication is the best way to sync the datasets, and maybe I'm better off staying with rsync.

winnielinnie · Oct 3, 2023

Patrick_3000 said:
The datasets are identical anyway

Only "identical" in the eyes of a file-based view. This is unrelated to ZFS.

Patrick_3000 said:
Maybe there is a way through the command line to just transfer snapshots so there is a common snapshot between the datasets?

You can't. You've already created and populated two separate datasets. The backup destination should only grow and populate from snapshots of the source destination, from the very beginning with the first full replication. There's no way to mix-and-match snapshot replications with rsync (between source and destination targets.)

Patrick_3000 said:
I also realize that I could just replicate from scratch and transfer the entire 5 TB, but the problem is that if I have to do that this time, it tells me that I might have to do it in the future again if something else goes wrong

Only if you allow a "drift" between common snapshots, either with accidental deletions, or snapshot expirations and auto pruning. Since you're taking your server(s) offline, and there is a pruning task that runs on the destination server, it's bound to happen again.

You'll have to change the workflow and/or use longer expirations and/or make use of the "hold" and "bookmark" feature. (The last point can be handled with a script.)

Patrick_3000 · Oct 3, 2023

winnielinnie said:
Only "identical" in the eyes of a file-based view. This is unrelated to ZFS.

You can't. You've already created and populated two separate datasets. The backup destination should only grow and populate from snapshots of the source destination, from the very beginning with the first full replication. There's no way to mix-and-match snapshot replications with rsync (between source and destination targets.)

Only if you allow a "drift" between common snapshots, either with accidental deletions, or snapshot expirations and auto pruning. Since you're taking your server(s) offline, and there is a pruning task that runs on the destination server, it's bound to happen again.

You'll have to change the workflow and/or use longer expirations and/or make use of the "hold" and "bookmark" feature. (The last point can be handled with a script.)

Thanks for the explanation. Replication would certainly have some advantages over rsync in that it can be done in one task, including child datasets, and it's faster.

Assuming that I switch to replication, which I don't understand as well as I understand rsync, would I be able to replicate the primary dataset to the secondary dataset using two different snapshot tasks? I'm thinking hourly snapshots that expire after one day, and daily snapshots that expire after one month. That would ensure the backup dataset is kept current and also that snapshots last long enough to survive any necessary downtime on the primary server. (I wouldn't want to do hourly snapshots that expire after one month because it's awkward to accumulate that many snapshots.)

winnielinnie · Oct 3, 2023

Patrick_3000 said:
would I be able to replicate the primary dataset to the secondary dataset using two different snapshot tasks? I'm thinking hourly snapshots that expire after one day, and daily snapshots that expire after one month.

Are you implying you'll only configure a replication task to send the latest "daily" snapshots to the backup? Or rather, that you'll configure two replication tasks, in which "hourly" and "daily" snapshots are sent over?

Patrick_3000 · Oct 3, 2023

winnielinnie said:
Are you implying you'll only configure a replication task to send the latest "daily" snapshots to the backup? Or rather, that you'll configure two replication tasks, in which "hourly" and "daily" snapshots are sent over?

The latter. That is, I'll configure two replication tasks, in which "hourly" and "daily" snapshots are sent over?

winnielinnie · Oct 3, 2023

Patrick_3000 said:
The latter. That is, I'll configure two replication tasks, in which "hourly" and "daily" snapshots are sent over?

I can only imagine that the "hourly" replication task will fail if a common snapshot is missing.

The Replication Tasks, as designed for TrueNAS, are very narrow in scope. There's no "fallback to another replication task" built in.

So then you'll be relying solely on your "daily" replication task, which undermines the point of setting up hourly replications in the first place.

Patrick_3000 · Oct 3, 2023

winnielinnie said:
I can only imagine that the "hourly" replication task will fail if a common snapshot is missing.

The Replication Tasks, as designed for TrueNAS, are very narrow in scope. There's no "fallback to another replication task" built in.

So then you'll be relying solely on your "daily" replication task, which undermines the point of setting up hourly replications in the first place.

It seems to me that, at that point, I could edit the daily snapshot task, change the schedule to hourly, delete the old hourly snapshot task, and create a new daily snapshot task, unless I'm missing something.

winnielinnie · Oct 3, 2023

zettarepl (the underlying iXsystem software) works by parsing the actual snapshot names. It doesn't use any sort of extraneous database or table.

Your daily snapshots are already named with "-daily" or such. Which is why this won't work:

Patrick_3000 said:
I could edit the daily snapshot task, change the schedule to hourly

If you think "I just won't use any unique identifier for my snapshot names", then you open up an even scarier can of worms, which I explain in another post.

Patrick_3000 · Oct 3, 2023

winnielinnie said:
zettarepl (the underlying iXsystem software) works by parsing the actual snapshot names. It doesn't use any sort of extraneous database or table.

Your daily snapshots are already named with "-daily" or such. Which is why this won't work:

If you think "I just won't use any unique identifier for my snapshot names", then you open up an even scarier can of worms, which I explain in another post.

It seems to me that I can give the hourly snapshots names starting with "Snap_A" and give the daily snapshots names starting with "Snap_B."

Then, if I lose all the hourly snapshots, I can edit the schedule for the daily snapshot task (for the ones with names starting with "Snap_B") and reschedule them to run hourly. Then I can delete the old hourly task which will fail anyway, and create a new hourly snapshot task and give it names starting with "Snap_C."

Important Announcement for the TrueNAS Community.

Replication between identical datasets with no common snapshot

Patrick_3000

Contributor

winnielinnie

MVP

Patrick_3000

Contributor

winnielinnie

MVP

Patrick_3000

Contributor

winnielinnie

MVP

Patrick_3000

Contributor

winnielinnie

MVP

Patrick_3000

Contributor

Similar threads