Colliding Periodic Snapshots

Status
Not open for further replies.

dpearcefl

Contributor
Joined
Aug 4, 2015
Messages
145
Learning FreeNAS with two boxes. This morning I recreated a replication task (every ten minutes) to another FreeNAS box. To test a slow connection, I limited the bandwidth. I misread the screen and set a realllly sloooow replication. it started and took over 40 minutes. I changed the task so it was not limited by bandwidth.

Before I recognized the problem, the same task started again before the first one had finished. Before all was said and done, I had a mess which I cleaned up by destroying the offending snapshots.

This got me thinking, if I want to set up multiple snapshot schedules (keep dailies for 7 days, weeklies for 5 weeks, monthlies for 12 months, etc.) for the same volume/dataset but with differing expirations, will the different schedules know to expire their snapshots and not the other snapshots?
 

rogerh

Guru
Joined
Apr 18, 2014
Messages
1,111
I asked that question, some time ago, and I think the answer is that the lifetime of the snapshot is specified in its name, and a process independent of any of the individual snapshot tasks will spot when it has expired and delete it. Also, the replication task will notice when one of the snapshots it has transferred is destroyed and delete it from the replication receiving system. I don't actually understand the mechanisms, but they do seem to work just like this in practice.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
I asked that question, some time ago, and I think the answer is that the lifetime of the snapshot is specified in its name, and a process independent of any of the individual snapshot tasks will spot when it has expired and delete it. Also, the replication task will notice when one of the snapshots it has transferred is destroyed and delete it from the replication receiving system. I don't actually understand the mechanisms, but they do seem to work just like this in practice.
I'm fairly certain that it's the snapshot task itself that cleans up after itself by deleting the snapshots that have expired, on a per task basis.
 

rogerh

Guru
Joined
Apr 18, 2014
Messages
1,111
I'm fairly certain that it's the snapshot task itself that cleans up after itself by deleting the snapshots that have expired, on a per task basis.
Sorry, I must have misunderstood. But I did know that you have to have a snapshot task running to delete snapshots, I just didn't realise that you needed all snapshot tasks that have extant snapshots running to delete their own snapshots.
 

cbarber

Dabbler
Joined
Sep 23, 2017
Messages
17
Found this thread looking for pretty much this same situation, just generally thinking about how to do multiple snapshot schedules on a single dataset to achieve the usual daily/weekly/monthly/whateverly retention schemes.

intuitively, seems like just creating those multiple snapshot tasks each with its own period and expiration is the way to go. Is this the recommended approach?

But "you have to have a snapshot task running to delete snapshots" means what? Clearly the snapshot task will not delete its own expired snaps if it is not running, but the sentence taken at face value might imply that I can't manually delete snapshots with that task running? That can't be right, can it?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194

rogerh

Guru
Joined
Apr 18, 2014
Messages
1,111
Found this thread looking for pretty much this same situation, just generally thinking about how to do multiple snapshot schedules on a single dataset to achieve the usual daily/weekly/monthly/whateverly retention schemes.

intuitively, seems like just creating those multiple snapshot tasks each with its own period and expiration is the way to go. Is this the recommended approach?

A separate task for each frequency and lifetime creates no problems and I agree it is the obvious way to do it. You are then left with a sensible number of snapshots, the system copes badly with thousands!

Interestingly, the deletion process will occur for all the stale snapshots of a given dataset even if only one of the tasks is running.

Only one replication task is needed for each dataset, even if a number of snapshot tasks are running.

(Your deletion question has already been answered)

With big datasets it makes sense to me to start with an *infrequent* snapshot task and replicate that, so that only small differential snapshots are being replicated when you start doing the more frequent ones. You can add or modify a snapshot task for a dataset without revising the replication task for the same snapshot.
 

cbarber

Dabbler
Joined
Sep 23, 2017
Messages
17
A separate task for each frequency and lifetime creates no problems and I agree it is the obvious way to do it. You are then left with a sensible number of snapshots, the system copes badly with thousands!

Interestingly, the deletion process will occur for all the stale snapshots of a given dataset even if only one of the tasks is running.

Only one replication task is needed for each dataset, even if a number of snapshot tasks are running.

(Your deletion question has already been answered)
Thanks for the sanity check! Interesting thing about the deletion process. So this must mean that the "staleness" is defined by the context of the task the snapshot was made by, but is knowable by all tasks, right? Because there could be several snapshots on the same dataset of the same age, but only one (or generally not all of them) is expired. Something about the snapshot itself must explicitly say when it expires? I think this was alluded to in an earlier comment but there seemed to be some uncertainty around this point.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
You are then left with a sensible number of snapshots, the system copes badly with thousands!
Actually, even listing thousands of snapshots isn't painful, and that's more or less the only operation on snapshots that grows with the number of snapshots.
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
Colliding snapshots are a thing though...

I haven't tracked it down fully yet, but I think if you have a recursive snapshot with the same expiry as a child snapshot, then it will generate the same snapshot name, and then you get a collision.
 

rogerh

Guru
Joined
Apr 18, 2014
Messages
1,111
Something about the snapshot itself must explicitly say when it expires?

I must admit I felt a little as though I had missed the obvious when someone pointed out that it is all encoded in the name - time of generation and lifetime - so the name does explicitly specify expiry.
 

rogerh

Guru
Joined
Apr 18, 2014
Messages
1,111
Colliding snapshots are a thing though...

I haven't tracked it down fully yet, but I think if you have a recursive snapshot with the same expiry as a child snapshot, then it will generate the same snapshot name, and then you get a collision.

It seems clear that one could do this, but it is not obvious why you would want to. If there is a use case for recursive and non-recursive snapshots on the same dataset at all, then perhaps you could arrange either different lifetimes or different snapshot task times so that they cannot collide.
 
Status
Not open for further replies.
Top