ZFS Rollup - A script for pruning snapshots, similar to Apple's TimeMachine

kavermeer · Dec 25, 2012

I have something that seems to do what I want (including working down the tree). I'll change that such that it simply deletes one snapshot from each dataset for every iteration. However, family life is keeping my busy during these holidays...

fracai · Dec 25, 2012

Families tend to get in the way. And they have zero appreciation for things like, "Hold on, I need a couple hours to work on this ZFS snapshot script." You'd think they wanted to spend time together or something. Sheesh!

Happy holidays!

kavermeer · Dec 27, 2012

This is a first version of the script: View attachment clearempty.zip . It needs some more polishing. Please note that I'm not a programmer, so run at your own risk.

It seems to work fine on my system, but I haven't tested it throughly. I welcome any comments, although I can't promise that I can actually improve anything... Let me know your thoughts.

Stephens · Dec 27, 2012

kavermeer said:
Please note that I'm not a programmer, so run at your own risk.

Don't worry. Even if you were a programmer, the advice would be the same.

fracai · Dec 27, 2012

Thanks for posting this; here are my thoughts so far:

Thanks for the "type" attribute. I'm looking into how I can use this as it's cleaner than just looking for the presence of '@'.
What's the TODO regarding fixing the recursive option? I'll admit that checking for "dataset+'@'" isn't the prettiest, but is there something that's broken about it?
Is there more info on 'freeness:state' somewhere? When I search for this I'm only finding tickets and references to replication.

For not being a programmer you're using some nifty tricks like lambda and assigning the result of an inline loop.

And there are some design choices that I'd do differently (ie. using creation time instead of the time portion of the snapshot name), but I think those are all cosmetic.

So, thanks again for posting. Do you have any objection to me adding this to my rollup repository? (attributions would of course remain)

kavermeer · Dec 28, 2012

Thanks for your comments! I think the 'type' attribute is probably the most authoritative way of distinguishing snapshots from datasets. But I haven't found any good source on what is allowed in names of volumes/datasets, so maybe checking for the '@' is just as good.

I just noticed that 'freenas:state' was set sometimes and decided that I should not interfere with that. I assume it's some housekeeping of one of the freenas-scripts, but I haven't found documentation on how all parts of the freenas-system work together.

You are correct about the time. However, I didn't want to look into how the 'creation' time should be parsed. But I agree that would be the better way.

The TODO should be removed: I added that functionality (copied two lines from your script) and simply forgot to remove the TODO line.

While I'm not a programmer, I know how to program a computer. I do algorithm development (mostly Matlab) on a regular basis, but that's just for proof of principle. Compared to a 'real' programmer, I lack skills regarding documentation, VCS, error checking, coding style and many more of those things that matter in real-life applications.

Anyway, I have no objections to you adding the script to your repository at all - see my comment on VCS above!

kavermeer · Jan 22, 2013

I just checked out your changes to the clearempty.py-script. I noticed you removed the freenas:type-property-check (commits f3644e7 and 5f4a99b). Did you find some information on what that property does or how its used? I'd really hate to interfere with other scripts on my FreeNAS-system.

fracai · Jan 22, 2013

Ah, sorry, I actually meant to ask you more about that. It was partly intentional and partly oversight. Basically, I haven't seen it used anywhere, so it wasn't a priority for me (hence it not showing up in rollup.py) and my initial search for information didn't turn anything up so I forgot about looking further.

I could see it being used for replication (marking a snapshot as in use), or possibly to identify snapshots that have been browsed as a filesystem, but those are just guesses. I'll look into it further (thanks for the reminder) and I'll start a thread if I can't find anything on my own.

If it's an important field I'll absolutely put in code to handle that.

fracai · Jan 22, 2013

So far I've found three states for 'freenas:state', all related to replication.

empty = already replicated
LATEST = latest to be replicated
NEW = not yet replicated

This ticket suggests that only LATEST should be retained. An already replicated snapshot doesn't need any special handling and a NEW state doesn't either if it's reached the expiration stage, but LATEST needs to be retained in order to properly get the changes to the remote side during the next replication.

I could see issues if a snapshot is deleted while replication is occurring, but I won't be able to test this until later today.

kavermeer · Jan 22, 2013

I don't have any information on freenas:state - I just noticed that it was there and decided that my safest bet was to only touch snapshots that had an empty state.

I checked on my two systems. My main system has one LATEST snapshot, which indeed seems the newest one. It's only set on the top-level dataset, not on any of the other datasets (although they do have their own period snapshots set. I do not have any NEW snapshot, at least not at the moment. But that's likely because I now replicate continuously, although I plan to eventually only replicate during the evening. My replication system shows all empty states.

Let me know what you find out, so I can decide whether to run the new version of the script. I guess keeping LATEST around doesn't do much harm anyway. Worst case, it'll only be deleted on the next run, when it's no longer the latest.

kavermeer · Mar 29, 2013

I noticed that when using the cleanup.py version from your repository, some empty snapshots are not removed. It seems like the order of deleting snapshots from the list of candidates for removal is wrong: It now first gets rid of non-empty snapshots, manual snapshots and snapshots with a set freenas:state, and then deletes the most recent snapshot. I think that should be:

Delete manual snapshots
Delete latest snapshot
Delete non-empty snapshots and snapshot with non-empty freenas:state.

After all, it's fine to remove the last empty snapshot if there is a newer non-empty one or one with freenas:state set. Let me know what you think - I can make the required changes myself, but I just want to make sure my logic isn't flawed.

fracai · Mar 29, 2013

I keep going back and forth between thinking you're right and thinking this isn't actually an issue. I had to break out some paper and diagram a few example cases before I was convinced that there actually needs to be a change. As you say, if the second most recent snapshot is empty and the most recent is not, the current code will not consider either. With your updated code, the second most recent snapshot would still be considered.

Thanks for noticing this.

How does this look? I haven't tested it.

Code:

        latest = None
        for snapshot in sorted(snapshots[dataset], key=lambda snapshot: snapshots[dataset][snapshot]['creation'], reverse=True):
            if not snapshot.startswith("auto-") \
                or snapshots[dataset][snapshot]['type'] != "snapshot" \
                del snapshots[dataset][snapshot]
                continue

            if not latest:
                latest = snapshot
                del snapshots[dataset][snapshot]
                continue

            if snapshots[dataset][snapshot]['used'] != '0' \
                or snapshots[dataset][snapshot]['freenas:state'] != '-' \
                or snapshot in deleted[dataset].keys():
                del snapshots[dataset][snapshot]
                continue

        # Stop if no snapshots are in the list
        if not snapshots[dataset]:
            del snapshots[dataset]
            continue

I also kept the loop together by sorting the hash by creation date and then keeping track of 'latest'. This also means that before we were looping over N, checking the list size, then sorting N-D, and checking the size. Correcting the algorthim meant looping over N, checking the list size, sorting N-D1, checking the list size, looping over N-D1-1, and checking the list size again. Now it sorts N, loops N, checks the list size. I'll leave the Big O comparison as an exercise for the reader, but while the incorrect algorthim was probably technically faster, the correct method shouldn't be significantly slower. Pfft, how many snapshots do people have anyway. In my experience it's deleting snapshots that actually takes time and once the scripts are running periodically the delay isn't noticeable because they're only deleting one or two snapshots anyway. Heck they're probably run by cron so it's not noticeable at all. Even if they're only run once a day or week, odds are in that case they're going to be run during downtime when it's not going to impact anyone anyway. The takeaway here is that simple bug reports lead to far too much thought.

Unfortunately, I won't be able to commit a fix for at least 12 hours. I think I'll be able to review a pull request much sooner though, so If you have your own patch, feel free.

kavermeer · Mar 31, 2013

The time that the script needs to run probably isn't that important, if you run it often. Currently, I don't cleanup empty snapshots automatically, so if I don't do it for some time, I have a few thousand snapshots. Running the script then takes 10-20 minutes (very rough estimate from the top of my head). After that, I have 100-200 snapshots left. I plan on letting it run in the evening, after office hours, as a cron job.

I think the code is fine, but I haven't tested it either. I can't try it before next Friday, but I'd be more than happy to run it then.

fracai · Mar 31, 2013

I pushed some changes to both scripts to keep the LATEST and most recent NEW and to alter the consideration behavior that you reported. I'd appreciate you looking at it and testing.

Thanks again for reporting the issue.

kavermeer · Apr 8, 2013

I'm not entirely sure about the handling of the NEW snapshots, because I don't know how this is handled by the freenas system. Does it actually happen that the latest snapshot is not NEW, but an earlier one is?

Anyway, the cleanup-script seems to work fine (it removed 1000 snapshots in 5 minutes). I'm not using the rollup-script at the moment, so I haven't tested that one.

noprobs · Apr 8, 2013

From my analysis on a system with replication. A snapshot is created on the primary server with a state of NEW and then changes to LATEST after replication. When the next snapshot is taken that one will become the LATEST and all earlier will have a status of '-'. Note if replication is not successful then snapshots remain in a state of NEW.

fracai · Apr 10, 2013

I'm not sure about how this works either to be honest. My interpretation was that you could end up with multiple snapshots marked 'NEW' or '-', but only one would ever be 'LATEST'. That's the assumption that the rollup and clearempty scripts take when handling these. If someone who is actually using replication can correct any of this I'll incorporate any relevant changes.

In other words, before replication runs for the first time you could have:
tank@auto-20130401-0100-4m NEW
tank@auto-20130401-0200-4m NEW
tank@auto-20130401-0300-4m NEW
tank@auto-20130401-0400-4m NEW

After replication you'd have:
tank@auto-20130401-0100-4m -
tank@auto-20130401-0200-4m -
tank@auto-20130401-0300-4m -
tank@auto-20130401-0400-4m LATEST

If you then make several snapshots without replication being able to run (network is down), you could end up with:
tank@auto-20130401-0100-4m -
tank@auto-20130401-0200-4m -
tank@auto-20130401-0300-4m -
tank@auto-20130401-0400-4m LATEST
tank@auto-20130401-0500-4m NEW
tank@auto-20130401-0600-4m NEW
tank@auto-20130401-0700-4m NEW

If nothing has changed since 0200 and you run the clearempty script, you would end up with:
tank@auto-20130401-0100-4m -
tank@auto-20130401-0200-4m -
tank@auto-20130401-0400-4m LATEST
tank@auto-20130401-0700-4m NEW

Then finally, the network comes back up and replication completes:
tank@auto-20130401-0100-4m -
tank@auto-20130401-0200-4m -
tank@auto-20130401-0400-4m -
tank@auto-20130401-0700-4m LATEST

Running clearempty at this point would result in:
tank@auto-20130401-0100-4m -
tank@auto-20130401-0200-4m -
tank@auto-20130401-0700-4m LATEST

Thanks for any corrections to my assumptions.

noprobs · Apr 10, 2013

fracai,

You're understanding is almost correct. If replication fails then freenas:state will not move from off NEW so you will only have '-' and 'NEW' see example as below

SlowVol/Installs@auto-20130404.0131-1w -
SlowVol/Installs@auto-20130406.0131-1w -
SlowVol/Installs@auto-20130406.0731-1w -
SlowVol/Installs@auto-20130407.1331-1w -
SlowVol/Installs@auto-20130408.1331-1w -
SlowVol/Installs@auto-20130408.1931-2d NEW
SlowVol/Installs@auto-20130409.0131-2d NEW
SlowVol/Installs@auto-20130409.0731-2d NEW
SlowVol/Installs@auto-20130409.1331-2d NEW
SlowVol/Installs@auto-20130409.1931-2d NEW
SlowVol/Installs@auto-20130410.0131-2d NEW
SlowVol/Installs@auto-20130410.0731-1w NEW
SlowVol/Installs@auto-20130410.0806-2d NEW
SlowVol/Installs@auto-20130410.1406-2d NEW
SlowVol/Installs@auto-20130410.2006-2d NEW

Note change from 2d to 1w is me tweaking my retention script.

On replication server status is always '-' though I have raised enhancement ticket to set 'LATEST' - I am testing a tweak to autorepl.py

FYI I run a modified autosnap.py script to clear up expired snapshots on replication server.

fracai · Apr 10, 2013

Right, but wouldn't the source machine still have a 'LATEST' on the last snapshot to be replicated?

noprobs · Apr 11, 2013

I would have thought so but it appears not

Autosnap.py just sets freenas:state value to 'NEW' if there is an associated replication. It replication is failing then it appears autorepl.py exits before the value of freenas:state is set to LATEST.

This is my status on primary server with replication failing (as pull server is powered off)

SlowVol/Installs@auto-20130406.0131-1w -
SlowVol/Installs@auto-20130406.0731-1w -
SlowVol/Installs@auto-20130407.1331-1w -
SlowVol/Installs@auto-20130408.1331-1w -
SlowVol/Installs@auto-20130408.1931-2d NEW
SlowVol/Installs@auto-20130409.0131-2d NEW
SlowVol/Installs@auto-20130409.0731-2d NEW
SlowVol/Installs@auto-20130409.1331-2d NEW
SlowVol/Installs@auto-20130409.1931-2d NEW
SlowVol/Installs@auto-20130410.0131-2d NEW
SlowVol/Installs@auto-20130410.0731-1w NEW
SlowVol/Installs@auto-20130410.0806-2d NEW
SlowVol/Installs@auto-20130410.1406-2d NEW
SlowVol/Installs@auto-20130410.2006-2d NEW
SlowVol/Installs@auto-20130411.0206-1w NEW
SlowVol/Installs@auto-20130411.0806-2d NEW
SlowVol/Installs@auto-20130411.1406-2d NEW
SlowVol/Installs@auto-20130411.2006-2d NEW

And this is status after replication has caught up

SlowVol/Installs@auto-20130406.0131-1w -
SlowVol/Installs@auto-20130406.0731-1w -
SlowVol/Installs@auto-20130407.1331-1w -
SlowVol/Installs@auto-20130408.1331-1w -
SlowVol/Installs@auto-20130410.0131-2d -
SlowVol/Installs@auto-20130410.0731-1w -
SlowVol/Installs@auto-20130410.0806-2d -
SlowVol/Installs@auto-20130410.1406-2d -
SlowVol/Installs@auto-20130410.2006-2d -
SlowVol/Installs@auto-20130411.0206-1w -
SlowVol/Installs@auto-20130411.0806-2d -
SlowVol/Installs@auto-20130411.1406-2d -
SlowVol/Installs@auto-20130411.2006-2d LATEST

Jon

Important Announcement for the TrueNAS Community.

ZFS Rollup - A script for pruning snapshots, similar to Apple's TimeMachine

Explorer

Guru

Explorer

Patron

Guru

Explorer

Explorer

Guru

Guru

Explorer

Explorer

Guru

Explorer

Guru

Explorer

Explorer

Guru

Explorer

Guru

Explorer

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "ZFS Rollup - A script for pruning snapshots, similar to Apple's TimeMachine"

Similar threads