How to get rid of 12000 snapshots?

indivision

Guru
Joined
Jan 4, 2013
Messages
806
I received a notification saying that I have over the recommended number of snapshots (12000+!!!).

I'm not quite sure how or why I would have this many as I don't have any snapshot tasks running at all.

Investigating into the Storage/Snapshots section, it looks like they are all related to ix-applications.

The GUI allows me to see 100 snapshots at a time and bulk delete 100 at a time. But, even when I do this it fails to delete half of the snapshots because they have a dependent clone. It would take a very long time to go through 12000 and delete this way. So, am looking for a better way.

How can I safely delete all (or every one that I can) of these snapshots?
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
In a root shell run
zfs list -t snap | awk '/<pattern>/ { printf "zfs destroy %s\n", $1 }'

Examine the output and adjust <pattern> until you see the destroy statements you want. Then append to the command:
zfs list -t snap | awk '/<pattern>/ { printf "zfs destroy %s\n", $1 }' | sh

HTH,
Patrick
 

indivision

Guru
Joined
Jan 4, 2013
Messages
806
Thank you!

What would the pattern need to be if I wanted to delete them all except for the very minimum needed to continue running the latest versions of the applications?
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
No idea, I do not use SCALE all that much. You can do a zfs list -t snap and have a close look if you spot something. If not, someone else will have to step in, now. "<pattern>" is a regular expression, if needed, btw.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
See here:
Basically, it's a built-in bug and it is very easy to damage one's system by trying to forcefully clean these automatic snapshots. I've solved the issue by decomissioning my SCALE testbed.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
If it's still interesting to destroy a lot of snapshots after reading the advice above, you can use the zfs destroy command and reference the oldest and newest snapshots (separated by %) you want to destroy and all snapshots in between will also be destroyed... like this:

zfs destroy pool/dataset@oldest-snapshot%newest-snapshot
 

indivision

Guru
Joined
Jan 4, 2013
Messages
806
Well. The good news is that I was able to clear the snapshots down to a sane number. And it has made a huge difference in performance. Not just in the apps but the entire server performance, boot-time, page changes, etc.

The bad news is that this broke all apps. Not really the end of the world for me to re-install those. I had set up external data for most. But, it's not even letting me install new apps currently. Any ideas on how to restore that functionality? Maybe moving the app pool?

There definitely needs to be something in place that mitigates this build-up over time. Or, at the very least something more up front visually (snapshot meter at the top of the apps section?) and a way to prune/address via GUI. Otherwise, it essentially places a relatively short life-span on an install.
 
Last edited:

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
I have 6 or so apps
The number of snapshots seem to have settled to abot 526 over the last few days and I turned off anything that was (I think) trying to control them a week ago.
Its something I am keeping an eye on.

Latest Scale seems to have improved things
 

indivision

Guru
Joined
Jan 4, 2013
Messages
806
I have 6 or so apps
The number of snapshots seem to have settled to abot 526 over the last few days and I turned off anything that was (I think) trying to control them a week ago.
Its something I am keeping an eye on.

Latest Scale seems to have improved things

That seems like a reasonable number if it somehow stays there. But, it sounds like it is currently designed to just keep adding new snaps forever (on every app update?)... If someone proposed doing that as a regular snapshot task schedule I think most people would recommend having quantity limits, etc.

I had been feeling like something was wrong with my system for a while due to performance. At least I finally found the culprit!
 

ian351c

Patron
Joined
Oct 20, 2011
Messages
219
@indivision I am in a similar situation as you having done pretty much the same things (set up backups of ix-applications, ended up with 1000s of snapshots, suffered severe performance penalty, disabled the backups, deleted the 1000s of snapshots, and then Apps stopped working). Every reboot I see this error:
Code:
File "/usr/lib/python3/dist-packages/middlewared/plugins/service_/services/kubernetes.py", line 28, in mount_kubelet_dataset
subprocess.CalledProcessError: Command '('mount', '-t', 'zfs', 'apps/ix-applications/k3s/kubelet', '/var/lib/kubelet')' returned non-zero exit status
future: <Task finished name='Task-3866' coro=<Middleware.call() done, defined at /usr/lib/python3/dist-packages/middlewared/main.py:1312> exception=CalledProcessError(1, ('mount', '-t', 'zfs', 'apps/ix-applications/k3s/kubelet', '/var/lib/kubelet'), b'', b"filesystem 'apps/ix-applications/k3s/kubelet' cannot be mounted using 'mount'.\nUse 'zfs set mountpoint=legacy' or 'zfs mount apps/ix-applications/k3s/kubelet'.\nSee zfs(8) for more information.\n")>
    await self.mount_kubelet_dataset()
  File "/usr/lib/python3/dist-packages/middlewared/plugins/service_/services/kubernetes.py", line 28, in mount_kubelet_dataset
subprocess.CalledProcessError: Command '('mount', '-t', 'zfs', 'apps/ix-applications/k3s/kubelet', '/var/lib/kubelet')' returned non-zero exit status


And after the boot completes the Apps screen in the GUI says something like "Apps aren't running". I use the following incantation at the shell to make Apps work:
Code:
zfs set mountpoint=legacy apps/ix-applications/k3s/kubelet
mount -t zfs apps/ix-applications/k3s/kubelet /var/lib/kubelet
zfs set mountpoint=legacy $(zfs list -o name | grep ix-applications | grep pvc)
midclt call service.stop kubernetes
midclt call service.start kubernetes


I have a case open with iX to resolve this. Feel free to follow the case if it looks like you have the same issue.
 

truecharts

Guru
Joined
Aug 19, 2021
Messages
788
Some notes on this issues:
- Never directly make snapshots of the iX-applications dataset, outside of the SCALE Apps Backup API (for which we've guides available)
- Never EVER delete any snapshot of iX-applications dataset
- Run daily docker prune commands to get rid of old docker containers and their snapshots.

Combined, these three steps should keep the amount of snapshots managable.
Without the need to overzealously decomission things.
 

indivision

Guru
Joined
Jan 4, 2013
Messages
806
@indivision I am in a similar situation as you having done pretty much the same things (set up backups of ix-applications, ended up with 1000s of snapshots, suffered severe performance penalty, disabled the backups, deleted the 1000s of snapshots, and then Apps stopped working).
I had never made backups of the ix-applications. So, I hadn't even thought to look at the snapshots section (I don't have any regular snapshots scheduled).

It was my fault breaking the apps by deleting snapshots. I don't have a lot of experience working with snapshots. So, I had assumed that deleting them was more like prune. I thought the data within older, deleted snapshots would be migrated into the newer ones before destruction.

The GUI doesn't provide clarification or means to clean up in a safe way. Even with 100 snapshots per screen, when you have 10,000+ in there the snapshot section GUI loses usefulness.
I have a case open with iX to resolve this. Feel free to follow the case if it looks like you have the same issue.
Can you please share the link to the case? I would like to follow.
Some notes on this issues:
- Never directly make snapshots of the iX-applications dataset, outside of the SCALE Apps Backup API (for which we've guides available)
- Never EVER delete any snapshot of iX-applications dataset
- Run daily docker prune commands to get rid of old docker containers and their snapshots.

Combined, these three steps should keep the amount of snapshots managable.
Without the need to overzealously decomission things.
Thank you. Lesson learned.

I think it would make a lot of sense to either automatically run the prune commands quietly by default or to add an interface for applying prune or scheduling it in the GUI.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Run daily docker prune commands to get rid of old docker containers and their snapshots.
And how would one do that? Just docker prune as a cron job?

Thanks,
Patrick
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
I guess
docker system prune --all --force --volumes
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
Some notes on this issues:
- Never directly make snapshots of the iX-applications dataset, outside of the SCALE Apps Backup API (for which we've guides available)
- Never EVER delete any snapshot of iX-applications dataset
- Run daily docker prune commands to get rid of old docker containers and their snapshots.

Combined, these three steps should keep the amount of snapshots managable.
Without the need to overzealously decomission things.
Hmm, and what happens if I want to replicate my ix-systems dataset to another TN for backup purposes? That involves a snapshot does it not?
I accept that this is of limited use (at the moment) as I also use external storage a lot but I can also snapshot that as well
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Hmm, and what happens if I want to replicate my ix-systems dataset to another TN for backup purposes? That involves a snapshot does it not?
I was thinking the same. Whenever I put SCALE into real production use, how would I do backup? I snapshot all my VM zvols, all my shares and all my iocage jails and replicate that to a second system at a different location.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134

ian351c

Patron
Joined
Oct 20, 2011
Messages
219
Can you please share the link to the case? I would like to follow.
Oops: NAS-115821
It looks like there can be an issue where a /mnt/mnt folder is created and that breaks mounting the kubelet dataset properly, which in turn breaks starting Apps. Waqar from iX removed that folder and everything is happy upon reboot.
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
I was thinking the same. Whenever I put SCALE into real production use, how would I do backup? I snapshot all my VM zvols, all my shares and all my iocage jails and replicate that to a second system at a different location.

@Patrick M. Hausen

If you use the default install method of truecharts then as far as I can tell - you can't back it up
If however you do what truecharts says you shouldn't (apparently it may break "revert") and move the config files and data files (if any) to a hostpath elsewhere then you can backup the config of the container. No need to backup the ix-applications as that is all repeatable (sort of - see below)

Caveat: There appears to be no way to backup the container config (as opposed to the app inside the container) so recreating the container may be a bit of a ballache working out what you have done / are doing. I suppose you could use a load of screenshots, but if you have many apps then that will be less than entirely practical (as in not)

Yea I know - thread ressurection - but at least its this year.

I have moved most (not quite all - yet) of my containers away from Truecharts onto a Ubuntu VM running under Scale, not because of the quality of the charts but for exactly the reason above. There is no repeatability with the config. I can backup the app configs / data files. But if I had to rebuild the NAS / "the ix-applications dataset gets corrupted and needs resetting" then I cannot restore / repeat the container configs. Its actually this same issue that makes the Truecharts support process poor and I suspect difficult
 
Last edited:

truecharts

Guru
Joined
Aug 19, 2021
Messages
788
@Patrick M. Hausen

If you use the default install method of truecharts then as far as I can tell - you can't back it up
If however you do what truecharts says you shouldn't (apparently it may break "revert") and move the config files and data files (if any) to a hostpath elsewhere then you can backup the config of the container. No need to backup the ix-applications as that is all repeatable (sort of - see below)

Caveat: There appears to be no way to backup the container config (as opposed to the app inside the container) so recreating the container may be a bit of a ballache working out what you have done / are doing. I suppose you could use a load of screenshots, but if you have many apps then that will be less than entirely practical (as in not)

Yea I know - thread ressurection - but at least its this year.

I have moved most (not quite all - yet) of my containers away from Truecharts onto a Ubuntu VM running under Scale, not because of the quality of the charts but for exactly the reason above. There is no repeatability with the config. I can backup the app configs / data files. But if I had to rebuild the NAS / "the ix-applications dataset gets corrupted and needs resetting" then I cannot restore / repeat the container configs. Its actually this same issue that makes the Truecharts support process poor

No, our advice is to always use the default PVC config option and use TrueTool for backups as described on the website.
 
Top