SOLVED [SCALE] Mass cleanup/removal of ix-applications snapshots solution required

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
As discussed, there are other container engines, there's also a method of using overlayfs on zfs. For Bluefin, we expect to make one of these work.
Noted, and appreciated. But this is likely to be too late for me.

As reported in another thread, I have actually damaged my test installation while attempting to trim snapshots.
I then took the opportunity to destroy the pool and re-do the setup. To avoid the overhead of loading and updating the whole TrueCharts catalog for just two apps, namely Pi-hole and TrueCommand, I tried using "plain" docker images. These two happen to be covered in ixSystem documentation, so it should be straightforward, right?

NO!

Set up TrueCommand as described: All systems would be lost when stopping and restarting the container because there was no persistent storage. So this part of the official documentation is plain wrong:
You can link additional SCALE storage to this application by adding Host Path Volumes entries.
<IMAGE>
This is not typically required for TrueCommand.
This IS ACTUALLY REQUIRED for any useful installation…

Then Pi-hole from a docker image… There setting up persistent storage is described, but the datasets have to created in advance, else one has to go back and redo any previous step—it's not convenient to just follow documented steps as one goes. Also, critical settings for port forwarding and storage can only be found by squinting on screenshots, there are not in the main text, and since there are quite a few of those, the process involves a lot of clicking "Add" buttons, moving up and down a column of settings that gets WAY TOO LONG for comfort and easy overview of what one does.
All that to end up with an instance which uses port 9053 instead of the standard 53… meaning that it is essentially useless—or involves a lot of additional work to get the rest of the network looking up its DNS on port 9053. Known issue. Kudos to @ornias for getting port 53 to work as intended in TrueCharts!
But I'm afraid that kind of sorcery is above my abilities, so the next step of my experiments with containers will be to set up a small Linux server with Portainer as a GUI, the two containers above—and upon success, I look forward to terminally decommission the TrueNAS SCALE test platform.

Positive note: The 'docker prune' cron job appears to have somewhat kept snapshots in check during the process (about 40-50 of them, for just two containers).
 

truecharts

Guru
Joined
Aug 19, 2021
Messages
788
To avoid the overhead of loading and updating the whole TrueCharts catalog for just two apps, namely Pi-hole and TrueCommand, I tried using "plain" docker images.

There seems to be some misunderstanding here...
Most old overhead issues (slowdowns in the UI) have been solved in latest releases.

But that's all the overhead there is, this topic is mostly about snapshot overhead due to docker-compose.
Our catalog does not cause any significant snapshot overhead, CPU load or ram consumption just by being loaded and the disk usage won't be significant either.
 

mkarwin

Dabbler
Joined
Jun 10, 2021
Messages
40
I don't disagree.... but we have resource and time contraints. Off-the-shelf solutions are preferred.. either existing code or developed and tested contributions. Its a good discussion topic for any developers on discord channel.
I understand and agree... there's only so little one can do within a certain period of time without writing everything from scratch. And in the existing environment, it's not feasible to write all the code from scratch, when there are projects already there and in use/tested by people responsible for respective tech/solutions. With the constrained resources all one can do is put these building blocks together to make the click together and produce a product/solution that is greater than the sum of its parts. The benefit is that you have this community to help locate any bugs/issues encountered now, or point out user misconceptions/expectations to system behaviour, or to propose improvements that might make the solution better in the long run. That's also why you have JIRA - it's easier to arrange limited resources to deal with things that are crucial first. Still, while currently it's an official release and yet it's not exactly enterprise ready stable build, AFAIK you're planning enterprise release later in the year... And once these are in use, and the enterprise environment starts to use docker and kubernetes that this platform offers in full, you might end up with more issues reported faster than anticipated. Thus issues such as these would need to be polished out/fixed before official enterprise use green light. Otherwise, god forbid, you might end up losing paying iX customers. Thus I believe some solution will come later, once the resources are available to work on this. I only wish it came sooner rather than later ;)
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
Noted, and appreciated. But this is likely to be too late for me.

As reported in another thread, I have actually damaged my test installation while attempting to trim snapshots.
I then took the opportunity to destroy the pool and re-do the setup. To avoid the overhead of loading and updating the whole TrueCharts catalog for just two apps, namely Pi-hole and TrueCommand, I tried using "plain" docker images. These two happen to be covered in ixSystem documentation, so it should be straightforward, right?

NO!

Set up TrueCommand as described: All systems would be lost when stopping and restarting the container because there was no persistent storage. So this part of the official documentation is plain wrong:

This IS ACTUALLY REQUIRED for any useful installation…

Then Pi-hole from a docker image… There setting up persistent storage is described, but the datasets have to created in advance, else one has to go back and redo any previous step—it's not convenient to just follow documented steps as one goes. Also, critical settings for port forwarding and storage can only be found by squinting on screenshots, there are not in the main text, and since there are quite a few of those, the process involves a lot of clicking "Add" buttons, moving up and down a column of settings that gets WAY TOO LONG for comfort and easy overview of what one does.
All that to end up with an instance which uses port 9053 instead of the standard 53… meaning that it is essentially useless—or involves a lot of additional work to get the rest of the network looking up its DNS on port 9053. Known issue. Kudos to @ornias for getting port 53 to work as intended in TrueCharts!
But I'm afraid that kind of sorcery is above my abilities, so the next step of my experiments with containers will be to set up a small Linux server with Portainer as a GUI, the two containers above—and upon success, I look forward to terminally decommission the TrueNAS SCALE test platform.

Positive note: The 'docker prune' cron job appears to have somewhat kept snapshots in check during the process (about 40-50 of them, for just two containers).

Thanks for the feedback and sorry for your frustration.

We have deliberately not recommended TrueCommand as a TrueNAS app... just because its too complex to have one system managing itself. When things go wrong, all the tools are broken and compound the problems.

Where there is a TrueCharts app, we do recommend that version over Docker. There's some catalog set-up time, but it is generally tiny compared with the system integration issues. It would be useful to see what you think of that process.

Glad the docker prune makes things manageable. The new TrueCharts docker compose app can also can give you a portainer UI for running docker images.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
We have deliberately not recommended TrueCommand as a TrueNAS app... just because its too complex to have one system managing itself. When things go wrong, all the tools are broken and compound the problems.
The TrueNAS Scale platform was set up for the sole purpose of running containers—and learning a bit about that. All my storage is on Core and I do NOT intend to move to Scale. Besides, even if I considered moving, the snapshot clutter issue would make inconvenient to have a "converged" setup on anything but a secondary or tertiary backup server—on a main server, I'd expect to actually come and use the snapshots, so I would not want my storage snapshots lost in a forest of container snapshots.

Where there is a TrueCharts app, we do recommend that version over Docker. There's some catalog set-up time, but it is generally tiny compared with the system integration issues. It would be useful to see what you think of that process.
I actually had TrueCharts apps before, and did not like the lengthy download and update time, nor the unsorted catalog. (It feels like a wall of mysterious names and obscure icons; no description, so even if there were something which might be of use to me, I would not find it.)
When I damaged my installation by pruning it to hard, I took the opportunity to learn a bit more and directly use docker containers. I found the vertically-oriented interface rather cumbersome and clicky: One has to click and go through each section in order, even if there's nothing to do. It is not possible to save a half-finished template and come back. When there are custom settings (port forwarding, storage, environment,…) the interface takes a lot of screen real estate and it becomes quickly impossible to see all the settings without scrolling, which is a pain (what do the developers use? 80" panoramic screens in vertical mode???)
After a few tries I managed to have a usable, persistent, TrueCommand (error in documentation: post #21 above) but no usable Pi-Hole because port 53 has to be remapped above 9000, where clients do NOT expect to find the DNS service.
The process was educational and let me eventually appreciate the work done by TrueCharts, with direct buttons to the container's interface—and Pi-Hole on port 53, yay! (But it's still terribly heavy compared to running Pi-Hole natively on a Raspberry Pi…)

So I went a little further down the rabbit hole and set up a smaller server with Ubuntu 20.04 LTS (X10SBA-L: a puny Celeron J1900 with a grand total of 2 available SATA ports), running Portainer-CE, Pi-Hole and TrueCommand. Portainer's interface, with its extensive array of options, will take some learning but is already much more pleasant than that of SCALE: At least a detailed and meaningful display of all settings fits in the screen before hitting the Start button! And, with some more Docker tinkering (macvlan), I managed to give Pi-Hole its port 53.
The OS and containers all fit in one old SSD, which is a better use of resources than having (at least) one drive for boot and (at least) drive for apps with SCALE. (Remember, the whole exercise was initially to run a grand total of two containers, now bumped by 50% with a third container to have a GUI for managing the other two…)

In retrospect, it would have been easier to keep Pi-Hole as a native app on a Raspberry Pi attached to the Fritz!Box modem, and forget about TrueCommand—or run it on demand from the desktop Docker application on my Mac. (Pity there's no ARM version of TrueCommand, but I understand the target market is for managing BIG server fleets, not home setups which can admittedly do with managing each NAS individually.)
But most of the tinkering and educational parts would have been lost… :wink:

My Scale testbed was shut down this evening and laid to rest, waiting for the dreaded Repurposing Screwdriver… :eek:
If the Ubuntu docker server proves to be stable, it will be my first Linux computer. I had been a Linux-avoider all the years from teenage programming (DOS) though college (Solaris, NeXT), home (MacOS, BeOS) and work (Windows, OS/2). And knew nothing about docker before this exercise ("Kubernetes", "Helm" and "Gluster" are still mostly gibberish to me…).

I still do not understand what is the target market for the container/virtualisation part of TrueNAS Scale:
For complete novices like yours truly, the catalogues are not helpful because there's no way to look for apps by function and discover. One has to know what one wants to run to find it.
But, from reading threads in this forum, it seems that those who actually know about containers want either (i) plain Docker and no Kubernetes or (ii) an even more sophisticated Kubernetes environment than what TrueNAS Scale provides.
That would leave an uncertain middle ground of users who already know what apps they need but want the apps installed at the click of a button (Synology-style) and not set up anything themselves. Knowledgeable, but not too much.

And, obviously, I have not been convinced by Anglefish. Hope this feedback helps for the next versions…
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
The TrueNAS Scale platform was set up for the sole purpose of running containers—and learning a bit about that. All my storage is on Core and I do NOT intend to move to Scale. Besides, even if I considered moving, the snapshot clutter issue would make inconvenient to have a "converged" setup on anything but a secondary or tertiary backup server—on a main server, I'd expect to actually come and use the snapshots, so I would not want my storage snapshots lost in a forest of container snapshots.


I still do not understand what is the target market for the container/virtualisation part of TrueNAS Scale:
For complete novices like yours truly, the catalogues are not helpful because there's no way to look for apps by function and discover. One has to know what one wants to run to find it.
But, from reading threads in this forum, it seems that those who actually know about containers want either (i) plain Docker and no Kubernetes or (ii) an even more sophisticated Kubernetes environment than what TrueNAS Scale provides.
That would leave an uncertain middle ground of users who already know what apps they need but want the apps installed at the click of a button (Synology-style) and not set up anything themselves. Knowledgeable, but not too much.

Agreed that the target market for SCALE is not running only-containers... there are plenty of platforms and software that does that. By definition, a high function storage platform will add some unnecessary complexity. On a single server, Docker and Portainer do a great job.

The target market is shared storage with a scale-out option + containers (and VMs). There are no good open source projects that do all these well.

You have two servers to work with.... many home users will want the one server (and backup options)
In the business world, many will want a small cluster that both serves data and runs some critical apps.
In the enterprise world, they want very large clusters.... some with apps and sometimes without apps. Some Apps are actually storage functions (e.g backup, or object store).

Angelfish is the 1st version of a decade-plus release process. At iX we practice Kaizen... continuous improvement. We move reasonably quickly, but can't commit to moving a lot faster. Hope that explains our plan.

We understand that for some use-cases, SCALE will not be the best solution - especially in the short term. However, we do and will continue to work with users where we can add value to the use-cases we have identified above. Our hope is that we will be a good open source partner to a community with these needs and welcome constructive criticism.

We agree that the excess docker-zfs snapshots is annoying and are working through the options to address.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
Agreed that the target market for SCALE is not running only-containers... there are plenty of platforms and software that does that. By definition, a high function storage platform will add some unnecessary complexity. On a single server, Docker and Portainer do a great job.
Agreed here. I was not in the target market to begin with…

Thanks for reading though anyway. If I boil it down to one item (beside the snapshot issue), SCALE needs a more user-friendly interface for setting up docker containers: More compact, less directive (no fixed path, no need to edit sections where there are no customs settings).

You have two servers to work with.... many home users will want the one server (and backup options)
Well, that goes against the advice to not run TrueCommand in TrueNAS, doesn't it?

We agree that the excess docker-zfs snapshots is annoying and are working through the options to address.
Thumbs up!
My new Linux server possibly still have some issues with excess overlays… but this is now out of sight and I'm satisfied to assume that my tiny server will not run out of inodes and that the issue is under control by default because overlayfs2 on ext4 is Docker native platform.
 
Last edited:

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
Agreed here. I was not in the target market to begin with…

Thanks for reading though anyway. If I boil it down to one item (beside the snapshot issue), SCALE needs a more user-friendly interface for setting up docker containers: More compact, less directive (no fixed path, no need to edit sections where there are no customs settings).


Well, that goes against the advice to not run TrueCommand in TrueNAS, doesn't it?


Thumbs up!
My new Linux server possibly still have some issues with excess overlays… but this is now out of sight and I'm satisfied to assume that my tiny server will not run out of inodes and that the issue under control by default because overlayfs2 on ext4 is Docker native platform.

The more clear advice is "don't rely on TrueCommand running on a TrueNAS that it is also managing."

It's better to run TrueCommand on a laptop or the Cloud version.
 

indivision

Guru
Joined
Jan 4, 2013
Messages
806
I actually had TrueCharts apps before, and did not like the lengthy download and update time, nor the unsorted catalog. (It feels like a wall of mysterious names and obscure icons; no description, so even if there were something which might be of use to me, I would not find it.)

Just throwing my 10 cents in. I think there is some context around this that helps understand the current state.

In the grand scheme of things, it was a big deal/accomplishment to branch out and offer Scale as an additional product. So, a lot of long-time FreeNAS users like myself see the availability of these apps at all as a positive. Presumably, how they are organized in the GUI will be polished and improved over time as more of a "Stage 2" focus.

In some ways, the "wall of apps" issue is a credit because it indicates that the interest and user contributions to add new apps quickly maxed out the first version of a catalog.
 

truecharts

Guru
Joined
Aug 19, 2021
Messages
788
In some ways, the "wall of apps" issue is a credit because it indicates that the interest and user contributions to add new apps quickly maxed out the first version of a catalog.

Some interesting info about this:
The amount of Apps and state the current Apps system is in, is already steps ahead from what was planned in late 2020.
The scale (hurhur) of what is released now, is so many times larger than what was expected, that even with the fixes that got added along 2021, it simply would never be something to call "perfect".

But at least we all got it into a state that is much larger than the "glorified tech-demo" that was mostly aimed towards in 2020.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
In some ways, the "wall of apps" issue is a credit because it indicates that the interest and user contributions to add new apps quickly maxed out the first version of a catalog.
Fair point, and credit to the TrueCharts team for their work. But now a better UI and way to manage the "wall of apps" is needed sooner than later.

And, as this thread and you own thread illustrate, the automatic ix-applications are an issue.
The "docker prune" cronjob should be automagically set by the system.
Even with pruning, it seems that the steady state is about 100-200 snapshots per container, and this drowns any user-defined snapshots in the IU. Maybe there should be separate interfaces to (i) snapshots of the application pool and (ii) snapshots of all other pools (i.e. "storage" pools)—or maybe no GUI at all for ix-applications snapshots, if the user is never supposed to manage or delete them manually, but then the system should reliably manage and keep these snapshots in check.
 

Daisuke

Contributor
Joined
Jun 23, 2011
Messages
1,041
For Bluefin, we expect to make one of these work.
Can you detail more on this? I have 712 snapshots for 6 apps, on current Scale release. The only partial fix I found is zfs-prune-snapshots, to remove all snapshots older than a week, for all pools, which trimmed down the 712 snapshots to 365 (many snapshots are 0B which cannot be deleted):
Code:
zfs-prune-snapshots 1w

What would be the expected fix in Bluefin? Thank you.
 
Last edited:

truecharts

Guru
Joined
Aug 19, 2021
Messages
788
Can you detail more on this? I have 712 snapshots for 6 apps, on current Scale release. The only partial fix I found is zfs-prune-snapshots, to remove all snapshots older than a week, for all pools, which trimmed down the 712 snapshots to 365 (many snapshots are 0B which cannot be deleted):
Code:
zfs-prune-snapshots 1w

What would be the expected fix in Bluefin? Thank you.

This is incredibly risky, as Docker on ZFS depends on those snapshots and their clones.
you should never manually touch any snapshot inside ix-applications.
 

kiler129

Dabbler
Joined
Apr 16, 2016
Messages
22
This is a true madness - with just 6 apps I have over 1K snapshots. Trying to do a backup of ix-applications dataset multiplied them. Now a reboot takes 15+ minutes when the drive containing ix-applications is attached. Attempting to delete the backup dataset from webUI caused the webUI to log me out before the operation even finished.

Honestly, with this issue not solved the apps cannot really be used on TNS.
 

truecharts

Guru
Joined
Aug 19, 2021
Messages
788
"Trying to do a backup of ix-applications"
Hope you followed our guide, never manually made a snapshot and read our important excludes when doing zfs send-recv ;-)
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
Hope you followed our guide, never manually made a snapshot and read our important excludes when doing zfs send-recv ;-)
Fair warning, but a direct link to said guide would be useful.
I hope you agree that it would be better if these automatic snapshots were hidden from the GUI, and automatically protected/excluded in GUI-generated replication tasks—and that it would be best if there was no such overwhelming amount of app-related snapshots in the first place.

Let's see how it improves in future releases.
 

kiler129

Dabbler
Joined
Apr 16, 2016
Messages
22
"Trying to do a backup of ix-applications"
Hope you followed our guide, never manually made a snapshot and read our important excludes when doing zfs send-recv ;-)
Let's just say I did a RTFM after the fact :wink:

Fair warning, but a direct link to said guide would be useful.
Backup and Restore | TrueCharts

I hope you agree that it would be better if these automatic snapshots were hidden from the GUI, and automatically protected/excluded in GUI-generated replication tasks (...)
I think they should be a suggestion to exclude things, similarly how the sub-datasets are hidden in the GUI. However, they shouldn't totally hide all snapshots from the GUI as this will create confusion why some things are and some aren't shown.

(...) and that it would be best if there was no such overwhelming amount of app-related snapshots in the first place.

Let's see how it improves in future releases.
https://github.com/moby/moby/issues/41055 - to my understanding this isn't really about TNS but how ZFS driver works in Docker.
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
Wanted to wrap this up by making it clear that SCALE Bluefin includes OverlayFS for ZFS. This removes all the excess snapshots.

I'll mark this as solved.

 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947
And it seems to work.
I have 478 Snapshots of the pool with ix-applications on (and other stuff). Almost all of them are my snapshots and not the apparently self-breeding ones from before, of which there is an apparent total lack. I can now see the wood for the trees.

@kiler129 your link appears to be broken
 
Top