SOLVED Scheduled Periodic Snapshot Tasks - Never Happen.

NumberSix

Contributor
Joined
Apr 9, 2021
Messages
188
Hi
I wonder if anyone can throw light on this mystery. One of the features of TrueNAS that first attracted me to it was the idea of scheduled snapshots. Yet despite setting this up when I first built the NAS back in April, with a scheduled weekly snapshot, I find that now, nearly 3 months later, it has never fired, not once.

My relatively simple home TrueNAS has just 2 top level datasets, which I call 'NAS' and 'System'. This is how the 'Periodic Snapshot Tasks' section looks:



Note the state is 'PENDING' - they have never run. Since the setup is identical for both, here's a look at the Edit section for the NAS task:

screenshot.1.jpg


Because this section is so uncomplicated, I can't think what I could have done wrong here, and why these schedules are never actioned. They just stay 'Pending'. For what it's worth. I have executed several snapshots manually, and that has no impact on the failure of these scheduled tasks. What could be going on here, anyone please? Thank you!
 

Attachments

  • screenshot.2.jpg
    screenshot.2.jpg
    82.1 KB · Views: 247
Joined
Oct 22, 2019
Messages
3,641
I've personally had issues abound of using uppercase letters for automatic snapshots, as well as using an underscore immediately after the timestamp string.

I don't believe this is a bug with ZFS or manual snapshots. I think it has something to do with how zettarepl deals with particular characters/strings used in automatic snapshots and replication tasks.

Essentially, these two types of snapshot naming schemes used in my automatic snapshots and/or replication tasks gave me problems in the past (highlighted in red):
  • WEEKLY-auto-%Y-%m-%-d_%H-%M
  • auto-weekly-%Y-%m-%-d_%H-%M_6exp
 
Last edited:

NumberSix

Contributor
Joined
Apr 9, 2021
Messages
188
Hi
Thank you! That's just the sort of irrational gotcha that I'd be (and have been) defenceless against.

I have no idea what zettarepl is but I shall Google it tomorrow. You suggest this bug/'feature' may affect Replication tasks too. This worries me, because I use the same naming scheme (uppercase elements & underscore) there too, and they work correctly. That suggests maybe this isn't the issue with auto Snapshots? Only way to find out is to test it. It's very late here now so that's on the agenda for tomorrow. I'll keep you posted on if it's a fix in this case! Thanks again for this tip.
 
Joined
Oct 22, 2019
Messages
3,641
That suggests maybe this isn't the issue with auto Snapshots? Only way to find out is to test it. It's very late here now so that's on the agenda for tomorrow. I'll keep you posted on if it's a fix in this case! Thanks again for this tip.

It's a tricky one, I know. I'm going off of some hiccups from my past usage of FreeNAS/TrueNAS, but the best way to know for sure is to set up a new Periodic Snapshot Task and have it make a snapshot of a dataset every 5 minutes, so you don't have to wait long to test this solution.

Under Minutes you would enter: */5
 
Last edited:

NumberSix

Contributor
Joined
Apr 9, 2021
Messages
188
Hi Winnie!
Thank you! I tried the exact test you suggested and... it worked! Thank you So Much! I would never have solved this without you.

Two questions arising if I may. First, can you tell me about your */5 syntax? It worked perfectly, but I also tried just 5. Since there's a field for minutes, I would have thought that values between 0 and 59 were invited, fool that I am. In fact '5' did not work and nothing was triggered. So what does '*/5' actually translate as, and should I use this same syntax for specifying hours and days too?

My secondcquestion: since these scheduled Snapshots also specify how long to keep the snapshot for (3 weeks in my case), if I delete the schedule once it's fired, will that leave the Snapshots it created 'orphaned', or will they still be deleted 3 weeks after their creation date?

Thank you!
 
Last edited:
Joined
Oct 22, 2019
Messages
3,641
Two questions arising if I may. First, can you tell me about your */5 syntax? It worked perfectly, but I also tried just 5. Since there's a field for minutes, I would have thought that values between 0 and 59 were invited, fool that I am. In fact '5' did not work and nothing was triggered. So what does '*/5' actually translate as, and should I use this same syntax for specifying hours and days too?
I might sound harsh or critical against TrueNAS, but please understand I think it's a great overall product. So with that out of the way, it might sound like I'm ranting or being negative.

The way options and tooltips are presented and described in TrueNAS CORE leave much to be desired. Some sections are archaic, while others are downright confusing. Why do they make you manually "type in" an expiration lifetime? Why not choose values from a "wheel" or drop-down? Why do they make you type in asterisks and integers in other fields? Why not let us click on a calender to create a template? I know these user-friendly additions would require a lot of work from the developers, but it follows the mantra of "TrueNAS is an appliance, use the GUI!"

When you enter a only a number in the Minutes field, it means the task will only run "on that minute of the hour".
  • In other words, if you enter 5, it means it will run on 00:05, 01:05, 02:05, 03:05, etc.
  • If you enter */5, it means it will run every minute (the astrerisk), but actually skipping in intervals of 5 minutes, such as 00:00, 00:05, 00:10, 00:15, etc.
There is an extra field below that will let you specificy a "window" when this is allowed, such as from 00:00 to 06:00, meaning after 6am the task will be ignored until the next midnight arrives.

The same principle applies to Hours and Days. I suggest keeping the task simple and don't try to go crazy with a highly nuanced schedule, especially for home use.


My secondcquestion: since these scheduled Snapshots also specify how long to keep the snapshot for (3 weeks in my case), if I delete the schedule once it's fired, will that leave the Snapshots it created 'orphaned', or will they still be deleted 3 weeks after their creation date?
The experience myself and others have had is yes, it will leave them orphaned and untouched. To this day, I'm still trying to figure out how expired snapshots are handled, and if using similarly-named snapshots risks an accidental "overlap" where you might get casualities of snapshots being deleted due to them sharing the same naming schema as those from another task.

From what I understand, zettarepl parses the names of the snapshots, and if there's a match between the timestamp and "X amount of time before the system clock's current time" then it shall delete the snapshot. Either that, or it only parses the names to see if it belongs to a certain Periodic Snapshot Task, and from there it reads the "creation time" of the snapshot to determine if it should be deleted. Hoenstly, I could never figure it out for certain. :frown:

A rule of thumb I use is to describe the snapshot first, then append with a hyphen, followed by the timestamp string. Nothing more. Some examples include:
  • auto-hourly-%Y-%m-%-d_%H-%M
  • auto-weekly-%Y-%m-%-d_%H-%M
  • auto-monthly-%Y-%m-%-d_%H-%M
  • backup-%Y-%m-%-d_%H-%M
 
Joined
Oct 22, 2019
Messages
3,641
On a related note, I hope others chime in and vote on the issue at iXsystem's Jira bug tracker. It's related to your concern about expired / protected snapshots:


 

NumberSix

Contributor
Joined
Apr 9, 2021
Messages
188
Hi WinnieLinnie!
I really can't thank you enough for your detailed answer; seriously appreciated i can tell you! The practice of needing a '*/' before the time measure is SO counter-intuitive and user-hostile. Surely the situation where the user has some reason to want things to run at x minutes past the hour is so obscure - certainly relative to the simple case of 'every x minutes/hours/days' that the perverse notation should only be required to express that perverse requirement?!! I agree with everything you said about creating a drop down, although as something of a html guru, I disagree that it would take much time to do that. It would be a matter of minutes to fix that, honestly. Perhaps that's something to add to the Jira? And bravo on your comment about the 'use the GUI' mantra. For people like myself with zero Linux(esque) and limited CLI experience, the GUI is my only option. The way of doing things in the GUI at the moment is about as user-friendly as an air raid.

Thank you again for everything!
 

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925
On a related note, I hope others chime in and vote on the issue at iXsystem's Jira bug tracker. It's related to your concern about expired / protected snapshots:
I just got bitten by this issue. I can't see how to vote for the ticket in Jira. I do have an account, sure I have voted in the past, no clue why I can't see how to do so now.
Help, please
 

NumberSix

Contributor
Joined
Apr 9, 2021
Messages
188
I just got bitten by this issue. I can't see how to vote for the ticket in Jira. I do have an account, sure I have voted in the past, no clue why I can't see how to do so now.
Help, please
Hi!
Happy to help! Just log into your Jira account, then click on the 'Vote' count near the top right corner of the Jira's issue page. Note, it may be covered by an unhelpful pop-up that appears over exactly that area. Just click the X in the top right corner of the pop-up to see the 'Vote' count beneath.

Cheers.
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
I've personally had issues abound of using uppercase letters for automatic snapshots, as well as using an underscore immediately after the timestamp string.

I don't believe this is a bug with ZFS or manual snapshots. I think it has something to do with how zettarepl deals with particular characters/strings used in automatic snapshots and replication tasks.

Essentially, these two types of snapshot naming schemes used in my automatic snapshots and/or replication tasks gave me problems in the past (highlighted in red):
  • WEEKLY-auto-%Y-%m-%-d_%H-%M
  • auto-weekly-%Y-%m-%-d_%H-%M_6exp
Thanks for highlighting this. I'd recommend reporting it as a bug (is it reported?).

Even if the cause is zettarepl, TrueNAS should have a goal of minimizing these headaches by warning, preventing or autocorrecting users from these mistakes. In the short term documentation could be improved. https://www.truenas.com/docs/core/tasks/periodicsnapshottasks/
 

NumberSix

Contributor
Joined
Apr 9, 2021
Messages
188

From what I understand, zettarepl parses the names of the snapshots, and if there's a match between the timestamp and "X amount of time before the system clock's current time" then it shall delete the snapshot. Either that, or it only parses the names to see if it belongs to a certain Periodic Snapshot Task, and from there it reads the "creation time" of the snapshot to determine if it should be deleted. Hoenstly, I could never figure it out for certain. :frown:

Hi
I think I tracked down how zettarepl decides hivh Snapshots to delete and how. It uses a config file called YAML. Within that there is a section - and here I quote from the documentation here: https://github.com/truenas/zettarepl

# This is a very important parameter that defines how your snapshots would
# be named depending on their creation date.
# zettarepl does not readsnapshot creation date from metadata (this can be
# very slow for reasonably big amount of snapshots), instead it relies
# solely on their names to parse their creation date.
# Due to this optimization, naming schema must contain all of "%Y", "%m",
# "%d", "%H" and "%M" format strings to allow unambiguous parsing of string
# to date and time accurate to the minute.
# Do not create two periodic snapshot tasks for same dataset with naming
# schemas that can be mixed up, e.g. "snap-%Y-%m-%d-%H-%M" and
# "snap-%Y-%d-%m-%H-%M". zettarepl won't be able to check for it on early
# stage and will get confused.
naming-schema: snap-%Y-%m-%d-%H-%M

So I hope that clarrifies one mystery: zettarepl deletes on the basis of the date in the Snapshot name and not on the basis of it's timestamp.

And yes, this obscure way of formatting the periodicity of the schedule should be reported as a bug, as MorganL has suggested. Let me know if it happens. I for one will add my vote!

Thanks.
 
Joined
Oct 22, 2019
Messages
3,641
So I hope that clarrifies one mystery: zettarepl deletes on the basis of the date in the Snapshot name and not on the basis of it's timestamp.

Just based off of that I would advise to only use the default snapshotname-%Y-%m-%d_%H-%M for all snapshots and tasks. After all, "day" and "month" can be easily confused, especially if there aren't enough snapshots to see a longterm pattern. "Is 08-11 August 11th or November 8th?"

Many thanks for confirming it is indeed all based on the snapshot's name, and that the creation time is never used. This further emphasizes the importance of carefully naming your snapshots and tasks.
 

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925
Hi!
Happy to help! Just log into your Jira account, then click on the 'Vote' count near the top right corner of the Jira's issue page. Note, it may be covered by an unhelpful pop-up that appears over exactly that area. Just click the X in the top right corner of the pop-up to see the 'Vote' count beneath.
Thanks, I finally found it: It was under the "More" button - an option to "Add Vote":

1625362347036.png


For m,e
 
Top