PLEASE HELP! "Restore Previous Version" Semi-working after upgrade from Freenas-9.10.2 (Read: Desperately need you wisdom)

M H

Explorer
Joined
Sep 16, 2013
Messages
98
So after running on FreeNAS 9.10.2-U6 for approximately the past 8 years, I decided to update to 11.3-U5 and TrueNAS 12. Everything seems to be working well, mostly the learning curve with the new GUI and how things are done. I like the updated GUI, it seems to do most things better.

Anyway, a MAJOR issue I ran into is that Windows' "Restore Previous Versions" is only partially randomly working.

At first, the previous versions didn't show up at all, but after a bunch of fiddling around, recreating new periodic snapshot tasks, deleting and recreating my SMB shares, cleaning up old snapshots (short of deleting them all. I have about a year's worth of snapshots, taken hourly, 5 days a week), letting snapshots be created and then setting up the SMB share, I was able to get some to show.

But it's now completely random what shows in Windows. The snapshots are being taken correctly in the TrueNAS GUI, but the ones that make it to "Restore Previous Versions" are random. I've also tried downgrading to 11.3-U5, same issue. Both show the snapshots randomly, although each version showed different individual snapshots on the same day (Like 11.3-U5 showed 6AM, 10AM, 2PM, 12.0 showed 7AM, 8AM, 10AM, 1PM, all from the same snapshots taken in the GUI). I've attached an example of the difference between the FreeNAS GUI and "Restore Previous Versions."

I haven't tried downgrading to 9.10.2 again, but I'm sure that will work normally as it did for so long. I really miss the option on the new GUI to be able to select which snapshot tasks are used for "Restore Previous Versions." I'm sure there was a reason they removed it, but I really need help here.

We use this feature often! We have thousands of customer folders in our main data folders and many people accessing them so there's all kinds of stupid things that happen. People dragging folders into other folders accidently, deleting folders if they disabled the delete confirmation, ransomware encrypting the entire folder, people editing files that others currently are working on or haven't been assigned to them, the list goes on. Too many times to count on how the feature has saved us and very quickly, open older version, copy/paste, done.

I've searched and searched and haven't found a way to get this working back the way it used to/should be. Any help would be greatly appreciated as we're down for the weekend, and I have until Sunday night to decide if I'm going all the way back to 9.10.2 which I have config backups for, but there are iSCSI shares and other things that won't just come over 1 to 1.

Ohh wise gurus of Free/TrueNAS, please show me the way! Thank you. Stay well and stay safe.
 

Attachments

  • Previous Versions Same Day.png
    Previous Versions Same Day.png
    77.5 KB · Views: 299
  • Snapshot List.png
    Snapshot List.png
    56.6 KB · Views: 281
Last edited:

HolyK

Ninja Turtle
Moderator
Joined
May 26, 2011
Messages
654
As far I remember you will see only non-zero size snapshots. Meaning if the snapshot has zero size (so nothing changed since last one) it won't be listed in the "Previous versions". I guess it was changed back in 10.2 or something?. Anyway I don't see anything about this in recent Docs so i might be completely wrong here. Yet I am quite sure I saw some post/thread about this a while ago. Maybe our SMB GrandMaster @anodos could help/clarify this? :]
 

M H

Explorer
Joined
Sep 16, 2013
Messages
98
Great! Thank you! That's a definite start. Now how does the "Allow Taking Empty Snapshots" option effect this, if at all? It would make perfect sense that only non-zero snapshots are displayed especially since my testing today has not really changed the contents of the dataset much. Ohh might @anodos , please chime in so that I know I can reliably start our work week next week in case of stupidity. Mine and potentially someone elses. :)

Ashampoo_Snap_Saturday, December 12, 2020_21h07m18s_001_.png
 

Heracles

Wizard
Joined
Feb 2, 2018
Messages
1,401
I have about a year's worth of snapshots, taken hourly, 5 days a week

So that means 120 snapshots per week, for 52 weeks, for a total of 6,240 snapshots. That is way too many.

I recommend you re-design your snapshot schedule on a multi-generations model. Here, I do :
1 snapshot per 15 minutes, saved for 3 days.
1 snapshot per hour, saved for for 1 week
1 snapshot per day, saved for 4 months
1 snapshot per week, saved for 1 year
1 snapshot per 4 weeks, saved for 4 years.

I also take snapshots only between 5:00 AM and midnight.

That gives me :
204 snapshots (15 minutes)
119 snapshots (1 hour)
122 snapshots (1 day)
52 snapshots (1 year)
52 snapshots (4 years)

Total is 549 snapshots. That model gives me better accuracy on the short term, similar accuracy on mid term and much longer capability, all of that for about 8% of your snapshots.

The fact that you have so many of them may explain why Windows is having that much trouble to scan all of them and end up with random results.

Maybe by letting it scan for ages, it will get there and it will discover all of them ?

Still, I recommend you to re-design that policy to a multi-generation model fitting your needs. You can surely greatly simplify that model...
 

M H

Explorer
Joined
Sep 16, 2013
Messages
98
So that means 120 snapshots per week, for 52 weeks, for a total of 6,240 snapshots. That is way too many.

I recommend you re-design your snapshot schedule on a multi-generations model. Here, I do :
1 snapshot per 15 minutes, saved for 3 days.
1 snapshot per hour, saved for for 1 week
1 snapshot per day, saved for 4 months
1 snapshot per week, saved for 1 year
1 snapshot per 4 weeks, saved for 4 years.

I also take snapshots only between 5:00 AM and midnight.

That gives me :
204 snapshots (15 minutes)
119 snapshots (1 hour)
122 snapshots (1 day)
52 snapshots (1 year)
52 snapshots (4 years)

Total is 549 snapshots. That model gives me better accuracy on the short term, similar accuracy on mid term and much longer capability, all of that for about 8% of your snapshots.

The fact that you have so many of them may explain why Windows is having that much trouble to scan all of them and end up with random results.

Maybe by letting it scan for ages, it will get there and it will discover all of them ?

Still, I recommend you to re-design that policy to a multi-generation model fitting your needs. You can surely greatly simplify that model...
I absolutely appreciate this suggestion. But I don't believe that this is the issue. On 9.10.2, I had over 22,000 snapshots at any given time (I only take snapshots 5 days a week, 7AM to 7PM when our design guys are around) and Windows had absolutely no problem loading them. It was instantaneous. I simply didn't age them out because I liked the security and saw no negative performance effect. The only thing that would take forever to load was the snapshots listing in the FreeNAS GUI. That would take a solid 3-5 minutes to populate. I'll post a screen shot of them still listed as soon as I get back to a machine that can access it.

I will absolutely take your suggestion under consideration and truly appreciate it. I assume that with differing schedules as long as they're of the same dataset, they will show up in "Previous Versions"?

EDIT: I was a little low on my estimate of 22,000. More like 40,000. This is on my decommisioned server and "Previous Versions" still works perfectly on 9.10.2.

Ashampoo_Snap_Saturday, December 12, 2020_23h09m43s_002_.png

Here you can see the behavior on the decommisioned server with the 40,000 snapshots. I select the folder, click "Restore Previous Versions," it populates instantly and scrolls perfectly (the lag is because I'm on remote desktop, screen capturing, etc.)

https://drive.google.com/file/d/10fgq-de8tnXD4hc0UXOj5kwk-Jk10koJ/view?usp=sharing


On the updated production server, I cut down the snapshots as much as I could in case this was a newly developed constraint as you can see below. But I will take any suggestion at this point. I need this resolved by Monday morning desperately or I know the shit will hit the fan sometime in the coming weeks. Again, I can't thank you enough for your attention, time and suggestions.

Ashampoo_Snap_Saturday, December 12, 2020_23h18m29s_003_.png
 
Last edited:

Heracles

Wizard
Joined
Feb 2, 2018
Messages
1,401
I had over 22,000 snapshots at any given time

Yes, but your Windows server "discovered" each and every one of them one at a time. Now, it must discover all of them at once...

I assume that with differing schedules as long as they're of the same dataset, they will show up in "Previous Versions"?

I do not expose my snapshots outside of my FreeNAS server, so I have no actual experience about this specific point. What I can tell you is whenever data from a say 15 minutes snapshots expire and is required by the next 1 hour one, the 1 hour one will inherit the data and keeps it. As such, it is like these snapshots grow every now and then when a lower generation expires. As such, I do expect that Yes, all of these snapshots will show because indeed, they are linked by the dataset.

I have also been Windows-free for well over a decade now. My souvenirs are that it is designed to work half the time and achieve its goal very well...
 

M H

Explorer
Joined
Sep 16, 2013
Messages
98
Yes, but your Windows server "discovered" each and every one of them one at a time. Now, it must discover all of them at once...



I do not expose my snapshots outside of my FreeNAS server, so I have no actual experience about this specific point. What I can tell you is whenever data from a say 15 minutes snapshots expire and is required by the next 1 hour one, the 1 hour one will inherit the data and keeps it. As such, it is like these snapshots grow every now and then when a lower generation expires. As such, I do expect that Yes, all of these snapshots will show because indeed, they are linked by the dataset.

I have also been Windows-free for well over a decade now. My souvenirs are that it is designed to work half the time and achieve its goal very well...

Thank you for your post and help. I don't think I understand. The example of the server with 40,000 snapshots is working perfectly fine and for 8 years at that (and is now decommissioned). The issue I'm having is with my newer production server that does not behave at all like my old server worked. I have exactly the same dataset arrangement, exactly the same snapshot schedule, exactly the same shares, everything is identical, but the FreeNAS version. Old server ran 9.10.2 and was exactly everything I needed. New server is TrueNAS 12.0 and will not allow Windows to list my "Restore Previous Versions."

Something is wrong or not configured correctly and I simply cannot put my finger on it. I'm in the process of downgrading back to 9.10.2 to confirm that everything works there. I will then upgrade main revision, by main revision, until I find at which version "Restore Previous Versions" stops functioning correctly. I was hoping to avoid this and that someone would have a solution. I've found nothing but outstanding experience and knowledge on these forums. They've saved me quite a few times for which I'm grateful.

Anyway, if anyone could please guide towards something to try, I'd greatly appreciate it, because the update to FreeNAS 13 and TrueNAS 12 is inevitable so I'd like to get it working rather than resorting to old versions.

Again, thanks everyone who's contributed so far, I truly appreciate it.
 

Heracles

Wizard
Joined
Feb 2, 2018
Messages
1,401
he example of the server with 40,000 snapshots is working perfectly fine

Maybe, but that server started with access to a pool with no snapshots. An hour later, it had to "learn" about a first snapshot. Next hour, another snapshot is to be discovered. Again, one snapshot at a time, one per hour, that is a rate that the server can keep up with.

Now, your new server must discover and achieve in an instant what the other had 8 years to discover and achieve. That is a pretty big difference and without any relation with the FreeNAS server.
 

M H

Explorer
Joined
Sep 16, 2013
Messages
98
Maybe, but that server started with access to a pool with no snapshots. An hour later, it had to "learn" about a first snapshot. Next hour, another snapshot is to be discovered. Again, one snapshot at a time, one per hour, that is a rate that the server can keep up with.

Now, your new server must discover and achieve in an instant what the other had 8 years to discover and achieve. That is a pretty big difference and without any relation with the FreeNAS server.

Ahh! Now I completely understand what you're saying and it makes sense. Sorry that I was a little slow picking up what you're putting down. I think that you were on the right track. I took the risk (because we have an identical off-site server for replication that has all of our snapshots) and deleted ALL of my snapshots last night after making sure everything was configured correctly. Sure enough, all the snapshots taken today are showing up in "Previous Versions." All of them, for every hour taken, freaking amazing! It's weird how many times its happened now with FreeNAS that things just started working after a day or two. I haven't really figured out the why, but it's working for now and I guess that's what really matters anyway.

Ashampoo_Snap_Sunday, December 13, 2020_15h10m31s_001_.png

Ashampoo_Snap_Sunday, December 13, 2020_15h11m49s_002_.png


And there it is, I can go into work tomorrow knowing that any potential s*** in the fan will be able to be cleaned out with the garden hose. Thanks for all your prompt help everyone, on a weekend at that. As I've said before, a lot of good knowledgeable people in these forums.

Stay safe, stay healthy, and stay classy.
 
Last edited:

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
There are limits to how many snapshots a Windows client can see in "Previous Versions". The upper bound depends on client version / configuration, but is typically I think somewhere 1,000. In legacy versions of FreeNAS, we restricted the snapshots presented as shadow copies based on snapshot "task". Starting in 11.3, we switched to presenting all snapshots (because a snapshot is a snapshot), but this presented a couple problems:
1) how do you limit the number of snapshots presented to users
2) how do you keep performance up (so that you don't have 1000 smbd processes all calling zfs_iter_snapshots_sorted()).

The answer to (1) that works in most cases is as follows:
a) We only present snapshots where ZFS_PROP_WRITTEN > 0. If WRITTEN is 0, then no unique data is contained in the snapshot. This means that it is not relevant to end-users (we only care about snapshots where something is different).
b) The FSCTL to query previous versions takes a path as an argument (you're getting an array of shadow copy info for a given path). We also only return previous versions in which the path being queried actually exists, and if it's a file, then we only present snapshots with unique timestamps.
In most cases (a) + (b) will yield only a handful of relevant snapshots.

For cases where this isn't sufficient, you can set a pattern for either inclusion of exclusion (accepts Unix wildcard characters ?*).
shadow:include="*daily*" for instance will only present snapshots with the word "daily" in it. shadow:exclude="archive*" will exclude all snapshots with names starting with "archive".

The answer to (2) is of course the answer-to, and root-of, so many problems in computing - caching the results.
At present each smbd process keeps its own shadow copy cache that's lazy-initialized (as users actually start trying to access shadow copies). Snapshot query results are cached by default for 5 minutes. Results for a timestamp-prefixed path from client to absolute path in ZFS snapshot directory are stored in basically an rb-tree with maximum size of 512KiB (as limit hit, LRU dropped).
The end result of this is fairly quick performance when navigating around previous versions, but there is potential for some user confusion as it may take 5 minutes for a recent snapshot to "appear".
This latter issue primarily affects people reviewing / testing the product, but I don't think will be an issue in real-life production because people usually want to recover older versions of files than the one in the snapshot taken a couple of minutes ago.

40,000 snapshots will still be painful in terms of how long it takes to return results. You can get a rough ballpark figure by timing how long it takes to complete "zfs list -t snap <dataset>" (since that uses the same libzfs call that we use in samba).
 
Last edited:
Top