I keep going back and forth between thinking you're right and thinking this isn't actually an issue. I had to break out some paper and diagram a few example cases before I was convinced that there actually needs to be a change. As you say, if the second most recent snapshot is empty and the most recent is not, the current code will not consider either. With your updated code, the second most recent snapshot would still be considered.
Thanks for noticing this.
How does this look? I haven't tested it.
Code:
latest = None
for snapshot in sorted(snapshots[dataset], key=lambda snapshot: snapshots[dataset][snapshot]['creation'], reverse=True):
if not snapshot.startswith("auto-") \
or snapshots[dataset][snapshot]['type'] != "snapshot" \
del snapshots[dataset][snapshot]
continue
if not latest:
latest = snapshot
del snapshots[dataset][snapshot]
continue
if snapshots[dataset][snapshot]['used'] != '0' \
or snapshots[dataset][snapshot]['freenas:state'] != '-' \
or snapshot in deleted[dataset].keys():
del snapshots[dataset][snapshot]
continue
# Stop if no snapshots are in the list
if not snapshots[dataset]:
del snapshots[dataset]
continue
I also kept the loop together by sorting the hash by creation date and then keeping track of 'latest'. This also means that before we were looping over N, checking the list size, then sorting N-D, and checking the size. Correcting the algorthim meant looping over N, checking the list size, sorting N-D1, checking the list size, looping over N-D1-1, and checking the list size again. Now it sorts N, loops N, checks the list size. I'll leave the Big O comparison as an exercise for the reader, but while the incorrect algorthim was probably technically faster, the correct method shouldn't be significantly slower. Pfft, how many snapshots do people have anyway. In my experience it's deleting snapshots that actually takes time and once the scripts are running periodically the delay isn't noticeable because they're only deleting one or two snapshots anyway. Heck they're probably run by cron so it's not noticeable at all. Even if they're only run once a day or week, odds are in that case they're going to be run during downtime when it's not going to impact anyone anyway. The takeaway here is that simple bug reports lead to far too much thought.
Unfortunately, I won't be able to commit a fix for at least 12 hours. I think I'll be able to review a pull request much sooner though, so If you have your own patch, feel free.