Noob needs help with Snapshot errors

choong

Dabbler
Joined
Jul 2, 2014
Messages
10
I setup a new replicating server by using snapshot from Server1(9.10.2) to Server2 (11.2). Everything ran fine within the first few hours. The next day I came in I noticed that snapshots have stopped. So I deleted all the snapshots on both S1 and S2 and tried to rebuild everything. By now I noticed S1 stopped doing snapshots even when I make 5 min intervals, nothing appears. Googling here for answers I saw someone said 9.10.2 had problems with snapshots so I upgraded S1 to 11.2. Yet nothing happens. I tried creating a snapshot manually by clicking the button and it works just fine.

Although the manual says if snapshot fails to run error would go to var/log/messages, I checked there nothing. Checked debug.log and it shows this
Oct 29 12:43:01 FPFS /autosnap.py: [tools.autosnap:143] Process 69525 gone
Oct 29 12:46:01 FPFS /autosnap.py: [tools.autosnap:136] Checking if process 70332 is still alive
Oct 29 12:46:01 FPFS /autosnap.py: [tools.autosnap:143] Process 70332 gone
Oct 29 12:46:17 FPFS uwsgi: [middleware.notifier:179] Popen()ing: /sbin/zfs list -t volume -o name -H
Oct 29 12:46:17 FPFS uwsgi: [middleware.notifier:179] Popen()ing: /sbin/zfs list -p -t snapshot -H -S creation -o name,used,availabl
e,referenced,mountpoint,freenas:vmsynced
Oct 29 12:47:01 FPFS /autosnap.py: [tools.autosnap:136] Checking if process 70470 is still alive
Oct 29 12:47:01 FPFS /autosnap.py: [tools.autosnap:143] Process 70470 gone

Checked cron and saw this
Oct 29 12:45:00 FPFS /usr/sbin/cron[70332]: (root) CMD (/usr/local/bin/python /usr/local/www/freenasUI/tools/autosnap.py > /dev/null
2>&1)
Oct 29 12:45:00 FPFS /usr/sbin/cron[70333]: (root) CMD (/usr/libexec/atrun > /dev/null 2>&1)
Oct 29 12:46:00 FPFS /usr/sbin/cron[70470]: (root) CMD (/usr/local/bin/python /usr/local/www/freenasUI/tools/autosnap.py > /dev/null
2>&1)
Oct 29 12:47:00 FPFS /usr/sbin/cron[70771]: (root) CMD (/usr/local/bin/python /usr/local/www/freenasUI/tools/autosnap.py > /dev/null
2>&1)

So I decided to use the default snapshot value and created the snapshot task, do a manual call of the python and got this

[root@FPFS /usr/local/bin]# python /usr/local/www/freenasUI/tools/autosnap.py
Traceback (most recent call last):
File "/usr/local/www/freenasUI/tools/autosnap.py", line 301, in <module>
if snap_expired(snap_infodict, snaptime):
File "/usr/local/www/freenasUI/tools/autosnap.py", line 111, in snap_expired
snapinfo_expirationtime = snapinfo_expirationtime + timedelta(days=int(365.2425 * snap_ttl_value))
OverflowError: date value out of range

My system time date seems normal, is that error something to do with the previous snap task that I created and deleted? or is that something wrong with my autosnap.py?

Most importantly, could someone advise how to fix this
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
You need to allow enough time between snapshots for the first replication to finish.
 

choong

Dabbler
Joined
Jul 2, 2014
Messages
10
Sorry I misunderstood you and I think you misunderstood me. The replication is fine, works like a charm. Its the snapshot that is a problem. In fact I have a perfectly same copy replicated to S2 which is almost 1 week old now because the snapshot stopped working after a few snapshots
 

choong

Dabbler
Joined
Jul 2, 2014
Messages
10
Yes. Turns out there are old snapshots not picked up by the UI both 9.10.2 and 11.2-U6. And the snapshot task won't run with the old snapshots there. So I manually deleted (via CLI) all the old snapshots and its running again (but not sure how long). Currently its running 11.2 U6.

I'll find out if it has run normally over the weekend when I get in tomorrow
 

choong

Dabbler
Joined
Jul 2, 2014
Messages
10
ok the snapshots are running, the replications are running, but expired snapshots are not removed. Deleting expired snapshot from UI does not do anything. Had to manually remove them via SSH. I would assume replication task completed/succeeded that expired snapshots would have been removed. right now I'm looking at 4 day old snapshot with 2d expiry

So looks like snapshot is still too buggy for me to use
 
Top