I have a bunch of periodic snapshot and replication tasks configured with different naming schemes and retention timings. Today I noticed that none of my daily snapshots are being cleaned up. New ones are created, but nothing is being deleted.
Looking in the zettarepl log, I see a retention task from a few days ago called zettarepl.zettarepl ran and deleted the various daily snapshots. After that, every time I see a retention zettarepl.zettarepl task it says that "Local retention failed: error listing snapshots" on a remote host. Then it shows zettarepl connecting to other remote hosts but not destroying any snapshots.
The fact that zettarepl can't list snapshots on that particular remote host is to be expected as it's down for maintenance. What's unusual is that this appears to prevent all snapshot cleanup, even ones that aren't associated with that server or a replication task at all. I've taken other hosts down for maintenance and not had this happen.
I believe this has something to do with the fact that the replication task is a pull from the down host while all of my other tasks are push replication. I'm temporarily standing up the down host to see if snapshots are properly deleted tonight.
Can anyone else replicate this issue or is it a bug unique to my setup?
Looking in the zettarepl log, I see a retention task from a few days ago called zettarepl.zettarepl ran and deleted the various daily snapshots. After that, every time I see a retention zettarepl.zettarepl task it says that "Local retention failed: error listing snapshots" on a remote host. Then it shows zettarepl connecting to other remote hosts but not destroying any snapshots.
The fact that zettarepl can't list snapshots on that particular remote host is to be expected as it's down for maintenance. What's unusual is that this appears to prevent all snapshot cleanup, even ones that aren't associated with that server or a replication task at all. I've taken other hosts down for maintenance and not had this happen.
I believe this has something to do with the fact that the replication task is a pull from the down host while all of my other tasks are push replication. I'm temporarily standing up the down host to see if snapshots are properly deleted tonight.
Can anyone else replicate this issue or is it a bug unique to my setup?