Replication failing and requiring manual intervention

izomiac

Dabbler
Joined
May 3, 2018
Messages
19
I'm currently using TrueNAS-22.02-RC.1-1 (SCALE) on two machines, with the hardware details in my signature. Lately, I've had a couple snapshots that TrueNAS is struggling to replicate. I've been getting the following error:

[2021/12/18 12:45:04] ERROR [replication_task__task_3] [zettarepl.replication.run] For task 'task_3' non-recoverable replication error ReplicationError('Last full ZFS replication failed to transfer all the children of the snapshot Pool/Iona@auto-2021-12-12_00-15. The snapshot Pool/Iona/Backup@auto-2021-12-12_00-15 was not transferred. Please run `zfs destroy Pool/Iona@auto-2021-12-12_00-15` on the target system and run replication again.')

Snapshot details:
Code:
root@iona:/mnt/Pool/Iona/Home/izomiac# zfs list -t snapshot | grep Iona | grep auto-2021-12-12_00-15
Pool/Iona@auto-2021-12-12_00-15                                                                                                                              136K      -      461K  -
Pool/Iona/Backup@auto-2021-12-12_00-15                                                                                                                       300M      -     2.05T  -
Pool/Iona/Home@auto-2021-12-12_00-15                                                                                                                           0B      -      341K  -
Pool/Iona/Home/izomiac@auto-2021-12-12_00-15                                                                                                                 188K      -      149G  -
Pool/Iona/Media@auto-2021-12-12_00-15                                                                                                                       55.6M      -     9.57T  -
Pool/Iona/Media-Private@auto-2021-12-12_00-15                                                                                                               67.2M      -     1.04T  -
Pool/Iona/Working@auto-2021-12-12_00-15                                                                                                                      256K      -     66.2G  -
root@iona:/mnt/Pool/Iona/Home/izomiac# zfs send -nvRw -I Pool/Iona@auto-2021-12-05_00-15 Pool/Iona@auto-2021-12-12_00-15
[116 lines clipped referring to snapshots taken after 2021-12-12]
total estimated size is 54.4G


Replication Task:
Replication Task.png

This is the fifth time I've had to manually destroy that incomplete snapshot, and I had to destroy the monthly 2021-12-01 snapshot twice as well. The weekly 2021-12-05 snapshot succeeded without any intervention. I'm not sure how recently this became an issue, since I had to do some major hardware/software work on Takao and replication was limited to my most essential files until I had a chance to physically go there and fix/upgrade it in November. I get a similar error on the replication task going the other way, but less frequently since those snapshots tend to be smaller. My internet connection isn't great (LTE), so interruptions are inevitable and I can't spare the bandwidth to transfer every weekly snapshot multiple times (50 GB x 5 failures per week = up to 1 TB extra). I'd also love for my replications to be automatic rather than requiring daily intervention. Any suggestions?
 

NugentS

MVP
Joined
Apr 16, 2020
Messages
2,947

izomiac

Dabbler
Joined
May 3, 2018
Messages
19
Excellent, I'll eagerly await the next release then. Too bad I missed your ticket, the only hit on Google for the error message is this unanswered post, which seems like it's probably a different issue. OTOH, I can't exactly complain about not writing down the error message for Jira before fixing the issue, I've certainly done that a time or two, lol.
 

TheNiTz

Cadet
Joined
Dec 22, 2021
Messages
6
Same issue, running the latest version as of 4/7/2022 TrueNAS-SCALE-22.02.0.1. If I run it manually it Succeeds if Automatic it fails after the 1st time

Log Path​

/var/log/jobs/3467.log

Log Excerpt​

[2022/04/07 13:00:08] INFO [replication_task__task_3] [zettarepl.replication.pre_retention] Pre-retention destroying snapshots: [] [2022/04/07 13:00:08] ERROR [replication_task__task_3] [zettarepl.replication.run] For task 'task_3' non-recoverable replication error ReplicationError('Last full ZFS replication failed to transfer all the children of the snapshot SDDs@auto-2022-04-07_13-00. The snapshot HDDs/SSDreplica/ix-applications/docker/6c8ca07846c178d3f005420ec2d66f490a525702debf2ef37eae978f53e4bcf1@auto-2022-04-07_13-00 was not transferred. Please run `zfs destroy -r HDDs/SSDreplica@auto-2022-04-07_13-00` on the target system and run replication again.')

Error​

[EFAULT] Last full ZFS replication failed to transfer all the children of the snapshot SDDs@auto-2022-04-07_13-00. The snapshot HDDs/SSDreplica/ix-applications/docker/6c8ca07846c178d3f005420ec2d66f490a525702debf2ef37eae978f53e4bcf1@auto-2022-04-07_13-00 was not transferred. Please run `zfs destroy -r HDDs/SSDreplica@auto-2022-04-07_13-00` on the target system and run replication again.
 

TheNiTz

Cadet
Joined
Dec 22, 2021
Messages
6
Solved - removed ix-application folder from the source. was able to complete by itself every time
 
Last edited:
Top