Replication problem, what to make of the errors?

Status
Not open for further replies.

Daniel Claesson

Dabbler
Joined
May 31, 2016
Messages
35
Hi all,

I have some problem with replication tasks. I replicate my main FreeNAS box to a secondary FreeNAS box located in another part of the building.

I have a snapshot of the whole ZFS volume on the main box. This snapshot i have then created a replication task for to my secondary box.

If i log in to my main box and check the status of the replication task, it i stated as "success / up to date".
But i still get e-mail notifications like these:

##
The replication failed for the local ZFS vipera/jails/plexmediaserver_1 while attempting to
apply incremental send of snapshot auto-20160812.2132-1w -> auto-20160813.2132-1w to 192.168.5.101
##
The replication failed for the local ZFS vipera/jails/.warden-template-pluginjail-clean-clone while attempting to
apply incremental send of snapshot auto-20160812.2132-1w -> auto-20160813.2132-1w to 192.168.5.101
##
The replication failed for the local ZFS vipera/jails/owncloud_1 while attempting to
apply incremental send of snapshot auto-20160812.2132-1w -> auto-20160813.2132-1w to 192.168.5.101
##
The replication failed for the local ZFS vipera/jails/transmission_1 while attempting to
apply incremental send of snapshot auto-20160812.2132-1w -> auto-20160813.2132-1w to 192.168.5.101
##

It sure looks like it's only the jails that is giving errors.

Some info on systems and config:
- Periodic Snapshot task config is: Volume=vipera, Recursive=Yes, When=06:00-01:00 everyday, Frecuency=every 1 day, Keept=1 week, VMware sync=Yes

- Replication task config: Volume=vipera, Remote host: IP secondary box, Remote ZFS volume=Tank/Rep-backup, Delete stale=Yes, Rep Stream Comp=lz4, No kb/s limit and can run whenever 24/7.

HW Configs (short version):
- Main Box= Supermicro X9SRL-F, Intel Xeon-E5 2620v1, 32GB ECC RAM, Boot 160GB Intel 320 SSD, Storage 6x 2TB Seagate drives in RAIDz2

- Secondary Box= HP Proliant DL120 G6, Intel Pentium G6950, 12GB ECC RAM, Boot Sandisk 8GB USB, Storage 4x 2TB Seagate drives in RAIDz1

Have i done a "brainfart" and messed up my configs of the replication task or what else can it be that is making the replications of the Jails fail.

Sorry for any bad english in this topic.

Best Regards
Daniel Claesson
Sweden
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
Nothing obvious jumps out. Have you tried the "initialize the remote side" option? This will delete the data on Pull and start a fresh replication. Also, any chance of a network issue? Is it only happening to the jails datasets (you can try creating separate replication tasks for the lower level datasets as a test)?
 

Daniel Claesson

Dabbler
Joined
May 31, 2016
Messages
35
Hi and thanks for your replay.

The "initialize the remote side" seem to be a legacy thing, as it is not present in version 9.10
See documentation: http://doc.freenas.org/9.10/freenas_storage.html#replication-tasks
But here in the old docs it is present: http://olddoc.freenas.org/index.php/Replication_Tasks

Network issues should not be a thing, i have a very basic setup. Only one VLAN for the FreeNAS boxes and the clients through managed HP switches. No "other" indications on a network error or disturbance. No dropped packets, low latency and overall good performance.

As i see no other option then to redo the snapshots and the replication from scratch with datasets as separate tasks. I will keep this thread up to date on how things progress.
 
Last edited:

nojohnny101

Wizard
Joined
Dec 3, 2015
Messages
1,478
Whenever I have seen those error messages, it has been one of these things:

1) There is a dataset that I recently added on "PULL" within the vdev being replicated that is not on "PUSH". That can throw error messages (ex: clone a previous snapshot to pull files off of it and and forget to delete it).

2) I know the system should clear snapshots out on PULL for datasets that no longer exist on PUSH but I had to do it manually one time to fix similar errors to what you're seeing.

Would it be possible for you just to delete all snapshots on PULL and start fresh?
 

Daniel Claesson

Dabbler
Joined
May 31, 2016
Messages
35
Hi,

I did a complete "reinstall/reconfigure" of the whole replication "flow". On the "Pull" i deleted the dataset and reconfigured a new one without any data on it.
After that i reconfigured "push" with new snapshots and replication tasks, every dataset as a separate task this time, before i did a snapshot and a replication task of the whole pool.

But so far i still get some error but not the same as before. This error was generated this night.
##
The replication failed for the local ZFS vipera/ESXi-Storage while attempting to
send snapshot manual-20160220 to 192.168.5.101
##
Replication vipera/ESXi-Storage -> 192.168.5.101:TANK/Rep-backup failed: Failed: vipera/ESXi-Storage (manual-20160220)
##

This might be a one time error and things will start to work after some time, when there are more snapshot task completed.
Any thoughts on this?
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
Can you prove a screen shot of your snapshot jobs and replication jobs? Do you have replication running on each dataset individually as well as a recursive one for the whole pool?
 

Daniel Claesson

Dabbler
Joined
May 31, 2016
Messages
35
Hi,

Of course i can, see below.
At the moment i only have individual replications running on each dataset and no recursive one for the whole pool. This seems to work, as since last reply i haven't got any more error messages.

This is the view of "Perodic Snapshot Tasks" on PUSH:
Periodic_snapshot_task_PUSH.png

This is the view of "Replication Tasks" on PUSH:
Replication_tasks_PUSH.png

This is the view of "Snapshots" on PULL:
Snapshots_View_PULL.png
 
Status
Not open for further replies.
Top