How to continue replication using already existing snapshots?

Status
Not open for further replies.

DaPlumber

Patron
Joined
May 21, 2014
Messages
246
This is driving me up the wall:

I have a bunch of snapshots on datasets that were replicated to another (in this case local) pool by selecting the recursive option on the base dataset in both cases. Now I don't want to snapshot or replicate all the datasets in the pool, so each dataset has it's own snapshot and replication job in the GUI. The snapshots are continuing just fine, but the replication is error-ing out complaining about the existing snapshots. Sure I could "initialize" the destination and wipe everything out and start over, but I don't want to do THAT.

How on earth do I get replication to continue with existing valid snapshots?

I know how I'd do this on the CLI and snapshots have "serial numbers" and inheritance so it should "just work", but I'm trying to be a good boy and not go behind the GUI's back...:rolleyes:
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Sorry, you need to provide more info. I'm not understanding what you are trying to do... :P
 

DaPlumber

Patron
Joined
May 21, 2014
Messages
246
Sorry, I'll try and be clearer none of this has been CLI, all GUI:

- Replicating datasets between two pools, let's call them "tank" and "tankbak"
- Both are local to the same FreeNAS 9.2.1.5-RELEASE box so the "remote" is localhost
- keys are set up correctly and are tested
- Started by doing this the "lazy" way and snapshotted and replicated the "root" dataset - tank - recursively (i.e. check the recursive checkbox in both snapshot and replication)
- Stopped this as I want to replicate only some of the datasets off tank, let's call them tank/ds1, tank/ds2, tank/ds4
- tank/ds3 no longer has snapshots (deleted) and tankbak/ds3 has been deleted conpletely
- tank/ds[124] are now snapshotting successfully independently with new jobs continuing the existing snapshots
- replication to tankbak/ds[124] is failing with a complaint of existing snapshots
- I don't want to re-replicate tank/ds[124] all over again, just continue the snapshots from the existing ones

e.g. message: "Replication tank/ds1 -> localhost failed: cannot receive new filesystem stream: destination has snapshots (eg. tankbak/ds1@auto-20140619.1236-2w) must destroy them to overwrite it"

Does the GUI truly have no way of stopping a replication snapshot stream and restarting it with a different job? ZFS fully supports this and in fact the incremental snapshot process depends on snapshots being children in order to work and it's obviously working for the snapshot process.

I can't help feeling I'm missing something simple?
 

panz

Guru
Joined
May 24, 2013
Messages
556
After bug(?)#5293 I'm going to send/receive snapshots manually, because I don't trust the replication routines anymore. I did a terrible mistake: automatic replicating the snapshots after the expiration date and... BOOM :) it flawed.

So, for me, no more "automated" things and... A little more work, but more controlled.
 

DaPlumber

Patron
Joined
May 21, 2014
Messages
246
@DaPlumber: unfortunately panz' solution is not easily applicable for everyone... But maybe the first comment by "Monarch Dodra" on this FreeNAS bug report can help you solve your problem? https://bugs.freenas.org/issues/1467

All the best!


Yeah, being able to edit which snap is the latest etc. would appear to be the fix, but that's not exposed in 9.1.5, that bug is talking about 9.3. I'm puzzled as to why this is being tracked outside of ZFS which manages snap streams and inheritance and hierarchy internally.

Either way I couldn't wait so I checked the "initialize" box and red-copied the data. Sigh. If this was remote over a slow link I'd be a little irked tho'.
 

DaPlumber

Patron
Joined
May 21, 2014
Messages
246
So I finally got the local replication running (after init. and recopy, annoying!) via GUI which means via ssh. Even with the "quick cipher" i.e. arcfour checked I'm still only getting about half the throughput I got with a direct pipe. Time to go add a "me too" to the enhancement request I guess? :cool:
 

panz

Guru
Joined
May 24, 2013
Messages
556
If you're replicating to a trusted machine (i.e. via your trusted network) you should use netcat to pipe the send to the receive side. I'm using it and it's astonishingly fast. No more "automated" replication for me, until developers will take care of many problems affecting their routines.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
If you're replicating to a trusted machine (i.e. via your trusted network) you should use netcat to pipe the send to the receive side. I'm using it and it's astonishingly fast. No more "automated" replication for me, until developers will take care of many problems affecting their routines.

You should do a guide on that @panz. There's probably quite a few people that would find great value in a guide on that topic. Not the least of which is me. ;)
 

titan_rw

Guru
Joined
Sep 1, 2012
Messages
586
Netcat replication is pretty easy.

On Nas1:

# zfs send tank/dataset@snap | nc nas2 8023

On Nas2:

# nc -l 8023 | zfs receive tank/dataset


This is from memory, but I think it's right. You might need a -F in the receive line if you are doing the initial replication. This is also assuming the machine you're replicating to is called "nas2", and that dns works. Change the port to whatever you want to replicate over.

I have no problem averaging over 100 MB/sec with netcat replication. But then when I've monitored ssh 'fast cypher' replication via "systat -ifstat", I've seen it run at full gigabit too. So even using 'weak' ssh encryption gigabit is still the bottleneck. Cpu usage would probably be higher with the built in ssh replication compared to netcat, but it's not a big deal.
 

panz

Guru
Joined
May 24, 2013
Messages
556
You should do a guide on that @panz. There's probably quite a few people that would find great value in a guide on that topic. Not the least of which is me. ;)

That's exactly what I'd like to do: I'm revising my FreeNAS setup (just moved the data and destroyed previous pools).

I'm going to deeply test all the replication procedures with a "real" setup (so, NOT in a VM) and I'm going to double check the the command-line examples ;)
 

dwoodard3950

Dabbler
Joined
Dec 16, 2012
Messages
18
Did you every come up with a solution for how to really re-start replication with an existing set of snapshots? I had an occasion where the replication failed when the backup destination went down for a day. I have a script I use to bring up-to-date, but I'd prefer to use the GUI scheme as it does well with the notifications and status messages that I do less well. As an example, I can run a "catch-up" routine (shown below) and then restart the replication, but the result is always the same as you described above with requires me to delete snapshots.
Code:
$ZFS send -V -R -I $snap1 $DATASET@$snap2 | \
ssh $SSH_OPT $REMUSER@$REMHOST \
$ZFS recv -dvF $REMPOOL

I've tinkered with the freenas:state property, but still no success. The result was I had to "initialize" which as you noted, requires a re-copy of all data. That's not horrible for local machines, but could be devastating for off-site replication.
 
Status
Not open for further replies.
Top