Staging replication data

Status
Not open for further replies.

memitim

Cadet
Joined
Nov 25, 2014
Messages
4
I have a client with two sites with FreeNAS servers that have data replicating between them. I just added a snapshot task for a dataset that is just shy of 2 TB at one site and would like to replicate the data. However, on their heavily abused WAN link, it'll take about three months or so to transfer the snapshot.

Therefore, I'd like to load the data on to an external hard drive and ship it to the other site. Putting the data-laden initial snapshot on an external drive seems feasible using zfs send\receive but I'm not so sure how that would be treated by a snapshot replication task in FreeNAS if I were to have the data loaded into the target dataset on the remote NAS. Would suck to go through all that just to have FreeNAS ignore the data and try syncing it all anyhow.

Has anyone done similar who can offer some advice on how to get this data staged on the remote server for ZFS replication to take over incrementals on?
 

fracai

Guru
Joined
Aug 22, 2012
Messages
1,212
I'd think you'd be able to send / receive the data to the external drive, transfer the data to the other server via station-wagon-net, and send / receive the data to the other server. You should then be able to replicate incrementals over the WAN.
 

memitim

Cadet
Joined
Nov 25, 2014
Messages
4
That was what I was figuring. I just did some testing by running zfs send on the first snapshot from one dataset:

Code:
zfs send -R RaidZ/Test-data@auto-20141126.1228-1w | openssl enc -aes-256-cbc -a -salt -pass pass:password > /data/testfile


and then zfs receive into a different dataset:

Code:
openssl enc -d -aes-256-cbc -a -in /data/testfile | zfs receive -F -d RAIDZ/Test-target


That worked and the new dataset with the snapshot showed up in the destination, although the test file that I put in the source dataset was not there so seems as though the snapshot was exported as a local snapshot rather than as a file stream to remote server.

When I then created the snapshot replication task from the source to the destination in FreeNAS, the messages log indicates:

Nov 26 14:28:01 vo-nas01 autorepl.py: [tools.autorepl:414] Remote and local mismatch after replication: RAID-Z2/Test-data: local=auto-20141126.1400-1w vs remote=auto-20141126.1228-1w
Nov 26 14:28:01 vo-nas01 autorepl.py: [common.pipesubr:58] Popen()ing: /usr/bin/ssh -i /data/ssh/replication -o BatchMode=yes -o StrictHostKeyChecking=yes -o ConnectTimeout=7 -p 22 vo-nas01 "zfs list -Ho name -t snapshot -d 1 RAID-Z2/Test-target/Test-data | tail -n 1 | cut -d@ -f2"
Nov 26 14:28:01 vo-nas01 autorepl.py: [tools.autorepl:431] Replication of RAID-Z2/Test-data@auto-20141126.1400-1w failed with cannot receive new filesystem stream: destination has snapshots (eg. RAID-Z2/Test-target/Test-data@auto-20141126.1228-1w) must destroy them to overwrite it

Between the first message and the "Replication" column for the snapshot in the "ZFS Snapshots" list showing a status of "OK," it looks like FreeNAS at least recognizes that the manually copied snapshot as the first in the set, but then fails to continue replication of the other snapshots.

Continuing to poke at it; ideas are welcome.
 

Mlovelace

Guru
Joined
Aug 19, 2014
Messages
1,111
I've had to bring the remote server "home" for the initial sync. Depending on the distance it's the easiest solution.
 

Toddebner

Dabbler
Joined
Mar 28, 2014
Messages
11
Mlovelace,

How did you accomplish getting replication working again? I have the following setup and experienced the same error as memitim.

FNas1 - Backup Server at Work. 500GB data created by windows server nightly backup
FNas2 - Remote Replication Server at Home.

I got replication working via ssh between the two sites in a small test - 2GB of data. took several hours.
Knowing that it would take 100+ days to sync the 500GB, I took FNas2 into the office.
Changed the IP, Gateway, DNS settings and connected to the network
Then had to redo the replication task setup because the ip address was changed
Was able to sync everything.
Took the FNas2 home, changed the ip address,etc back
Changed the ip address on the replication task on FNas1 (office)
Changed the ip address inside of the SSH key on the replication task
Saved it
and replication started on the next periodic snapshop.

Then received the same error as memitim.

I either did something wrong or need the same answer as memitim
 

Mlovelace

Guru
Joined
Aug 19, 2014
Messages
1,111
My offsite location uses a site-to-site VPN and is in the same IP scheme as the "home" location, so I didn't have to re-IP the offsite FN. I setup the snapshot replication task as it's described in the FN manual with the offsite server at "home" for the initial sync. Then I sent the offsite FN to the offsite location and it didn't skip a beat on replication. I had to bring it back "home" again for another initial sync when a new large dataset was created.
 

memitim

Cadet
Joined
Nov 25, 2014
Messages
4
I wish that I could simply relocate the server and replicate it locally. :) Unfortunately both are production storage units; I lost a couple of hours of Thanksgiving to triaging one of them when it went down from a power outage, so days of downtime are completely out.

The steps that I've come up with so far are:

Source server

1. Kill atime on the source dataset to prevent mismatch errors from the timestamp changing while the data is in transit.
2. Send the snapshot to a file on the external drive.

Destination server:

1. Receive the file into the target dataset.
2. Create the replication task.
3. Wait for it to throw the error, "Replication of <newer snapshot> failed with cannot receive new filesystem stream: destination has snapshots (eg. <sent snapshot>) must destroy them to overwrite it."
4. Run a manual sync of the <newer snapshot> indicated in the error message.

At which point the replication task resumes syncing the remaining snapshots normally. I still need to test whether the data itself is contained in the initial snapshot. If it is getting sent in the second manual sync then that would defeat the purpose of the entire exercise, but if it is contained in the first snapshot transferred on the external drive then this should do it.
 

memitim

Cadet
Joined
Nov 25, 2014
Messages
4
Yeah, at least in testing. The destination site currently doesn't have enough space available to contain the source data so I haven't been able to send it up. But I did test using a 500MB test dataset and it worked fine.

I found that the snapshot referenced in the error that I quoted in step#3 is usually the snapshot that was manually imported to the destination. Not sure why that wouldn't be consistent. But looking in the /var/log/messages log at the same time as that error also shows the message regarding the actual snapshot that the source system was trying to send. By manually sending an incremental snapshot between the one that was manually transferred and the one that the source failed to replicate (zfs -i <source snapshot> <destination snapshot> | ssh root@<destination server> zfs receive <target directory>), the scheduled replication then took over successfully.

The other thing to note, at least in replications between the 9.2.1-7 source and 9.2.1-5 destination that I have to support here, is that the target directory is actually a subdirectory of the specified target with the name of the source dataset. For example, if replicating the dataset "Test-data" from the source and the replication task gets configured to use the target dataset "ZPOOL2/Test-target", the actual target directory is "ZPOOL2/Test-target/Test-data." Important to note since the manual snapshot import and incremental snapshot syncs have to use that directory.

Here is my current documentation so far:

On the source FreeNAS server:

1. Kill atime on source dataset to avoid mismatch error by the time that data arrives at destination:

Code:
zfs set atime=off ZPOOL1/Test-data


2. zfs send initial snapshot to file

Code:
zfs send <zpool>/<dataset>@<snapshot> | openssl enc -aes-256-cbc -a -salt -pass pass:<password> > <destination file>


For example, the following command saves the first snapshot (auto-20141126.1228-1w) in the "Test-data" dataset on the "ZPOOL1" zpool to the file "/data/testfile" using the encryption password "testpass":

Code:
zfs send ZPOOL1/Test-data@auto-20141126.1228-1w | openssl enc -aes-256-cbc -a -salt -pass pass:testpass > /data/mount/externaldrive1/testfile


On the destination FreeNAS server:

3. zfs receive of file into target dataset.

Code:
openssl enc -d -aes-256-cbc -a -in /data/testfile | zfs receive -F ZPOOL2/Test-target/Test-data


Note that the directory specified in the destination is actually a subdirectory of the target dataset. This is because FreeNAS configures replication tasks to use a subdirectory named after the source dataset within the dataset specified in the replication task. Therefore, the replication task has the "ZPOOL2/Test-target" defined as the target dataset but sends the snapshot of the "Test-data" dataset to the "ZPOOL2/Test-target/Test-data" directory.

Back on the source FreeNAS server:

4. Create the replication task.

5. Wait for the following error to appear in the Status column of the applicable entry in the ZFS Replication tab of the web interface:

Code:
"Replication of <snapshot> failed with cannot receive new filesystem stream: destination has snapshots (eg. <snapshot name>) must destroy them to overwrite it."


6. Open the /var/log/messages log and locate the above error message.

7. Near that error, there will be the following message:

Code:
Remote and local mismatch after replication: ZPOOL1/Test-data: local=auto-20141210.1756-1w vs remote=auto-20141126.1228-1w


8. Copy the name of the snapshot next to "local=". In the above example, the snapshot name would be "auto-20141210.1756-1w".

9. Run a manual incremental sync from the source to the destination server:

Code:
zfs send -i ZPOOL1/Test-data@auto-20141126.1228-1w ZPOOL1/Test-data@auto-20141210.1756-1w | ssh root@test-nas02 zfs receive ZPOOL2/Test-target/Test-data


This will increment between the initial snapshot manually imported into the destination (Test-data@auto-20141126.1228-1w in this example) to the snapshot that the replication task attempted and failed to send (Test-data@auto-20141210.1756-1w in this example). Again, note that the destination directory is a subdirectory within the target dataset named after the source, not the target dataset itself.

10. Ensure that replication succeeds after the next scheduled snapshot on source.
 
Status
Not open for further replies.
Top