Confused about replication...

jspcto · Nov 30, 2015

Hi all... I'm looking for a little help understanding how replication can accomplish what is stated in the documentation here http://doc.freenas.org/9.3/freenas_storage.html#replication-tasks.

"A replication task allows you to automate the copy of ZFS snapshots to another system over an encrypted connection. This allows you to create an off-site backup of a ZFS dataset or pool."

The part that confuses me is the second sentence about creating a backup of a dataset or pool. According to the documentation on Periodic Snapshots (excerpt below), snapshots only contain data which has changed.

"A periodic snapshot task allows you to schedule the creation of read-only versions of ZFS volumes and datasets at a given point in time. Snapshots can be created quickly and, if little data changes, new snapshots take up very little space. For example, a snapshot where no files have changed takes 0 MB of storage, but as you make changes to files, the snapshot size changes to reflect the size of the changes."

So, if a replication task sends ZFS snapshots (which are only changes to the original dataset), how can it be an off-site backup as mentioned in the documentation? In the event of a catastrophic event, how could those snapshots be used to rebuild datasets if they only contain changes?

It is possible I'm confused about the intent of replication or the concept of snapshots, but I'm trying to better understand this feature and potentially apply it to my scenario. In short, I'd like to mirror selected datasets to an off-site FreeNAS server to help mitigate against catastrophic loss. I've looked at rsync as well, and I'm wondering if that might be better suited for this use case.

Thanks in advance for any light you can shed on this topic.

SweetAndLow · Nov 30, 2015

Replication is what you want. On the initial replication it transfers all the data because when it compares the src snapshot to the remote it sees differences. In this case the remote is empty and the source has all your data.

Not very long winded answer but I hope that clears things up a little.

depasseg · Nov 30, 2015

Think of it this way- the first snapshot has the bulk of the data in it, and the following snapshots contain the delta information.

Robert Trevellyan · Dec 1, 2015

jspcto said:
So, if a replication task sends ZFS snapshots (which are only changes to the original dataset)

A replication task doesn't "send a snapshot", it makes the state of the destination match the state of the source as of a given snapshot by sending the difference between what the snapshot represents and what the destination already has.

Imagine turning the pages of a book and inserting a bookmark at page 100, then continuing to turn pages until you get to 110. The first bookmark takes up almost no space in the book, but if you wanted to replicate what it represents, you'd have to copy 100 pages. Later, if you had another bookmark at page 110, and someone already has the first 100 pages, you only need to copy 10 more pages to give them what that 2nd bookmark represents. On the other hand, if they don't already have the first 100 pages, you'd have to copy all 110 pages that the 2nd bookmark represents.

jspcto · Dec 1, 2015

Okay, thanks to all who have responded. I appreciate the explanations... makes sense. I was misinterpreting the documentation line about how replication "automates the copy of ZFS snapshots". I read that to mean that is was only sending exact copies of existing snapshots on the source system. Instead it sounds like a comparison is being made between the remote dataset and the source dataset and that snapshots are sent for the delta. As you said, that would mean that the initial replication snapshot would be "everything" since the delta between the systems is exactly that, everything. After that, I suppose the subsequent snapshots sent to the remote would essentially be identical to those stored on the source system since replication snapshots are sent on the same frequency as the source snapshots.

This makes me wonder what would happen in the event that communication between the source and remote was disrupted for a period of time. Would all of the "missed" snapshots that are stored on the source system be individually sent when communication was reestablished or would it really be more like the initial sync that looks at the delta once between the two systems? I'm assuming it is the latter... if so, I think I get it now. If not... maybe a bit more tutelage is required.

Robert Trevellyan · Dec 2, 2015

jspcto said:
what would happen in the event that communication between the source and remote was disrupted for a period of time

The way I understand it, as long as source and destination have a previous snapshot in common, a subsequent replication will only send the differences, even if intervening replications were missed.

Some please correct me if this is wrong.

Spacemarine · Jul 16, 2016

jspcto said:
I was misinterpreting the documentation line about how replication "automates the copy of ZFS snapshots".

I don't think that you "misinterpreted" the documentation in the way that it would be your fault. I think the documentation is just very weak when it comes to replication. If you read it word by word, you automatically end up with your interpretation. I try to find some more information on this topic, as the documentation doesn't really cover all the implications that come from doing a replication.

Stux · Jul 16, 2016

When you do a replication, any data which is in the local snapshot, but not on the remote system, is transmitted to the remote system so that it can then be referenced by the remote snapshot, which will then be a copy of the local snapshot.

In the future as data is deleted from your pool, it will not be removed if it is referenced by older snapshots. Those snapshots will then grow in size. It's as if you delete it from the current pool and it gets added to the old snapshot. (It was actually always there)

Once all the snapshots which reference the deleted data are removed, then the space occupied by the deleted data will be reclaimed.

Spacemarine · Jul 17, 2016

That first sentence makes it very clear, thank you! Would it be correct to add the following statement to the first sentence: In order for the replication to start, the remote system has to be an empty dataset (in which case the whole local dataset is transmitted) OR it has to be a dataset that shares a common snapshot with the local system (and in that case, this snapshot will be used as the starting pint)

As I am writing this, I begin to wonder: What if they share a common snapshot, but that snapshot is not the newest one? What happens to the newer snaphots? Will the be replaced? Or must the common snapshot be the newest one?

Robert Smith · Jul 17, 2016

One thing that needs to be clarified. zfs send does not check the destination in any way. It is your responsibility to generate correct stream; if it is not correct zfs receive will fail.

Zfs send does not need to see the destination. You can send to a file, and take that file to the destination by car, for example.

HTH

Stux · Jul 17, 2016

There has to be a common snapshot (or the whole pool will be sent/replaced). When replicating you use the -F option to zfs receive which forces the remote pool to rollback to the state of the last common snapshot before the receive can go ahead.

Thus any divergent changes in the remote pool/dataset will be lost.

Important Announcement for the TrueNAS Community.

Confused about replication...

jspcto

Cadet

SweetAndLow

Sweet'NASty

depasseg

FreeNAS Replicant

Robert Trevellyan

Pony Wrangler

jspcto

Cadet

Robert Trevellyan

Pony Wrangler

Spacemarine

Contributor

Stux

MVP

Spacemarine

Contributor

Robert Smith

Patron

Stux

MVP

Similar threads