Replication - I want to understand the way how safe it is

Gen8 Runner

Contributor
Joined
Aug 5, 2015
Messages
103
Hey there,
like in another thread mentioned, I just setup a second "FreeNAS" machine (so, one is TrueNAS, the other FreeNAS). I am using the TrueNAS machine as backup-server for my FreeNAS machine, what is done by Replication.

But there is one big question-mark in my head, which I don't understand yet. So, let me give you example:
- Today, at 12am i make a Replication-task of a dataset with 10TB, that snapshot has a lifetime of one year
- FreeNAS sends the snapshot of 10TB size to my TrueNAS machine
- Now i add every week 10GB of data to my dataset (-> 520GB in one year)
- Those 520GB are sent by monthly snapshots to the TrueNAS machine
- On 03.09.2021 12am the lifetime of my 10TB snapshot is reached and TrueNAS is destroying the snapshot
- Now only the 520GB of added data still exist on the TrueNAS Backup-machine and the 10TB data of the initial-snapshot are gone

Is this the way it works or do the 10TB still exist on the pool?
I could imagine, that the 10TB of data are not destroyed (because FreeNAS / TrueNAS is so crazy reliable), but that would be a discrepancy about the snapshot lifetime.

Who can resolve that miracle? What is the technical background and what exactly is happening to the data after the lifetime of the snapshot is reached? Because, as far as i know, the following snapshots ONLY include changed datablocks.

Thanks in advance!
 
Joined
Oct 18, 2018
Messages
969
Is this the way it works or do the 10TB still exist on the pool?
I could imagine, that the 10TB of data are not destroyed (because FreeNAS / TrueNAS is so crazy reliable), but that would be a discrepancy about the snapshot lifetime.
I believe this is correct. Bellow is my understanding of the situation. If this is incorrect hopefully someone will correct me.

A snapshot really is like a picture of your data at a given time. zfs does try to be efficient though. It only records new information for those files which have changes from a prior snapshot. This way, if you have 10TB of data and only change or add 1 GB of data in the next snapshot zfs only stores the extra 1GB, not an entire new copy of the original 10TB. When a snapshot expires it will delete data which is not referenced by any other snapshot or dataset.

This is not a discrepancy with the snapshot lifetime. Lets say you have a dataset with only the following two files.
Code:
datasetFoo
- file1 1GB
- file2 1GB

These files live in your dataset and take up a combined 2GB of data. When you first take a snapshot of datasetFoo zfs writes references to file1 and file2. I will refer to such references with a * below. It does not rewrite them, the files are already there. These references take up much much less space than the original file.
Code:
snap1
- file1*
- file2*

datasetFoo
- file1 1GB
- file2 1GB

Now, say you change file1, well zfs is copy on write. That means that REALLY the old version of file1 still exists and zfs writes a new version. lets call this new version file1`

Code:
snap1
- file1*
- file2*

datasetFoo
- file1` 1GB
- file2 1GB

also on disk
- file1 1GB


Well, the dataset has the most up-to-date version of file1 which is file1`. snap1 still points to the old version, file1 and file2. Now you're consuming ~3GB of storage.

So then you take ANOTHER snapshot.

Code:
snap1
- file1*
- file2*

snap2
- file1`*
- file2*

datasetFoo
- file1` 1GB
- file2 1GB

also on disk
- file1 1GB


Just like snap1, when snap2 was made it simply recorded a Note that snap1m snap2, and datasetFoo all refer to file2. Okay, now you want to delete snap1. zfs deletes the references to file1 and file2 within snap1. It then checks whether file1 and file2 are referenced anywhere else, either in another snapshot or in a dataset. If so, it keeps it, else it deletes it. So we end up with the following.

Code:
snap2
- file1`*
- file2*

datasetFoo
- file1` 1GB
- file2 1GB


There is no inconsistency in the snapshot lifetime of snap2. Consider what would happen if deleting snap1 also deleted all references to file2. If that were the case, snap2 would lack the reference to file2 and would therefore no longer actually be a snapshot of datasetFoo because you could not use snap2 to recover datasetFoo!

As an aside, my mental model of this is highly influenced by how git manages blobs, trees, and commits.

Anyway, I hope this helps. If I've got something wrong hopefully someone will chime in.
 

Heracles

Wizard
Joined
Feb 2, 2018
Messages
1,401
Hi there,

Bellow is my understanding of the situation. If this is incorrect hopefully someone will correct me.

You are close to it... :smile:

zfs does try to be efficient though. It only records new information for those files which have changes from a prior snapshot.

ZFS is working at block level and not file level.

Well, the dataset has the most up-to-date version of file1 which is file1`. snap1 still points to the old version, file1 and file2. Now you're consuming ~3GB of storage.

Not exactly... If you completely overwrite the 1G file with a new volume of 1G of data, then you are right. But should you only change a small part of that file, only few blocks will ended up modified and so only these blocks will take more space.

Should zfs be working at file level, it would not be able to take snapshots of zvol...
 

Gen8 Runner

Contributor
Joined
Aug 5, 2015
Messages
103
Hey together and many thanks already to @PhiloEpisteme for your super detailled explanation and thanks to @Heracles for the additional content.

So, in short words you could say:
The snapshot AFTER the INITIAL snapshot is always like a "picture" of the WHOLE Dataset (in my example 10TB AND the monthly added data) and not just of the changed (10GB each month) datablocks!?
Consequently, even if the INITIAL snapshot of 10TB would be deleted, the next snapshot (snapshot 1, 2, 3, whatever) always will say: Hey, the INITIAL snapshot doesn't exist anymore, but in my snapshot 1 "picture" I still see, that all the 10TB files (+ the until then added data) exist on the main-FreeNAS-Server, so KEEP IT ALL (the 10TB & the new added data) on the BackupServer.

Is that correct in easy and non technical words?
 
Joined
Oct 18, 2018
Messages
969
ZFS is working at block level and not file level.
Thanks for the clarification @Heracles!

The snapshot AFTER the INITIAL snapshot is always like a "picture" of the WHOLE Dataset (in my example 10TB AND the monthly added data) and not just of the changed (10GB each month) datablocks!?
It sounds like you've got it. I might say it more simply. Each snapshot is like a picture. Deleting an old snapshot will not affect other snapshots. So long as you have the snapshot, you can always restore your data from that "picture". The specifics about which blocks are kept or deleted is focused on maximizing space efficiency as much as possible and is an implementation detail.

Perhaps a short way to state which blocks are kept is as follows. Blocks are kept if there exists at least one reference to the block such as from a snapshot or from the current version of a dataset.
 

Gen8 Runner

Contributor
Joined
Aug 5, 2015
Messages
103
@PhiloEpisteme
Alright, perfect! Then i know now, that I don't need to worry about expiration of snapshots.
I always thought, that the snapshot is just a snap of the CHANGED blocks and not of the WHOLE dataset blocks.
Then it works perfectly fine for my backup-solution with Replication!

Great community here! :smile:
 
Top