Replication with daily changed filenames but same data

Status
Not open for further replies.

fips

Dabbler
Joined
Apr 26, 2014
Messages
43
Hi,

my hypervisior (proxmox) creates every day a full backup file of each of my VMs.
If I know set up a replication task between 2 freenas boxes, does zfs replication understand that the data is more or less the same even if the filename is different?
I don't want to transfer every day 600GB it just few MB are changed...

My procedure how I think about it would be:
Hypervisor create backupfiles on a nfs share ("actual backup dataset") on the first freenas box.
A replication task copies data to the second freenas box.
All backup files moved to an archiv dataset, so "actual backup dataset" is empty.
Next Day:
Hypervisor create new backupfiles on nfs share ("actual backup dataset") on the first freenas box.
A replication task copies only incremental data to the second freenas box.
All backup files moved to the archiv dataset, so "actual backup dataset" is empty.
And so on the next days...

Am I thinking to "easy"?
 

fracai

Guru
Joined
Aug 22, 2012
Messages
1,212
My quick assessment is no. It sounds like you're creating local copies of unchanged data with each daily backup. You could use "rsync --link-dest" to de-duplicate this when you transfer to the remote backup. Alternatively, you could find a way to make the local backup only change modified files and use snapshots to manage your backup history. Replication at that point would match the local incremental behavior.

I am not familiar with proxmox or how it's options for creating backups.
 

Pezo

Explorer
Joined
Jan 17, 2015
Messages
60
You could turn on deduplication if possible, then you would not even store duplicate portions of those files.
 

fips

Dabbler
Joined
Apr 26, 2014
Messages
43
I am not familiar with proxmox or how it's options for creating backups.
It just creates lzo archives like:
vzdump-qemu-105-2017_08_16-23_30_46.vma.lzo

Next day its name is for example:
vzdump-qemu-105-2017_08_17-22_31_00.vma.lzo
 

fracai

Guru
Joined
Aug 22, 2012
Messages
1,212
rsync --link-dest can only link against unchanged files, so that won't be of help to you here.
I think your only option to get incremental archives would be to turn on dedupe. And I'm pretty sure the first rule of dedupe is, "do not turn on dedupe".

Actually, this proxmox thread (along with a few others; search "proxmox incremental backup") seems to indicate it's possible. Could you disable compression and backup to a non-dated folder? As long as the backup is smart enough to not rewrite a file that hasn't changed you could get incremental backups with ZFS snapshots.
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
One of the tricky things with backups, is the format. Just 1 little thing changed in the LZO compressed image causes it be be a unique file. Virtual disk images, ZVols and database files have the same issue.

All my client backups are Linux at present, though I used to have 2 Solaris servers. So, in order to keep both a history of backups, and a make incremental backups, I have this ZFS dataset configuration;

POOL
POOL/backups
POOL/backups/CLIENT1
POOL/backups/CLIENT1@Date_of_Backup
...
POOL/backups/CLIENT2
POOL/backups/CLIENT2@Date_of_Backup

My backup software is Rsync. It compares my last backup's files with the current client's files, and only transfers the changes. After the backup finishes, I make a ZFS snaphot with the date of the backup. Works great for me.

But in your case, you may want the client that uses the virtual disk image to perform the backups. Then you can get incremental backups to work properly.

There may be other options, I just don't know them.
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
I experimented with VM backups to a deduplicated dataset. I was seeing 30x dedup ratios. Cool.

Ie, a full vm backup, into a date named folder each day.

BUT the dedup performance hit to the pool was high... but for a dedicated VM backup pool, perhaps its worth it.

BUT you can't use exterior compression.

Now, a related question... does anyone know if the zfs send stream is sent deduplicated?
 

fips

Dabbler
Joined
Apr 26, 2014
Messages
43
Thanks so far for your input.
Well I don't want to create incremental backups from the hypervisior.
For me its a big advantage to have a single backup file I can roll back if needed or install on a new machine.

If on the "actual backups dataset" only 1 file per VM exist, how can deduplication work?
Or do I have to set it on the second freenas box?

But as Arwen said what if something in LZO compression changed, I can forget about deduplication.
@Arwen:
You keep all backups on both Clients?
 

Pezo

Explorer
Joined
Jan 17, 2015
Messages
60
But as Arwen said what if something in LZO compression changed, I can forget about deduplication.
Can't you disable compression if you enable dedup?
 

Pezo

Explorer
Joined
Jan 17, 2015
Messages
60
But you would only have about one of those big backup files, and then only the changes for subsequent backups.
What size are we talking about, uncompressed?
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
And ZFS supports compression.
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
...
@Arwen:
You keep all backups on both Clients?
It's for home, so I perform backups only once a month. (And I actually have 3 clients, plus a temporary build box that I don't backup, yet.) Probably have 20 or 30 snapshots for my oldest client. But, since they are snapshots and incremental based on changes, I see perhaps 100MB to 1GB difference between snapshots. Depends on how many OS packages got changed during an OS update.

With 7TB available, I see no reason to worry about it at present. Whence I start getting low on free space, I'll start deleting some of the oldest client backup snapshots.

Note that I keep my Media separate. Plus, the computer documentation, source code packages and firmware that I keep for my devices, (Ethernet switches, FreeNAS OS images, ISO boot images, etc...), are also separate from the client backups. Basically those are kept for historical reasons.
 

fips

Dabbler
Joined
Apr 26, 2014
Messages
43
But you would only have about one of those big backup files, and then only the changes for subsequent backups.
What size are we talking about, uncompressed?
well compressed (fast with lzo) are all backups about 600GB, uncompressed is hard to say, I would say something like 1-1,5TB
 
Status
Not open for further replies.
Top