Deduplicating backup solution ?

Status
Not open for further replies.

giox069

Dabbler
Joined
Jun 1, 2012
Messages
28
I'm using FreeNAS with ZFS with a 3,5TB of data.

Every night data is backupped over a slow (10Mbps) Internet link using rsync and copying only changed files. The destination NAS is a Synology NAS, not FreeNAS.

Sometimes, during the day, a lot of data is moved across the ZFS filesystem (i.e.: 60-70GB), so the nightly backup will take too long to complete.

Is there a solution to avoid the backup of these moved files ? Unfortunately I cannot use ZFS replica to a Synology NAS.

Thank you.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
I'm using FreeNAS with ZFS with a 3,5TB of data.

Every night data is backupped over a slow (10Mbps) Internet link using rsync and copying only changed files. The destination NAS is a Synology NAS, not FreeNAS.

Sometimes, during the day, a lot of data is moved across the ZFS filesystem (i.e.: 60-70GB), so the nightly backup will take too long to complete.

Is there a solution to avoid the backup of these moved files ? Unfortunately I cannot use ZFS replica to a Synology NAS.

Thank you.

You have a situation where your pipe isn't big enough. You can either (1) get a larger pipe or (2) move less data. Deduplication isn't going to help you. You can shop around for faster internet or creating a dedicated site-to-site link (like metro-e). You can also figure out a way to physically carry nightly incremental backups offsite. Your car has lots of bandwidth.

If your FreeNAS server has available hotswap bays and proper server hardware, you can always create a second zpool. You could just get a single large disk (I'd probably go with 6TB in your case) and replicate to it during the day. When you are getting ready to leave, export the single-disk pool, pull the disk, and take it home with you. This works decently well if you take my approach to work (first in, last out).
 
Joined
Jan 9, 2015
Messages
430

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
The only Dedupe that will be of any value will be well beyond the cost of buying another FreeNAS (or other ZFS capable machine) in order to take advantage of replication. And the dedupe that I'm talking about isn't the one built into FreeNAS, it's a large scale network based appliance.

I'm confused why you think dedupe would be helpful though. Are the 60+GB all the same files? Or are in some way duplicates?
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
The only Dedupe that will be of any value will be well beyond the cost of buying another FreeNAS (or other ZFS capable machine) in order to take advantage of replication. And the dedupe that I'm talking about isn't the one built into FreeNAS, it's a large scale network based appliance.

I'm confused why you think dedupe would be helpful though. Are the 60+GB all the same files? Or are in some way duplicates?
Presumably it's because files are being moved around and he's using rsync. I guess ZFS replication would potentially solve this problem since it operates at the block level. Unless users are moving data from one dataset to another.
 

titan_rw

Guru
Joined
Sep 1, 2012
Messages
586
As far as I'm aware, 'zfs send' un-deduplicates data first. There was mention of a -D option or something to have zfs send actually send a deduplicated data steam, but I never got it to work right.

Too bad because one use-case for dedupe I thought of was sending full VM disk images over a local network to a local deduplicated dataset, then replicating to a remote site over a limited bandwidth connection. Ideally it would only replicate changed blocks. That doesn't seem to be possible. At least not the last time I looked.
 

Joshua Parker Ruehlig

Hall of Famer
Joined
Dec 5, 2011
Messages
5,949
to double check, is the data being compress while being sent? I think rsync has built in compression but I'm not sure if it's the best. in my testing, using zfs send to send about 8GB once a week to an off site location, plzip was the best overall performer (considering compression, network, and decompression time).
 
Status
Not open for further replies.
Top