Register for the iXsystems Community to get an ad-free experience and exclusive discounts in our eBay Store.

Backup/Dataset replication with zfs send/receive

NASbox

Neophyte Sage
Joined
May 8, 2012
Messages
591
I'm hoping that someone with experience can help me validate my concept for removable local backups before I dig deeply into the research on the exact commands and how to script it.

TLDR; Can I selectively update a backup created with zfs send -r \ zfs receive. If necessary I am OK with destroying my current backup and redoing it if the inital backup must be created non-recursively.

BACKGROUND/DETAILS
Context: Home environment

I just expanded my FreeNAS, and before the expansion I had a pool and 4 datasets and I created a snapshot on each of them so that the pool looked like this:

TANK
TANK@SNAP
TANK/DATASET1
TANK/DATASET1@SNAP
TANK/DATASET2
TANK/DATASET2@SNAP
TANK/DATASET3
TANK/DATASET3@SNAP
TANK/DATASET4
TANK/DATASET4@SNAP

I was able to backup this up on a one drive removable pool with a recursive zfs send/receive - something like: zfs send -r TANK@SNAP | zfs receive BACKUP01
(I can't remember exact syntax-It was a single line command something like I've shown here).

I have since added two large datasets used for server and workstation backups:

TANK/BACKUPDATASET1
TANK/BACKUPDATASET2

which I don't want to back up on pool BACKUP01.

Can I selectively update my removable drive backup of pool TANK and DATASET1-4 by creating new snapshots "SNAP2" and doing something like:

zfs send TANK@SNAP | zfs receive BACKUP01/TANK@SNAP2
zfs send TANK/DATASET1@SNAP | zfs receive BACKUP01/TANK/DATASET1@SNAP2
zfs send TANK/DATASET2@SNAP | zfs receive BACKUP01/TANK/DATASET1@SNAP2
zfs send TANK/DATASET3@SNAP | zfs receive BACKUP01/TANK/DATASET1@SNAP2
zfs send TANK/DATASET4@SNAP | zfs receive BACKUP01/TANK/DATASET1@SNAP2

and then going forward continue the process by:

Destroying the original snapshots to allow space to be reclaimed on BACKUP01 with something like:

zfs destroy TANK@SNAP
zfs destroy TANK/DATASET1@SNAP
zfs destroy TANK/DATASET2@SNAP
zfs destroy TANK/DATASET3@SNAP
zfs destroy TANK/DATASET4@SNAP
zfs destroy BACKUP01/TANK@SNAP
zfs destroy BACKUP01/TANK/DATASET1@SNAP
zfs destroy BACKUP01/TANK/DATASET2@SNAP
zfs destroy BACKUP01/TANK/DATASET3@SNAP
zfs destroy BACKUP01/TANK/DATASET4@SNAP

and then creating another snapshot SNAP3 and doing another set of zfs send/receive:

zfs send TANK@SNAP2 | zfs receive BACKUP01/TANK@SNAP3
zfs send TANK/DATASET1@SNAP2 | zfs receive BACKUP01/TANK/DATASET1@SNAP3
zfs send TANK/DATASET2@SNAP2 | zfs receive BACKUP01/TANK/DATASET2@SNAP3
zfs send TANK/DATASET3@SNAP2 | zfs receive BACKUP01/TANK/DATASET3@SNAP3
zfs send TANK/DATASET4@SNAP2 | zfs receive BACKUP01/TANK/DATASET4@SNAP3

I want to "own the snapshot names" and control the whole process so that I can use the FreeNAS automatic snapshots for day to day protection and not worry if a replication snapshot gets deleted.

The datasets in question are low turnover, and tend to grow though addition, so the deltas should be relatively small (a few GB at most) and there shouldn't be a lot of deletions to waste space on the protective snapshot.

This seems very simple to implement and appears to offer speed and data integrity/management benefits over using rsync. (With the old system a problem with the rsync source would corrupt the backup as well. Also a zfs diff DATASET@SNAP DATASET instantly tells me exactly what files are not backed up as well and could convently be used to create interim backups.)

If I need to recreate the initial individual snapshots (i.e. not use a base created with a recursive replication), I'm fine doing that since I have 2 backup copies and TANK is RAID-Z2.

I am also wondering what happens if BACKUP01 gets completely full?
Does this scenario put the whole FreeNAS system at risk?
For example if SNAP3 filled my backup pool (i.e cryptolocker or similar), could I still continue to use BACKUP01 by destroying the partially complete SNAP3?
(In this case I would hope I could revert SNAP2 on both TANK and BACKUP01. Then copy any clean versions of files created since SNAP2 to the main pool TANK and run the backup again. Would BACKUP01 recover from being completely full by destroying the failed replication snapshot SNAP3?)

Am I missing anything? Any special options I should include?

Any suggestions/advice/cautions/questions or other constructive input is welcomed.

================================================

For the benefit of anyone who might care what the project is that I'm working on:

I'm creating a removable local backup solution for my home/home office.

My plan is to map my pool into manageable groups and assign the paths to be backed up to a specified backup drive. The backup script will then identify the removable drive that is loaded in the hot swap (by its serial number) and automatically mount it's pool, snapshot the required datasets on the main pool, replicate the deltas to the backup pool, scrub the backup, check the smart data and unmount the backup drive.

The only manual intervention should be loading the drive and starting the backup script, checking for successful completion, and unloading and storing the drive at the end.
 
Last edited:

NASbox

Neophyte Sage
Joined
May 8, 2012
Messages
591
Did you decide upon a solution?

Thanks for asking. Still trying to figure out what to do. Had a limited amount of time to work on it, and the learning curve is steep.

I've been doing a few manual snapshots of other datasets and manually pruning.

I'm also trying to set up a little test environment - I hope a dataset with a few nested datasets is equivalent to a pool with several datasets for test purposes.

Can anyone tell me if partial snapshot backup/restore on datsets within TANK/JUNKDATA

TANK
TANK/JUNKDATA
TANK/JUNKDATA/DATASET1
TANK/JUNKDATA/DATASET2
TANK/JUNKDATA/DATASET3
TANK/JUNKDATA/DATASET4
TANK/JUNKDATA/DATASET5
TANK/JUNKDATA/DATASET6

is going to behave the same way as If I were operating on a structure like this:

TANK
TANK/DATASET1
TANK/DATASET2
TANK/DATASET3
TANK/DATASET4
TANK/DATASET5
TANK/DATASET6

Any input would be much appreciated.
 

Jatrabari

Member
Joined
Sep 23, 2017
Messages
98
================================================
For the benefit of anyone who might care what the project is that I'm working on:

I'm creating a removable local backup solution for my home/home office.

My plan is to map my pool into manageable groups and assign the paths to be backed up to a specified backup drive. The backup script will then identify the removable drive that is loaded in the hot swap (by its serial number) and automatically mount it's pool, snapshot the required datasets on the main pool, replicate the deltas to the backup pool, scrub the backup, check the smart data and unmount the backup drive.

The only manual intervention should be loading the drive and starting the backup script, checking for successful completion, and unloading and storing the drive at the end.

Were you able to finish this project?
 
Top