Hey all, sorry if this has been posted before but I couldn't find any concrete threads that explain all this in the forum. If it has been explained please feel free to point me to the relevant threads. My scenario is that I have a ZFS machine with ~100tb of storage that I need to backup to a non-zfs storage machine (Synology specifically). This means I can't really take advantage of zfs send/receive snapshotting that would obviously be preferable to backup (and restore) my server. So it seems I will be relegated to using the old rsync to push data. If I simply rely on rsync to manage the changes it can take an exceedingly long time to do the necessary checksums, etc and so I would, ideally, like to avoid that and leverage the ZFS snapshots to manage this.
My system config is roughly:
For the diff I simply wrote a little script that takes the diff format and parses into a single file of changes to rsync
M /mnt/tank1/renders/A
- /mnt/tank1/renders/A/art.mp4
M /mnt/tank1/renders/B
+ /mnt/tank1/renders/B/bocce.mp4
I suspect I could simply maintain the directories that had content change and then rsync only those such that anything that was deleted would automatically be removed and anything added would be added. Otherwise I could list the specific files/directories added/removed if I knew rsync would actually delete the files that aren't there.
For non-data sets like the boot drive I was simply copying /data as well, although I've seen mention of the TrueNAS scale config being backed up to /var/db/system/configs-* as well, but I haven't been able to determine what I actually need to be backing up.
My main goal, of course, is to have a complete set of backups that could be used to restore my entire TrueNAS Scale machine should something catastrophically fail, but unfortunately it wasn't clear exactly what I needed to do to do that besides the data, configurations, etc. I recently had one of my Truenas NVME drives failed (I don't have enough slots to have a mirror unfortunately, and so I had to manually delete and restore the pools from the redundant HDD backups. It wasn't a 2 second process, but it wasn't hard either. I guess I'm trying to make sure that if other significant failures happen in the future, like the boot drive or god forbid the entire RaidZ2 array that I can restore the datasets, or at the very least the data.
Given all this here are my actual questions:
My system config is roughly:
- 1 NVME boot drive (not mirrored)
- 1 NVME "scatch" drive - Contains system dataset/ix-applications/container config info/etc - (not mirrored)
- 6 drive RAIDZ2 array (long term data storage and backups of the scratch drive datasets
- Take snapshots of all relevant datasets.
- Do a zfs diff on those snapshots to the previous snapshots to capture any changes
- Parse that diff (see below) and combine into a single list of files/directories that have changed
- Rsync only the contents of that list
For the diff I simply wrote a little script that takes the diff format and parses into a single file of changes to rsync
M /mnt/tank1/renders/A
- /mnt/tank1/renders/A/art.mp4
M /mnt/tank1/renders/B
+ /mnt/tank1/renders/B/bocce.mp4
I suspect I could simply maintain the directories that had content change and then rsync only those such that anything that was deleted would automatically be removed and anything added would be added. Otherwise I could list the specific files/directories added/removed if I knew rsync would actually delete the files that aren't there.
For non-data sets like the boot drive I was simply copying /data as well, although I've seen mention of the TrueNAS scale config being backed up to /var/db/system/configs-* as well, but I haven't been able to determine what I actually need to be backing up.
My main goal, of course, is to have a complete set of backups that could be used to restore my entire TrueNAS Scale machine should something catastrophically fail, but unfortunately it wasn't clear exactly what I needed to do to do that besides the data, configurations, etc. I recently had one of my Truenas NVME drives failed (I don't have enough slots to have a mirror unfortunately, and so I had to manually delete and restore the pools from the redundant HDD backups. It wasn't a 2 second process, but it wasn't hard either. I guess I'm trying to make sure that if other significant failures happen in the future, like the boot drive or god forbid the entire RaidZ2 array that I can restore the datasets, or at the very least the data.
Given all this here are my actual questions:
- Right now I am only backing up data from my zfs -> synologys as it is not a zfs machine. Is there a better way to backup datasets to a non-zfs machine such that dataset (metadata?) would also be backed up but the files also be directly accessible? Ie to accelerate recreating the datasets+data once a failed zfs machine is usable again
- What should I actually be backing up for the system/boot drive? Are /data and /var/db/system/configs-* both needed? Is one prefered?
- Is there an intelligent way to manage this through the GUI? I couldn't find a way to setup an rsync job that does something specific like this without rsyncing everything.
- Any way to automate the zfs diff of snapshots other than manually determining the last two, diffing, and then constructing a list? Or better yet setup an rsync task that leverages diffs directly?