Register for the iXsystems Community to get an ad-free experience and exclusive discounts in our eBay Store.

Cloud Backup Snapshots using RClone?

BetYourBottom

Member
Joined
Nov 26, 2016
Messages
131
I've been thinking about this quite a bit lately. It seems most backup solutions take forever to scan your files for change, or will upload everything every time, or don't allow easy versioning. (Rclone doesn't seem to have any versioning, Duplicati takes a long time to compile and send files for larger datasets)

I was wondering if it might be possible to directly send ZFS snapshots through to a cloud backup. Since the snapshots are virtually instant, all the detection for changes is handled by ZFS itself so you don't need lengthy checksuming or inaccurate timestamping. Snapshots can be diffed using tools built into ZFS by default, so filtering out only changed data is handled for free by ZFS as well. Versioning just comes with the territory when dealing with snapshots as well, so it seems like if there was a way to directly upload snapshots that would drastically simplify backups.

Based on all the reading I've been doing lately on ZFS and RClone, I came up with a command that I think should work (I haven't been able to test it yet, I'm moving drives and will use the freed ones later for a test pool). I'd like some feedback on if something like this could work as simply as I think it may.

zfs send pool@longestlife | gzip | gsplit --bytes=10M --filter "rclone rcat remote:path/to/$FILE"

The first command should send a full snapshot to stdout with longestlife referring to a periodic snapshot taken that expires the latest (I would like to integrate this with periodic snapshotting) to create a baseline. Then Gzip should compress the datastream that is being sent by zfs send and feed it into gsplit; gsplit then takes the stream chops it into 10MB chunks and sends it off to rclone to upload to a path on your remote device.

My current concerns is that this might either swamp your RAM by doing everything in stdout, and/or loads up 1000s of rclone instances at once causing other performance issues. I'm not an expert at how these commands work with stdout, but in the ideal case, I was hoping that there would be some blocking so that you'd only get so many instances of rclone open before everything pauses itself and no giant RAM cache is created.

If this would work out then for newer versions, the only thing that'd need to change would be the remote directory and zfs send command to be something like zfs send -i pool@longestlife pool@latestsnap, to send a diff of the two snapshots.

To do a full restore I would think something like this would work:

rclone cat remote:path/to/dir | zcat | zfs receive

Basically the reverse of the previous command, and would need to be run on each of the uploaded snapshots as well.

Issues I can see with this are:
  • it might be difficult to rebase the backup on a newer snapshot. So say you have a snapshot that lasts for 1 year, this backup solution would work until then but when it expires the only way I see it working right now would be to do a full upload based on the newest 1 year snapshot.
  • I don't know how a partial restore would work, I'm not familiar enough with zfs send and zfs receive to know how partial restores work using them to be able to figure out how it would work via the upload solution
  • I don't know how rclone would deal with errors during transfer, how much it'll confirm file integrity and how an error would propagate back to stop the backup should an issue arise.
I'd love some feedback on this especially if it's to tell me someone smarter has already figured out how to directly use zfs' tools to deal with backups.
 

hescominsoon

Senior Member
Joined
Jul 27, 2016
Messages
403

BetYourBottom

Member
Joined
Nov 26, 2016
Messages
131
I wasn't looking for a certain service, I was looking for something that could be used in a more generic sense.
Rclone can connect to a ton of different online storage providers and, thus, if a command like this works (I haven't been able to test it yet), then it'd let you send zfs snapshots to arbitrary providers.

They wouldn't even need a zfs backend to deal with the snapshots because this should send them as split up gzipped and encrypted files. So as long as they can store generic files and the restore command works, it shouldn't need any special support outside of what rclone provides.
 

hescominsoon

Senior Member
Joined
Jul 27, 2016
Messages
403
rlcone doesn't do snapshot transfers..i wish it did. Until it does your best bet is a provider like rsync which is native zfs and can meet your requirements...
 

hescominsoon

Senior Member
Joined
Jul 27, 2016
Messages
403
my apologies i thought you would read the documentation at rsync before snarking back... rsync setups up a remote zfs filesystem. you could then set that as a replication target using the gui of TNC/TN. then you have native zfs synchronization and it should check your boxes. Then rclone isn't necessary and the resource usage is inline with native ZFS send/receive.
 
Last edited:

BetYourBottom

Member
Joined
Nov 26, 2016
Messages
131
my apologies i thought you would read the documentation at rsync before snarking back...
I appreciate that you are the only person to actually reply but your first post was still a link with no additional comment (which is rude).
I replied nicely explaining why that service was not exactly relevant and restated why I'm interested in this specific idea/solution.
Then you seemingly ignored that comment to just say "rclone doesn't do snapshot transfers" as if I didn't just make an entire topic about a sequence of commands that was meant to allow for snapshot transfers over rclone.

My snark is because it seems you are ignoring the point of this topic.
A sequence of commands that, in theory, will let you send snapshots to arbitrary cloud providers via rclone, discussion on if the commands would work, and ways to make it work or improve it's function.

I also appreciate that you like that service or think that it would be especially convenient for snapshot replication. However, I have no interest in an additional cloud storage provider at the moment for various reasons and would like to continue exploring options that might allow me to transfer snapshots to generic storage providers.
 
Top