How to backup files *to* TrueNAS with rsync?

troudee

Explorer
Joined
Mar 26, 2020
Messages
69
Hello everybody!

I am planning to have a few Windows machines do their backups onto my new TrueNAS Core installation. For the image backups (done every now and then), I am planning to have SMB shares.

For the daily-run backup script, I was planning to use rsync to sync the most important files into a directory at the NAS (securing the latter with periodic snapshots).

Overall question:
As I think the world works, rsync cannot play its main strengths (only the changed parts are transferred and written (!)) if I just rsync from Windows to an SMB share (using SMB), is that correct?

If that is correct, the next question arise:

How to set up the users?
I was thinking about creating a group (eg. rsync-users) that has no technical meaning and some users that should use rsync over ssh. But what should I set as their shells then? I do not think it is a good idea to give them shell access, but there is just scponly and nothing like rsynconly.
 
Joined
Oct 22, 2019
Messages
3,641
You would use either the SSH (encrypted stream, slower) or Rsync protocol daemon mode (non-encrypted stream, faster).

The free client (command line only) for Windows, that comes as a complete bundle, is known as cwRsync Client. It can be used to run a regular (daily, weekly, etc) rsync task via the Windows Task Manager.

No need to do this over SMB: it works over SSH and Rsync.

You will need to use the --inplace option to prevent extraneous copies (seen as different records by ZFS), which can yield larger snapshot sizes. Using a "copy-on-write" technique on top of a "copy-on-write" file system is inefficient. See here for a more comprehensive discussion: Adding / removing files from an existing .zip archive, how does ZFS handle it?

I shall share my personal setup, and feel free to tweak it to your liking. Just need to compile everything together so that it is easy to read and review.
 

troudee

Explorer
Joined
Mar 26, 2020
Messages
69
Thank you very much for your reply! :smile:

As I think the world works, rsync cannot play its main strengths (only the changed parts are transferred and written (!)) if I just rsync from Windows to an SMB share (using SMB), is that correct?
You would use either the SSH (encrypted stream, slower) or Rsync protocol daemon mode (non-encrypted stream, faster).
[...]
No need to do this over SMB: it works over SSH and Rsync.

That sounds exactly like I imagined it to be, great. As far as I know, Rsync in daemon mode on the TrueNAS does not offer any authentication, right? Then I would prefer the SSH method with extra credentials per PC – if somebody/something evil gains control over it, it cannot spoil the backups of the others!

The free client (command line only) for Windows, that comes as a complete bundle, is known as cwRsync Client. It can be used to run a regular (daily, weekly, etc) rsync task via the Windows Task Manager.
Is there any objection against just using rsync from Cygwin? That would have been my first way to look for any Unix tool. :wink:

You will need to use the --inplace option to prevent extraneous copies (seen as different records by ZFS), which can yield larger snapshot sizes. Using a "copy-on-write" technique on top of a "copy-on-write" file system is inefficient. See here for a more comprehensive discussion: Adding / removing files from an existing .zip archive, how does ZFS handle it?
Yes, that was the main reason why I wanted to use rsync in the first place!! :smile: I was planning to ask for that later on in this thread, but now that you've already mentioned it, this would be the parameters I'd use to make the backups:

Code:
rsync --verbose --progress \
      --recursive --links --times \
      --inplace --no-whole-file \
      --delete --delete-after \
      <SRC> <DSTonNAS>

(for day-to-day sync)

Code:
rsync --verbose --progress \
      --checksum \
      --recursive --links --times \
      --inplace --no-whole-file \
      --delete --delete-after \
      <SRC> <DSTonNAS>

(for not-every-day-to-day, but regularly, to get files that changed without updating times or growing, like TrueCrypt Containers)

I shall share my personal setup, and feel free to tweak it to your liking. Just need to compile everything together so that it is easy to read and review.
I would be delighted if you have the time to do that! Especially how you solved the question of the users' shells: How do you restrict your rsync users from logging in via SSH (and probably using some newly-found privilege escalation bug)?
 
Joined
Oct 22, 2019
Messages
3,641
The setup for the client (Windows)
  1. Download cwRsync Client
  2. Install it to C:\Program Files\cwrsync\
  3. Add to your User's Path the following entry: C:\Program Files\cwrsync\bin
    • This is done via Control Panel > System > Advanced System Settings > Environment Variables
    • Select "Path" and click "Edit"
    • Click "New"
    • Enter or paste C:\Program Files\cwrsync\bin
    • Now rsync can directly be invoked from the Command Prompt, PowerShell, or batch file (.bat) without specifying the executable or entire path
  4. Make a hidden folder named .bin in your User's home directory; this is where the .bat file will be kept
  5. Create a public/private key pair with ssh-keygen (from the cwRsync package)


The setup for the TrueNAS Server (SSH)
  1. Enable and start the SSH service
  2. Create a new user account (or use any existing account)
  3. Make sure to assign them a real home directory (cannot use "/nonexistent"), and give this directory read, write, execute (rwx) persmissions only for User; not Group nor Other
  4. Copy + paste the the public key (previously generated) in the "SSH Public Key" form (or upload it to the server)
  5. After saving these changes, double-check that the User's .ssh hidden folder has read, write, execute (rwx) permissions only for User; not Group nor Other
  6. This user will need read and write permissions for the dataset / directory to be sync'd on the server


Setup the rsync command and options to be run when invoked on the client (Windows)
  1. Create a .bat file inside your hidden .bin folder with the following template, change it to reflect your preferences, and name it something like rsync_to_nas.bat
    • Code:
      TIMEOUT /T 15 /NOBREAK > NUL
      
      DEL /F /Q "%APPDATA\cwrsync\rsync_to_nas_log.txt"
      
      rsync -v -a -H -h --inplace --no-whole-file --delete-delay --info=BACKUP,COPY,DEL,REMOVE,SKIP,STATS --log-file="/cygdrive/c/Users/winnielinnie/AppData/Roaming/cwrsync/rsync_to_nas_log.txt" /cygdrive/c/Users/winnielinnie/ winnielinnie@192.168.0.100:/mnt/mainpool/homebackups/winnielinnie/
      • The first line waits 15 seconds (which might be ideal if invoking this script upon waking up the computer)
      • The next line clears the log file, since rsync appends to the log, which can keep growing in size
      • The following rsync options I use:
        • -v: be verbose
        • -a: archive mode
        • -H: treat hard-links as hard-links
        • -h: use human-readable numbers
        • --inplace: do not make copies of files being updated (very important for ZFS destinations)
        • --no-whole-file: used by default for rsync over SSH, but still nice to explicitly write out
        • --delete-delay: the speed of --delete-during, but deletes missing files from the source after the upload/update process finishes first
        • --info: what type of information to include in the log file
        • --log-file: where the log file is to be stored
      • The source and destination paths can work with variables or the "/cygdrive/c/" format
      • Pay close attention to the trailing slash ( / ) for the source and destination
      • --delete does not need to be invoked when using --delete-delay
      • The username on your Windows 10 computer does not need to match the user account name on your TrueNAS server. I just used "winnielinnie" for both; it's not required.
      • The log file can be really cluttered if directly rsync'ing everything in your User's home folder, since it contains some locked and untouchable files.


Setup the rsync task using Windows Task Scheduler
  1. Start Menu > Task Scheduler > Task Scheduler Library
  2. Create Task
    • Under "General"
      • Name: Rsync to NAS
      • Select "Run whether user is logged on or not"
    • Under "Triggers"
      • New > Choose a preset or select your own custom schedule
    • Under "Actions"
      • New > Start a program
      • Program/script: Browse to the rsync_to_nas.bat file
    • Under "Conditions" and "Settings"
      • You can choose to wake the computer up, if desired and supported
      • You can choose to allow this task to be manually run, on-demand
      • Do NOT enable the option to "only run if the following network connection is available". It causes issues (or just won't even start the task.)

Read over the options and steps carefully, and remember this is just a template. You can have fun with it and do your own unique spins and implementations. This should theoretically work with cygwin + rsync (though cwRsync is basically just a bundled version of it, and doesn't even require a separate cygwin install.) This should also work with acrosync, though I don't recommend it, as it appears to be abandonware.

I hadn't had time to really review my post, so expect me to go over it again and edit corrections or include new information.
 
Last edited:
Joined
Oct 22, 2019
Messages
3,641
As far as I know, Rsync in daemon mode on the TrueNAS does not offer any authentication, right?
It does, but it's rudimentary authentication, and the TrueNAS GUI doesn't even offer such options. You have to manually enter them under "Auxillary Parameters." It works, but it's less secure. However, there is a noticeable performance increase compared to SSH. Regardless, SSH is still very fast, much more secure, and doesn't require a daemon to run on the server.


How do you restrict your rsync users from logging in via SSH (and probably using some newly-found privilege escalation bug)?
Restrict users from SSH? You need a user account with SSH access in order to connect to the server and transfer the files (over an encrypted connection.) You cannot restrict the user from using SSH, while also requiring SSH to use rsync over the network. Unless I misread that question?

EDIT: Correction! Change the user account's Shell to "scponly". Apparently, it still allows rsync tranfsers, not just limited to scp. :smile: If the user tries to login, they will receive this error: "WinSCP: this is end-of-file:0"

Now the user can still use rsync, but is not allowed access to a terminal session via ssh.


(for not-every-day-to-day, but regularly, to get files that changed without updating times or growing, like TrueCrypt Containers)
Is that a hard requirement for you? Since you can always disable the option to "preserve modification times" (at least it's an option with VeraCrypt, and I'm hoping you've switched to VeraCrypt.) The --checksum option can really slow things down, especially for larger files.
 
Last edited:

troudee

Explorer
Joined
Mar 26, 2020
Messages
69
Wow, that's amazing! Thank you very much for all your effort!! :smile:

I'll try and do a few experiments, especially if the scponly works and does not negate --inplace/--no-whole-file.

You write that the rsync-ssh user on the NAS needs a home directory. Is that needed for the SSH key only, or for some more magic I do not understand yet?

(for not-every-day-to-day, but regularly, to get files that changed without updating times or growing, like TrueCrypt Containers)
Is that a hard requirement for you? Since you can always disable the option to "preserve modification times" (at least it's an option with VeraCrypt, and I'm hoping you've switched to VeraCrypt.) The --checksum option can really slow things down, especially for larger files.
Yes, it's not only TrueCrypt (or VeraCrypt), sometimes my users have to set the times of certain files to certain dates for various reasons. Since that happens not very often, the --checksum variant is not thought to be run frequently, just now and then.
 
Joined
Oct 22, 2019
Messages
3,641
Is that needed for the SSH key only, or for some more magic I do not understand yet?
Yup! You got it right: for the SSH key, which is required to run this as an automated task, otherwise it will prompt for the user's passphrase (which isn't feasible when running it in the background as a routine.)
 

troudee

Explorer
Joined
Mar 26, 2020
Messages
69
Yup! You got it right
Great. One step closer to a working system!

A question about your rsync command options: You are using -H and -a. The latter is short for -rlptgoD, and -D is short for
Code:
     --devices               preserve device files (super-user only)
     --specials              preserve special files


Are the -H --devices --specials that useful in the Windows world? Do "device files" and "special files" (whatever that means) even exist there? And what about the hard links? Doesn't that have the risk of not including something that is hardlinked in a user's folder but lies outside it in reality? As far as I understand, if I have this structure...
Code:
C:\
    Users\
        troudee\
            stuff\
                cat-photo-1.jpg
                [...]
                cat-photo-4738947.jpg
                veryImportantFile.txt
            Documents
                veryImportantFile.txt (Hardlink to ../stuff/veryImportantFile.txt)

...and take a rsync backup of C:\Users\troudee\Documents only, not using -H, it will generate a copy of veryImportantFile.txt because it just follows the hard link. When I use -H, it will only create a hard link of the file in the backup, pointing nowhere?
 
Joined
Oct 22, 2019
Messages
3,641
Are the -H --devices --specials that useful in the Windows world? Do "device files" and "special files" (whatever that means) even exist there?
I have a mix of Linux and Windows computers, and sort of use a similar format across the board. That's why I don't recommend you use my template "as is" since everyone has their own unique setups and needs. :smile: As for the other options, they're all "inclusive" when using the -a (archive) option. It's an old habit of mine to use -a as a "de facto" option for my rsync tasks. Good catch, though, since some of included options are redundant or meaningless, especially for a Windows system.

And what about the hard links? Doesn't that have the risk of not including something that is hardlinked in a user's folder but lies outside it in reality?
You continue with...
...and take a rsync backup of C:\Users\troudee\Documents only, not using -H, it will generate a copy of veryImportantFile.txt because it just follows the hard link. When I use -H, it will only create a hard link of the file in the backup, pointing nowhere?
Bingo! You've once again illustrated why my template cannot be used ubiquitously for all users. :tongue: Nearly all of my rsync tasks include an entire "user home" or complete contents of a file-system that resides on another device or partition, such as a USB drive.

You seem to have a good grasp on this, and you should definitely "make it your own". That's why I ended my post with,
Read over the options and steps carefully, and remember this is just a template. You can have fun with it and do your own unique spins and implementations.


Every user has their own unique setup, and so treat my step-by-step guide as a rough idea of how you can get it to work with your TrueNAS box. I can vouch that what I wrote above works as expected with cwRsync and (previously) acrosync, on Windows 7 and 10. I'm willing to bet rsync + cygwin shouldn't be much different. Best of luck, @troudee. If anything can be clarified, please let me know. If you come across any tricks or shortcuts, share them. I'm constantly trying out new things, myself. :cool:
 
Last edited:

troudee

Explorer
Joined
Mar 26, 2020
Messages
69
Thank you very much, I will try it! :smile:
I hope you don't mind me asking all that stuff, but I tend to make plans that grow larger and larger and then at the end, when I read the last bit of documentation ("what does that mean, --debug-disable-mirror? Why would an elevator have a mirror? ... Wait ... This is a braking system for cars???") I stumble over some tiny detail and realize that my mental model of the whole thing has been wrong all the time. :grin: So I try to ask the questions as early as possible.
 

troudee

Explorer
Joined
Mar 26, 2020
Messages
69
I've tested it with the scponly and a cygwin-rsync now. Created a pseudorandom 1GB file and synced it, then manipulated the file (rename or not, manipulate content or not, old or new timestamp) and then synced again (datetime-based command or checksum-based command).

Because I am lazy when it comes to think, I did the test for all 16 combinations or rename-or-not/manipulate-content-or-not/old-or-new-timestamp/timebased-checksumbased.

What shall I say, it always performed the way I thought it would, so I would count that as a success. :smile: :smile:

These are my two commands:

Code:
# Slower variant, selecting missing files by checksum
rsync 
    --verbose             # increase verbosity
    --checksum             # skip based on checksum, not mod-time & size
    --recursive         # recurse into directories
    --times             # preserve modification times
    --inplace             # update destination files in-place
    --no-whole-file     # (without no-) copy files whole (w/o delta-xfer algorithm)
    --delete             # delete extraneous files from dest dirs
    --delete-delay       # find deletions during, delete after
    --progress             # show progress during transfer
    --stats             # give some file-transfer stats
    --human-readable     # output numbers in a human-readable format
    --fuzzy                # find similar file for basis if no dest file
    --log-file="<...>"
    <SRC-DIR>/
    <USER>@<HOST>:<DST-DIR>/


Code:
# Faster variant, select updated files by their newer timestamp
rsync 
    --verbose             # increase verbosity
    --recursive         # recurse into directories
    --times             # preserve modification times
    --inplace             # update destination files in-place
    --no-whole-file     # (without no-) copy files whole (w/o delta-xfer algorithm)
    --delete             # delete extraneous files from dest dirs
    --delete-delay       # find deletions during, delete after
    --progress             # show progress during transfer
    --stats             # give some file-transfer stats
    --human-readable     # output numbers in a human-readable format
    --fuzzy                # find similar file for basis if no dest file
    --log-file="<...>"
    <SRC-DIR>/
    <USER>@<HOST>:<DST-DIR>/
 
Top