RSYNC Access Mode Question and RSYNC Issues

j.lanham · Jul 6, 2023

I've finally found the Archive flag that allowed me to RSYNC at least somewhat successfully to our new server. There's a couple of issues though that I was wondering if I could get help with.

What exactly does Access Mode under the RSYNC module do. Is it Access Mode from the source to the destination? i.e. if I change it from RW which is how I set it up initially to Read, will it not be able to write to the target server? If true, I don't understand if the module is the target why the default is Read for rsync modules.

Even though I have it running as root on the target server, with my full rsync last night it failed with the following errors on only two different data sets.

rsync: [generator] failed to set permissions on "<full file path>" (in <Source Server>): Operation not permitted (1)

and

rsync: [receiver] failed to set permissions on "<full file path>" (in <Source Server>): Operation not permitted (1)

Does this indicate that the rsync is trying to set permission on the Source Server or is indicating that it can't set them FROM the source server?

Is there anything I can do to correct it to make sure I get a full sync including permissions?

winnielinnie · Jul 6, 2023

j.lanham said:
What exactly does Access Mode under the RSYNC module do.

The path you have selected (on the left-hand side): do you want to allow clients to PUSH to it ("write'), to PULL from it ("read"), or allow BOTH ("write" and "read")?

Say for example a client will only ever use this path (on your NAS server) as a "backup destination". This is the most common use of clients backing up to your NAS server. In that case, you would set the Access Mode to "Write Only".

j.lanham said:
why the default is Read for rsync modules.

No idea why. In my opinion, the default should be "Write and Read" or "Write Only".

j.lanham said:
Is there anything I can do to correct it to make sure I get a full sync including permissions?

Can you elaborate? Are you trying to backup from your TrueNAS server to another location? Is this other location Unix/Linux or Windows? Or rather, are you trying to backup from some client PC to your TrueNAS server?

j.lanham · Jul 6, 2023

winnielinnie said:
The path you have selected (on the left-hand side): do you want to allow clients to PUSH to it ("write'), to PULL from it ("read"), or allow BOTH ("write" and "read")?

Say for example a client will only ever use this path (on your NAS server) as a "backup destination". This is the most common use of clients backing up to your NAS server. In that case, you would set the Access Mode to "Write Only".

No idea why. In my opinion, the default should be "Write and Read" or "Write Only".

Can you elaborate? Are you trying to backup from your TrueNAS server to another location? Is this other location Unix/Linux or Windows? Or rather, are you trying to backup from some client PC to your TrueNAS server?

This is a backup sync to another server. We have two truenas servers setup. One that's our primary file server and one we just put in to have an onsite live backup in case the primary craters. I setup the backup server, in terms of datasets, exactly like the primary server. The rsync is a push from the primary to the backup.

The source server is the primary file server that is doing a push rsync to the backup server.

If you set the module to write only, how does it determine what has changed between the last sync and the current if it can't read from the backup server to compare files, etc?

winnielinnie · Jul 6, 2023

j.lanham said:
This is a backup sync to another server.

How is this other server initiating the rsync task? Rysnc Modules do not initiate anything.

In case there's confusion, Rsync Tasks and Rsync Modules are completely unrelated in TrueNAS.

j.lanham said:
The rsync is a push from the primary to the backup.

How? Did you configure an Rsync Task to do this? If so, then you're not using the Rsync Modules. (Those are used differently.)

j.lanham said:
If you set the module to write only, how does it determine what has changed between the last sync and the current if it can't read from the backup server to compare files, etc?

"Write Only" and "Read Only" when concerning an Rsync Module is not about "read-write" permissions. They only refer to the "direction" of the transfers. (Think of it as "Incoming" and "Outgoing".)

But this may all be moot because of what I explained above.

j.lanham · Jul 6, 2023

winnielinnie said:
How is this other server initiating the rsync task? Rysnc Modules do not initiate anything.

In case there's confusion, Rsync Tasks and Rsync Modules are completely unrelated in TrueNAS.

How? Did you configure an Rsync Task to do this? If so, then you're not using the Rsync Modules. (Those are used differently.)

"Write Only" and "Read Only" when concerning an Rsync Module is not about "read-write" permissions. They only refer to the "direction" of the transfers. (Think of it as "Incoming" and "Outgoing".)

But this may all be moot because of what I explained above.

There is an rsync task on the primary initiating a module sync with the backup server. The backup server has the module setup on it.

winnielinnie · Jul 6, 2023

j.lanham said:
The backup server has the module setup on it.

In that case, you can set it to "Write Only" on the backup server's Module, and make sure the username or UID set on the source side has full permissions for the path on the source and destination side. (Or run it as root.)

(It can help to show your setup.)

j.lanham · Jul 6, 2023

winnielinnie said:
(It can help to show your setup.)

These are the rsync and corresponding module setups. The rsync setup is on two screen shots because of screenshot limitations.

winnielinnie · Jul 6, 2023

With the above, the Rsync Task still fails with the same error?

EDIT: As an aside, why are you using Rsync for onsite backups to another TrueNAS server? Why not ZFS replication?

j.lanham · Jul 6, 2023

With the above, the Rsync Task still fails with the same error?

Yes

EDIT: As an aside, why are you using Rsync for onsite backups to another TrueNAS server? Why not ZFS replication?

That would be ideal, however the setup for the ssh connection isn't very well documented, for instance, it implies that the keys for the sender are setup on the sender not the receiver wherein as I understand it the key is used to log into the receive, but it doesn't have the private key to validate the public key by. Things along those lines.

If I moved to ZFS replication, would I have to start with a clean data set on the receiver?

j.lanham · Jul 7, 2023

winnielinnie said:
With the above, the Rsync Task still fails with the same error?

EDIT: As an aside, why are you using Rsync for onsite backups to another TrueNAS server? Why not ZFS replication?

Thank you for all of your advice. I finally finished setting up the ZFS replication, and it worked a treat. One final question, once you've replicated the snapshot, how do you turn it into a read write dataset? I know the default is to mark it read only, which I did on the first replication just for simplicity. I know there is a way to mark it read write, but there's not a whole lot of information out there, it seems.

winnielinnie · Jul 7, 2023

j.lanham said:
If I moved to ZFS replication, would I have to start with a clean data set on the receiver?

Yes, which you already tried yourself. (Keep in mind you must always have a "common" snapshot between the source and destination in order to send incremental replications.

j.lanham said:
One final question, once you've replicated the snapshot, how do you turn it into a read write dataset?

Leave it as "Read-Only". Any dataset which is a destination for replications should always remain "Read-Only" and never modified, unless you switch to it as your new "source" (and something else will become the new "backup".)

j.lanham · Jul 7, 2023

winnielinnie said:
Yes, which you already tried yourself. (Keep in mind you must always have a "common" snapshot between the source and destination in order to send incremental replications.

One question about the "common" snapshots. How many snapshots do you HAVE to retain on the backup? The question becomes if you delete the initial snapshot from the backup dataset, does it remove all of the data from the backup dataset?

winnielinnie said:
Leave it as "Read-Only". Any dataset which is a destination for replications should always remain "Read-Only" and never modified, unless you switch to it as your new "source" (and something else will become the new "backup".)

Thanks for the recommendation. I really appreciate it.

winnielinnie · Jul 7, 2023

j.lanham said:
One question about the "common" snapshots. How many snapshots do you HAVE to retain on the backup?

The only snapshot you "have" to retain on the destination side is the most recent "common" snapshot. Otherwise, an incremental replication is not possible. This is why the "retention policy" is important, and you have to make sure you don't inadvertently prune such snapshots.

The only time a snapshot can be destroyed from either side is if there is a pair of a more recent snapshot shared between the two sides. It is up to this point-in-time that the older snapshots can be safely deleted. (Whether manually or automatically. Keep in mind that if done "automatically", you need to confirm that there are safeguards that don't allow common snapshots to be destroyed.)

j.lanham said:
The question becomes if you delete the initial snapshot from the backup dataset, does it remove all of the data from the backup dataset?

Not at all. In fact, you can delete all snapshots from the destination side, and you will still have all the data intact that existed in the filesystem at the moment-in-time of the latest snapshot. ("Latest", as in before you destroyed all snapshots.) However, incremental replications will be impossible now, and you'll have to start all over from scratch if you do this.

The "initial" snapshot simply represents the first time a full replication was transferred. There's nothing special about it in of itself. (It's useful to keep if there exists important historical data, or if you want a "safety net" in which you can use it as an "emergency" common snapshot, just in case something happens later down the road. However, this "emergency" common snapshot will still need to replicate all the differences since that specific day in the past, so it might still be a large incremental transfer. It also requires that this "emergency" common snapshot exists on the source side as well.

j.lanham · Jul 7, 2023

winnielinnie said:
The only snapshot you "have" to retain on the destination side is the most recent "common" snapshot. Otherwise, an incremental replication is not possible. This is why the "retention policy" is important, and you have to make sure you don't inadvertently prune such snapshots.

The only time a snapshot can be destroyed from either side is if there is a pair of a more recent snapshot shared between the two sides. It is up to this point-in-time that the older snapshots can be safely deleted. (Whether manually or automatically. Keep in mind that if done "automatically", you need to confirm that there are safeguards that don't allow common snapshots to be destroyed.)

Do you know what the recommended or suggested retention policy would be for that backup and the primary systems would be in terms of making sure the common snapshot for continued replication of changes, as opposed to a full replication? I would assume, that retaining all snapshots on the target system would eventually take up all of the disk space. Right now I have the production system set for a week retention and save the snapshot if it hasn't been replicated yet. The replication task is set to not expire any data sets on the backup server. I know I'm asking some basic questions, because I have a vague understanding of how snapshots work, but I'm trying to get my head around the details, so I can understand exactly what going on. I do appreciate all of your advice.

winnielinnie · Jul 7, 2023

j.lanham said:
I would assume, that retaining all snapshots on the target system would eventually take up all of the disk space.

Snapshots consume zero space, until you start deleting files, in which case the snapshot "holds onto" the space that is being consumed by "deleted" files.

This is why (me personally) I would recommend a long-lived retention policy, especially for "backup" and "archival" datasets, on both ends. A "year" is a good minimum.

If you really believe you'll be creating and deleting a lot of data throughout the life of this dataset, then perhaps a shorter retention policy is more applicable.

EDIT: To illustrate this using an extreme example: If you have a dataset where you keep creating new files and data, and it grows and grows and grows, and you're taking hundreds of snapshots every day, and by the end of the year you've accumulated 73,000 snapshots, then all these snapshots will still consume no additional space! (As long as you've never deleted any files on the dataset.)

* Technically not "no additional space", since there are kilobytes worth of metadata involved. Besides,73,000 snapshots would be quite a long list to parse through.

j.lanham · Jul 7, 2023

winnielinnie said:
Snapshots consume zero space, until you start deleting files, in which case the snapshot "holds onto" the space that is being consumed by "deleted" files.

This is why (me personally) I would recommend a long-lived retention policy, especially for "backup" and "archival" datasets, on both ends. A "year" is a good minimum.

If you really believe you'll be creating and deleting a lot of data throughout the life of this dataset, then perhaps a shorter retention policy is more applicable.

EDIT: To illustrate this using an extreme example: If you have a dataset where you keep creating new files and data, and it grows and grows and grows, and you're taking hundreds of snapshots every day, and by the end of the year you've accumulated 73,000 snapshots, then all these snapshots will still consume no additional space! (As long as you've never deleted any files on the dataset.)

* Technically not "no additional space", since there are kilobytes worth of metadata involved. Besides,73,000 snapshots would be quite a long list to parse through.

I suppose this is a late time to explain, but we have a single zvol made up of multiple disks. This contains our system data set which we built multiple data sets under for our data, essentially each data set is a share. When we started coming up with the idea of a live backup server, we setup another dataset and moved all of our other datasets under this data set because as I understand it, you can't replicate a system data set to another system dataset. So our Data datasets contains all of our other datasets. This is what we replicated (recursively) to the target server. I'm just explaining this because it gives a lot of context to my questions. If you automatically, through replication job retention periods, does it automatically remove the snapshots for all of the underlying datasets or does it only apply to the master dataset?

We setup the server initially as a test bed, and then when covid hit, we had to push it into production in a hurry.

winnielinnie · Jul 7, 2023

j.lanham said:
I suppose this is a late time to explain, but we have a single zvol made up of multiple disks. This contains our system data set which we built multiple data sets under for our data, essentially each data set is a share. When we started coming up with the idea of a live backup server, we setup another dataset and moved all of our other datasets under this data set because as I understand it, you can't replicate a system data set to another system dataset. So our Data datasets contains all of our other datasets.

This is hard to follow, and I think you're using the wrong terminologies. Does you mean "zpool" when you write "zvol"? Do the "multiple disks" refer to your vdev(s)? Do you mean "root dataset of the pool" when you write "system data set"?

j.lanham said:
If you automatically, through replication job retention periods, does it automatically remove the snapshots for all of the underlying datasets or does it only apply to the master dataset?

I'm not entirely sure about this. It's a question for iXsystems. The underlying software used is their in-house "zettarepl". The behavior of pruning snapshots, and whether it will happen recursively into your nested children, may in fact be based on whether you only have a single dataset selected as a destination, or if you select multiple different datasets.

j.lanham · Jul 7, 2023

winnielinnie said:
This is hard to follow, and I think you're using the wrong terminologies. Does you mean "zpool" when you write "zvol"? Do the "multiple disks" refer to your vdev(s)? Do you mean "root dataset of the pool" when you write "system data set"?

Yes, I am. I did mean zpool. The root data set in the zpool is marked as the system data set. I'm still trying to get my head around the terminology.

winnielinnie said:
I'm not entirely sure about this. It's a question for iXsystems. The underlying software used is their in-house "zettarepl". The behavior of pruning snapshots, and whether it will happen recursively into your nested children, may in fact be based on whether you only have a single dataset selected as a destination, or if you select multiple different datasets.

I'll try to see if I can. Yes, the target of the replication is a single dataset. The backup server is setup in terms of root dataset of the pool exactly like the production machine.

winnielinnie · Jul 7, 2023

I miswrote something. I meant to write:

(...) may in fact be based on whether you only have a single dataset selected as a source, or if you select multiple different datasets.

Important Announcement for the TrueNAS Community.

RSYNC Access Mode Question and RSYNC Issues

Explorer

MVP

Explorer

MVP

Explorer

MVP

Explorer

Attachments

MVP

Explorer

Explorer

MVP

Explorer

MVP

Explorer

MVP

Explorer

MVP

Explorer

MVP

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "RSYNC Access Mode Question and RSYNC Issues"

Similar threads