rsync issues when building new server: ~tmp~ dirs & endless syncs

Status
Not open for further replies.
Joined
Jul 15, 2018
Messages
4
Hi forum. Long time listener, first time caller.

I am setting up a brand new FreeNAS 11.1 box and migrating all data and jails from my ageing 9.3 install. I couldn't do zfs snapshot transfer (apparently I needed 9.10 or later on the older server, I'm not sure why) so I switched on rsyncd on the new server and set up an rsync task on the old one to push my data over.

The task is set up with the following options ticked: recursive, preserve permissions, delay updates. It uses the rsync module with a module name that exists on the newer server. It sends stuff over as root.

At first, it seemed to work: stuff started copying. I gave it a couple of days to think about it. That's when things went a bit weird:
  • The old server has ~4 TB on it. When I checked, the new drive had ~8 TB of data! On closer inspection, this is because there's dupes of almost all the files in .~tmp~ directories. This is presumably linked to the "delay updates" setting in the rsync task.
  • If I look around the new server's filesystem, the transfer looks about complete. I can see lots of files that should be there. But there could be some missing, of course.
  • If I do a manual rsync command line from the old server to the new (using ssh for transport), it still starts copying files, even if they appear to be present on the new server. I did
    Code:
    rsync -arvz --progress
    .
  • If I do md5sums on the old and new files, they have the same signatures on both servers. Yet rsync insists on pushing fresh copies.
I think I understand why the .~tmp~ directories are there. But I don't understand why the rsync task and the rsync command line are insisting on pushing files that appear to already be present on the destination. Something to do with permissions, maybe? Or something else? Any advice on what I can do to debug this / get confident my files are transferred OK so I can get on with unhooking my old server? Thanks in advance!
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Hi forum. Long time listener, first time caller.
Funny.
delay updates.
Nothing shows up until everything is done. I don't use that one because I want to see the data appear on the new system as it is copied.
On closer inspection, this is because there's dupes of almost all the files in .~tmp~ directories. This is presumably linked to the "delay updates" setting in the rsync task.
Absolutely correct, which is part of the reason I don't use it.
If I do a manual rsync command line from the old server to the new (using ssh for transport), it still starts copying files, even if they appear to be present on the new server. I did
You could have borked everything up by tampering with it while it was already doing a sync. The temp files are not cleaned up until the initial transfer completes successfully and partially copied files do not show up.
If I do md5sums on the old and new files, they have the same signatures on both servers. Yet rsync insists on pushing fresh copies.
Because it is a task in progress. The list of files to copy was created before the process started and it is ignoring your additional efforts. You just need to wait for the first rsync to finish or you will have to wipe the pool and start over to make sure you don't end up with a bunch of garbage data.

If you could list all the hardware details for the old and new system, that would be great. Here is an example:
https://forums.freenas.org/index.php?threads/updated-forum-rules-4-11-17.45124/

To transfer the data more quickly, when I was doing what you are doing one time a few years ago, I connected all the drives to a single system and imported the old pool as a pool in the new server along side the new pool. That made it much easier to copy all the files (at high speed) from the old pool.
I had to have both chassis next to one another and use long SAS cables to reach from one system to the other, but I copied about 7TB of data over in a weekend where it was going to take a week or more to do it over the network.
 
Joined
Jul 15, 2018
Messages
4
You could have borked everything up by tampering with it while it was already doing a sync. The temp files are not cleaned up until the initial transfer completes successfully and partially copied files do not show up.
...
Because it is a task in progress. The list of files to copy was created before the process started and it is ignoring your additional efforts. You just need to wait for the first rsync to finish or you will have to wipe the pool and start over to make sure you don't end up with a bunch of garbage data.

Hmmm. But: on the new disk I have this structure:
Code:
dir/file
dir/.~tmp~/file

On the old disk I have:
Code:
dir/file
If I do md5sums on those files, they report as the same. When I do
Code:
rsync -arvz -e ssh dir/* root@newserver:/dir
...it starts to copy file over. What I'm confused about is why that happens, as (AFAIK) rsync doesn't store any state outside the filesystem, and the only state I think it should be probing (is file the same in both places) is true and should result in a noop. You say that "The list of files to copy was created before the process started" -- so you're saying some state is persisted between the transfer I tried to do via rsyncd and the ones I am trying to do via ssh? How?

I think the next step is to purge some of the .~tmp~ directories and see if it still does does rogue copy operations. Failing that, I can blow it all away and do the entire copy again, although I'd rather avoid that if possible, and I'd like to understand what's going on.

If you could list all the hardware details for the old and new system, that would be great. Here is an example:
https://forums.freenas.org/index.php?threads/updated-forum-rules-4-11-17.45124/
New system is this, old system is a HP Microserver N40L with 8 GB RAM and a passive DC-DC PSU upgrade. Nothing in the hardware config should affect how rsync is working, though, right?

Edit to add -- new machine has all 6 drives in a single RAIDZ2 vdev. Old machine has 3x drives in a RAIDZ1.

To transfer the data more quickly, when I was doing what you are doing one time a few years ago, I connected all the drives to a single system and imported the old pool as a pool in the new server along side the new pool. That made it much easier to copy all the files (at high speed) from the old pool.
I had to have both chassis next to one another and use long SAS cables to reach from one system to the other, but I copied about 7TB of data over in a weekend where it was going to take a week or more to do it over the network.
It's a good idea but sadly a non-starter for various reasons, not least of which is both old and new systems having fully utilised SATA ports.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
It's a good idea but sadly a non-starter for various reasons, not least of which is both old and new systems having fully utilised SATA ports.
SATA drives work perfectly on SAS controllers:
https://www.ebay.com/itm/Dell-H310-...0-IT-Mode-for-ZFS-FreeNAS-unRAID/162834659601
One of those and a set of cables:
https://www.ebay.com/itm/2x-Mini-SA...d-Breakout-Internal-Cable-3-Feet/372255907421
will handle 8 drives no problem.
You can get them for even less if you are willing to flash the firmware yourself.
I just copied my the 9TB of data in my main pool to a backup pool last night in about 2 hours and 15 minutes because I had all the drives connected to the same SAS controller. Probably this weekend I will put the backup pool into a different chassis and setup a scheduled task to keep the two pools in sync over the network. I was pretty impressed with copying 9TB in 2 hours. I used zfs send | zfs receive and I know that was much faster than rsync would be, because I have done it using rsync before.
 
Joined
Jul 15, 2018
Messages
4
I'm sure the SAS card is very nice but I can't justify $55 on more hardware just to speed up a one-off transfer. I'm impatient but not that impatient. As for zfs send, as I said, when I tried to set that up snapshot transfers using the FreeNAS UI it told me the source version (9.3) was too old. I don't think upgrading the old box is the right answer, just to try a different transfer option -- there's always a non-zero chance something goes wrong. I'll plow on with rsync for now.
 
Joined
Jul 15, 2018
Messages
4
Documenting this for future reference:

On the new server (hostname freenas), I have no .~tmp~ files under the path I am testing with:
Code:
root@freenas:/mnt/mainvolume/Media/Videos/TV/30 Rock/Season 01 # ls .~*
ls: No match.
Picking a file at random, I see the same md5sum for it on the old (hostname bran) and new servers:
Code:
root@freenas:/mnt/mainvolume/Media/Videos/TV/30 Rock/Season 01 # md5 1x01\ -\ Pilot.avi
MD5 (1x01 - Pilot.avi) = 3168ee2fb61c2f3a88d0eee9b185793c

[root@bran] /mnt/mainvolume/Media/Videos/TV/30 Rock/Season 01# md5 1x01\ -\ Pilot.avi
MD5 (1x01 - Pilot.avi) = 3168ee2fb61c2f3a88d0eee9b185793c
And yet, if I start an rsync pushing to the new server from the old, it attempts to send this file, for no reason I can fathom:
Code:
[root@bran] /mnt/mainvolume/Media/Videos/TV/30 Rock/Season 01# rsync -arvvz -e ssh -s * "root@192.168.0.109:/mnt/mainvolume/Media/Videos/TV/30 Rock/Season 01/" --progress
opening connection using: ssh -l root 192.168.0.109 rsync --server -svvlogDtprze.iLsfx  (7 args)
protected args: . "/mnt/mainvolume/Media/Videos/TV/30 Rock/Season 01/"  (2 args)
root@192.168.0.109's password:
sending incremental file list
delta-transmission enabled
1x01 - Pilot-thumb.jpg
		 32,067 100%   29.91MB/s	0:00:00 (xfr#1, to-chk=83/84)
1x01 - Pilot.avi
	 50,830,632  27%   16.08MB/s	0:00:08  ^CKilled by signal 2.


I'm more or less resigned to just doing the whole transfer again (with --delete to clear down the spurious .~tmp~ files, and this time just using the command line rather than the FreeNAS UI) but I sure am curious what's going on.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
I'm sure the SAS card is very nice but I can't justify $55 on more hardware just to speed up a one-off transfer. I'm impatient but not that impatient. As for zfs send, as I said, when I tried to set that up snapshot transfers using the FreeNAS UI it told me the source version (9.3) was too old. I don't think upgrading the old box is the right answer, just to try a different transfer option -- there's always a non-zero chance something goes wrong. I'll plow on with rsync for now.
It doesn't need to be a "one-off". I run my SATA drives from a SAS controller all the time. I think it is more reliable that way, but your welcome to do it however you like.
The reason it probably said the source was too old is because the feature flags in the ZFS pool have changed. I know there was one, maybe two ZFS version updates between 9.3 and now. You wouldn't be able to upgrade directly from 9.3 anyhow. You would have to go to 9.10 first and you could try zfs send from there, see if it worked, and if it didn't you would need to upgrade to 11.1 but that would all be quite the pain. It might take you an hour or two... The slow part of what you are doing is the network connection though. If you are running 1Gb network, that is going to be a choke point, but rsync will not fully utilize the network because it is designed to run in the background and leave system resources available to serve current requests. That is part of the reason it goes so slow. That is why I was saying that doing the transfer in my system from local pools using send and receive went in like 2 hours where doing the same transfer with rsync took more than 8 hours and used more CPU resources.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Documenting this for future reference:

On the new server (hostname freenas), I have no .~tmp~ files under the path I am testing with:
Code:
root@freenas:/mnt/mainvolume/Media/Videos/TV/30 Rock/Season 01 # ls .~*
ls: No match.
Picking a file at random, I see the same md5sum for it on the old (hostname bran) and new servers:
Code:
root@freenas:/mnt/mainvolume/Media/Videos/TV/30 Rock/Season 01 # md5 1x01\ -\ Pilot.avi
MD5 (1x01 - Pilot.avi) = 3168ee2fb61c2f3a88d0eee9b185793c

[root@bran] /mnt/mainvolume/Media/Videos/TV/30 Rock/Season 01# md5 1x01\ -\ Pilot.avi
MD5 (1x01 - Pilot.avi) = 3168ee2fb61c2f3a88d0eee9b185793c
And yet, if I start an rsync pushing to the new server from the old, it attempts to send this file, for no reason I can fathom:
Code:
[root@bran] /mnt/mainvolume/Media/Videos/TV/30 Rock/Season 01# rsync -arvvz -e ssh -s * "root@192.168.0.109:/mnt/mainvolume/Media/Videos/TV/30 Rock/Season 01/" --progress
opening connection using: ssh -l root 192.168.0.109 rsync --server -svvlogDtprze.iLsfx  (7 args)
protected args: . "/mnt/mainvolume/Media/Videos/TV/30 Rock/Season 01/"  (2 args)
root@192.168.0.109's password:
sending incremental file list
delta-transmission enabled
1x01 - Pilot-thumb.jpg
		 32,067 100%   29.91MB/s	0:00:00 (xfr#1, to-chk=83/84)
1x01 - Pilot.avi
	 50,830,632  27%   16.08MB/s	0:00:08  ^CKilled by signal 2.


I'm more or less resigned to just doing the whole transfer again (with --delete to clear down the spurious .~tmp~ files, and this time just using the command line rather than the FreeNAS UI) but I sure am curious what's going on.
I am on my 8th build and have migrated data from one FreeNAS system to another than many times. I have seen the problem you are experiencing and I tried to tell you why it happened but you must have misunderstood my meaning.
delay updates.
When you use delay updates though the GUI, it copies everything into this temp folder that the GUI creates and doesn't move those files over to the directory they are supposed to be in until the rsync is completed successfully. If something else comes along and makes changes while it is working, or it gets interrupted, it can leave those temp folders / files hanging out with no cleanup script to eliminate them.
The old server has ~4 TB on it.
With that little bit of data, you could have it copied in a couple hours, if you would listen to advice.
I gave it a couple of days to think about it.
Instead, you are spending days and you are going to have to start over if you want to be sure everything is copied accurately.
I am not just throwing out random garbage here, I really am trying to tell you a way to save yourself days of time.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
there's always a non-zero chance something goes wrong. I'll plow on with rsync for now.
An upgrade would only affect your boot pool. If it worked, you would have the option to update the ZFS features. Once you do that, you would not be able to go back, but if you had a problem with the upgrade, before updating the ZFS features, you could always roll back to the previous boot environment.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
You say that "The list of files to copy was created before the process started" -- so you're saying some state is persisted between the transfer I tried to do via rsyncd and the ones I am trying to do via ssh? How?
I was talking about the one that was started through the GUI. When you start the rsync it goes out an does a comparison and makes a list of the data to copy, then starts to work the list. If you delete or move a file from the source directory while the transfer is in progress, the system will have an error (it keeps going) but it will tell you about the missing files that couldn't be copied. So if you kick off another rsync process before the first one is done, they don't know about each other and they can step on each other. Also, if you kill the rsync before it is done, it never cleans up those temp files that are only created if you use the setting to delay updates.
 
Status
Not open for further replies.
Top