RSYNC failing at 90%+ sending from QNAP to FN8.3

Status
Not open for further replies.
Joined
Jan 13, 2013
Messages
3
I am trying to use Rsync to replicate one share on a Qnap TurboNAS 859 (4x2TB in RAID6) to a FreeNAS server elsewhere in the same building (on the same network) purely for backup purposes.

The Qnap is acting as the Time Machine target for a single OSX machine and it is this (hidden!) directory, currently containing about 1.5TB of data (the machine is used mainly for video editing), that I'm trying to sync. I have successfully set up an Rsync module on the FreeNAS machine and an Rsync job on the Qnap and things progress well (if slowly - I'm getting about 300Mbit/s only across what should be a gigabit link) until the transfer gets to something like 95% complete whereupon I get the following from the FreeNAS box:

Code:
Jan 13 04:47:49 backupnas rsyncd[90856]: rsync error: timeout in data send/receive (code 30) at io.c(137) [receiver=3.0.9]
Jan 13 04:47:49 backupnas rsyncd[90856]: rsync: connection unexpectedly closed (280 bytes received so far) [generator]
Jan 13 04:47:49 backupnas rsyncd[90856]: rsync error: error in rsync protocol data stream (code 12) at io.c(605) [generator=3.0.9]


about an hour later I get an email from the Qnap that says (slightly obfuscated):

Server Name: MacNAS
IP Address: 172.22.x.x
Date/Time: 2013/01/13 05:48:36
Level: Error
[Remote Replication] MacNASBackup failed: rsync error: timeout in data send/receive (code 30) at io.c(137) [receiver=3.0.9]. Begin 3rd retry.

At this point transfers re-start, but it seems as though the Qnap starts from scratch rather than just completing the previously almost-complete transfer. After three retries the Qnap fails the sync:

Server Name: MacNAS
IP Address: 172.22.x.x
Date/Time: 2012/12/09 13:43:21
Level: Error
[Remote Replication] MacNASBackup failed: rsync error: timeout in data send/receive (code 30) at io.c(137) [receiver=3.0.9].

(obviously this was from a previous attempt - the current attempt hasn't finally failed yet, but this is the fourth time I've tried)

The Qnap has an option to use "RTRR" (so-called Real Time Remote Replication) to another server and I could use this instead of Rsync, except that while I can manually enter the path for the Time Machine directory in the Rsync setup, I can't enter it in RTRR (which uses a file browser which can't see the Time Machine directory).

The FreeNAS box is one I have built fairly recently and some initial problems turned out to be a dodgy CPU which booted fine but crashed under load (errors in the L1 cache IIRC). It has been replaced temporarily with a spare processor until I get around to swapping the original. Current spec. for the FreeNAS box is then:

  • AMD A8-3870 APU (was originally, and will be, an A4-3400)
  • Motherboard: Asus F1A75V-Evo
  • RAM: 16GB as 2x8GB DDR3-1600
  • Discs: 8x750G Seagate ST9750420AS 2.5" in caddies (4 to a 5.25" bay - thoroughly recommend this as a solution to more drives in less space) with empty bays for 8 more drives
  • Storage configured as one RAIDZ2 dev giving about 4.5TB(decimal) online
  • SATA: 6+1 onboard SATA3 ports (4 in use) and 3x 4-port SATA3 cards (cheap Startech ones) in PCIe x4 slots (4 ports on one card in use at the moment)
  • Network: Onboard Realtek 8111 because the nice Intel NIC missed the order and is coming in the next batch of stuff
  • FreeNAS: 8.3.0 (x64)


The intervening network is 3x managed gigabit switches with gigabit fibre switch-switch links. This sector of the network was upgraded specifically to allow the NAS boxes (and the Mac) to talk at gigabit speeds.

Any thoughts gladly received. Backing up the Time Machine archive is the first thing, then there is a second QNAP with more general files that I wish to backup. This is a work setup, but I'm also considering how I can backup my home FreeNAS box, probably to another unit offsite. If I can get Rsync working on an internal network I'm hoping it's not too much extra work to get it going across t'internet :smile:

Thanks.

Martin.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I have no idea how to fix your exact issue, but I'll give you some info and some ideas...

Rsync does a checksum comparision between source and destination. This seems to ALWAYS be CPU bound limiting people to, at best, 300Mb/sec(gee.. that's what you are getting!). I got the same speed doing direct link between 2 quite powerful machines with 16GB of RAM+ each. The issue is Rsync is single threaded. One core will sit at 100% and the rest idle.

During my experimenting and testing rsync seems to have a timeout for a single file transfer. If one file takes too long the rsyncing of that file will be aborted. I assume this is to prevent a timeout from locking out rsync indefinitely. You may be able to get around this issue by manually copying the files to the exact same location on the destination, then doing an rsync.

However from my experience(and someone disagreed with this assessment so this is up for debate) rsync will recalculate all checksums again the next time it is run. Both the source and destination will calculate checksums, then any files that don't agree will be resent. No, the 2 machines don't seem to keep any kind of "database" of files because they have no way of know if those files have been tampered with or corrupted so the safest bet is to recalculate every time. Remember that really slow 300Mb/sec speed you were getting, thats how fast the checksums are calculated. So your rsyncs will still take hours and hours to complete and your network traffic will be limited only to the changed files.

I know what Time Machine is, but that's all. But if it does something like make a single big file and do stuff inside that big file rsync will resend that file every time it is run since it is always changing. This would particularly suck if my previous paragraph is correct because first the checksums will be calculated on both ends(hours and hours of processing) and then the files will be sent(again.. hours and hours).

You may want to setup some kind of cronjob to just copy the files and choose to ignore/overwrite certain things.

Good luck! I'd really like to know what your final choice is and how it works out for you.
 
Joined
Jan 13, 2013
Messages
3
I have no idea how to fix your exact issue, but I'll give you some info and some ideas...

Rsync does a checksum comparision between source and destination. This seems to ALWAYS be CPU bound limiting people to, at best, 300Mb/sec(gee.. that's what you are getting!). I got the same speed doing direct link between 2 quite powerful machines with 16GB of RAM+ each. The issue is Rsync is single threaded. One core will sit at 100% and the rest idle.

Ha! Thanks for that - being processor-bound is somehow less of a downer than being network-bound, probably because there's less I can do about it :smile:

During my experimenting and testing rsync seems to have a timeout for a single file transfer. If one file takes too long the rsyncing of that file will be aborted. I assume this is to prevent a timeout from locking out rsync indefinitely.

I know what Time Machine is, but that's all. But if it does something like make a single big file and do stuff inside that big file rsync will resend that file every time it is run since it is always changing.

It's entirely possible that a big file is the cause, but it's not something I've ever seen noted in any documentation. As far as I can tell, Time Machine doesn't create a single big file; what it does seems to be very similar to the ZFS snapshot function. You can even browse the archive manually if you don't want to use the interface because unchanged files are presented in-place as hard links.

In my specific case the Mac is being used (as I noted) mainly for video editing and one thing we all know about videos is that they can be huge. If what you say is correct then it could just be one video file causing the problems. I wonder if there's a way to change this timeout, and whether it would work if done only at the receiving end (i.e. the FreeNAS box) as I doubt there's a way to do it at the sending end (the QNAP)?

You may be able to get around this issue by manually copying the files to the exact same location on the destination, then doing an rsync.

You may want to setup some kind of cronjob to just copy the files and choose to ignore/overwrite certain things.

Good luck! I'd really like to know what your final choice is and how it works out for you.

I'd really like to be able to do a simple file copy (all I need is the things to be synchronised every now and again), but as I mentioned the Time Machine share on the QNAP is hidden, even to the QNAP's own file browser, so I can't set it up as a share accessible to other network devices. The only place you can browse it is on the Mac itself. Reading through the FreeNAS documentation, Time Machine shares have to be shared using a slightly odd protocol in order for Time Machine to "see" them and in FreeNAS it's possible to set this on a share-by-share basis. The QNAP box will only allow one such share. I can't mediate the backup through the Mac (which would probably be even slower) as the Mac is used by several people, any of whom would complain if there was a file transfer going on in the background the whole time, and all of whom will shut the computer down when they've finished anyway.

I'm going to have a play with the QNAP (might even try their forums, assuming there are any) and will let you know if I discover anything relevant. Thanks for taking the time to reply.

Hwyl!

M.
 
Joined
Jan 13, 2013
Messages
3
I've been browsing the QNAP forums (difficult as there doesn't seem to be a search - had to use my Google-fu) and came across an interesting self-answered thread:

http://forum.qnap.com/viewtopic.php?p=262779

Essentially he seemed to have the same problem as I have. He answered it like this:

Believe it or not, the reason for this problem was a wrong MTU value (Jumbo frames) - da**, that took a lot of time to find it out

I've no idea if this is likely to be a problem here - I've not fiddled with MTU on either the QNAP or FreeNAS. Are they likely to be set differently? The intervening network should handle any size without issue, but as far as I know everything's set to 1500...

In another thread I found the following, which confirms what you have said:

http://forum.qnap.com/viewtopic.php?p=103912

In my opinion large files is the main problem. I have four QNAP's under my controle. Most of the rsync jobs are working fine and even with large files it will take some time but eventually they get through. But the jobs that are almost always failing are large outlook .pst files. For some reason 90 percent of the time these backups will fail (and I have three qnaps with large .pst files). But for the life I do not know the reason why sometimes it works and most of the time it fails.

and in this thread:
http://forum.qnap.com/viewtopic.php?f=22&t=23640&p=105982

(hope I copied that correctly) there have obviously been ongoing issues with Rsync on the QNAPs, though as that thread is from 2010 I don't know how relevant it is today.

I'll keep looking (it's great being at work on Sunday - fewer people bothering me ;-)

Hwyl!

M.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I doubt MTU is your problem. You'd know because you'd have changed the MTU. The default for virtually everything is 1500 unless you change it.
 

pete_c20

Dabbler
Joined
Nov 23, 2012
Messages
23
There's an option in Rsync called '--no-whole-file'. I'm no expert on Rsync but this option is present in the default setup in the rsync client I use (QTDSync). Is it possible to add to the options on the rsync command line in your stuff?

Edited to add -
When you say that rsync aborts at 95% and re-starts, is it re-starting the one big file or the whole job? I've never seen rsync restart an entire job other than to send the incremental file list up to the point where it needs to continue.

If it is aborting on a single large file, what is the file size? Let me know and I'll attempt to replicate a possible time-out issue here.
 
Status
Not open for further replies.
Top