Super Slow transfer using replication task over the internet

onlineforums

Explorer
Joined
Oct 1, 2017
Messages
56
I'm a bit confused and wondering if there is a way of diagnosing a concern that I currently am experiencing.

I have 500 GB of data transferring from one TrueNAS box to another TrueNAS box that is off site. This is done via replication task over SSH. So far, in 5 days (about 12 hours) it has transferred a total of 162 GB of the 500 GB. This is outrageous and not normal. I'm not sure why it is so slow as this is not normal behavior (such as the original transfer that was well over 500 GB only took a few days). I'm thinking about killing the process and manually starting it again but don't want to do that if it means starting over at 0 and having to wait another 5 days for only 162 GB.

top command shows 95% idle CPU usage and 777M free memory.

Are there any commands I can try to get a sense as to what the bottleneck would be? Both the sender and receiver have fiber internet connections with high throughput and both boxes are MiniXL's so I don't think it is a overhead issue.
 

ChrisRJ

Wizard
Joined
Oct 23, 2020
Messages
1,919
Can you pinpoint where the delay occurs? What is your effective up- and downstream speed in both locations? Is the provider limiting speed via SSH/port 22?
 

onlineforums

Explorer
Joined
Oct 1, 2017
Messages
56
Can you pinpoint where the delay occurs? What is your effective up- and downstream speed in both locations? Is the provider limiting speed via SSH/port 22?
I suppose that is what I'm asking in this thread (how to pinpoint the delay). The up and down streams are sufficient on both sides (50mbps up and down on both sides). I run sshd on a custom port.
 

ChrisRJ

Wizard
Joined
Oct 23, 2020
Messages
1,919
50 mbps is probably 50 Mbps, which would be roughly 5 MByte/s. Correct?

The first thing to test would be the line speed for SSH. What do you get from a simple SSH copy (scp command)?
 

Alex_K

Explorer
Joined
Sep 4, 2016
Messages
64
iperf3 could too be used for raw network throughput test
sometimes they (datacenters) give you 50 mbps only when using multi-threaded copy, while giving ~10-15 mbps per thread. hard to help with that

probably it'll boil down to playing with tcp window and buffer tunables, also passing stream through mbuffer helps a lot. if you have a lot of free cpu then use, for example, pigz for replication stream compression

and you can do resume on zfs transfer when done from console, not sure about gui
initiating zfs send with something like this:
zfs send -vR snapshot-we-replicate | pigz -9c | mbuffer -s 128k -m 1G | ssh user@host "mbuffer -s 128k -m 1G| pigz -d| zfs recv -sveF where_to_dataset"


then if you changed parameters, replace
"R snapshot-we-replicate"
"-t receive_resume_token"
where the value of the receive_resume_token is the property of the filesystem or volume which is received into.

in my experience on slow links netcat did not visibly speedup things, but maybe in your use case it would.


when you'll see in your network graph a flat ~50mbps, and your cpu is reasonably loaded, you probably done everything you could

other then requesting datacenters to temporary lift your network limit for the sake of big one-time data transfer, some may agree to do that even free of charge
 

onlineforums

Explorer
Joined
Oct 1, 2017
Messages
56
ChrisRJ and Alex_K thank you for your replies.

ChrisRJ - I actually went to test a large single file using SCP to monitor it rather than going the iperf3 route and the source replication task in the GUI shows "ERROR" and when clicking on it states "Network connection timeout." So I'm going to manually run this and see if it is any quicker.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
they (datacenters)

You misspelled "crappy colocation or low end hosting vendor" :smile:

Budget minded service providers will sometimes do this kind of thing to help take the sharp peaks off their own 95%th to their upstreams, but at a real data center you can get circuits to whatever service providers are present onsite and typically these are easy to max out. The common suspects for cheap high speed bandwidth are Cogent and Hurricane Electric, the latter of which is advertising 10Gbps for $700/mo. Their POP list is at http://he.net/ip_transit.html
 
Top