Register for the iXsystems Community to get an ad-free experience and exclusive discounts in our eBay Store.

Replicate Performance

Status
Not open for further replies.

hokan

Member
Joined
Feb 10, 2017
Messages
42
When I make terabyte-sized changes so I can see how replicate performs, it doesn't perform as well as I expected.

My source and target machines are connected via Intel 10gb connections. iperf tells me that I can get well over 9gb/sec.

My replicate goes from 1gb/sec to 1.5gb/sec. I'm not sure why there's variation, but in any case, I expect it to be faster.

The most interesting thing I see; on the Disk section of the Reporting tab which shows that, during replication, a number of disks have very high latency -- more than half a second on the sending machine and a bit lower on the receiving machine.

Is this the best I can do? If not, what kinds of things should I be looking for?
 
Last edited:

hokan

Member
Joined
Feb 10, 2017
Messages
42
Sending Machine Top:
Code:
55 processes:  3 running, 52 sleeping
CPU:  1.3% user,  0.0% nice,  3.6% system,  0.1% interrupt, 95.0% idle
Mem: 24M Active, 879M Inact, 60G Wired, 1245M Free
ARC: 56G Total, 11G MFU, 45G MRU, 2859K Anon, 203M Header, 187M Other
Swap: 32G Total, 184M Used, 32G Free

  PID USERNAME  THR PRI NICE  SIZE  RES STATE  C  TIME  WCPU COMMAND
15148 root  1  52  0 24700K 10240K pipewr 30  3:18  40.87% lz4c
15152 root  1  47  0 56752K  8320K CPU38  38  3:45  37.06% ssh
15150 root  1  49  0 12340K  2960K pipdwt 27  3:19  36.28% dd
15149 root  1  83  0 12340K  2960K CPU5  5  3:22  35.79% dd
15151 root  1  26  0  9256K  1956K select 14  0:47  10.60% pipewatcher
15146 root  2  26  0 48620K  3736K pipewr 36  0:54  10.50% zfs
41223 root  12  20  0  247M 43324K nanslp 32 150:41  0.10% collectd




Receiving Machine Top:
Code:
36 processes:  1 running, 35 sleeping
CPU:  0.9% user,  0.0% nice,  3.4% system,  0.4% interrupt, 95.4% idle
Mem: 191M Active, 1101M Inact, 59G Wired, 2291M Free
ARC: 57G Total, 113M MFU, 56G MRU, 29M Anon, 105M Header, 183M Other
Swap: 32G Total, 32G Free

  PID USERNAME  THR PRI NICE  SIZE  RES STATE  C  TIME  WCPU COMMAND
83352 root  1  52  0  152M 74928K select 39  4:25  41.89% sshd
83359 root  1  37  0 20604K 10336K piperd 11  2:13  23.78% lz4c
83360 root  1  28  0 44388K  3648K piperd 30  1:32  13.77% zfs
84485 root  1  20  0 50644K 11944K CPU18  18  0:00  0.29% top
4861 root  12  20  0  247M 45356K nanslp 31  39:09  0.00% collectd
 
Last edited by a moderator:

wblock

Documentation Engineer
Joined
Nov 14, 2014
Messages
1,505
That doesn't look far off, assuming I'm looking at the same thing. A 9Gbit connection divided by 8 bits per byte would give about 1Gbyte per second replications.

Not sure about the disk latency.
 

hokan

Member
Joined
Feb 10, 2017
Messages
42
That doesn't look far off, assuming I'm looking at the same thing. A 9Gbit connection divided by 8 bits per byte would give about 1Gbyte per second replications.

Not sure about the disk latency.

My network utilization at the switch connecting the two FreeNAS boxes reports almost 10 gigabits/second during my iperf test, but 1 to 1.5 gigabits/second during a replicate.
 

hokan

Member
Joined
Feb 10, 2017
Messages
42
Seems that it's encryption that is slowing you down. What's your hardware?

EDIT: Maybe something with your RAID card?

My hardware setup is in my signature.

I have encryption turned off. In any case, ssh is using less than 50% of a cpu core and I have 40 cores available.

It certainly could be something with my RAID card (which is running in JBOD mode), but I don't know how to determine that.
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,392
How fast is your pool? On the send and receive side?

Sent from my Nexus 5X using Tapatalk
 

hokan

Member
Joined
Feb 10, 2017
Messages
42
How fast is your pool? On the send and receive side?

Sent from my Nexus 5X using Tapatalk
I'm not sure what you mean? Both systems were built with identical hardware and are running the same level of FreeNAS, so I'd expect them to be the same performance.
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,392
I'm not sure what you mean? Both systems were built with identical hardware and are running the same level of FreeNAS, so I'd expect them to be the same performance.
Well what is their performance? Maybe you can only send 1gbit/s because you pool is maxed out.

Sent from my Nexus 5X using Tapatalk
 

hokan

Member
Joined
Feb 10, 2017
Messages
42

hokan

Member
Joined
Feb 10, 2017
Messages
42
Code:
[root@rascal-1] /mnt/MainVolume/yy# dd if=/dev/zero of=/mnt/MainVolume/yy/ddfile bs=2048k count=10k
10240+0 records in
10240+0 records out
21474836480 bytes transferred in 6.950249 secs (3089793850 bytes/sec)
[root@rascal-1] /mnt/MainVolume/yy# dd if=/mnt/MainVolume/yy/ddfile of=/dev/null bs=2048k count=10k
10240+0 records in
10240+0 records out
21474836480 bytes transferred in 3.003808 secs (7149204052 bytes/sec)



It looks like this dd test is a whole lot faster for me than the folks in the linked thread experienced. I wonder why?

And it looks like it's writing at 3 gigabytes/second; much faster than my replication.
 
Last edited:

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,392
Code:
[root@rascal-1] /mnt/MainVolume/yy# dd if=/dev/zero of=/mnt/MainVolume/yy/ddfile bs=2048k count=10k
10240+0 records in
10240+0 records out
21474836480 bytes transferred in 6.950249 secs (3089793850 bytes/sec)
[root@rascal-1] /mnt/MainVolume/yy# dd if=/mnt/MainVolume/yy/ddfile of=/dev/null bs=2048k count=10k
10240+0 records in
10240+0 records out
21474836480 bytes transferred in 3.003808 secs (7149204052 bytes/sec)



It looks like this dd test is a whole lot faster for me than the folks in the linked thread experienced. I wonder why?

And it looks like it's writing at 3 gigabytes/second; much faster than my replication.
Great job testing your memory speed. Turn off compression and run the test again.

Sent from my Nexus 5X using Tapatalk
 

hokan

Member
Joined
Feb 10, 2017
Messages
42
Great job testing your memory speed. Turn off compression and run the test again.

Sent from my Nexus 5X using Tapatalk
Yes, I have the fastest memory; the best memory; none better...

Code:
[root@rascal-1] ~# dd if=/dev/zero of=/mnt/MainVolume/yy/ddfile bs=2048k count=10k
10240+0 records in
10240+0 records out
21474836480 bytes transferred in 43.964731 secs (488455996 bytes/sec)
[root@rascal-1] ~# dd if=/mnt/MainVolume/yy/ddfile of=/dev/null bs=2048k count=10k
10240+0 records in
10240+0 records out
21474836480 bytes transferred in 3.935560 secs (5456615207 bytes/sec)


With compression off, the write speeds are now closer to those in the previously referenced link, but still a fair bit faster than replication. The read speeds are, perhaps, due to arc?
 
Last edited:

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,392
Yes, I have the fastest memory; the best memory; none better...

Code:
[root@rascal-1] ~# dd if=/dev/zero of=/mnt/MainVolume/yy/ddfile bs=2048k count=10k
10240+0 records in
10240+0 records out
21474836480 bytes transferred in 43.964731 secs (488455996 bytes/sec)
[root@rascal-1] ~# dd if=/mnt/MainVolume/yy/ddfile of=/dev/null bs=2048k count=10k
10240+0 records in
10240+0 records out
21474836480 bytes transferred in 3.935560 secs (5456615207 bytes/sec)


With compression off, the write speeds are now closer to those in the previously referenced link, but still a fair bit faster than replication. The read speeds are, perhaps, due to arc?
And these results are the same on both server's?

Also try to just scp a file from one to the other.

Sent from my Nexus 5X using Tapatalk
 

wblock

Documentation Engineer
Joined
Nov 14, 2014
Messages
1,505
dd if=/dev/zero of=/mnt/MainVolume/yy/ddfile bs=2048k count=100k
Just a usage note: dd(1) takes lots of different smart units, so bs=2048k can be replaced with bs=2m.
 

hokan

Member
Joined
Feb 10, 2017
Messages
42
With compression off:
Code:
[root@rascal-1] ~# dd if=/dev/zero of=/mnt/MainVolume/yy/ddfile bs=1g count=10k
10240+0 records in
10240+0 records out
10995116277760 bytes transferred in 26664.244631 secs (412354313 bytes/sec)
[root@rascal-1] ~# dd if=/mnt/MainVolume/yy/ddfile of=/dev/null bs=1g count=10k
10240+0 records in
10240+0 records out
10995116277760 bytes transferred in 8545.197300 secs (1286701277 bytes/sec)

So 412 megabyts/sec writing and 1,286 megabytes/sec reading.

I copied a file from one system to the other with scp:
scp -o Cipher=none 4.tar 172.19.108.15:/mnt/MainVol-2/yy/.
Looks like this does 3.4 gigabits/second. SSH was 100% CPU on both systems.

I tried to approximate the pipe that does the replicate and came up with this test:
Code:
cat 4.tar | lz4c | /bin/dd obs=1m 2> /dev/null | /bin/dd obs=1m 2> /dev/null | /usr/local/bin/pipewatcher $$ | /usr/local/bin/ssh -ononeenabled=yes -ononeswitch=yes -i /data/ssh/replication -o BatchMode=yes -o StrictHostKeyChecking=yes -o ConnectTimeout=7 -p 22 172.19.108.15 "/usr/bin/env lz4c -d > /mnt/MainVol-2/xx/d
33.d"

Network use with this was close to what I got with replicate; about 1.2 gigabits/second.

I removed one bit at a time from the sequence and found that pipewatcher didn't seem to cost anything, but removing one of the dd commands speeded things up a bit, and removing both dd commands got me up to around 2 gigabytes/second.

I wonder what those dd commands are for and why there's two of them?

I removed the decompress step on the remote side and that didn't help.

I removed lz4c so now we're down to this:
Code:
cat 4.tar | /usr/local/bin/ssh -ononeenabled=yes -ononeswitch=yes -i /data/ssh/replication -o BatchMode=yes -o StrictHostKeyChecking=yes -o ConnectTimeout=7 -p 22 172.19.108.15 "cat - > /mnt/MainVol-2/xx/d33.d"


And we got 3.9 gigabytes/second; ssh is just under 100% of a cpu.

So, put everything else back in the pipe, except for the lz4c:
Code:
cat 4.tar | /bin/dd obs=1m 2> /dev/null | /bin/dd obs=1m 2> /dev/null | /usr/local/bin/pipewatcher $$ | /usr/local/bin/ssh -ononeenabled=yes -ononeswitch=yes -i /data/ssh/replication -o BatchMode=yes -o StrictHostKeyChecking=yes -o ConnectTimeout=7 -p 22 172.19.108.15 "cat > /mnt/MainVol-2/xx/d33.d"

With this it's down to around 2 gigabytes/second and no CPU core is more than 50% busy.

I'm not sure what to make of all this.
 
Last edited:

Dice

Neophyte Sage
Joined
Dec 11, 2015
Messages
1,215
Status
Not open for further replies.
Top