Replication really slow, where is my bottleneck?

Status
Not open for further replies.

RlainTheFirst

Dabbler
Joined
Sep 29, 2017
Messages
14
Hi!
I am in the process of replicating my entire FreeNAS at home (with almost 3TB of data) onto a new setup (with old hardware) which will be placed offsite after initial sync.
The main server is a xeon quadcore and does not seem to sweat this replication on any reporting graphs.

Under reporting on receiver side, interface traffic seems peaking out at 112.2Mbits/s.
This is on a gigabit network. 14MB/s seems really slow, and the sync is on its third day, closing in on completion, but still got about 20% left.

Both machines have 8GB ram.
CPU on receiver side seems to hover around 80% utilization.
top -P shows both cores/logical threads usually idling around 20-30%.
I seem unable to find out if the receiver is dual core, or single core with hyperthreading. It is a dell poweredge sc1420, dmesg output appended at end of post.
To check for network issues I ran iperf, output shown below, and the network reporting graph showed a huge spike as expected.
Code:
iperf -c 10.0.0.120
------------------------------------------------------------
Client connecting to 10.0.0.120, TCP port 5001
TCP window size: 32.8 KByte (default)
------------------------------------------------------------
[  3] local 10.0.0.111 port 15467 connected with 10.0.0.120 port 5001
[ ID] Interval  Transfer  Bandwidth
[  3]  0.0-10.0 sec  962 MBytes  807 Mbits/sec



So it seems that neither CPU nor network limit the transfer rate.
On the receiver side I have two 4TB drives striped, no room or need for mirrors I think, since only time Ill have to go there is if house burns down.
Main server has mirrored disk all the way.

Disk busy graph on receiver seems to hover around 15% for both disks, which then also isn't a bottleneck it would seem.

The only idea I currently have is that it might be a single core with hyperthreading and thus the 20-30% idling I see is simply that the hyperthreading is not needed fully, but rather the compute power is the limiting factor.
Is there a way I can confirm this?

Also the graph ARC Hit Ratio, what is it? Sender shows around 25% and receiver around 70%.


Any response greatly appreciated!


Dmesg output (some of it):
Code:
CPU: Intel(R) Xeon(TM) CPU 2.80GHz (2793.06-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0xf41  Family=0xf  Model=0x4  Stepping=1
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x641d<SSE3,DTES64,MON,DS_CPL,CNXT-ID,CX16,xTPR>
  AMD Features=0x20000800<SYSCALL,LM>
  TSC: P-state invariant
real memory  = 9193914368 (8768 MB)
avail memory = 8195547136 (7815 MB)
Event timer "LAPIC" quality 100
ACPI APIC Table: <DELL  PE1420 >
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
FreeBSD/SMP: 1 package(s) x 2 hardware threads
 
Last edited by a moderator:

RlainTheFirst

Dabbler
Joined
Sep 29, 2017
Messages
14
I found these sites:
http://www.dell.com/downloads/global/products/pedge/en/sc1420_specs.pdf
https://ark.intel.com/products/series/59213/Legacy-Intel-Xeon-Processors

On ark.intel I did a search for all matches of 2.80 and found that only one of the matches had two cores, and that processor has 4MB cache, which the first link suggests is not possible.
So It would seem that it is a single core, hyperthreaded CPU in it.
Maybe it really is the bottleneck then, what do you guys think? Does this seem reasonable?
 

c32767a

Patron
Joined
Dec 13, 2012
Messages
371
I found these sites:
http://www.dell.com/downloads/global/products/pedge/en/sc1420_specs.pdf
https://ark.intel.com/products/series/59213/Legacy-Intel-Xeon-Processors

On ark.intel I did a search for all matches of 2.80 and found that only one of the matches had two cores, and that processor has 4MB cache, which the first link suggests is not possible.
So It would seem that it is a single core, hyperthreaded CPU in it.
Maybe it really is the bottleneck then, what do you guys think? Does this seem reasonable?

Try a manual RSYNC between 2 test directories on the 2 NASs and see how it performs..
I wouldn't expect that CPU to light your network on fire, but it should be able to get out of it's own way.
You might be running into encryption overhead. if a manual rsync works better with no encryption, you might want to try turning off any CPU intensive operations like encryption and compression.. (Note the obvious consequences of turning off encryption.. :) )
 
Status
Not open for further replies.
Top