How to improve replication performance

Status
Not open for further replies.

Dave Genton

Contributor
Joined
Feb 27, 2014
Messages
133
Using zfs snapshots and replication to mirror my primary FreeNAS server to a another node for disaster recovery and piece of min. Both servers reside on same segment, same switch in fact and I am looking to boost the performance.

I upgraded both servers to 10Gb and the performance is not much better. They are on a dedicated VLAN for replication and the config cannot get any simpler. I have turned off the encryption by setting it to "none" but seems no different in speed, throughput, and CPU usage than "Fast" encryption mode does. Each day typically the first 6 or so smaller data sets sync pretty quickly, done in an hour or so for all. Then the last dataset being 5TB goes on for 24 hours even if no changes since last snapshot/replication ? Why so long ? Why not faster with encryption disabled ? cpu should at least go down...

Changes I have made have been great over the months but back at square one with minimal sysctl tunables and runs at its fastest by far like this. All recommended tweaks just slows it down, so stuck with Chelsio T520 NIC and tweaked the obvious mss, mssdflt, send and receive spaces with auto increments as documented online. All others removed that are suggested and throughput raised from 1.3gb max to 2.0 gb max instantly with cpu rising 20% as well. TSO/LSO hardware enabled, changing mtu actually breaks replication quite often with ssh having issues but when it doesn't mind the mtu 9000 it stays at 2GB on 10Gb dedicated links with cpu nearing 60% for a E3-1241v3 XEON that was always at 20% for repl in the past. Low latency and high throughput network but servers not using like others can take off and fly on. I figure its the replication specifically as I can hit 8Gbps transfers on these servers easily, and consistently with iperf3 tests etc.

Any known config changes so I can expedite replication between nodes ?? Resources not an issue having Xeon's and 32GB ECC RAM in each box, 8 7200 RPM hdd's. Any assistance would be appreciated as I have hit the wall and need to speed this process back up abit and at least justify the 10Gb NICs and switches I just purchased that cut very little time off my daily replications.

Minmss and mssdflt back to 536/1448 for best performance. Raising mssdflt for 9000 mtu didn't help, but hurt by 700mbps for some reason. Newest update 2 installed this week again allowing repl without encyption, prior had to keep on "fast" or wouldn't work.. No encryption but CPU much higher...

d-
 
Joined
Feb 2, 2016
Messages
574
What is your largest snapshot? How much actual data is being replicated? You say the dataset is 5TB but is that the snapshot size?

What is the raw read speed of the sending host? What is the raw write speed of the receiving host?

Cheers,
Matt
 
Status
Not open for further replies.
Top