Register for the iXsystems Community to get an ad-free experience and exclusive discounts in our eBay Store.

Replication Stream Compression

Western Digital Drives - The Preferred Drives of FreeNAS and TrueNAS CORE

Johnny Fartpants

Neophyte Sage
Joined
Jul 3, 2015
Messages
571
Hi All,

This is just a bit of knowledge sharing so hope somebody finds this helpful.

For a long time now all my systems have been using LZ4 compression during replication as after testing (a long time ago) this seemed to be a good idea. I must add that all my datasets are LZ4 by default.

However recently I was troubleshooting an issue with one of my replica boxes whereby it was randomly rebooting. It appears IPMI watchdog was hard reseting it during the replication window. I watched it during replication the other night and noticed LZ4c was using a lot of CPU between 90-100% so assumed that this was freezing the system and causing watchdog to reset it. Anyway I started looking at all my other systems and both on the send and receive systems they also had very high CPU usage thanks to LZ4c but I figured its been like this for a while so its not an issue. However I've never been able to replicate quickly even though all systems have 10Gb network connections and am often stuck at 100Mbps up to about 500Mbps.

I decided to re-investigate compression during replication and essentially disabled it and now all my systems are pushing 2Gbps. Suddenly ssh is consuming about 80-90% CPU (which it was about 5% before) but all seems happy. Interestingly LZ4c has now vanished from CPU usage stats during replication.

Anyway take from it what you will but it might be worth just revisiting your replication setup if you feel you should be able to get the sort of speeds I am.

All the best.
 
Last edited:

mav@

iXsystems
iXsystems
Joined
Sep 29, 2011
Messages
1,128
In FreeNAS 11.3, as part of general replication rewrite, we are going to start using ZFS-native compressed replication, that was added recently. It should both save CPU time and improve throughput.

Watchdog fires when watchdogd in user space can not receive any sufficient time frame for 128 seconds. I suppose it may theoretically be caused by extremely high load, but considering it means that system is not responsive for 2 minutes, it should be not very good already.
 

Mlovelace

Neophyte Sage
Joined
Aug 19, 2014
Messages
1,065
Are you replicating across a WAN connection or is it all internal networks? If it's all internal, and you don't need an encrypted stream, you could push the snapshots through netcat. I move just over 3TiB per hour when I pipe zfs send/receive through netcat.
 

Johnny Fartpants

Neophyte Sage
Joined
Jul 3, 2015
Messages
571
Are you replicating across a WAN connection or is it all internal networks? If it's all internal, and you don't need an encrypted stream, you could push the snapshots through netcat. I move just over 3TiB per hour when I pipe zfs send/receive through netcat.
It's all internal. Thats sounds cool how do you do that?
 

Mlovelace

Neophyte Sage
Joined
Aug 19, 2014
Messages
1,065
Here is the commands I use.
Code:
zfs snapshot pool/dataset@relocate

zfs send pool/dataset@relocate | mbuffer -q -s 1024k -m 1G | pv -b | nc -w 20 XX.XX.XX.XX 8023

nc -w 20 -l 8023 | mbuffer -q -s 1024k -m 1G | pv -rtab | zfs receive -vF pool/dataset


So, 'nc' is the netcat command '-w' flag is the wait time to connect (gives you a chance to hit enter on both sides) '-l' is the listen port it can be anything not likely to be in use.

I pipe netcat into mbuffer (mbuffer buffers I/O operations and displays the throughput rate. It is multi-threaded, supports network connections, and offers more options than the standard buffer.) '-s' is the pagesize of the incoming data so I match it to the recordsize of the dataset, 1M in my case, '-m' is the total buffer size.

Then I pipe that into pv to monitor the stream real time (Pipe Viewer (pv) is a terminal-based tool for monitoring the progress of data through a pipeline. It can be inserted into any normal pipeline between two processes to give a visual indication of how quickly data is passing through, how long it has taken, how near to completion it is, and an estimate of how long it will be until completion.)
 
Last edited:

MexiCali

Newbie
Joined
Apr 16, 2019
Messages
1
What if I wanted to compress those snapshots? An example of how I would do that?
 

Mlovelace

Neophyte Sage
Joined
Aug 19, 2014
Messages
1,065
What if I wanted to compress those snapshots? An example of how I would do that?
Add the -c flag to the zfs send command, and blocks that were compressed on disk will remain compressed as they pass through the pipe to the other end.
 
Last edited:

Chris Moore

Wizened Sage
Joined
May 2, 2015
Messages
10,062
Are you replicating across a WAN connection or is it all internal networks? If it's all internal, and you don't need an encrypted stream, you could push the snapshots through netcat. I move just over 3TiB per hour when I pipe zfs send/receive through netcat.
Is this still working for you? Any improvements or changes you would share?
 

Mlovelace

Neophyte Sage
Joined
Aug 19, 2014
Messages
1,065
Is this still working for you? Any improvements or changes you would share?
I haven't run this command since the upgrade to trueNAS 12. Is there something broken with it on the latest version?
 

Chris Moore

Wizened Sage
Joined
May 2, 2015
Messages
10,062
I haven't run this command since the upgrade to trueNAS 12. Is there something broken with it on the latest version?
I am using it now to migrate to new hardware. Appears to be working fine on TrueNAS 12. I was just curious if there was any, "new and improved," method?
 

Tabmowtez

Member
Joined
Nov 12, 2020
Messages
34
I'm doing the exact same thing syncing my datasets from my old truenas box to my new one.
It is maxing out both 1gbps links so I didn't bother figuring out if there is a better/faster way :)
When there are lots of smaller files obviously it slows down a little but that's to be expected I think.

receiving full stream of tank/media@send into data/media@send
5.30TiB 21:01:36 [ 112MiB/s] [73.4MiB/s]
 
Top