Replication Stream Compression

Johnny Fartpants · Feb 7, 2019

Hi All,

This is just a bit of knowledge sharing so hope somebody finds this helpful.

For a long time now all my systems have been using LZ4 compression during replication as after testing (a long time ago) this seemed to be a good idea. I must add that all my datasets are LZ4 by default.

However recently I was troubleshooting an issue with one of my replica boxes whereby it was randomly rebooting. It appears IPMI watchdog was hard reseting it during the replication window. I watched it during replication the other night and noticed LZ4c was using a lot of CPU between 90-100% so assumed that this was freezing the system and causing watchdog to reset it. Anyway I started looking at all my other systems and both on the send and receive systems they also had very high CPU usage thanks to LZ4c but I figured its been like this for a while so its not an issue. However I've never been able to replicate quickly even though all systems have 10Gb network connections and am often stuck at 100Mbps up to about 500Mbps.

I decided to re-investigate compression during replication and essentially disabled it and now all my systems are pushing 2Gbps. Suddenly ssh is consuming about 80-90% CPU (which it was about 5% before) but all seems happy. Interestingly LZ4c has now vanished from CPU usage stats during replication.

Anyway take from it what you will but it might be worth just revisiting your replication setup if you feel you should be able to get the sort of speeds I am.

All the best.

mav@ · Feb 7, 2019

In FreeNAS 11.3, as part of general replication rewrite, we are going to start using ZFS-native compressed replication, that was added recently. It should both save CPU time and improve throughput.

Watchdog fires when watchdogd in user space can not receive any sufficient time frame for 128 seconds. I suppose it may theoretically be caused by extremely high load, but considering it means that system is not responsive for 2 minutes, it should be not very good already.

Mlovelace · Feb 7, 2019

Are you replicating across a WAN connection or is it all internal networks? If it's all internal, and you don't need an encrypted stream, you could push the snapshots through netcat. I move just over 3TiB per hour when I pipe zfs send/receive through netcat.

Johnny Fartpants · Feb 7, 2019

Mlovelace said:
Are you replicating across a WAN connection or is it all internal networks? If it's all internal, and you don't need an encrypted stream, you could push the snapshots through netcat. I move just over 3TiB per hour when I pipe zfs send/receive through netcat.

It's all internal. Thats sounds cool how do you do that?

Mlovelace · Feb 7, 2019

Here is the commands I use.

Code:

zfs snapshot pool/dataset@relocate

zfs send pool/dataset@relocate | mbuffer -q -s 1024k -m 1G | pv -b | nc -w 20 XX.XX.XX.XX 8023

nc -w 20 -l 8023 | mbuffer -q -s 1024k -m 1G | pv -rtab | zfs receive -vF pool/dataset

So, 'nc' is the netcat command '-w' flag is the wait time to connect (gives you a chance to hit enter on both sides) '-l' is the listen port it can be anything not likely to be in use.

I pipe netcat into mbuffer (mbuffer buffers I/O operations and displays the throughput rate. It is multi-threaded, supports network connections, and offers more options than the standard buffer.) '-s' is the pagesize of the incoming data so I match it to the recordsize of the dataset, 1M in my case, '-m' is the total buffer size.

Then I pipe that into pv to monitor the stream real time (Pipe Viewer (pv) is a terminal-based tool for monitoring the progress of data through a pipeline. It can be inserted into any normal pipeline between two processes to give a visual indication of how quickly data is passing through, how long it has taken, how near to completion it is, and an estimate of how long it will be until completion.)

MexiCali · Apr 16, 2019

What if I wanted to compress those snapshots? An example of how I would do that?

Mlovelace · Apr 16, 2019

MexiCali said:
What if I wanted to compress those snapshots? An example of how I would do that?

Add the -c flag to the zfs send command, and blocks that were compressed on disk will remain compressed as they pass through the pipe to the other end.

Chris Moore · Dec 28, 2020

Mlovelace said:
Are you replicating across a WAN connection or is it all internal networks? If it's all internal, and you don't need an encrypted stream, you could push the snapshots through netcat. I move just over 3TiB per hour when I pipe zfs send/receive through netcat.

Is this still working for you? Any improvements or changes you would share?

Mlovelace · Dec 28, 2020

Chris Moore said:
Is this still working for you? Any improvements or changes you would share?

I haven't run this command since the upgrade to trueNAS 12. Is there something broken with it on the latest version?

Chris Moore · Dec 28, 2020

Mlovelace said:
I haven't run this command since the upgrade to trueNAS 12. Is there something broken with it on the latest version?

I am using it now to migrate to new hardware. Appears to be working fine on TrueNAS 12. I was just curious if there was any, "new and improved," method?

Tabmowtez · Dec 30, 2020

I'm doing the exact same thing syncing my datasets from my old truenas box to my new one.
It is maxing out both 1gbps links so I didn't bother figuring out if there is a better/faster way :)
When there are lots of smaller files obviously it slows down a little but that's to be expected I think.

receiving full stream of tank/media@send into data/media@send
5.30TiB 21:01:36 [ 112MiB/s] [73.4MiB/s]

diskdiddler · Dec 23, 2021

Mlovelace said:
Here is the commands I use.

Code:
zfs snapshot pool/dataset@relocate zfs send pool/dataset@relocate | mbuffer -q -s 1024k -m 1G | pv -b | nc -w 20 XX.XX.XX.XX 8023 nc -w 20 -l 8023 | mbuffer -q -s 1024k -m 1G | pv -rtab | zfs receive -vF pool/dataset

So, 'nc' is the netcat command '-w' flag is the wait time to connect (gives you a chance to hit enter on both sides) '-l' is the listen port it can be anything not likely to be in use.

I pipe netcat into mbuffer (mbuffer buffers I/O operations and displays the throughput rate. It is multi-threaded, supports network connections, and offers more options than the standard buffer.) '-s' is the pagesize of the incoming data so I match it to the recordsize of the dataset, 1M in my case, '-m' is the total buffer size.

Then I pipe that into pv to monitor the stream real time (Pipe Viewer (pv) is a terminal-based tool for monitoring the progress of data through a pipeline. It can be inserted into any normal pipeline between two processes to give a visual indication of how quickly data is passing through, how long it has taken, how near to completion it is, and an estimate of how long it will be until completion.)

Just to help other newbies who are using TrueNAS / FreeNAS, I have done what this guy says and it worked for me.
Note that the 3'rd command "nc -w etc etc etc" is obviously keyed in on the receiving workstation
The xx.xx.xx is obviously the IP address

I created a zstd max compression 65GB, sparse zvol on my test PC.
I shut down my favourite UbuntuVM (docker for core! works great) and performed a snapshot.
Then used the above commands, with the added zfs send -c and it worked fine, it appears to have retained my sparse volume too which is important to me.
On destination PC, I was able to boot up my VM from the zvol without issue, super impressive.

Hopefully other people googling how to copy / migrate from one machine to another will be able to find this via Google.

Now I can emulate / test upgrading from Core, to SCALE on a test machine, with my very important VM, ZVOL of linux and see if KVM can boot it, I'm going to assume so but be fun to find out.

Thanks all in this thread.

diskdiddler · Apr 20, 2022

Hey Team,

So this is kind of off topic but maybe someone could help.
This command

nc -w 20 -l 8023 | mbuffer -q -s 1024k -m 1G | pv -rtab | zfs receive -vF pool/dataset

Doesn't work on proxmox.
Obviously this thread is focused on TrueNAS but I'm curious if the transmission will still work if I omit the pv and mbuffer command?
Should I just find pv and mbuffer and install them?

mav@ · Apr 20, 2022

`pv` there I guess is only for cosmetics. Without mbuffer it probably become significantly slower.

diskdiddler · Apr 20, 2022

mav@ said:
`pv` there I guess is only for cosmetics. Without mbuffer it probably become significantly slower.

Ultimately, it seems the disk image from TrueNAS is incompatible, in some capacity.
I ended up settling on this command.

zfs send SSDVM/VM/UbuntuServer | ssh root@192.168.0.250 zfs receive -F rpool/data/vm-101-disk-0

It is indeed writing into the destination server, but alas. She won't boot.
I may just re-install Ubuntu 22.04 in a few days on the new system but I thought this might be a fun expirment, no idea what the issue is, maybe it's RAW mode or something, I'm just not sure.

diskdiddler · Apr 20, 2022

I decided to boot the VM with a Ubuntu USB key and then see what I could see.

root@proxmox:~# zfs list
NAME USED AVAIL REFER MOUNTPOINT
rpool 34.3G 194G 96K /rpool
rpool/ROOT 4.09G 194G 96K /rpool/ROOT
rpool/ROOT/pve-1 4.09G 194G 4.09G /
rpool/data 30.1G 194G 96K /rpool/data
rpool/data/subvol-102-disk-0 11.7G 28.3G 11.7G /rpool/data/subvol-102-disk-0
rpool/data/vm-100-disk-0 56K 194G 56K -
rpool/data/vm-101-disk-0 18.4G 194G 18.4G -

(it's vm-101-disk-0)
So it's pushed 18GB of content from TrueNAS to Proxmox but if my basic understanding is correct, the destination content, is well dead / scrambled. It's not a file system, right? So clearly I'd need to tackle this differently.
Probably something in my send / recv command.

pnunn · Aug 16, 2022

This is SO much faster that rsync on the same datasets. Thanks heaps.

Important Announcement for the TrueNAS Community.

Replication Stream Compression

Johnny Fartpants

Guru

mav@

iXsystems

Mlovelace

Guru

Johnny Fartpants

Guru

Mlovelace

Guru

MexiCali

Cadet

Mlovelace

Guru

Chris Moore

Hall of Famer

Mlovelace

Guru

Chris Moore

Hall of Famer

Tabmowtez

Dabbler

diskdiddler

Wizard

diskdiddler

Wizard

mav@

iXsystems

diskdiddler

Wizard

diskdiddler

Wizard

pnunn

Dabbler

Similar threads

Important Announcement for the TrueNAS Community.

Replication Stream Compression

Guru

iXsystems

Guru

Guru

Guru

Cadet

Guru

Hall of Famer

Guru

Hall of Famer

Dabbler

Wizard

Wizard

iXsystems

Wizard

Wizard

Dabbler

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Replication Stream Compression"

Similar threads