Replication Stream Compression

Joined
Jul 3, 2015
Messages
926
Hi All,

This is just a bit of knowledge sharing so hope somebody finds this helpful.

For a long time now all my systems have been using LZ4 compression during replication as after testing (a long time ago) this seemed to be a good idea. I must add that all my datasets are LZ4 by default.

However recently I was troubleshooting an issue with one of my replica boxes whereby it was randomly rebooting. It appears IPMI watchdog was hard reseting it during the replication window. I watched it during replication the other night and noticed LZ4c was using a lot of CPU between 90-100% so assumed that this was freezing the system and causing watchdog to reset it. Anyway I started looking at all my other systems and both on the send and receive systems they also had very high CPU usage thanks to LZ4c but I figured its been like this for a while so its not an issue. However I've never been able to replicate quickly even though all systems have 10Gb network connections and am often stuck at 100Mbps up to about 500Mbps.

I decided to re-investigate compression during replication and essentially disabled it and now all my systems are pushing 2Gbps. Suddenly ssh is consuming about 80-90% CPU (which it was about 5% before) but all seems happy. Interestingly LZ4c has now vanished from CPU usage stats during replication.

Anyway take from it what you will but it might be worth just revisiting your replication setup if you feel you should be able to get the sort of speeds I am.

All the best.
 
Last edited:

mav@

iXsystems
iXsystems
Joined
Sep 29, 2011
Messages
1,428
In FreeNAS 11.3, as part of general replication rewrite, we are going to start using ZFS-native compressed replication, that was added recently. It should both save CPU time and improve throughput.

Watchdog fires when watchdogd in user space can not receive any sufficient time frame for 128 seconds. I suppose it may theoretically be caused by extremely high load, but considering it means that system is not responsive for 2 minutes, it should be not very good already.
 

Mlovelace

Guru
Joined
Aug 19, 2014
Messages
1,111
Are you replicating across a WAN connection or is it all internal networks? If it's all internal, and you don't need an encrypted stream, you could push the snapshots through netcat. I move just over 3TiB per hour when I pipe zfs send/receive through netcat.
 
Joined
Jul 3, 2015
Messages
926
Are you replicating across a WAN connection or is it all internal networks? If it's all internal, and you don't need an encrypted stream, you could push the snapshots through netcat. I move just over 3TiB per hour when I pipe zfs send/receive through netcat.
It's all internal. Thats sounds cool how do you do that?
 

Mlovelace

Guru
Joined
Aug 19, 2014
Messages
1,111
Here is the commands I use.
Code:
zfs snapshot pool/dataset@relocate

zfs send pool/dataset@relocate | mbuffer -q -s 1024k -m 1G | pv -b | nc -w 20 XX.XX.XX.XX 8023

nc -w 20 -l 8023 | mbuffer -q -s 1024k -m 1G | pv -rtab | zfs receive -vF pool/dataset


So, 'nc' is the netcat command '-w' flag is the wait time to connect (gives you a chance to hit enter on both sides) '-l' is the listen port it can be anything not likely to be in use.

I pipe netcat into mbuffer (mbuffer buffers I/O operations and displays the throughput rate. It is multi-threaded, supports network connections, and offers more options than the standard buffer.) '-s' is the pagesize of the incoming data so I match it to the recordsize of the dataset, 1M in my case, '-m' is the total buffer size.

Then I pipe that into pv to monitor the stream real time (Pipe Viewer (pv) is a terminal-based tool for monitoring the progress of data through a pipeline. It can be inserted into any normal pipeline between two processes to give a visual indication of how quickly data is passing through, how long it has taken, how near to completion it is, and an estimate of how long it will be until completion.)
 
Last edited:

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Are you replicating across a WAN connection or is it all internal networks? If it's all internal, and you don't need an encrypted stream, you could push the snapshots through netcat. I move just over 3TiB per hour when I pipe zfs send/receive through netcat.
Is this still working for you? Any improvements or changes you would share?
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
I haven't run this command since the upgrade to trueNAS 12. Is there something broken with it on the latest version?
I am using it now to migrate to new hardware. Appears to be working fine on TrueNAS 12. I was just curious if there was any, "new and improved," method?
 

Tabmowtez

Dabbler
Joined
Nov 12, 2020
Messages
36
I'm doing the exact same thing syncing my datasets from my old truenas box to my new one.
It is maxing out both 1gbps links so I didn't bother figuring out if there is a better/faster way :)
When there are lots of smaller files obviously it slows down a little but that's to be expected I think.

receiving full stream of tank/media@send into data/media@send
5.30TiB 21:01:36 [ 112MiB/s] [73.4MiB/s]
 

diskdiddler

Wizard
Joined
Jul 9, 2014
Messages
2,374
Here is the commands I use.
Code:
zfs snapshot pool/dataset@relocate

zfs send pool/dataset@relocate | mbuffer -q -s 1024k -m 1G | pv -b | nc -w 20 XX.XX.XX.XX 8023

nc -w 20 -l 8023 | mbuffer -q -s 1024k -m 1G | pv -rtab | zfs receive -vF pool/dataset


So, 'nc' is the netcat command '-w' flag is the wait time to connect (gives you a chance to hit enter on both sides) '-l' is the listen port it can be anything not likely to be in use.

I pipe netcat into mbuffer (mbuffer buffers I/O operations and displays the throughput rate. It is multi-threaded, supports network connections, and offers more options than the standard buffer.) '-s' is the pagesize of the incoming data so I match it to the recordsize of the dataset, 1M in my case, '-m' is the total buffer size.

Then I pipe that into pv to monitor the stream real time (Pipe Viewer (pv) is a terminal-based tool for monitoring the progress of data through a pipeline. It can be inserted into any normal pipeline between two processes to give a visual indication of how quickly data is passing through, how long it has taken, how near to completion it is, and an estimate of how long it will be until completion.)


Just to help other newbies who are using TrueNAS / FreeNAS, I have done what this guy says and it worked for me.
Note that the 3'rd command "nc -w etc etc etc" is obviously keyed in on the receiving workstation
The xx.xx.xx is obviously the IP address

I created a zstd max compression 65GB, sparse zvol on my test PC.
I shut down my favourite UbuntuVM (docker for core! works great) and performed a snapshot.
Then used the above commands, with the added zfs send -c and it worked fine, it appears to have retained my sparse volume too which is important to me.
On destination PC, I was able to boot up my VM from the zvol without issue, super impressive.

Hopefully other people googling how to copy / migrate from one machine to another will be able to find this via Google.

Now I can emulate / test upgrading from Core, to SCALE on a test machine, with my very important VM, ZVOL of linux and see if KVM can boot it, I'm going to assume so but be fun to find out.

Thanks all in this thread.
 

diskdiddler

Wizard
Joined
Jul 9, 2014
Messages
2,374
Hey Team,

So this is kind of off topic but maybe someone could help.
This command
nc -w 20 -l 8023 | mbuffer -q -s 1024k -m 1G | pv -rtab | zfs receive -vF pool/dataset

Doesn't work on proxmox.
Obviously this thread is focused on TrueNAS but I'm curious if the transmission will still work if I omit the pv and mbuffer command?
Should I just find pv and mbuffer and install them?
 

mav@

iXsystems
iXsystems
Joined
Sep 29, 2011
Messages
1,428
`pv` there I guess is only for cosmetics. Without mbuffer it probably become significantly slower.
 

diskdiddler

Wizard
Joined
Jul 9, 2014
Messages
2,374
`pv` there I guess is only for cosmetics. Without mbuffer it probably become significantly slower.


Ultimately, it seems the disk image from TrueNAS is incompatible, in some capacity.
I ended up settling on this command.

zfs send SSDVM/VM/UbuntuServer | ssh root@192.168.0.250 zfs receive -F rpool/data/vm-101-disk-0

It is indeed writing into the destination server, but alas. She won't boot.
I may just re-install Ubuntu 22.04 in a few days on the new system but I thought this might be a fun expirment, no idea what the issue is, maybe it's RAW mode or something, I'm just not sure.
 

diskdiddler

Wizard
Joined
Jul 9, 2014
Messages
2,374
I decided to boot the VM with a Ubuntu USB key and then see what I could see.

Z2Jl9nx.png



root@proxmox:~# zfs list
NAME USED AVAIL REFER MOUNTPOINT
rpool 34.3G 194G 96K /rpool
rpool/ROOT 4.09G 194G 96K /rpool/ROOT
rpool/ROOT/pve-1 4.09G 194G 4.09G /
rpool/data 30.1G 194G 96K /rpool/data
rpool/data/subvol-102-disk-0 11.7G 28.3G 11.7G /rpool/data/subvol-102-disk-0
rpool/data/vm-100-disk-0 56K 194G 56K -
rpool/data/vm-101-disk-0 18.4G 194G 18.4G -

(it's vm-101-disk-0)
So it's pushed 18GB of content from TrueNAS to Proxmox but if my basic understanding is correct, the destination content, is well dead / scrambled. It's not a file system, right? So clearly I'd need to tackle this differently.
Probably something in my send / recv command.
 
Top