can't maintain ssh connection to perform large zfs send

seanm · May 21, 2022

I'm trying to zfs send a 6 TB pool from some old disks to some new disks. It gets a few 100 GB through, but then the ssh connection always drops with:

client_loop: send disconnect: Broken pipe

I've set ServerAliveInterval in my client's ~/.ssh/config but that doesn't seem to help. Could it be the TrueNAS dropping the connection?

I've noticed also that the zfs send/receive consumes 100% CPU, I guess because of decompression/recompression?

Thanks for any suggestions...

winnielinnie · May 21, 2022

Why not use tmux, in which you needn't keep an SSH session alive and active?

Secondly, for such a large transfer, you should be generating a "resume token" in case you need to resume, rather than starting all over again.

seanm · May 21, 2022

Never heard of either, but thanks, I will read about them.

winnielinnie · May 21, 2022

tmux (like screen) let's you run processes in separate sessions. These sessions continue to exist in the background. You can create, kill, list, enter, and exit tmux sessions.

This cheat sheet makes for a good bookmark and guide. (CTRL + B is your friend!)

---

ZFS has a feature called a "resume token". Upon the first (or subsequent) run of a "zfs recv" command, the -s flag instructs the receiving dataset (the "destination" of the transfer) to periodically generate a unique resume token. (I think it updates its resume token every 10 seconds, or something like that.)

You can manually view the resume token on the destination dataset like so:
zfs get receive_resume_token newpool/mydata

It's a very long string! It contains the information needed to resume a ZFS send/recv, and it supersedes the subsequent "zfs send" options (because you cannot change any "send options" for a resumed transfer.)

---

Your first attempt to send/recv might look something like this (notice there is no -t flag on the send, but there is a -s flag on the recv):

zfs send -w -R oldpool/mydata@migrate_2022-05-21_00-00 | zfs recv -v -s -d -F newpool

However, if it is interrupted and you wish to resume without starting all over, you can invoke the -t flag on the send side, which tells it to use a "resume token" found on the destination dataset (notice the recv side always invokes the -s flag, but the send side only invokes -t, without all the other options, nor snapshot names, nor pool names, nor dataset names, etc):

zfs send -t `zfs get -H receive_resume_token newpool/mydata | cut -f3` | zfs recv -v -s -d -F newpool

If you look closely, you'll see I embedded a command within a command. This is because the -t flag needs only the pure resume token string without extraneous fields or words.

To test this yourself, you can see the string without extraneous fields like so:
zfs get -H receive_resume_token newpool/mydata | cut -f3

In fact, during a send/recv transfer, you can watch in real-time this string continuously update:
watch -n 10 "zfs get -H receive_resume_token newpool/mydata | cut -f3"

seanm · May 21, 2022

Wow, thanks for all the detail, I'll give that a try!

seanm · May 22, 2022

The resume token is working great, thanks. So I'll eventually get this thing copied now.

But for tmux, would it need to be installed on my TrueNAS or on the client I'm sshing from?

NugentS · May 22, 2022

its on the server. SSH in and type tmux

seanm · May 23, 2022

OK, thanks all. tmux allows the zfs send to not die when the ssh connection dies, and the resume token also works.

Alas, even with `nice -n 20` the zfs send/receive jumps the CPU too 100% and everything else the NAS is running (notably VMs) becomes uselessly slow.

Is there any way to run zfs send/receive at rock-bottom priority?

winnielinnie · May 23, 2022

seanm said:
Is there any way to run zfs send/receive at rock-bottom priority?

You applied "nice" on both ends (send and recv)?

Is this a raw stream? If not, it means that the records would have to decrypt -> re-encrypt (if you're using encryption).

How much RAM do you have? What CPU?

Patrick M. Hausen · May 23, 2022

winnielinnie said:
You applied "nice" on both ends (send and recv)?

Good point - I missed that. Alternatively use an explicit shell invocation: nice -n 20 sh -c "zfs send ... | zfs receive ..."

seanm · May 23, 2022

Oh, indeed I only put one 'nice', will try on both ends.

It's an Intel Xeon E5-2630 v4 @ 2.20GHz with 64 GiB RAM.

It's not that the CPU usage is unexpected, since the pools are different compression types, I presume it's uncompressing and recompressing, but what is unexpected is that it pulverizes the system so badly that my FAMP VM is literally unable to serve a simple webpage. 6 TB will take days anyway, I'd rather it take it's time. :)

Patrick M. Hausen · May 23, 2022

seanm said:
my FAMP VM

So that's a FreeBSD/Apache/MySQL(MariaDB)/PHP stack? Why in a VM? If you are running that on FreeBSD, a jail would have much lower overhead.

seanm · May 23, 2022

Patrick M. Hausen said:
So that's a FreeBSD/Apache/MySQL(MariaDB)/PHP stack? Why in a VM? If you are running that on FreeBSD, a jail would have much lower overhead.

Yes, FreeBSD/Apache/MySQL/PHP. In a VM because TrueNAS is forever far behind FreeBSD and I want to run FreeBSD 13, but jails would limit me to FreeBSD 12 (well, until a few days ago, yes, I know Core 13 is out now).

Patrick M. Hausen · May 23, 2022

That's why they switched to tracking -stable. You could run 12.3 jails on TN 12 and now 13.1 jails in TN 13 ... not that far, anymore. OPNSense made the same switch, btw.

seanm · May 23, 2022

Yeah, I'm happy to see that change to -stable. But, it doesn't help me today. :)

I also tried "nice -n 20 sh -c "blah blah"" and after it running for literally 5 seconds my website is inaccessible. :(

seanm · Jun 4, 2022

So I discovered the -R flag and that seems to send a more 'raw' stream where the destination dataset ends up with the same type of compression algorithm. I had hoped this would greatly reduce CPU usage, but alas, no. It still consumes 98% CPU to do this send/receive, and after a few minutes all the VMs I'm running became molasses.

Is it impossible to do a large send/receive on a live system??

Patrick M. Hausen · Jun 4, 2022

I'm doing that all day with hourly replications. All VMs on SSDs or spinning drives?

winnielinnie · Jun 4, 2022

seanm said:
So I discovered the -R flag and that seems to send a more 'raw' stream where the destination dataset ends up with the same type of compression algorithm. I had hoped this would greatly reduce CPU usage, but alas, no. It still consumes 98% CPU to do this send/receive

I'll be honest, that does seem odd for what should be mostly data transfer. The CPU shouldn't get slammed just to do a raw ZFS send/recv.

Maybe run htop and sort by CPU% to see which process(es) are taking the CPU to its limit?

seanm · Jun 4, 2022

The VMs are on a mirror of 2 SSDs. The send/receive is between a pool of old spinning HDs and a pool of newer spinning HDs, that is, the VMs are not on a pool involved in the copy at all.

htop says it's all kernel:

seanm · Jun 5, 2022

I had the idea to try just doing a plain old 'cp', from and to the same pools, and the insane CPU usage occurs there too. Then I tried copying from yet a different pool to my new pool, and same issue again. (In fact, killing the cp doesn't even calm my CPUs down, I had to unmount the destination pool.)

So it's not zfs send/receive per se.

Then I created a new dataset on the new/destination pool this time using lesser compression (ztd-3) and now things work reasonably. So it seems the culprit is that my destination was using ztd-19, and/or that my source is using gzip-9.

Important Announcement for the TrueNAS Community.

can't maintain ssh connection to perform large zfs send

Guru

MVP

Guru

MVP

Guru

Guru

MVP

Guru

MVP

Hall of Famer

Guru

Hall of Famer

Guru

Hall of Famer

Guru

Guru

Hall of Famer

MVP

Guru

Guru

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "can't maintain ssh connection to perform large zfs send"

Similar threads