Slow replication

mpfusion · Aug 26, 2014

We use replication from one server to a backup server. It's not much data to
replicate, around 1.9 TiB in 36 datasets (4400 snapshots).

Transmitting the first snapshots per dataset maxed out the network connection
(1 GiB up/down). After that it stalled and transferring a single snapshot
takes several minutes. Every few minutes a few KiB to a few MiB are
transferred. It took more than 24 hours to transmit 2000 snapshots (and it's
still doing it's thing).

To test I manually transferred all the data to the remote side:

Code:

zfs snapshot -r pool@foo
zfs send -Rv pool@foo | ssh freenas-backup | zfs receive -Fdu newpool

This transfer including all 4400 snapshots took around 6 hours which seems
much more reasonable. The FreeNAS replication, however, transferred half of
the snapshots in 24 hours. There seems to be something wrong and I may suspect
it has something to do with the “zfs list” process (on PUSH) which “ps” shows
almost all the time:

Code:

/sbin/zfs list -t snapshot -H

But since I don't know the nitty gritty details of the replication I might be wrong. Furthermore, the log file is filled with thousands of lines:

Code:

Aug 26 21:48:20 freenas autosnap.py: [tools.autosnap:58] Popen()ing: /sbin/zfs get -H freenas:state tank/foo/bar@auto-20140703.0700-2m
Aug 26 21:48:20 freenas autosnap.py: [tools.autosnap:58] Popen()ing: /sbin/zfs get -H freenas:state tank/foo/bar@auto-20140719.0700-2m
Aug 26 21:48:20 freenas autosnap.py: [tools.autosnap:58] Popen()ing: /sbin/zfs get -H freenas:state tank/foo/bar@auto-20140717.0700-2m

CPU, network and disk are low on PUSH and on PULL during the replication. SSH
cipher is deactivated and I left replication stream compression on the default
setting (lz4).

Am I alone with this or is it a known issue? Should I file a bug report? It's
a little hard to reproduce in a VM since you need quite some data including a
bunch of snapshots.

Specs PUSH:
FreeNAS-9.2.1.7-RELEASE-x64
32 GiG RAM, 2 mirrored vdevs

Specs PULL:
FreeNAS-9.2.1.7-RELEASE-x64
8 GiG RAM, RAIDZ-1

mpfusion · Aug 28, 2014

Short update:

Replication is now running since more than three days over a GiB ethernet connection and it's still not finished replicating less than 2 TiB of data. That doesn't look right to me.

mpfusion · Aug 28, 2014

Since there are no replies I created a ticket: https://bugs.freenas.org/issues/5936

Important Announcement for the TrueNAS Community.

Slow replication

mpfusion

Contributor

mpfusion

Contributor

mpfusion

Contributor

Similar threads

Important Announcement for the TrueNAS Community.

Slow replication

mpfusion

Contributor

mpfusion

Contributor

mpfusion

Contributor

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Slow replication"

Similar threads