George Kyriazis
Dabbler
- Joined
- Sep 3, 2013
- Messages
- 42
Hi there,
I have 2 freenas systems configured as follows:
FreeNAS-8.3.1-RELEASE-p2-x64
1 mirrored pool (2x4TB) on each system
1/2 hour snapshots scheduled on PUSH (expiring after 2 days)
daily and weekly snapshots
zfs replication from PUSH to PULL
filesystem ~8% full.
Machines are geographically separated on a corporate net, with BW between then ~5-15Mb/sec.
zfs replication takes too long to complete, in fact, my replication is backlogged. The network is not the bottleneck.
I can divide the replication of each snapshot into 3 observable phases
1. network activity. PUSH does a zfs send, PULL does a zfs receive, and everything works great. This phase takes about 15 minutes.
2. No network activity, PUSH has exited zfs send, however PULL is still executing zfs receive. zfs iostat indicates that PULL still performs disk activity. This also takes about 15 minutes.
3. zfs inherit is executed after zfs send/receive. This one takes about 15 minutes, too, with disk activity on PULL.
Phases 2&3 bring the time of a shapshot replication >1/2 hour, and my replication is lagging behind. This didn't always happen, it started happening lately.
Anyway you cut it, taking ~45 minutes per snapshot, while only 1/3 of it is network traffic, seems too much of an overhead for replication.
I'd like to get other people's feedback if this timing makes sense. What is zfs doing in the "dead" receive period? Why is zfs inherit take so long?
Thanks!
George
I have 2 freenas systems configured as follows:
FreeNAS-8.3.1-RELEASE-p2-x64
1 mirrored pool (2x4TB) on each system
1/2 hour snapshots scheduled on PUSH (expiring after 2 days)
daily and weekly snapshots
zfs replication from PUSH to PULL
filesystem ~8% full.
Machines are geographically separated on a corporate net, with BW between then ~5-15Mb/sec.
zfs replication takes too long to complete, in fact, my replication is backlogged. The network is not the bottleneck.
I can divide the replication of each snapshot into 3 observable phases
1. network activity. PUSH does a zfs send, PULL does a zfs receive, and everything works great. This phase takes about 15 minutes.
2. No network activity, PUSH has exited zfs send, however PULL is still executing zfs receive. zfs iostat indicates that PULL still performs disk activity. This also takes about 15 minutes.
3. zfs inherit is executed after zfs send/receive. This one takes about 15 minutes, too, with disk activity on PULL.
Phases 2&3 bring the time of a shapshot replication >1/2 hour, and my replication is lagging behind. This didn't always happen, it started happening lately.
Anyway you cut it, taking ~45 minutes per snapshot, while only 1/3 of it is network traffic, seems too much of an overhead for replication.
I'd like to get other people's feedback if this timing makes sense. What is zfs doing in the "dead" receive period? Why is zfs inherit take so long?
Thanks!
George