If you have a lot of data to move around between boxes and you're on a secure 10 GbE network, you can do some initial replication tasks manually and get vastly improved replication speeds. This doesn't use encryption, so I wouldn't use this method on an unsecured network.
I'm setting up two new storage servers for work. We have about 9TB of data. First I rsync'd the data to box1, then I used a replication task to replicate the datasets to box2 (both FreeNAS-11.0-U1, box2 being a warm standby). Both operations took days. I had to start over and didn't want to wait days again, so I decided to figure out if I could do the initial replication more quickly.
The replication task seemed to max out at about 500-600Mbps (about 60MB/s max) which was surprising, as both these boxes have 10gig interfaces. I was using fast encryption (not sure why I didn't turn this off) but even then, ssh was using maybe 25% CPU so I don't think that was the bottleneck.
In fact, I stumbled on a ticket that I can't seem to find right now that addressed this, I think. It has to do with how FreeNAS runs the replication tasks - something with that process slows it down considerably.
What I ended up doing which worked really well was to use
First, I set up my snapshot tasks on box1 how I want them to be in production. I let FreeNAS take the initial snapshots.
Then on the box2 (the receiver) I ran:
I didn't have to create the dataset first.
Then on box1 (the sender) I ran
Running the data through
Once this was finished, I set up a replication task in FreeNAS to take care of replication from now on. Everything seems to be working just fine.
Oh, you'll probably also want to run
Just wanted to post this because I had to dig and dig to figure this out (as always with FreeNAS / ZFS). If I did anything really stupid here, please let me know. I've been using FreeNAS for a couple of years now and I still feel like a complete novice just on the verge of totally destroying everything even though I'm pretty sure I'm past that point by now!
EDIT: I forgot to re-enable replication from box1 to box2 and when I did, replication started over from scratch again. I *think* this is because all the snapshots on box2 were stale, but I don't know for sure. Make sure you test this procedure well!
I'm setting up two new storage servers for work. We have about 9TB of data. First I rsync'd the data to box1, then I used a replication task to replicate the datasets to box2 (both FreeNAS-11.0-U1, box2 being a warm standby). Both operations took days. I had to start over and didn't want to wait days again, so I decided to figure out if I could do the initial replication more quickly.
The replication task seemed to max out at about 500-600Mbps (about 60MB/s max) which was surprising, as both these boxes have 10gig interfaces. I was using fast encryption (not sure why I didn't turn this off) but even then, ssh was using maybe 25% CPU so I don't think that was the bottleneck.
In fact, I stumbled on a ticket that I can't seem to find right now that addressed this, I think. It has to do with how FreeNAS runs the replication tasks - something with that process slows it down considerably.
What I ended up doing which worked really well was to use
nc
as a transport for the initial replication tasks (you can use nc
to open basic TCP sockets and transfer data, among other things). Using nc
, I was able to get 500MB/s (about 4 Gbps) which is obviously quite the improvement.First, I set up my snapshot tasks on box1 how I want them to be in production. I let FreeNAS take the initial snapshots.
Then on the box2 (the receiver) I ran:
Code:
nc -w 120 -l 8888 | zfs receive poolname/dataset_being_replicated
I didn't have to create the dataset first.
Then on box1 (the sender) I ran
zfs list -t snap | grep dataset_to_replicate
to find the snapshot names for the dataset I wanted to replicate. I copied the latest snapshot name with the intent of replicating all snapshots up to that one. Then I ran:Code:
zfs send -R pool/dataset_to_replicate@auto-20170712.1333-1d | pv -ptera -s SIZE_OF_DATASET | nc -w 20 box2 8888
Running the data through
pv
shows transfer speed, time elapsed, and if you provide -s XXXg
it will give you a rough idea of how long the replication will take (it's not 100% accurate, but it's close). Otherwise you won't get any output until the transfer is complete.Once this was finished, I set up a replication task in FreeNAS to take care of replication from now on. Everything seems to be working just fine.
Oh, you'll probably also want to run
zfs set readonly=on pool/dataset
on box2 after the initial replication is complete because this is what FeeNAS will do on the initial replication.Just wanted to post this because I had to dig and dig to figure this out (as always with FreeNAS / ZFS). If I did anything really stupid here, please let me know. I've been using FreeNAS for a couple of years now and I still feel like a complete novice just on the verge of totally destroying everything even though I'm pretty sure I'm past that point by now!
EDIT: I forgot to re-enable replication from box1 to box2 and when I did, replication started over from scratch again. I *think* this is because all the snapshots on box2 were stale, but I don't know for sure. Make sure you test this procedure well!
Last edited: