I believe I've run into a bug in FreeNAS 8.2-beta3, regarding replicating ZFS datasets, when you have sub-datasets, and recursive snapshots turned on. I understand if it's probably there to prevent accidental data loss, but it appears to also cause an unexpected side effect in replicating sub-datasets.
I have a ZFS dataset structure that looks like this:
I don't know if it's necessarily a good idea to separate my vmware VDR backups into separate volumes, but that's a discussion for another topic. (It made sense in my brain at the time.)
Anyway, I setup periodic recursive snapshots on data1/vm_backups.
This meant I ended up with a set of snapshots like such:
(I was originally doing daily snapshots expiring after a few days, but I started testing)
I then went to the ZFS replication tab, and started entering in the information:
Volume/dataset: data1/vm_backups (this is the only selectable option at this point)
Remote ZFS filesystem name: tank/charon
Recursively replicate and remove stale snapshot on remote side: YES (checked)
Initialize remote side for once: YES (checked)
And filled in the remote details.
After some testing, I found I was getting the dataset I specified, and I was certainly getting snapshots - and old snapshots were even expiring - but I wasn't getting the recursive aspect.
I'd googled some on how ZFS replication works, and the commands in the log files looked correct. I even tested them, and they did run.
I then tried deleting the replication, deleting the periodic snapshots, and all the existing snapshots. I tried doing the process manually. I created a recursive snapshot, and then told it to replicate, based partially on the command I'd seen in the log files, but without being quiet about errors or anything of the sort. It worked perfectly.
So, I remembered one line from the log file whenever I setup a new sync process:
Jun 12 18:00:01 charon autorepl[70896]: Creating tank/vm_backups on remote system
This was something I had not done on my testing, when logging in and trying it manually. So I got curious, what if I do the first replication with the periodic snapshot, and then let the ZFS replication interface take over from there?
This appeared to work. It seems I also needed to set freenas:state=LATEST when doing this, after which, things seemed to settle into a proper cycle of snapshot, replicate, wait, repeat.
I'm thinking the correct operation would involve detecting sub-datasets, and creating those as well as the dataset in question. That assumes that the creation of the dataset on the remote end is intentional and not some lingering solution from very old versions of zfs send/receive.
I have a ZFS dataset structure that looks like this:
Code:
(local system - charon) NAME USED AVAIL REFER MOUNTPOINT data1 826G 2.55T 51.0G /mnt/data1 data1/vm_backups 141M 2.55T 136K /mnt/data1/vm_backups data1/vm_backups/email 112K 2.55T 112K /mnt/data1/vm_backups/email data1/vm_backups/unix 70.9M 2.55T 70.8M /mnt/data1/vm_backups/unix data1/vm_backups/windows 70.3M 2.55T 70.2M /mnt/data1/vm_backups/windows
I don't know if it's necessarily a good idea to separate my vmware VDR backups into separate volumes, but that's a discussion for another topic. (It made sense in my brain at the time.)
Anyway, I setup periodic recursive snapshots on data1/vm_backups.
This meant I ended up with a set of snapshots like such:
Code:
(local system - charon) NAME USED AVAIL REFER MOUNTPOINT data1 826G 2.55T 51.0G /mnt/data1 data1/vm_backups 141M 2.55T 136K /mnt/data1/vm_backups data1/vm_backups@auto-20120613.1133-1h 0 - 136K - data1/vm_backups/email 112K 2.55T 112K /mnt/data1/vm_backups/email data1/vm_backups/email@auto-20120613.1133-1h 0 - 112K - data1/vm_backups/unix 70.9M 2.55T 70.8M /mnt/data1/vm_backups/unix data1/vm_backups/unix@auto-20120613.1133-1h 64K - 70.8M - data1/vm_backups/windows 70.3M 2.55T 70.2M /mnt/data1/vm_backups/windows data1/vm_backups/windows@auto-20120613.1133-1h 64K - 70.2M -
(I was originally doing daily snapshots expiring after a few days, but I started testing)
I then went to the ZFS replication tab, and started entering in the information:
Volume/dataset: data1/vm_backups (this is the only selectable option at this point)
Remote ZFS filesystem name: tank/charon
Recursively replicate and remove stale snapshot on remote side: YES (checked)
Initialize remote side for once: YES (checked)
And filled in the remote details.
After some testing, I found I was getting the dataset I specified, and I was certainly getting snapshots - and old snapshots were even expiring - but I wasn't getting the recursive aspect.
Code:
(remote system - atropos) tank/charon 57.3K 1.15T 26.6K /tank/charon tank/charon/vm_backups 30.6K 1.15T 30.6K /tank/charon/vm_backups tank/charon/vm_backups@auto-20120613.1133-1h 0 - 30.6K - tank/charon/vm_backups@auto-20120613.1148-1h 0 - 30.6K -
I'd googled some on how ZFS replication works, and the commands in the log files looked correct. I even tested them, and they did run.
I then tried deleting the replication, deleting the periodic snapshots, and all the existing snapshots. I tried doing the process manually. I created a recursive snapshot, and then told it to replicate, based partially on the command I'd seen in the log files, but without being quiet about errors or anything of the sort. It worked perfectly.
Code:
zfs snapshot -r data1/vm_backups@1 zfs send -R data1/vm_backups@1 | /usr/bin/ssh -i /data/ssh/replication atropos "/sbin/zfs receive -Fd tank"
Code:
(local system - charon) NAME USED AVAIL REFER MOUNTPOINT data1 813G 2.56T 51.0G /mnt/data1 data1/vm_backups 141M 2.56T 136K /mnt/data1/vm_backups data1/vm_backups@1 0 - 136K - data1/vm_backups/email 112K 2.56T 112K /mnt/data1/vm_backups/email data1/vm_backups/email@1 0 - 112K - data1/vm_backups/unix 70.8M 2.56T 70.8M /mnt/data1/vm_backups/unix data1/vm_backups/unix@1 0 - 70.8M - data1/vm_backups/windows 70.2M 2.56T 70.2M /mnt/data1/vm_backups/windows data1/vm_backups/windows@1 0 - 70.2M -
Code:
(remote system - atropos) NAME USED AVAIL REFER MOUNTPOINT tank 1.50T 1.15T 33.3K /tank tank/vm_backups 137M 1.15T 30.6K /tank/vm_backups tank/vm_backups@1 0 - 30.6K - tank/vm_backups/email 25.3K 1.15T 25.3K /tank/vm_backups/email tank/vm_backups/email@1 0 - 25.3K - tank/vm_backups/unix 68.8M 1.15T 68.8M /tank/vm_backups/unix tank/vm_backups/unix@1 0 - 68.8M - tank/vm_backups/windows 68.6M 1.15T 68.6M /tank/vm_backups/windows tank/vm_backups/windows@1 0 - 68.6M -
So, I remembered one line from the log file whenever I setup a new sync process:
Jun 12 18:00:01 charon autorepl[70896]: Creating tank/vm_backups on remote system
This was something I had not done on my testing, when logging in and trying it manually. So I got curious, what if I do the first replication with the periodic snapshot, and then let the ZFS replication interface take over from there?
This appeared to work. It seems I also needed to set freenas:state=LATEST when doing this, after which, things seemed to settle into a proper cycle of snapshot, replicate, wait, repeat.
I'm thinking the correct operation would involve detecting sub-datasets, and creating those as well as the dataset in question. That assumes that the creation of the dataset on the remote end is intentional and not some lingering solution from very old versions of zfs send/receive.