zvol replication "dataset is busy"

Status
Not open for further replies.

macmac1

Dabbler
Joined
Apr 9, 2014
Messages
17
Hello everybody,

I have two FreeNAS boxes running version FreeNAS-9.2.0-RELEASE-x64 (ab098f4).

Here is my pool and dataset config (both boxes have same volumes and datasets configured):

# zpool list
NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
zfsVol1 21.8T 463G 21.3T 2% 1.00x ONLINE /mnt

# zfs list
NAME USED AVAIL REFER MOUNTPOINT
zfsVol1 6.19T 4.18T 244K /mnt/zfsVol1
zfsVol1/jails 1.29G 4.18T 482K /mnt/zfsVol1/jails
zfsVol1/jails/.warden-template-pluginjail 805M 4.18T 805M /mnt/zfsVol1/jails/.warden-template-pluginjail
zfsVol1/jails/bacula-sd_1 271M 4.18T 1.05G /mnt/zfsVol1/jails/bacula-sd_1
zfsVol1/jails/btsync_1 247M 4.18T 1.03G /mnt/zfsVol1/jails/btsync_1
zfsVol1/zfsDataset1 2.31G 4.18T 2.08G /mnt/zfsVol1/zfsDataset1
zfsVol1/zfsDataset2 238K 4.18T 238K /mnt/zfsVol1/zfsDataset2
zfsVol1/zfsVolume1 2.06T 6.03T 220G -
zfsVol1/zfsVolume2 2.06T 6.24T 84.1M -
zfsVol1/zfsVolume3 2.06T 6.24T 55.6M -

I have created periodic snapshots for zfsVol1/zfsDataSet1 and zfsVol1/zfsDataSet2. Works.

But I also have zvols, which I export as iSCSI targets used by VMware ESX server.
I tried to make replication of zfsVol1/zfsVolume2. Have periodic snapshots configured.

But for replication task, I get the following in /var/log/messages:

Apr 9 16:04:01 x48svr61xfn1 autorepl.py: [common.pipesubr:71] Executing: (/sbin/zfs send -V -R zfsVol1/zfsVolume2@auto-20140409.1527-2w | /bin/dd obs=1m | /bin/dd obs=1m | /usr/bin/ssh -c arcfour256,arcfour128,blowfish-cbc,aes128-ctr,aes192-ctr,aes256-ctr -i /data/ssh/replication -o BatchMode=yes -o StrictHostKeyChecking=yes -q -l root -p 22 192.168.200.204 "/sbin/zfs receive -F -d zfsVol1 && echo Succeeded.") > /tmp/repl-39921 2>&1
Apr 9 16:04:05 x48svr61xfn1 autorepl.py: [common.pipesubr:57] Popen()ing: /usr/bin/ssh -c arcfour256,arcfour128,blowfish-cbc,aes128-ctr,aes192-ctr,aes256-ctr -i /data/ssh/replication -o BatchMode=yes -o StrictHostKeyChecking=yes -q -l root -p 22 192.168.200.204 "zfs list -Hr -o name -t snapshot -d 1 zfsVol1/zfsVolume2 | tail -n 1 | cut -d@ -f2"
Apr 9 16:04:05 x48svr61xfn1 autorepl.py: [tools.autorepl:332] Replication of zfsVol1/zfsVolume2@auto-20140409.1527-2w failed with 119620+609 records in 58+1 records out 61415564 bytes transferred in 2.483186 secs (24732567 bytes/sec) 119952+1 records in 58+1 records out 61415564 bytes transferred in 2.485535 secs (24709192 bytes/sec) cannot receive new filesystem stream: dataset is busy

"dataset is busy" seems crucial to me, but what it means? Should target volume be unmounted when doing replication. Or maybe my pools configuration is wrong for such scenario (both iSCSI zvol and data-sets on same volume)?

What I do wrong? Is such iSCIS/zvol replication possible at all? Any step-by-step scenario available as for dataset replication?
 

macmac1

Dabbler
Joined
Apr 9, 2014
Messages
17
Partially solved: "dataset busy" appears when I have iSCSI target of second FN box mounted on ESX server.
When I disable iSCSI on second FN box or disconnect target from ESX server - replication performs without errors, BUT:
now, when I mount iSCSI target on second FN box (replication destination) at ESX server, it does not see valid VMFS partition there (wants to reformat it).
Any way to do such iSCSI volumes replication?
 
D

dlavigne

Guest
I asked a dev who suggested the following:

This is probably untested. To apply a snapshot update to a zvol, you have to unmount the zvol since changing a disk out from under a user will destroy the data, or worse. I suspect the scripts don't check for this.

It is probably worth creating a bug report at bugs.freenas.org so that this part of the replication system can be checked. If you do, post the issue number here.
 

macmac1

Dabbler
Joined
Apr 9, 2014
Messages
17
Finally I did it this way (I will refer to peers as PUSH and PULL, as in freenas doc regarding replication):

  • at PUSH, created periodic snapshots for selected zvol and replication task pointing to PULL
  • Wait some time till replication process sends the snapshot to PULL.
  • at PULL, select desired snapshot and clone it (Note: in version 9.2.0 I started with, I had to remove PUSH's volume path from suggested name for created snapshot. When I upgraded to 9.2.1.3, it was not necessary any more)
  • Now you see newly created snapshot under "Active Volumes" at PULL.
  • at PULL, create new iSCSI extend. Select Type = "Device" and choose newly created snaphost clone from the list. Assign extend to iSCSI target
  • Now, when you try to restart iSCSI service, it will fail and you can find message like "Auto size error" in /var/log/messages. This is due to this bug: https://bugs.freenas.org/issues/3120. You have to reboot the PULL box to get rid of this error.
  • After PULL reboots, make sure iSCSI service is started.
  • You have what you wanted: the copy of PUSH iSCSI zvol, exported through iSCSI at PULL
  • When you add new iSCSI target (from PULL) to the same ESX server that has original PUSH's volume mounted as datastore, you must choose "Assign a new signature" when adding storage.
For me it was pretty complex and I would highly appreciate example for zvol replication in FreeNAS docs.

And my question to FreeNAS-experts persists: Is that right way to maintain zvol replication?
 
D

dlavigne

Guest
It looks like that bug just got a partial fix. If you get a chance to test the fixes in a 9.2.2 alpha or beta, let us know if this procedure is still necessary so we can have updated docs for the 9.2.2 release.
 
Status
Not open for further replies.
Top