ZFS replication deleted existing dataset.

Fab Sidoli

Contributor
Joined
May 15, 2019
Messages
114
I am trying to set up dataset replication of user data between two FreeNAS boxes and have a query with the process.

I have Box A (source) and Box B (destination).

On Box A I have the following pool structure:

store
-> group
----> group1
----> group2
-> home
----> user1
----> user2

etc...

Each is a dataset in it's own right and on Box A I can see that these live under /mnt/store/...

On Box B I set up the pool store which also lives under /mnt/.

I have set up an SSH connection on Box A to allow the replication to take place.

Under Tasks -> Periodic Snapshot Tasks I have created two schedules, one for store/group and one for store/home (apparently you can't have one for store as this is the system dataset).

I set each of these to do recursive snapshots and have defined a temporary schedule for testing purposes.

Under Tasks -> Replication Tasks I have defined a task for doing the ZFS send/recv from Box A -> B.

I'm doing a PUSH over SSH using the SSH connection previously set up. The source datasets are selected as store/group and store/home. The target destination is set to store. The Recursive option is set.

The replication appears to take place and within the BUI I can see that the pools are "replicated". However, /mnt/store/group and /mnt/store/home on Box B appear to be empty when viewed on the command line using 'ls -l' (see below). 'zfs list' does show the datasets like the BUI, however.

# ls -lR /mnt/store
total 27
drwxr-xr-x 2 root wheel 3 Apr 3 00:01 group
drwxr-xr-x 2 root wheel 3 Apr 3 00:12 home

/mnt/store/group:
total 1
-rw-r--r-- 1 root wheel 0 Feb 20 21:02 .windows

/mnt/store/home:
total 1
-rw-r--r-- 1 root wheel 0 Feb 20 21:02 .windows


The first time I tested replication a while back I could actually see the datasets under /mnt/store/{group,home}. This time around, however, the first new replication actively deleted the datasets under store on Box B. It has left .zfs/snapshot directories in place however.

I'm not sure I understand what is going on.

Fab
 

dariop84

Cadet
Joined
Mar 21, 2016
Messages
3
Hi Fab,

I had the same issue, recursive datasets are replicated and deleted (or invisible I should say, since I was able to see them from the GUI in Storage tab).
Also had some (child) datasets that were visible (from cmd line/ssh) but empty...

I "solved" creating a dedicated replication task for each child dataset, avoiding the "recursive" option.
Of course if you have a lot of child datasets this is not a good solution but I don't have a better one for now.

I also had issues with the "advanced" replication task creation, there is a bug (?) if you select SSH+NETCAT that prevents you to browse the destination filesets in the "Target Datasets" section

Ah and also another bug (?) I found is that even if you have a Periodic Snapshot Task and active snapshots for a given dataset, if you use the basic replication task creation it always creates a new Periodic Snapshot Task and you have to edit it later and set your snapshot task (and also delete manually the one automatically created)

In general the replication process in 11.3U1 is somehow buggy I'd say
 

Fab Sidoli

Contributor
Joined
May 15, 2019
Messages
114
That's interesting know.

Any ZFS gurus out there that can confirm one way or another whether this behaviour is expected? Essentially, I want to replicate data (and config) to the second box so that I can drop it in place for the primary should it go bang.
 

guermantes

Patron
Joined
Sep 27, 2017
Messages
213
The replication appears to take place and within the BUI I can see that the pools are "replicated". However, /mnt/store/group and /mnt/store/home on Box B appear to be empty when viewed on the command line using 'ls -l' (see below). 'zfs list' does show the datasets like the BUI, however.
Pardon me if I don't understand, but is that not just because ZFS snapshots are hidden from directory browsing by default? Are you sure they are gone? What happens if you try to clone one of the datasets that are supposed to be deleted to a new location?
 

Fab Sidoli

Contributor
Joined
May 15, 2019
Messages
114
Pardon me if I don't understand, but is that not just because ZFS snapshots are hidden from directory browsing by default? Are you sure they are gone? What happens if you try to clone one of the datasets that are supposed to be deleted to a new location?

What do you mean by hidden here? Do you mean from the user or from the system admin - or does ZFS not discriminate here?

I suppose my expectations may be wrong, but replication implies to me that the datasets will be replicated from one host to the other with the structure remaining in place. From the CLI, which I'm most comfortable with, 'ls -l /mnt/store/group' gives a different output from Host A to B. On the former I am presented with a list of group directories. On the latter the directory appears empty - although I have verified that the snapshots do live in .zfs/snapshots. I have been able to clone one of the datasets but I naively assumed I wouldn't have to do this.

What I'm after is a means of "failover" from Host A to B in the event Host A fails. Assuming everything else is in place for this to happen, I just want to be able to share out the dataset on B, but it looks like I need to clone them first.

Sorry if these are silly questions. I last used a ZFS appliance about 6 years ago and that was a Solaris box where everything was done via the CLI. My recollection back then was that ZFS send and receive mirrored the datasets under the main mount points on each host without one being hidden from view.
 

guermantes

Patron
Joined
Sep 27, 2017
Messages
213
Sorry, I was apparently confused. I thought the files were hidden inside the .zfs subfolder also on the destination, but when checking my own replication destination I can actually see all the files there from the command line. Mind you, I am replicating each dataset to it's own recipient dataset inside a common parent directory. Somehow I think ZFS prefers it that way.
 

Fab Sidoli

Contributor
Joined
May 15, 2019
Messages
114
I've read suggestions that this is the way to do it, but it's not scalable when each users' home directory is a dataset and you have several hundred users.
 

Fab Sidoli

Contributor
Joined
May 15, 2019
Messages
114
I found this which seems like a bug. A reboot has worked. I'll see what the next replication does.

 

tomasin

Cadet
Joined
Jun 6, 2020
Messages
4
Top