Replication causing datasets to act unmounted

Status
Not open for further replies.

philiplu

Explorer
Joined
Aug 10, 2014
Messages
58
I'm seeing similar issues as mentioned in some other forum posts, like ZFS Replication Fails from time to time and ZFS Replication: Dataset becomes unavailable on receiving end after receiving snapshot. While replication is going on, I start seeing syslog messages like these:

Oct 1 16:30:05 Marvin collectd[42972]: statvfs(/mnt/bakset1/bu/users/user1) failed: No such file or directory
Oct 1 16:30:05 Marvin collectd[42972]: statvfs(/mnt/bakset1/bu/users/user2) failed: No such file or directory​

That's just a symptom - the real problem seems to be that zfs receive can sometimes unmount a dataset, even though zfs mount will still show the dataset as mounted.

What's interesting is that I don't see this when using the GUI to set up replication, like those other forum posts. Instead, I'm seeing it with a script of mine, which I use to push an on-demand replication to a local pool, which I use to create rotating backup pools that are hot-swapped and stored off-site. The replication script tries to mimic the same zfs commands as used in GUI replication (at least, before 9.3.1 - I haven't checked to see exactly what's changed in the new replication scheme). Here are example commands I use to replicate locally:

zfs snapshot -r pool/bu@bakset1-20151001.1616
zfs send -e -R -I bakset1-20150929.1304 pool/bu@bakset1-20151001.1616 | zfs receive -v -F -d bakset1
zfs destroy -r -v pool/bu@bakset1-20150929.1304​

Once that zfs receive starts, the collectd errors start appearing. If I manually look at the tree out in the PULL pool bakset1, the top-level received dataset, at bakset1/bu, doesn't have the actual nested datasets within it. Instead, it just has the placeholder mountpoint directories for those datasets. You can tell because the mountpoints are empty, and all have the same modification times when the bakset1 destination was first created, instead of various times corresponding to the modifications of the source datasets under pool/bu. The collectd syslog errors start appearing because those datasets are nested two levels down, e.g. bakset1/bu/users/user1, and those mountpoint directories aren't found any longer since their parent directory is acting unmounted.

Even though the nested datasets act like they're unmounted, they still show up in the output from zfs mount.

I fix the problem by detaching and reattaching the destination pool in the GUI storage tab, but the problem will reoccur the next time a replicate.

Anyone have any idea why a zfs receive would half-way unmount a dataset?

I've attached the script I use to manually replicate, though the details there aren't important to the errors I'm seeing.
 

Attachments

  • repl-to-bakset.py.txt
    4 KB · Views: 256

Apollo

Wizard
Joined
Jun 13, 2013
Messages
1,458
Check your volume is not set as readonly.
If it is, then set it to readonly=off
If you detach or perform zpool export of bakset1, then do: "zpool import bakset" in CLI, you should see errors regarding dataset not being mounted. This is most likely the sign your volume is set as readonly=on.
When you set it as readonly=off, then when you import it, it should mount all the dataset and include the volume under GUI.
 

philiplu

Explorer
Joined
Aug 10, 2014
Messages
58
No, not readonly. zpool get readonly bakset1 returns readonly=off. I would expect, if readonly was on, that the zfs receive would have failed. But it doesn't - the backup datasets look correct, once I detach/attach the pool.
 

Apollo

Wizard
Joined
Jun 13, 2013
Messages
1,458
No, not readonly. zpool get readonly bakset1 returns readonly=off. I would expect, if readonly was on, that the zfs receive would have failed. But it doesn't - the backup datasets look correct, once I detach/attach the pool.
Readonly=on on a volume doesn't prevent it from receiving replication snapshots. It will prevent any other write types but replication.
 

philiplu

Explorer
Joined
Aug 10, 2014
Messages
58
I still don't understand what's going wrong, but I do have a good workaround. Leaving this here in case anyone else runs into a similar issue.

To recap - I'm doing a local incremental replication, zfs send -R -I piped to zfs receive -F -d. For instance, I replicate all datasets under pool/bu to my backup volume bakset1, receiving those datasets into a matching tree at bakset1/bu. When the receive starts, the system acts like the datasets nested within bakset1/bu are no longer mounted - a ls -l /mnt/bakset1/bu will show the mountpoints for the nested datasets, but all those mountpoints are empty. But running zfs mount will show that the datasets are still 'officially' mounted. That then triggers collectd entries in the system log as mentioned in the original post.

I think I originally saw something like this occasionally happen months ago when I used to have a GUI replication task to replicate locally in a similar fashion, but that was intermittent. But there are posts elsewhere of people seeing the same collectd - statvfs error messages, which leads me to believe the problem can still sometimes occur.

So anyway, since the receive is making the destination datasets act as if they're unmounted, and they don't need to be mounted when the zfs receive runs, I tried forcibly unmounting the datasets before replicating, then remounting after the receive. That seems to be working - no more syslog errors, and no more datasets that seem stuck in a half-unmounted/half-mounted state.

Here are the modifications I made to my original example commands:

zfs unmount bakset1/bu
zfs snapshot -r pool/bu@bakset1-20151001.1616
zfs send -e -R -I bakset1-20150929.1304 pool/bu@bakset1-20151001.1616 | zfs receive -v -F -d bakset1
zfs destroy -r -v pool/bu@bakset1-20150929.1304
zfs mount -a

and I've attached the modified script I use to perform the local replication.
 

Attachments

  • repl-to-bakset.py.txt
    4.1 KB · Views: 244

balrog76

Dabbler
Joined
Jul 25, 2015
Messages
10
afaik working as designed.

cut'n'paste from here: http://docs.oracle.com/cd/E19253-01/819-5461/gbchx/index.html

Receiving a ZFS Snapshot
Keep the following key points in mind when you receive a file system snapshot:

  • Both the snapshot and the file system are received.

  • The file system and all descendent file systems are unmounted.

  • The file systems are inaccessible while they are being received.

  • The original file system to be received must not exist while it is being transferred.

  • If the file system name already exists, you can use zfs rename command to rename the file system.

 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Actually, this issue is a little more complicated than that.

Those Oracle links are nice, but you can't always trust the Oracle documentation anymore because OpenZFS is not the same as the Oracle ZFS implementation. So it's best to simply *never* use those links unless you've verified yourself that everything said is 100% accurate. Seems to be accurate pretty often, but is inaccurate for OpenZFS often enough to really ruin someone's day.

readonly=on attribute hasn't really be used until recently in FreeNAS/TrueNAS. You used to be able to write to the dataset on a destination box, and that could create problems:

- What do you do when Samba tries to write something and you want to replicate something at the same time?
- What do you do when Samba ties to access a dataset and it isn't available because replication is in progress?
- What do you do when Samba is happily looking at a dataset, and suddenly a new snapshot finishes replicating that changes the directory structure you were just looking at 2 seconds go?

Lots of edge cases suddenly manifest themselves and the end-result is generally hate, discontent, and confusion for the services involved.

Now, the replicator FreeNAS uses requires the readonly attribute to be set to on. This prevents Samba, NFS, or iSCSi from trying to initiate a write at the same time as replication, but it does nothing for other scenarios. This has also been a change in behavior that has only recently (2-3 months?) become a problem because of the rewrite of the replicator to allow for new scenarios (A->B and A->C at the same time).

In one thread (https://forums.freenas.org/index.ph...receiving-end-after-receiving-snapshot.30821/) he mentions that he was sharing out the data, and if readonly=off (which he said it was set to off) then the problems are very likely self-inflicted and the odd behaviors are because of said problems.

I've worked with lots of systems, and never seen these problems myself, nor had anyone complain about these issues. So I'm guessing people are trying to do silly things, have something misconfigured somehow, or something else along those lines.

Until the recent replicator code rewrite, it really wouldn't be a good idea to be mounting the PULL system's replicated storage on Samba, NFS, etc because of possible edge cases. I know people that set up the shares, but they were the "backup" so they weren't ever mounted except to grab a file or two as necessary.

From reading all of this, I get the feeling that people think they can do things with replicated targets that they probably shouldn't be doing.. mounting them and accessing them as if there is no reason to be concerned. Until the readonly attribute=on was used I don't think that was ever supported, recommended, or maybe even tested. You cannot use network sharing protocols if the underlying filesystem might spontaneously change without warning (like what happens when a snapshot finishes replicating... one second you have one particular filesystem, the next you have a totally different one). Even if you log into a Samba share, then go change something from an ssh session, if what you change conflicts with Samba's expectations, bad things can and do happen. This is also why you don't share out the same location with NFS and Samba. The file system can change because of a write from Samba, but nfsd isn't happy when you suddenly changed the file system with no warning.

With this thread in particular, I'm reading the python code that is in the original post, and it makes it pretty clear that the intention is to NOT use the built in replication services that FreeNAS has available in the WebGUI. This screams of "not supported" and "no clue how the system will respond". Not that the python code won't work, but when this is something that someone put together in their spare time, it isn't as well tested, vetted, and as reproducible as the replicator that everyone around the world using FreeNAS would be using and have common experiences and common results.

FWIW, when the new replicator was first released, it was a little buggy. It wasn't good with edge cases and specific scenarios. It created quite a bit of headaches at first because so many problem came up all-of-a-sudden that I couldn't make heads or tails of if this was a misconfiguration, an upgrade issues, or an actual bug (they were bugs usually).
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
I've experienced the original problem where the zfs dataset was unmounted. And this was on a PULL box that was only used as a backup (no shares or whatever). Fortunately over the past 3 months the replication issues seem to have been fixed including this one.

I think it's strange that the replicated dataset on PULL isn't read-only though, given the possibility of user induced issues mentioned above.



Sent from my Nexus 6P using Tapatalk
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
Your situation is completely different. Please stick to your post. This issue was that the datasets would not appear at all. Thank you!
 
Last edited:

elangley

Contributor
Joined
Jun 4, 2012
Messages
109
Your situation is completely different. Please stick to your post. This issue was that the datasets would not appear at all.

Sent from my Nexus 9 using Tapatalk

Okay and sorry for the hijack. I will remove the post.
 

brumnas

Dabbler
Joined
Oct 4, 2015
Messages
33
Sorry reanimating this, but it seems to be the same issue I have - I'm replicating locally, e.g. PUSH/PULL is 127.0.0.1. The PUSH is creating snapshots periodically every hour, PULL is receiving them once daily in the night. The PULL is not shared, nor doing anything fancy and I don't need it, only as a backup.

PUSH = /mnt/pool1, with some recursive datasets
PULL = /mnt/pool2, no data but backup for pool1

The PULL side has a dataset called "backup", which is the target for the replication. So when the pool1/livedatabackup is replicated, it should be put into /mnt/pool2/backup/livedatabackup.

Log:
Code:
nasko collectd[9633]: statvfs(/mnt/pool2/backup/livedatabackup) failed: No such file or directory

- this messages comes every 10 minutes (who is triggering it?)
- the "livedatabackup" dataset lives originally @pool1
- some datasets are NOT error logged - strange is, the "more tricky" ones, like samba shares, are ok, but the never-shared are logged

The only "complications" in this simple scenario I can think off:
- I'm replicating locally, e.g. both PUSH/PULL are 127.0.0.1
- The "problematic" (those err logged) datasets @pool1 were created _after_ the very first snapshot/replication was ran

pool2 status:
- replicated copies seem to be ok, e.g. when cloning from them, I get the original files
- the copied datasets are displayed as existent, but those in error log are empty - this is suspicious somehow

Open issues
- Which process tries every 10 minutes and doesn't succeed, logging "statvfs" module?
- Can creating new datasets in PUSH after the first replication cause problems on PULL?
- After the very last/newest replication: should the PULL pool look exactly as PUSH? E.g. mounted datasets, containing files etc.

Thank you,
Andrej
 

nojohnny101

Wizard
Joined
Dec 3, 2015
Messages
1,478
i would like to add my situation to this as well as it seems i am experiencing something very similar to @brumnas with a slight variation:

I am replicating over SSH between 2 freenas boxes. i have recursive snapshots setup on push and replication task for once every night. everything is replicated one for one. My only difference is I created a new dataset "called example" on the PULL box that is not on the PUSH server. After the replication runs every night (successful, confirmed) the PULL server starts throwing up the following errors every 10 minutes:
Code:
Jul  6 12:44:05 axio-backup collectd[8869]: statvfs(/mnt/axio-backup/tunnelhere) failed: No such file or directory
Jul  6 12:44:15 freenas-backup collectd[8869]: statvfs(/mnt/freenas-backup/example) failed: No such file or directory
Jul  6 12:44:25 freenas-backup collectd[8869]: statvfs(/mnt/freenas-backup/example) failed: No such file or directory
Jul  6 12:44:35 freenas-backup collectd[8869]: statvfs(/mnt/freenas-backup/example) failed: No such file or directory
Jul  6 12:44:45 freenas-backup collectd[8869]: statvfs(/mnt/freenas-backup/example) failed: No such file or directory
Jul  6 12:44:55 freenas-backup collectd[8869]: statvfs(/mnt/freenas-backup/example) failed: No such file or directory


if i reboot, the error messages go away until the next replication finishes.
 

nojohnny101

Wizard
Joined
Dec 3, 2015
Messages
1,478
has anyone else come across this as it seems the bug submitted above didn't completely resolve this issue.

My console continues to be flooded with the "statvfs" error messages references the one dataset that I created on PULL while everyone else in the pool is replicated datasets from the PUSH box.
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
provide the output of 'ls /mnt/freenas-backup' my guess is that example doesn't exist (or isn't mounted), likely because you are replicating into freenas-backup. And collectd is trying to poll its stats. I've found that if I want to have a usable space on a PULL system, then I use a sub dataset for pull, not the main pool dataset.
 

nojohnny101

Wizard
Joined
Dec 3, 2015
Messages
1,478
@depasseg makes sense. Is there any easy way to move all of the replicated data to a new dataset that I create without having to wipe the backup server and start from scratch. I have over 5TB of data and would prefer to not have to move it all over again. Although I would like to preserve the existing snapshots that have already been sent to PULL along with the data.
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
You should be able to pause the replication on push, create another dataset on pull, create a recursive snapshot and replicate from one dataset to the other (assuming you have enough space to have 2 copies), then reconfigure push to the new dataset. "Should" being the key word.
 

nojohnny101

Wizard
Joined
Dec 3, 2015
Messages
1,478
You should be able to pause the replication on push, create another dataset on pull, create a recursive snapshot and replicate from one dataset to the other (assuming you have enough space to have 2 copies), then reconfigure push to the new dataset. "Should" being the key word.

I do have enough space and will do so. I will edit this post when completed to report back. Thanks for the help. Hopefully this stops the messages.
 
Status
Not open for further replies.
Top