Being bitten by FreeBSD fts_read/statfs/ZFS issue - what to avoid and does the next release fix it?

Status
Not open for further replies.

Stilez

Guru
Joined
Apr 8, 2016
Messages
529
I'm being affected quite badly by this known FreeBSD ZFS tracked issue which is related to statfs + this issue + this, when using my FreeNAS box. Unfortunately I'm not clear about what must be avoided, to avoid triggering the issue, or how to identify existing snapshots or nested datasets that must be renamed as a workaround.

This is the issue where fts_read returns "no such file or dir" when one tries to ls -1R or find in a directory that is (or contains somewhere in its hierarchy) a .zfs/snapshot directory. The second link says that it's due to the 63-or-88 byte statfs struct limit as mentioned in the user guide (1.2) which affects the mounting of snapshots, but there isn't much detail and it isn't clear in this context exactly what the relevant "Filesystem paths" will be, when a pool contains snapshots of a nested dataset (or indeed snapshots of a root dataset that contains nested datasets), so I can't fix it on my NAS.

The symptoms show in two ways.
  • find and ls error out when searching files+dirs in my NAS via console. I'm not clear about the exact criteria triggering the issue, so I can't tell which snapshots to delete or rename to fix it.
  • I can't be sure whether snapshots in a nested dataset hierarchy make the issue worse. What I mean is, suppose I create a pool comprising of a top level dataset containing 3 or 4 levels of nested datasets, like this: production_aberdeen\ public_files\ current\ marketing\ [contains only ordinary files+dirs]. (Because we might need different zfs settings on the marketing dataset compared to its parents). Snapshots of production_aberdeen will imply snapshots of /marketing, or /marketing might have snapshots enabled in its own right. Judging by the output of zfs list, the existence of nested datasets seems to imply nested .zfs/snapshot directories. If I have nested datasets and then use find or ls -1R, I'm guessing that at some point they will need to mount a .zfs/snapshot, and might hit a path longer than the 63-or-88 byte issue and error out? I can't be sure.

At the moment it's difficult/impossible to use find or ls properly to search my pool in console. So my questions are:
  1. Does FreeBSD 11.1 in the upcoming FreeNAS 11.1 fix this, as the above bug report suggests, or is it only a part-fix? (Or does resolution require full ino64 (including 1024 byte statfs paths) and it'll only be resolved when FreeBSD 12 lands in FreeNAS during/after 2019?)
  2. What exactly are the criteria for nested dataset names/paths, and their snapshot names to trigger this issue, which I must avoid on my NAS if I want find/ls to always work, or which I can use to find datasets + manual snapshots need renaming because they trigger it?

Note - I've tried to narrow down the specific snapshot(s) which it can't traverse using both truss find... and a nested find . -maxdepth 1 -exec find {} ... \; but I can't see from either of them, a specific snapshot that's a problem. Perhaps I'm misinterpreting the truss output, it's possible, but it seems to say that the path lengths were okay..okay..okay..then it fails but the last path shown was the same length as previous path lengths that didn't fail.
 
Last edited:

Arwen

MVP
Joined
May 17, 2014
Messages
3,600
Sorry to hear about your troubles, and I don't have any real help.

I do suggest you create a FreeNAS bug against it. They will likely state that upstream will fix. But, at least you can get the developers to match the bug against a FreeBSD 11.x release and let you know, (via the bug), when FreeNAS has it fixed.
 

Stilez

Guru
Joined
Apr 8, 2016
Messages
529
I do suggest you create a FreeNAS bug against it. They will likely state that upstream will fix. But, at least you can get the developers to match the bug against a FreeBSD 11.x release and let you know, (via the bug), when FreeNAS has it fixed.
What I'm hoping to learn is, the exact criteria that trigger this issue. It can be worked around by careful renaming/deleting of snapshots, or by limiting the locations/depths of nested datasets, but I don't know *exactly* what elements constitute the relevant "filesystem paths" which have the 63 + 88 byte limits.

Without that info I can't work around it.
 
Status
Not open for further replies.
Top