SOLVED ZFS Error

eewiz

Explorer
Joined
Oct 14, 2021
Messages
50
Hello,

I'm new to TrueNAS and really need some help.
My machine:
Core 13.0-U5, 32GB Memory, Xeon E3-1200 4-core, 128K SSD - boot, 12 8TB spinners - pool - RAIDZ2.

There are no user files stored in the "pool" dataset.
Notice that the "pool" dataset has USEDDS=238K which I assume belongs to invisible stuff like snapshots, metadata and I don't know what else.
I assume that USEDDS=238K for the "pool" dataset is probably a normal situation.
Code:
root@plum[/]# zfs list -o space pool
NAME  AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD
pool  3.37T  63.0T        0B    238K             0B      63.0T

There are also no user files stored in the "pool/eds" dataset either.
Notice that "pool/eds" has USEDDS=23.9T which is nowhere near correct.
Code:
root@plum[/]# zfs list -o space pool/eds
NAME      AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD
pool/eds  3.37T  62.9T        0B   23.9T             0B      39.0T

The USEDCHILD=39.0T is correct. The "pool/eds" child datasets do add up to 39.0T.

I need to find and delete the non-visible USEDDS=23.9T data from the "pool/eds" dataset.
Here is the complete "pool/eds" dataset storage space.
Code:
root@plum[/]# zfs list -o space -r pool/eds
NAME               AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD
pool/eds           3.37T  62.9T        0B   23.9T             0B      39.0T
pool/eds/backup    3.37T   881G     1.04M    881G             0B         0B
pool/eds/bt        3.37T  35.2M        0B   35.2M             0B         0B
pool/eds/common    3.37T   226G        0B    226G             0B         0B
pool/eds/drivers   3.37T   120G     36.6K    120G             0B         0B
pool/eds/fixes     3.37T  46.9G        0B   46.9G             0B         0B
pool/eds/games     3.37T  5.36T     36.6K   5.36T             0B         0B
pool/eds/keepass   3.37T   484K        0B    484K             0B         0B
pool/eds/lib       3.37T   721G     36.6K    721G             0B         0B
pool/eds/music     3.37T  1.58T     36.6K   1.58T             0B         0B
pool/eds/rch-ec    3.37T   334G     36.6K    334G             0B         0B
pool/eds/sm        3.37T   817G     36.6K    817G             0B         0B
pool/eds/software  3.37T  3.08T      951K   3.08T             0B         0B
pool/eds/urb       3.37T  1.49T        0B   1.49T             0B         0B
pool/eds/user      3.37T  66.2G      951K   66.2G             0B         0B
pool/eds/video     3.37T  24.4T     1.58G   24.4T             0B         0B

I have searched Google and this forum for several days and have found no answer.
I believe that it is saying that there is 23.9TB of data stored in the "pool/eds" dataset but, ls does not show it.
Code:
root@plum[/]# ls -al /mnt/pool/eds
total 378
drwxrwxrwx  17 root  wheel  17 Jun 10 02:42 .
drwxrwxrwx   6 root  wheel   6 Jun 10 17:10 ..
drwxrwxrwx+ 16 root  wheel  18 Apr 30 19:37 backup
drwxrwxrwx+  2 root  wheel   3 Dec  5  2021 bt
drwxrwxrwx+ 20 root  wheel  20 Jun  5  2018 common
drwxrwxrwx+ 40 root  wheel  40 Dec  4  2019 drivers
drwxrwxrwx+ 33 root  wheel  33 Mar 26  2021 fixes
drwxrwxrwx+  9 root  wheel   9 Jan  7  2021 games
drwxrwxrwx+  2 root  wheel   3 Jun  3 01:27 keepass
drwxrwxrwx+ 29 root  wheel  32 Nov 21  2019 lib
drwxrwxrwx+ 17 root  wheel  19 Mar 30 00:13 music
drwxrwxrwx+  5 root  wheel   5 Dec  2  2021 rch-ec
drwxrwxrwx+  8 root  wheel   8 May  5  2019 sm
drwxrwxrwx+  7 root  wheel   7 Oct 23  2021 software
drwxrwxrwx+  9 root  wheel   9 Jun 10 04:36 urb
drwxrwxrwx   7 root  wheel   7 May 10 05:35 user
drwxrwxrwx+  7 root  wheel   7 Jun  4 01:02 video
root@plum[/]#

"ls -a" is supposed to list dot-hidden files and none show up.
"ls -d .*/" also shows no dot-hidden directories in the "pool/eds" dataset.
Code:
root@plum[/]# cd /mnt/pool/eds
root@plum[/mnt/pool/eds]# ls -d .*/
zsh: no matches found: .*/
root@plum[/mnt/pool/eds]#

Although, I do know that "/mnt/pool/eds/.zfs" exists yet, "ls -d .*/" does not show the ".zfs" folder either.
So at this point, I must assume that there is a folder in the "pool/eds" dataset that is also not visible, like ".zfs" is not visible but, consumes 23.9TB of storage space.

I don't know what I'm missing and need a little help.
Thank You
 
Joined
Oct 22, 2019
Messages
3,641
You could have previously had an actual "folder" that shares the same name as one of the child datasets. But now the child dataset "mounts over" that folder.

In order to rule this out, you need to first unmount all your child datasets, and then recheck with "du":
Code:
du -hs /mnt/pool/eds/*

Keep in mind that the GUI does not offer a way to unmount datasets. It needs to be done in the command-line. I'm not sure how finicky the TrueNAS middleware is if you unmount a dataset.
 

eewiz

Explorer
Joined
Oct 14, 2021
Messages
50
Thank you winnielinnie for the reply.
All of the files in /mnt/pool/eds/video had somehow been deleted and an automatic snapshot had already been created after the deletion.
I deleted that current snapshot of the empty /mnt/pool/eds/video dataset.
I cloned a one-day-old snapshot of /mnt/pool/eds/video to /mnt/pool/eds/video-clone.
I ssh'd in, then su - then I mv'd each folder from /mnt/pool/eds/video-clone/folder-name to /mnt/pool/eds/video one folder at a time.
Storage Used rose from 58% to 94% during the move. It took about 28 hours to move the full 23.9TB.
It's surprising to me that it took 28 hours to manipulate the snapshots concerned since, the move created no new actual data.
/mnt/pool/eds/video-clone was emptied and all of the data was correctly moved to /mnt/pool/eds/video.
I then went to the GUI and deleted the empty /mnt/pool/eds/video-clone.

I never made a snapshot of /mnt/pool/eds/video-clone so I expected the Storage Used to return to 58% but, it remained at 94%.
But, obviously, creating the /mnt/pool/eds/video-clone dataset must have created a system default (invisible?) snapshot.
One could create a detailed ZFS dataset structure in TrueNAS and never, ever, create a single snapshot yet the system will still work correctly.
Hence the system must create system default snapshots, even if the admin never purposefully makes any manual or automatic snapshots.
I can find absolutely no information about this, other than "deleting a dataset will also delete all snapshots for the dataset".
I assume that also means that deleting a dataset will also delete the system default snapshot including any other snapshots that were ever created.
It appears that in my case, the system default snapshot of /mnt/pool/eds/video-clone may not have been deleted when the dataset was deleted.

I automatically make and retain snapshots for one week.
It has now been 10 days since the data move and all snapshots made before the data move have been automatically removed from the system.
I am still stuck at 94% Storage Used.
You can see above that the /mnt/pool/eds dataset shows USEDDS= 23.9T.
23.9TB is the amount of data that was cloned from the /mnt/pool/eds/video snapshot to the temporary /mnt/pool/eds/video-clone dataset.
There is no data in the /mnt/pool/eds dataset, only more datasets that are currently, correctly detailed by the USEDCHILD=39.0T value.
Actually, there never was any data in the /mnt/pool/eds dataset.
Cloning the /mnt/pool/eds/video snapshot to the /mnt/pool/eds/video-clone dataset created a new dataset in /mnt/pool/eds not USEDDS data.
The cloning to a new dataset should have increased USEDCHILD from 39TB to 63TB because the /mnt/pool/eds/video-clone dataset was simply a new dataset added to /mnt/pool/eds.
It was not 23.9TB of raw data added to /mnt/pool/eds, which would increase USEDDS to 23.9TB.
It was 23.9TB added to /mnt/pool/eds in the form of a new dataset /mnt/pool/eds/video-clone, which should have increased USEDCHILD.

I'm stuck here and don't know where to look next.
Thank You
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
zfs list -t snapshot pool/eds?
 
Joined
Oct 22, 2019
Messages
3,641
I think you need to better grasp the concepts of "folders", "datasets", and "mounts", and their differences. You're going too fast and trying things before even confirming what the problem is.

I gave a very simple idea of what you can do: "unmount" all children, then use "du" to check which folder under /mnt/pools/eds/* accounts for 24 TB of data. From here you could have definitely seen the culprit, and then decide on the next step.

But now I'm lost reading your follow up post. At one point you might have again confused a folder for a dataset, or vice versa. (Folders and mounted datasets appear the same when using the command-line tools. One day "video" might be a folder in your "eds" dataset, then the next day it's a mountpoint for the dataset pool/eds/video; in which the former's contents will no longer be available until you unmount the "video" dataset.)

My suspicion is that at one point you have the dataset pool/eds, and populated it with folders and files. One of these "folders" shares the same name with a new child dataset you create under pool/eds. Upon creation of this new dataset, it is automatically mounted at the path /mnt/pool/eds/name. If "name" is the same as the folder name, then it will "overlap" it, which causes confusion. The original folder under the dataset pool/eds still contains the files; yet you cannot access nor see them because it is currently being "overlapped" by an actual mounted dataset that shares the same name as the folder.

There are other ways this can happen as well, including doing things in the wrong order with encrypted datasets.
 
Last edited:

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
It's surprising to me that it took 28 hours to manipulate the snapshots concerned since, the move created no new actual data.

If you moved the data from one dataset (even if it's a snapshot) to another, you created new data. You now have two copies of the data, one in your old snapshot(s), and one in the location where you moved the data to. Just because the old data and the new data are the same doesn't mean that this doesn't take twice the space.

Storage Used rose from 58% to 94% during the move. It took about 28 hours to move the full 23.9TB.

Yeah, that's about what you'd expect to happen.

I never made a snapshot of /mnt/pool/eds/video-clone so I expected the Storage Used to return to 58% but, it remained at 94%.
But, obviously, creating the /mnt/pool/eds/video-clone dataset must have created a system default (invisible?) snapshot.
One could create a detailed ZFS dataset structure in TrueNAS and never, ever, create a single snapshot yet the system will still work correctly.
Hence the system must create system default snapshots, even if the admin never purposefully makes any manual or automatic snapshots.
I can find absolutely no information about this, other than "deleting a dataset will also delete all snapshots for the dataset".
I assume that also means that deleting a dataset will also delete the system default snapshot including any other snapshots that were ever created.
It appears that in my case, the system default snapshot of /mnt/pool/eds/video-clone may not have been deleted when the dataset was deleted.

This is gibberish. In ZFS, space is freed when the last reference to a block is released. Snapshots and clones can both hold open a reference to a block for as long as they exist. There's no such thing as a "system default snapshot" or "invisible snapshot". If you ask the right question, ZFS will tell you about it. It seems most likely to me that your snapshot strategy left one hanging around somewhere or that you mounted a dataset on top of an existing directory or something like that.
 
Joined
Oct 22, 2019
Messages
3,641
Are you a time-traveler?

This is a thread you started on Friday (before this thread):

@eewiz I mean this with no offense: It appears you did all these (risky, massive) troubleshooting steps before you even asked for help about "Where did 24 TB go?" in this new thread.

In other words, first you did a bunch of stuff with snapshots, cloning, massive transfers, and then later you created this new thread without mentioning what you already did. Then you do mention it, as if it's something you tried after reading my first reply? (Yet the reality is that you already did these things?)

This makes it very difficult and confusing to troubleshoot, let alone help you.

You should have started this thread with everything you already did; rather than mention it later...
 
Last edited:

eewiz

Explorer
Joined
Oct 14, 2021
Messages
50
Oh, I apologize but I tried leading off with the nity-grity in this post:
https://www.truenas.com/community/threads/unrecoverable-storage-space.110489/
No one has ever replied to that post so, I figured I would post again in an attempt to pique some interest before laying out the nity-grity.

Thank you winnielinnie, you hit the nail on the head.
I unmounted all of the datasets below /mnt/pool/eds to find the following:
Code:
root@plum[/mnt/pool/eds]# ls -l
total 285
drwxr-xr-x  2 root  wheel  2 Nov 14  2021 backup
drwxr-xr-x  2 root  wheel  2 Nov 13  2021 bt
drwxr-xr-x  2 root  wheel  2 Nov 15  2021 common
drwxr-xr-x  2 root  wheel  2 Nov 14  2021 drivers
drwxr-xr-x  2 root  wheel  2 Nov 14  2021 fixes
drwxr-xr-x  2 root  wheel  2 Nov 17  2021 games
drwxr-xr-x  2 root  wheel  2 Dec  2  2021 keepass
drwxr-xr-x  2 root  wheel  2 Nov 19  2021 lib
drwxr-xr-x  2 root  wheel  2 Nov 21  2021 music
drwxr-xr-x  2 root  wheel  2 Nov 21  2021 rch-ec
drwxr-xr-x  2 root  wheel  2 Nov 23  2021 sm
drwxr-xr-x  2 root  wheel  2 Nov 20  2021 software
drwxr-xr-x  2 root  wheel  2 Nov 13  2021 urb
drwxr-xr-x  2 root  wheel  2 Nov 14  2021 user
drwxrwxrwx+ 7 root  wheel  7 Jun  1 07:08 video

The above are all folders, not datasets.
I originally thought that the unmounts did not work until I noticed all of the permissions had changed, except for the "video" folder.
Ultimately, I had to share the "eds" dataset to view it in Windows before I could determine that what is seen above are folders, not datasets.
I see no way to tell the difference from an "ls -l" output.

On day-one, two years ago, I must have created folders in the /mnt/pool/eds dataset only to later discover that folders cannot be replicated.
Since folders residing in a dataset are totally invisible in the GUI there was no visual indication, nor warning thrown, that datasets were being created with the same name as existing folders.
All of the folders were empty except for the /mnt/pool/eds/video folder which held the 23.9TB in question.
I renamed the video folder to "video_dir" and rebooted TrueNAS.
After verifying that all the datasets were correct, I started an "rm -R video_dir"
I am currently waiting for the rm command to finish.

Again, thank you very much for leading me to the solution.
I understand that the [clone-snap, move-data, delete-empty-clone] method I employed did not work but now, I wonder why.

Should I have not done [clone-snap, move-data, delete-empty-clone] or was the problem caused by the underlying, same-named folder?
I don't understand why the data was actually moved from the "video-clone" dataset to the "video" folder and not the "video" dataset.
I expected that ZFS would have just moved the file metadata from the "video-clone" dataset's metadata table (no snapshot was ever created) to the "video" dataset's snapshot like a move in Windows simply changes the file allocation table, or whatever it's called these days in Windows, leaving the actual file data untouched.
I expected the move to take no time at all because, I expected that no new data would be created.

I am curious, if the underlying "video" folder did not exist, would the [clone-snap, move-data, delete-empty-clone] method have worked?

Originally, I simply deleted the last snapshot, which was auto-created after all of the files in the "video" dataset were deleted, then rebooted.
This did not return the "video" dataset to normal, it was still empty.
Then I tried to ROLLBACK the now last, good snapshot, taken the day before the files were deleted and nothing happened.
The "video" dataset remained empty.
Then I did the [clone-snap, move-data, delete-empty-clone] maneuver to ultimately discover here today that the data was moved to the "video" folder instead of the "video" dataset.

IMHO, when it comes to TrueNAS everything is a risky maneuver with a massive learning curve.
There simply are no tutorials with clear steps and numerous examples to learn from.
Every new thing done in TrueNAS is an exhausting exercise of trial and error.

I can report that the rm was successful.
After a reboot, the plum server is now back to 59% Used Space.

Thank You
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Should I have not done [clone-snap, move-data, delete-empty-clone] or was the problem caused by the underlying, same-named folder?
I don't understand why the data was actually moved from the "video-clone" dataset to the "video" folder and not the "video" dataset.
I expected that ZFS would have just moved the file metadata from the "video-clone" dataset's metadata table (no snapshot was ever created) to the "video" dataset's snapshot like a move in Windows simply changes the file allocation table, or whatever it's called these days in Windows, leaving the actual file data untouched.
I expected the move to take no time at all because, I expected that no new data would be created.

In Windows, when you move a file from c: to d:, what happens?

The same thing happens in UNIX when moving across a mountpoint. UNIX does not have "drive letters" but just like DOS, it cannot just "change the file allocation table" and "leaving the actual file data untouched". That doesn't work in Windows either, contrary to what you suggest. You've mistaken and confused file operations happening ACROSS filesystems with file operations happening WITHIN filesystems.

IMHO, when it comes to TrueNAS everything is a risky maneuver with a massive learning curve.
There simply are no tutorials with clear steps and numerous examples to learn from.
Every new thing done in TrueNAS is an exhausting exercise of trial and error.

There are numerous ZFS tutorials and lessons to learn from. They generally apply to TrueNAS. Moreover, ZFS itself is nearing twenty years old and is very well documented. There are extensive manual pages explaining the operation of snapshots and clones, even functionality not supported via the GUI. The UNIX manual is an administrator's basic knowledge resource, dating back from well before the World Wide Web.

ZFS is amazingly easy to experiment with, because you can do most things that would be done at the top level within nested datasets. Don't know for sure what to expect? Lab it out on a test dataset or two. When done, you just delete the dataset(s) and experiment goes poof. There is absolutely a steep learning curve for UNIX filesystem knowledge, and ZFS doesn't make it easier. But it's entirely predictable and incredibly resilient if you've familiarized yourself with the operational knowledge and the available options.
 

eewiz

Explorer
Joined
Oct 14, 2021
Messages
50
Hello All,

I went to the OSU college library circa 1983 and read the UNIX user's and programmer's manuals.
I also read about DOS 2.0.
I was using CPM at that time.
I chose DOS due to the serious complexity extant within the UNIX system.
I purchased an IBM AT in 1985.
It was a difficult choice to choose between the 10MB or the 20MB hard drive.
Due to all the floppies I had at that time, I chose the 20MB model which cost more than most complete computers cost today.

To use an automotive analogy, assume that you are an extraterrestrial that comes from a planet with transportation technology that simply requires a thought about your new destination to cause your vehicle to go there.
The UNIX, ZFS and TrueNAS documentation teach you everything you need to know about the gas pedals, break pedals, steering wheels, gas tanks, engines, transmissions, trunks, hoods, lights, wipers, seats and windows of the UNIX and ZFS systems, using that automotive analogy.
But they don't teach you anything about how to drive that analogous automobile.
What do you do if traffic is merging from the right, what is all this red, yellow and green light stuff, why do you have to let up on the break pedal as the machine comes to a stop?

Without analogy:
One will discover after much effort that ZFS cannot replicate folders, only datasets.
Do not passphrase encrypt /mnt/pool. It will break an out-of-the-box TrueNAS system.
You will not see any folders residing in a dataset from the TrueNAS GUI.
Having folders in a dataset can be dangerous and may cause serious issues if a dataset is created with the same name as a folder (which folder cannot be seen from the GUI).
Why can't I make push replication work when pull replication works just fine?
What does one do when one discovers that all of the files in a dataset have been deleted?
What does one do when the ROLLBACK button does not restore a dataset?
Does the dataset to be rolled back need to be removed first or empty or may it contain data that will be clobbered by the ROLLBACK.
What's a correct way to split a dataset into two datasets?
etc... etc... etc...

I have not found that type of information except from forum posts, with their kind responders, due to others being in the same or similar SNAFU.
Also, it almost never fails that someone else's SNAFU is never identical.
There you may find concise, step-by-step instructions, that may be evaluated to determine if those steps may solve one's issue.
With hundreds of programs having many parameters each, needed to manage a UNIX-Like/ZFS system, one doesn't even know what is needed out of those hundreds of programs to start to solve whatever issue one may face, without directed guidance, or years of experience.
Of course, with GUI systems like TrueNAS, if things don't go wrong, one will never gain that required experience.
Although, without a GUI system, most, like me, will not even try to move from Windows to a UNIX-like system.
I tried installing Slackware in the middle 90's.
When installation was complete I was presented with a command line prompt.
Dir did not work.
I reinstalled Windows 95.
I tried Linux again around 2010, after Linux was sporting GUI's.
I wasted a whole day researching how to do something as simple as increasing the screen resolution.
I reinstalled Windows XP.
I now have Debian, Ubuntu, Manjaro and Mint and their multiple desktops running in Virtualbox but, I still can't use them like I can Windows.
IMHO, when Linux finally implements something akin to the Windows registry feature it may become something usable by general interest users.
Linux/UNIX/BSD is a fine system for a corporate environment where admins with years of training and experience deal with all the complexities required to set up a system for a user.
There the user simply has to use the features provided and never has to look under the hood.

I still have not found a Linux/UNIX/BSD implementation where one can simply drag something to a desktop to create a shortcut on that desktop.
This is such a powerful feature I cannot understand why it has not been implemented in some one or another of the Linux/UNIX/BSD desctops.
I know, that is not the paradigm supported by the thousands of authors who have written the disparate pieces of Linux/UNIX/BSD.
The whole life of Linux/UNIX/BSD is dependencies, not integration.

I do wish to thank all of the authors that labored to produce the TrueNAS system.

Thank You
All for now
 
Top