Phantom Data Use after deleting clone and all snapshots?

JoeKickass

Dabbler
Joined
Nov 15, 2021
Messages
12
After recovering data from my cloned snapshot, I went ahead and deleted the clone dataset and all my snapshots... yet my used space did not go down.

Every command I try says there is nothing using the space, no snapshots, no folder, it's like there is a 7TB dead space on the pool...
Please let me know if anyone has any ideas or has fixed a similar issue!

"zfs remap" seems to be deprecated so I wasn't able to try that, but here are the clues I've found so far:

Used space too high (~7TB which seems like it was never reclaimed from cloned snapshot that was deleted)
used space too high.jpg


No snapshots:
no snapshots.jpg


zfs list shows nothing using the space, 15TB used but only 7.5TB refer?

zfs list.jpg


Here is the total space graph as I recovered the data from the clone, the jump is after I added new drives and finished transferring.
It looks like the total space goes down, while the used space remains the same, which doesn't seem right to me:
total space.jpg


Thank you for any help or ideas!
 
Last edited:

JoeKickass

Dabbler
Joined
Nov 15, 2021
Messages
12
New clue I think:

The written and used-by-dataset values show 7.5TB

but there is also a 7.8TB "used-by-children" that I see in other people's output, any ideas if that is normal?

root@truenas[~]# zfs get all STORAGE NAME PROPERTY VALUE SOURCE STORAGE type filesystem - STORAGE creation Sat Oct 2 15:43 2021 - STORAGE used 15.4T - STORAGE available 2.24T - STORAGE referenced 7.57T - STORAGE compressratio 1.01x - STORAGE mounted yes - STORAGE quota none default STORAGE reservation none default STORAGE recordsize 128K default STORAGE mountpoint /mnt/STORAGE default STORAGE sharenfs off default STORAGE checksum on default STORAGE compression lz4 received STORAGE atime on default STORAGE devices on default STORAGE exec on default STORAGE setuid on default STORAGE readonly off default STORAGE jailed off default STORAGE snapdir visible local STORAGE aclmode passthrough received STORAGE aclinherit passthrough local STORAGE createtxg 1 - STORAGE canmount on default STORAGE xattr on default STORAGE copies 1 local STORAGE version 5 - STORAGE utf8only off - STORAGE normalization none - STORAGE casesensitivity sensitive - STORAGE vscan off default STORAGE nbmand off default STORAGE sharesmb off default STORAGE refquota none default STORAGE refreservation none default STORAGE guid 7893211037971188226 - STORAGE primarycache all default STORAGE secondarycache all default STORAGE usedbysnapshots 0B - STORAGE usedbydataset 7.57T - STORAGE usedbychildren 7.78T - STORAGE usedbyrefreservation 0B - STORAGE logbias latency default STORAGE objsetid 54 - STORAGE dedup off default STORAGE mlslabel none default STORAGE sync standard local STORAGE dnodesize legacy default STORAGE refcompressratio 1.01x - STORAGE written 7.57T - STORAGE logicalused 15.5T - STORAGE logicalreferenced 7.69T - STORAGE volmode default default STORAGE filesystem_limit none default STORAGE snapshot_limit none default STORAGE filesystem_count none default STORAGE snapshot_count none default STORAGE snapdev hidden default STORAGE acltype nfsv4 default STORAGE context none default STORAGE fscontext none default STORAGE defcontext none default STORAGE rootcontext none default STORAGE relatime off default STORAGE redundant_metadata all default STORAGE overlay on default
 
Last edited:

Heracles

Wizard
Joined
Feb 2, 2018
Messages
1,401
Hi,

Give your TrueNAS some time for it to actually free the space, mark it as such, etc. This is part of a cleanup process and is not instant...
 

JoeKickass

Dabbler
Joined
Nov 15, 2021
Messages
12
Hi,

Give your TrueNAS some time for it to actually free the space, mark it as such, etc. This is part of a cleanup process and is not instant...
Thanks I read that too, I went ahead and stopped the scrub I was running and I will let it sit, but as far as I can tell nothing is happening.
Is there a cleanup process that I can watch for in "top"?

I've already restarted once, could that have interrupted the cleanup?
 

Heracles

Wizard
Joined
Feb 2, 2018
Messages
1,401
Just dont waste your time in front of your screen.... See you tomorrow and have fun until that :smile:
 

JoeKickass

Dabbler
Joined
Nov 15, 2021
Messages
12
Just dont waste your time in front of your screen.... See you tomorrow and have fun until that :smile:
I'm 99% sure nothing is happening, can anyone help?

There's no way an enterprise solution would keep 50% of your storage unavailable for hours with no indication of progress.

Here is my cpu and disk usage, seems flatlined:
cpu usage.jpg


These disks are the most active, but it's only a few kb and they are writing data, not deleting it:
disk usage.jpg


I think the only activity is periodic 5 second log writes:
disk usage.jpg


I really don't want to have to order a pair of hard drives just to re-create the pool and return them, is there a rental service maybe?
 
Last edited:

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
There's no way an enterprise solution would keep 50% of your storage unavailable for hours with no indication of progress.

Haaahhahahah. Remind me to tell you the story about the "enterprise solution" that had to be rebooted, a filer that got wedged when its snapshot reserve got messed up. Enterprise solutions can actually be the worst of the bunch.

These disks are the most active, but it's only a few kb and they are writing data, not deleting it:
View attachment 50887

There is no category for "deleting" data. Deleting is the process of writing new metadata that outdates the previously written information.

So this is quite possibly doing something, and, small rates could represent lots of seeking. I'm not convinced either way actually.

I really don't want to have to order a pair of hard drives just to re-create the pool and return them, is there a rental service maybe?

Amazon? Best Buy? The usual suspects. :smile:
 

JoeKickass

Dabbler
Joined
Nov 15, 2021
Messages
12
Well I had 2.20 TB free when I went to bed, and I have exactly the same space free when I woke up..
I just tried exporting/importing the pool for kicks, but no luck, looks like I managed to break TrueNAS...

I'm surprised no one is interested in trying to debug this issue, IMO it restricts my usage of TrueNAS to personal side-projects and hobbies.
No way would I trust this software at my actual job! :grin: I guess I'll order a pair of drives for Monday so I can re-create the pool

Does anyone know of another NAS software that is more reliable?
I think I might just bite the bullet and get Windows Server so I can use ReFS, all I really need is file resilience...
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
The old OpenZFS feature Async Destroy shows up here;
Code:
> zpool get freeing rpool
NAME   PROPERTY  VALUE    SOURCE
rpool  freeing   0        -

If it's anything but zero, then it's still deleting your clones and snapshots.

Last year I read about a potentially new OpenZFS feature, don't remember it's name but for practical purposes, I will call it Async Delete. This was to put larger files on a background delete task, instead of having the user wait. Very similar to Async Destroy and perhaps uses some of the same code. I don't know if that ever got implemented, nor if so, which OpenZFS version.

PS: What I mean by "old" OpenZFS feature, is that Async Destroy was one of the first new features added after OpenZFS split from Oracle ZFS. Anyone who used ZFS for long enough, saw a need for this feature.
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
@JoeKickass - Do you have any zVolumes?

Their is a known space waster using zVols with large block size, but internally the other file system uses small block sizes, and especially on RAID-Zx. I don't know all the details.
 

JoeKickass

Dabbler
Joined
Nov 15, 2021
Messages
12
@JoeKickass - Do you have any zVolumes?

Their is a known space waster using zVols with large block size, but internally the other file system uses small block sizes, and especially on RAID-Zx. I don't know all the details.
Thank you for the replies, but no I only created a few datasets no zvols, and I never changed the block sizes.

Unfortunately "get freeing" says nothing is being freed, at least now I know for sure,
but it's a serious issue to not be able to clone snapshots if the space will never be freed... how do other people use this feature?

Nothing Freeing.jpg
 

Paul5

Contributor
Joined
Jun 17, 2013
Messages
117
I'm going to give my meaningless 2 cents worth here:

1-Download and install if not already have Filezilla ftp client on your pc.
2-In FN 'Services' enable ftp or sftp which you prefer and enable 'Allow root login'
3-From Filezilla login as root to and go to /mnt then navigate your pools an datasets to see if anything odd is present.
3a-So many TB, in Filezilla breakdown the datasts or folders one at a time. Right click and select add to the list which will give you a size for that dataset/folder on the bottom right hand corner after it calculates.

4-If you manually lock and unlock your pool/datasets first try navigating them unlocked then try with them locked. Filezilla
should still be able to see the locked Datasets if using the new zfs encryption. Reason being is via ftp I can actually write to locked datasets phantomly filling up the disks.

If nothing is obvious try also the setting for being able to see the snapshots and look again..
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
@JoeKickass - Sorry I was not more helpful. I've been using ZFS from more than 10 years, (at work, and since 2014 at home). Never "lost" data space that I knew about.


On thing that bit me long time ago was copying sparse files. I was having to re-partition some Solaris servers, (before ZFS), and a simple "rsync -aH" copying some of the sparse files used zero fill. Pretty annoying until I figured out I needed to add the "-S" option to "rsync -aHS". Recently I've saw one my work servers with an OS disks of 100Gigabytes, but 1Terrabyte sparse file.

Sparse file copies is almost certainly not your problem. But, I am just highlighting that I have seen blow ups before. It eventually will have a logical explanation.
 

JoeKickass

Dabbler
Joined
Nov 15, 2021
Messages
12
Thanks for all the help, never did solve the problem, I got the new 10TB drives in and was able to transfer and re-create the pool.

In the end this was a blessing in disguise, I was able to convert my ages old mirror array to a raid-z2 and gained 2TB!

I can only guess it was something to do with the way I restored data from the snapshot:
1) create snapshot
2) clone snapshot
3) cut-paste from clone to current dataset
4) delete clone dataset

In the future it sounds like it's better to transfer everything to the clone dataset, promote it, and then delete your original dataset.
Please let me know if I have it right, it's very unintuitive...

TBH I will do everything in my power to never have to clone a snapshot again, to me snapshots seem like a payday loan or casino... sure they give you what you want but they'll take away much more when all is said and done!
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
In general, snapshots are fine with ZFS. It uses copy on write for any new or changed items, so the snapshot feature was sort of builtin due to that design criteria.

My home systems have ZFS snapshots for both the OS, using alternate boot environments, (which include clones). And snapshots on my home dataset, 24 hourly ones, and 7 daily ones, (but taken hourly until 11pm when it rolls over to the next day). This saves me from having to re-create or restore minor files that I edited or over-wrote by mistake.

That said, out of control snapshots can suck up a lot of space if you have a lot of churn in your data. Plus, having a plan for ZFS snapshot retention is in most ways, more important than how many you have.
 
Top