Phantom Data Use after deleting clone and all snapshots?

JoeKickass · Nov 19, 2021

After recovering data from my cloned snapshot, I went ahead and deleted the clone dataset and all my snapshots... yet my used space did not go down.

Every command I try says there is nothing using the space, no snapshots, no folder, it's like there is a 7TB dead space on the pool...
Please let me know if anyone has any ideas or has fixed a similar issue!

"zfs remap" seems to be deprecated so I wasn't able to try that, but here are the clues I've found so far:

Used space too high (~7TB which seems like it was never reclaimed from cloned snapshot that was deleted)

No snapshots:

zfs list shows nothing using the space, 15TB used but only 7.5TB refer?

Here is the total space graph as I recovered the data from the clone, the jump is after I added new drives and finished transferring.
It looks like the total space goes down, while the used space remains the same, which doesn't seem right to me:

Thank you for any help or ideas!

JoeKickass · Nov 19, 2021

New clue I think:

The written and used-by-dataset values show 7.5TB

but there is also a 7.8TB "used-by-children" that I see in other people's output, any ideas if that is normal?


root@truenas[~]# zfs get all STORAGE
NAME     PROPERTY                VALUE                   SOURCE
STORAGE  type                    filesystem              -
STORAGE  creation                Sat Oct  2 15:43 2021   -
STORAGE  used                    15.4T                   -
STORAGE  available               2.24T                   -
STORAGE  referenced              7.57T                   -
STORAGE  compressratio           1.01x                   -
STORAGE  mounted                 yes                     -
STORAGE  quota                   none                    default
STORAGE  reservation             none                    default
STORAGE  recordsize              128K                    default
STORAGE  mountpoint              /mnt/STORAGE            default
STORAGE  sharenfs                off                     default
STORAGE  checksum                on                      default
STORAGE  compression             lz4                     received
STORAGE  atime                   on                      default
STORAGE  devices                 on                      default
STORAGE  exec                    on                      default
STORAGE  setuid                  on                      default
STORAGE  readonly                off                     default
STORAGE  jailed                  off                     default
STORAGE  snapdir                 visible                 local
STORAGE  aclmode                 passthrough             received
STORAGE  aclinherit              passthrough             local
STORAGE  createtxg               1                       -
STORAGE  canmount                on                      default
STORAGE  xattr                   on                      default
STORAGE  copies                  1                       local
STORAGE  version                 5                       -
STORAGE  utf8only                off                     -
STORAGE  normalization           none                    -
STORAGE  casesensitivity         sensitive               -
STORAGE  vscan                   off                     default
STORAGE  nbmand                  off                     default
STORAGE  sharesmb                off                     default
STORAGE  refquota                none                    default
STORAGE  refreservation          none                    default
STORAGE  guid                    7893211037971188226     -
STORAGE  primarycache            all                     default
STORAGE  secondarycache          all                     default
STORAGE  usedbysnapshots         0B                      -
STORAGE  usedbydataset           7.57T                   -
STORAGE  usedbychildren          7.78T                   -
STORAGE  usedbyrefreservation    0B                      -
STORAGE  logbias                 latency                 default
STORAGE  objsetid                54                      -
STORAGE  dedup                   off                     default
STORAGE  mlslabel                none                    default
STORAGE  sync                    standard                local
STORAGE  dnodesize               legacy                  default
STORAGE  refcompressratio        1.01x                   -
STORAGE  written                 7.57T                   -
STORAGE  logicalused             15.5T                   -
STORAGE  logicalreferenced       7.69T                   -
STORAGE  volmode                 default                 default
STORAGE  filesystem_limit        none                    default
STORAGE  snapshot_limit          none                    default
STORAGE  filesystem_count        none                    default
STORAGE  snapshot_count          none                    default
STORAGE  snapdev                 hidden                  default
STORAGE  acltype                 nfsv4                   default
STORAGE  context                 none                    default
STORAGE  fscontext               none                    default
STORAGE  defcontext              none                    default
STORAGE  rootcontext             none                    default
STORAGE  relatime                off                     default
STORAGE  redundant_metadata      all                     default
STORAGE  overlay                 on                      default

Heracles · Nov 19, 2021

Hi,

Give your TrueNAS some time for it to actually free the space, mark it as such, etc. This is part of a cleanup process and is not instant...

JoeKickass · Nov 19, 2021

Heracles said:
Hi,

Give your TrueNAS some time for it to actually free the space, mark it as such, etc. This is part of a cleanup process and is not instant...

Thanks I read that too, I went ahead and stopped the scrub I was running and I will let it sit, but as far as I can tell nothing is happening.
Is there a cleanup process that I can watch for in "top"?

I've already restarted once, could that have interrupted the cleanup?

Heracles · Nov 19, 2021

Just dont waste your time in front of your screen.... See you tomorrow and have fun until that

JoeKickass · Nov 19, 2021

Heracles said:
Just dont waste your time in front of your screen.... See you tomorrow and have fun until that

I'm 99% sure nothing is happening, can anyone help?

There's no way an enterprise solution would keep 50% of your storage unavailable for hours with no indication of progress.

Here is my cpu and disk usage, seems flatlined:

These disks are the most active, but it's only a few kb and they are writing data, not deleting it:

I think the only activity is periodic 5 second log writes:

I really don't want to have to order a pair of hard drives just to re-create the pool and return them, is there a rental service maybe?

jgreco · Nov 19, 2021

JoeKickass said:
There's no way an enterprise solution would keep 50% of your storage unavailable for hours with no indication of progress.

Haaahhahahah. Remind me to tell you the story about the "enterprise solution" that had to be rebooted, a filer that got wedged when its snapshot reserve got messed up. Enterprise solutions can actually be the worst of the bunch.

These disks are the most active, but it's only a few kb and they are writing data, not deleting it:
View attachment 50887

There is no category for "deleting" data. Deleting is the process of writing new metadata that outdates the previously written information.

So this is quite possibly doing something, and, small rates could represent lots of seeking. I'm not convinced either way actually.

I really don't want to have to order a pair of hard drives just to re-create the pool and return them, is there a rental service maybe?

Amazon? Best Buy? The usual suspects.

JoeKickass · Nov 20, 2021

Well I had 2.20 TB free when I went to bed, and I have exactly the same space free when I woke up..
I just tried exporting/importing the pool for kicks, but no luck, looks like I managed to break TrueNAS...

I'm surprised no one is interested in trying to debug this issue, IMO it restricts my usage of TrueNAS to personal side-projects and hobbies.
No way would I trust this software at my actual job!

I guess I'll order a pair of drives for Monday so I can re-create the pool

Does anyone know of another NAS software that is more reliable?
I think I might just bite the bullet and get Windows Server so I can use ReFS, all I really need is file resilience...

Arwen · Nov 20, 2021

The old OpenZFS feature Async Destroy shows up here;

Code:

> zpool get freeing rpool
NAME   PROPERTY  VALUE    SOURCE
rpool  freeing   0        -

If it's anything but zero, then it's still deleting your clones and snapshots.

Last year I read about a potentially new OpenZFS feature, don't remember it's name but for practical purposes, I will call it Async Delete. This was to put larger files on a background delete task, instead of having the user wait. Very similar to Async Destroy and perhaps uses some of the same code. I don't know if that ever got implemented, nor if so, which OpenZFS version.

PS: What I mean by "old" OpenZFS feature, is that Async Destroy was one of the first new features added after OpenZFS split from Oracle ZFS. Anyone who used ZFS for long enough, saw a need for this feature.

Arwen · Nov 20, 2021

@JoeKickass - Do you have any zVolumes?

Their is a known space waster using zVols with large block size, but internally the other file system uses small block sizes, and especially on RAID-Zx. I don't know all the details.

JoeKickass · Nov 20, 2021

Arwen said:
@JoeKickass - Do you have any zVolumes?

Their is a known space waster using zVols with large block size, but internally the other file system uses small block sizes, and especially on RAID-Zx. I don't know all the details.

Thank you for the replies, but no I only created a few datasets no zvols, and I never changed the block sizes.

Unfortunately "get freeing" says nothing is being freed, at least now I know for sure,
but it's a serious issue to not be able to clone snapshots if the space will never be freed... how do other people use this feature?

Paul5 · Nov 20, 2021

I'm going to give my meaningless 2 cents worth here:

1-Download and install if not already have Filezilla ftp client on your pc.
2-In FN 'Services' enable ftp or sftp which you prefer and enable 'Allow root login'
3-From Filezilla login as root to and go to /mnt then navigate your pools an datasets to see if anything odd is present.
3a-So many TB, in Filezilla breakdown the datasts or folders one at a time. Right click and select add to the list which will give you a size for that dataset/folder on the bottom right hand corner after it calculates.

4-If you manually lock and unlock your pool/datasets first try navigating them unlocked then try with them locked. Filezilla
should still be able to see the locked Datasets if using the new zfs encryption. Reason being is via ftp I can actually write to locked datasets phantomly filling up the disks.

If nothing is obvious try also the setting for being able to see the snapshots and look again..

Arwen · Nov 22, 2021

@JoeKickass - Sorry I was not more helpful. I've been using ZFS from more than 10 years, (at work, and since 2014 at home). Never "lost" data space that I knew about.

On thing that bit me long time ago was copying sparse files. I was having to re-partition some Solaris servers, (before ZFS), and a simple "rsync -aH" copying some of the sparse files used zero fill. Pretty annoying until I figured out I needed to add the "-S" option to "rsync -aHS". Recently I've saw one my work servers with an OS disks of 100Gigabytes, but 1Terrabyte sparse file.

Sparse file copies is almost certainly not your problem. But, I am just highlighting that I have seen blow ups before. It eventually will have a logical explanation.

JoeKickass · Nov 22, 2021

Thanks for all the help, never did solve the problem, I got the new 10TB drives in and was able to transfer and re-create the pool.

In the end this was a blessing in disguise, I was able to convert my ages old mirror array to a raid-z2 and gained 2TB!

I can only guess it was something to do with the way I restored data from the snapshot:
1) create snapshot
2) clone snapshot
3) cut-paste from clone to current dataset
4) delete clone dataset

In the future it sounds like it's better to transfer everything to the clone dataset, promote it, and then delete your original dataset.
Please let me know if I have it right, it's very unintuitive...

TBH I will do everything in my power to never have to clone a snapshot again, to me snapshots seem like a payday loan or casino... sure they give you what you want but they'll take away much more when all is said and done!

Arwen · Nov 23, 2021

In general, snapshots are fine with ZFS. It uses copy on write for any new or changed items, so the snapshot feature was sort of builtin due to that design criteria.

My home systems have ZFS snapshots for both the OS, using alternate boot environments, (which include clones). And snapshots on my home dataset, 24 hourly ones, and 7 daily ones, (but taken hourly until 11pm when it rolls over to the next day). This saves me from having to re-create or restore minor files that I edited or over-wrote by mistake.

That said, out of control snapshots can suck up a lot of space if you have a lot of churn in your data. Plus, having a plan for ZFS snapshot retention is in most ways, more important than how many you have.

Important Announcement for the TrueNAS Community.

Phantom Data Use after deleting clone and all snapshots?

JoeKickass

Dabbler

JoeKickass

Dabbler

Heracles

Wizard

JoeKickass

Dabbler

Heracles

Wizard

JoeKickass

Dabbler

jgreco

Resident Grinch

JoeKickass

Dabbler

Arwen

MVP

Arwen

MVP

JoeKickass

Dabbler

Paul5

Contributor

Arwen

MVP

JoeKickass

Dabbler

Arwen

MVP

Similar threads

Important Announcement for the TrueNAS Community.

Phantom Data Use after deleting clone and all snapshots?

Dabbler

Dabbler

Wizard

Dabbler

Wizard

Dabbler

Resident Grinch

Dabbler

MVP

MVP

Dabbler

Contributor

MVP

Dabbler

MVP

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Phantom Data Use after deleting clone and all snapshots?"

Similar threads