Data Written doesn't add up to Total Allocation in dataset

couchbed

Dabbler
Joined
Dec 14, 2022
Messages
19
One of my datasets shows:

Total Allocation: 357GiB
Data Written: 197 GiB?

What's the extra 160 GiB? I checked the snapshots, there are some, but definitely not 160 GiB worth. The pie chart only has the Data Written section, which it says is 55% of the data. What's the rest?

truenas-issue.png



A similar question was asked in this thread, but it was a difference of 1GiB, not 160. https://www.truenas.com/community/t...cation-dedupe-off-should-i-be-worried.106517/
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,700
I presume you're using RAIDZ of one or other kind?

You need to find the explanation of how that works, as it's expected that you're going to lose something like 20% of your capacity to it based on how blocks are allocated between how many disks you have in the VDEVs.

Read the bit about space allocation here:
 

couchbed

Dabbler
Joined
Dec 14, 2022
Messages
19
I presume you're using RAIDZ of one or other kind?

You need to find the explanation of how that works, as it's expected that you're going to lose something like 20% of your capacity to it based on how blocks are allocated between how many disks you have in the VDEVs.

Read the bit about space allocation here:
Unless I'm extremely confused, that's not the issue. I'm talking about a dataset, not a pool. My other datasets have identical Data Written and Total Allocation.

I have 2 4TB drives in a mirror.
 
Joined
Oct 22, 2019
Messages
3,641
You can pull back the curtains with:
Code:
zfs list -t filesystem -o space pool0/steven

This will show you more relevant information, such as data used by snapshots and children.
 

couchbed

Dabbler
Joined
Dec 14, 2022
Messages
19
You can pull back the curtains with:
Code:
zfs list -t filesystem -o space pool0/steven

This will show you more relevant information, such as data used by snapshots and children.
Okay, yes, we're getting somewhere. By running the command you said, I get
Code:
root@truenas[~]# zfs list -t filesystem -o space pool0/steven
NAME          AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD
pool0/steven  2.80T   357G      159G    197G             0B         0B

So the space IS taken up by snapshots. However, I can't figure out why the snapshots are taking up all that space. If I run a command to look at the snapshots, it doesn't add up:
Code:
root@truenas[~]# zfs list -r -t snapshot -o name,creation,used,refer pool0/steven
NAME                                        CREATION                USED     REFER
pool0/steven@auto-monthly-2023-02-01_00-00  Wed Feb  1  0:00 2023  4.21M      291G
pool0/steven@auto-monthly-2023-03-01_00-00  Wed Mar  1  0:00 2023  39.6M      322G
pool0/steven@auto-monthly-2023-04-01_00-00  Sat Apr  1  0:00 2023   183M      328G
pool0/steven@auto-monthly-2023-05-01_00-00  Mon May  1  0:00 2023   333M      339G
pool0/steven@auto-monthly-2023-06-01_00-00  Thu Jun  1  0:00 2023   240K      197G
pool0/steven@auto-daily-2023-07-01_00-00    Sat Jul  1  0:00 2023     0B      197G
pool0/steven@auto-monthly-2023-07-01_00-00  Sat Jul  1  0:00 2023     0B      197G
pool0/steven@auto-daily-2023-07-17_00-00    Mon Jul 17  0:00 2023   181M      197G
pool0/steven@auto-hourly-2023-07-21_18-00   Fri Jul 21 18:00 2023     0B      197G

By my understanding, the numbers in the USED column should add up to 159G, but they're not even close.
 
Joined
Oct 22, 2019
Messages
3,641
the numbers in the USED column should add up to 159G
No, they "shouldn't".

Snapshots have records that overlap.

If data exists in more than one snapshot, then an individual snapshot's used data is not fully revealed.

To prove it, run this command. Make sure NOT to run it with root or sudo privileges. Make sure you use the "-n" flag for a dry run, so that no snapshots are destroyed.
Code:
zfs destroy -n -v pool0/steven@auto-monthly-2023-02-01_00-00%auto-hourly-2023-07-21_18-00


I REPEAT: DO NOT RUN THIS COMMAND AS ROOT OR WITH SUDO PRIVILEGES, AND MAKE SURE TO USE THE "-n" FLAG.
 
Joined
Oct 22, 2019
Messages
3,641
Here is a color graphic representation of this phenomenon:

I explain it more in this post:
 

couchbed

Dabbler
Joined
Dec 14, 2022
Messages
19
No, they "shouldn't".

Snapshots have records that overlap.

If data exists in more than one snapshot, then an individual snapshot's used data is not fully revealed.

To prove it, run this command. Make sure NOT to run it with root or sudo privileges. Make sure you use the "-n" flag for a dry run, so that no snapshots are destroyed.
Code:
zfs destroy -n -v pool0/steven@auto-monthly-2023-02-01_00-00%auto-hourly-2023-07-21_18-00


I REPEAT: DO NOT RUN THIS COMMAND AS ROOT OR WITH SUDO PRIVILEGES, AND MAKE SURE TO USE THE "-n" FLAG.
Okay, I believe you. I currently don't know how to execute zfs command not as root because there's some issue with the Path when I'm not root. But regardless, what you're saying makes sense.

If I wanted to free up space, I should be able to tell fairly well which snapshots are the big ones based on the REFER column, right? Because that's basically telling me, if you were to revert to this snapshot, this is the size the dataset would be, right? If I deleted the first four snapshots out of the dataset, it should shrink the Total Allocated size down to about 197G, right?
 
Joined
Oct 22, 2019
Messages
3,641
I currently don't know how to execute zfs command not as root because there's some issue with the Path when I'm not root.
I exaggerated my warning to be EXTRA EXTRA safe. You can still run the above command as root. However, if you accidentally forgot to use "-n" or make a typo, then you risk destroying all your snapshots. As long as the command is correct and you use "-n", it will only execute as a "dry run".


If I wanted to free up space, I should be able to tell fairly well which snapshots are the big ones based on the REFER column, right? Because that's basically telling me, if you were to revert to this snapshot, this is the size the dataset would be, right? If I deleted the first four snapshots out of the dataset, it should shrink the Total Allocated size down to about 197G, right?
Theoretically, yes. Based on the output, it looks like there was a mass deletion event between May 1st and June 1st.


Once again, using the "-n" flag, you can check how much space will really be freed up:
Code:
zfs destroy -n -v pool0/steven@auto-monthly-2023-02-01_00-00%auto-monthly-2023-05-01_00-00

This will simulate ("dry run") a destruction of the first four snapshots, from February 1st through May 1st.
 

couchbed

Dabbler
Joined
Dec 14, 2022
Messages
19
I exaggerated my warning to be EXTRA EXTRA safe. You can still run the above command as root. However, if you accidentally forgot to use "-n" or make a typo, then you risk destroying all your snapshots. As long as the command is correct and you use "-n", it will only execute as a "dry run".



Theoretically, yes. Based on the output, it looks like there was a mass deletion event between May 1st and June 1st.


Once again, using the "-n" flag, you can check how much space will really be freed up:
Code:
zfs destroy -n -v pool0/steven@auto-monthly-2023-02-01_00-00%auto-monthly-2023-05-01_00-00

This will simulate ("dry run") a destruction of the first four snapshots, from February 1st through May 1st.
Thanks for your help! I appreciate it.
The "Used by Snapshots" metric really ought to be in the WebUI. They have a nice pretty pie chart, but for all of my datasets it's just all one color, lol.
 
Top