Reserved ZFS drive space?

Status
Not open for further replies.

froggie

Cadet
Joined
Jun 4, 2013
Messages
9
FreeNAS 8.3.1
Six 3TB Drives

I have six 3TB Drives in RAID-Z1. I am expecting about 13.6TB of usable space because of parity and lies. Instead, I have 12.8TB of usable space. That is 800GB missing. I know by default I lose 2GB per drive for swap, but that is only 12GB. I do not understand what happened. I do not have any snapshots or child datasets.

I know that some systems, such as Fedora, will reserve 10% of the space for root. I could run a simple command to reclaim the space back on my data. I tried to search for something similar on ZFS, but have not found any quotas or reservations set.
http://vishesh-yadav.com/blog/2011/09/01/decrease-reserved-space-in-ext2ex3ext4-filesystems/
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
There is no reserved space that I am aware of and your 13.6TB estimate is correct, minus the 12 GB for swap space. I guess you will need to figure out if you really have full 3TB drives and they aren't using 1000 vs. 1024. That makes a huge difference, not sure if it would make 800GB difference.

List the drive models you have and the rest of your equipment configuration.

EDIT: I'm such an idiot... You cannot look at any values and assume they are correct. Such as if you use 'zfs list' well the numbers will not add up. You will get an approximate 13.6TB of usable space but the way ZFS works, it may report less. Take comfort in you should have the correct amount of space, nothing is really sucking up 800GBs.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Ok, this is straightforward to figure out.

Look to see how many sectors you have. Go into the command prompt and use "dmesg" or look in /var/run/dmesg.boot. You will see entries like

ada4: <ST3000DM001-9YN166 CC4H> ATA-8 SATA 3.x device
ada4: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada4: Command Queueing enabled
ada4: 2861588MB (5860533168 512 byte sectors: 16H 63S/T 16383C)

So this machine I'm looking at happens to have four of them. The usable disk partition is slightly smaller.

# gpart list ada4
Geom name: ada4
modified: false
state: OK
fwheads: 16
fwsectors: 63
last: 5860533134
first: 34
entries: 128
scheme: GPT
Providers:
1. Name: ada4p1
Mediasize: 2147483648 (2.0G)
Sectorsize: 512
Stripesize: 4096
Stripeoffset: 0
Mode: r1w1e0
rawuuid: 25128729-3857-11e2-8e3c-00505699477a
rawtype: 516e7cb5-6ecf-11d6-8ff8-00022d09712b
label: (null)
length: 2147483648
offset: 65536
type: freebsd-swap
index: 1
end: 4194431
start: 128
2. Name: ada4p2
Mediasize: 2998445415936 (2.7T)
Sectorsize: 512
Stripesize: 4096
Stripeoffset: 0
Mode: r1w1e2
rawuuid: 25289ef1-3857-11e2-8e3c-00505699477a
rawtype: 516e7cba-6ecf-11d6-8ff8-00022d09712b
label: (null)
length: 2998445415936
offset: 2147549184
type: freebsd-zfs
index: 2
end: 5860533134
start: 4194432
Consumers:
1. Name: ada4
Mediasize: 3000592982016 (2.7T)
Sectorsize: 512
Stripesize: 4096
Stripeoffset: 0
Mode: r2w2e4

So FreeNAS winds up with 4 x 2998445415936 = 11993781663744 bytes of raw disk space for ZFS. So we divide that by 1099511627776. We get 10.9. That's terabytes.

Now we ask ZFS what it has.

# zpool list
NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
pool 10.9T 10.1T 807G 92% 1.88x ONLINE /mnt

And it has 10.9T. But that's not "usable" space, that's just raw disk space, which is how ZFS accounts for it in "zpool list". And since it's RAIDZ2 on a 4 disk pool, I would expect usable space to be pretty close to half that.

So I hope your "zpool list" says ~16.36TB, since all the 3TB drives have about the same number of sectors. Let us know.
 

froggie

Cadet
Joined
Jun 4, 2013
Messages
9
They are 3TB WD Red drives.

#dmesg
ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
ada0: <WDC WD30EFRX> ATA-9 SATA 3.x device
ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
ada0: Command Queueing enabled
ada0: 2861588MB (5860533168 512 byte sectors: 16H 63S/T 16383C)

looks like that would be 16.37TB (or 13.64TB after RAID-Z1)

#zpool list
NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
media 16.2T 3.57T 12.7T 21% 1.00x ONLINE /mnt


The FreeNAS Storage GUI says 12.8 TiB for the size and my clients say 12.8TB or 12.9TB for available space on the share.
 

froggie

Cadet
Joined
Jun 4, 2013
Messages
9
I also read some snippet where ZFS on Solaris will reserve 1/64 of the space on each drive for its own use. That would be about 256GB for my setup. That would still leave about 500GB missing.
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,402
Code:
FREE
12.7T


What's the output of zfs list?
 

titan_rw

Guru
Joined
Sep 1, 2012
Messages
586
I don't know about anyone else, but my "zpool list" numbers are messed up.

I have 10 3tb drives in raidz3. So capacity is 7x3tb raw. Actual usable capacity is about 17.8tb.

zfs list:

Code:
NAME                USED  AVAIL  REFER  MOUNTPOINT
nas1pool            11.8T  6.02T  905G  /mnt/nas1pool


There's other datasets, but USED + AVAIL = 17.8, as I expected.

Here's zpool list:

Code:
root@nas ~ # zpool list
NAME      SIZE  ALLOC  FREE    CAP  DEDUP  HEALTH  ALTROOT
nas1pool  27.2T  17.8T  9.47T    65%  1.00x  ONLINE  /mnt


There's no way the pool has a capacity of 27.2tb. Also, 17.8tb can't be allocated. That's the total capacity of the pool.

I've always ignored zpool list because the numbers were just plain wrong. Is it just me?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
The FreeNAS Storage GUI says 12.8 TiB for the size and my clients say 12.8TB or 12.9TB for available space on the share.

Mmm hmm. So the pool isn't missing any space, and that leads to the conclusion that it is just your understanding of the numbers, or client representational issues, etc., in question. No harm in that but it's complicated.

We've seen new users come in puzzled before, and what you should know is that ZFS is wicked more complex than your average filesystem. You can do all sorts of things to mess with it, including turning on compression, deduplication, multiple copy storage, snapshots, puttering with blocksizes, etc.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I don't know about anyone else, but my "zpool list" numbers are messed up.

I have 10 3tb drives in raidz3. So capacity is 7x3tb raw. Actual usable capacity is about 17.8tb.

zfs list:

Code:
NAME                USED  AVAIL  REFER  MOUNTPOINT
nas1pool            11.8T  6.02T  905G  /mnt/nas1pool


There's other datasets, but USED + AVAIL = 17.8, as I expected.

Here's zpool list:

Code:
root@nas ~ # zpool list
NAME      SIZE  ALLOC  FREE    CAP  DEDUP  HEALTH  ALTROOT
nas1pool  27.2T  17.8T  9.47T    65%  1.00x  ONLINE  /mnt


There's no way the pool has a capacity of 27.2tb. Also, 17.8tb can't be allocated. That's the total capacity of the pool.

I've always ignored zpool list because the numbers were just plain wrong. Is it just me?

Yes, it's just you. As I just finished typing, "ZFS is wicked more complex".

Your capacity is not 7x3TB raw. It is 10x3TB raw. "raw" means raw. You have 10 drives. They're probably allocating 2998445415936 bytes (see above) to ZFS. That's 29984454159360 total spanned bytes. Divide by 1099511627776 (1TB). No $#!+ ... it comes out to 27.2. Look at your zpool list size. Yes you have a 27.2T pool. ZFS says so. The math says so too.

So the thing is, if you're using RAIDz3, you're losing 30%. That means that while your POOL has allocated 17.8T, that's including the space required for the parity blocks. Assuming you haven't twiddled any special features like compression or snapshots, the amount of stored file data is around 11.9TB ... that's 17.8TB * .7 ... and lo and behold you just finished showing us that you have 11.8TB used. I hope I can be forgiven a .1TB rounding error. :eek:
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
oh and p.s. don't feel bad about having a rough time deciphering it. It becomes a necessary skill when dealing with a pool with dozens of datasets and reservations, and then other complications like compression, snapshots, and dedup.
 

froggie

Cadet
Joined
Jun 4, 2013
Messages
9
Screen shot 2013-06-12 at 9.33.15 PM.png


I put 5 of the drives into a 5 bay Drobo that I am also testing with. This is what it looks like, and it makes complete sense to me (raw and usable space). The zpool list command also makes sense showing us the raw space. The actual usable space for ZFS, however, is what puzzles me... I guess I just need to drink the koolaide and consider it magical, lol.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
It's not magical, but it can be fun to bash your head against. It really helps to compare apples to apples, such as looking only at what the ZFS command line utilities report, because adding in other random ways of displaying space just complicates things.

You saw above that it was trivial to decode what was going on with titan_rw's pool, but you kind of need to know how the puzzle works, and it can be substantially more complex than a single dataset. On one of the systems here, for example, it reports a given number of GB "AVAIL" for most of the datasets, but the actual usable free space on the pool is actually a larger number due to reservations. Turns out the true free space on the pool is the AVAIL space for one of the datasets plus the private reservations minus used space for the others... this is wicked annoying to figure out, so the cheat is to just look at "zpool list" and know how the pool was built. If you have no clue what I'm talking about, I'm happy for you, keep it that way, your head won't ache.
 

titan_rw

Guru
Joined
Sep 1, 2012
Messages
586
Ahh, you're right. I didn't know 'zpool list' was including parity.

I did have another example where jpaetzel on irc couldn't even figure out why a zvol was reporting what it was reporting. I'll see if I can recreate that.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Ahh, you're right. I didn't know 'zpool list' was including parity.

I did have another example where jpaetzel on irc couldn't even figure out why a zvol was reporting what it was reporting. I'll see if I can recreate that.

Easy peasy, twiddle with the reservation vs the volume size (google for "zfs sparse vdev") for lots of fun and confusion.
 

titan_rw

Guru
Joined
Sep 1, 2012
Messages
586
This isn't using sparse, as far as I know. I created a 900G zvol via the gui. Then I formatted it UFS. Then wrote 824G of data to it from /dev/zero simply with dd. Here's the results:

Code:
root@nas ~ # zfs list
NAME                USED  AVAIL  REFER  MOUNTPOINT
nas1pool/testzvol  2.16T  3.90T  2.16T  -
 
root@nas ~ # df -h
/dev/zvol/nas1pool/testzvol    885G    824G    -9G  101%    /mnt/test
 
root@nas ~ # du -sh /mnt/test/
824G    /mnt/test/
 
root@nas ~ # zfs get all nas1pool/testzvol
NAME               PROPERTY              VALUE                  SOURCE
nas1pool/testzvol  type                  volume                 -
nas1pool/testzvol  creation              Wed Jun 12 20:25 2013  -
nas1pool/testzvol  used                  2.16T                  -
nas1pool/testzvol  available             3.90T                  -
nas1pool/testzvol  referenced            2.16T                  -
nas1pool/testzvol  compressratio         1.00x                  -
nas1pool/testzvol  reservation           none                   default
nas1pool/testzvol  volsize               900G                   local
nas1pool/testzvol  volblocksize          8K                     -
nas1pool/testzvol  checksum              on                     default
nas1pool/testzvol  compression           off                    default
nas1pool/testzvol  readonly              off                    default
nas1pool/testzvol  copies                1                      default
nas1pool/testzvol  refreservation        928G                   local
nas1pool/testzvol  primarycache          all                    default
nas1pool/testzvol  secondarycache        all                    inherited from nas1pool
nas1pool/testzvol  usedbysnapshots       0                      -
nas1pool/testzvol  usedbydataset         2.16T                  -
nas1pool/testzvol  usedbychildren        0                      -
nas1pool/testzvol  usedbyrefreservation  0                      -
nas1pool/testzvol  logbias               latency                default
nas1pool/testzvol  dedup                 off                    default
nas1pool/testzvol  mlslabel                                     -
nas1pool/testzvol  sync                  standard               default
nas1pool/testzvol  refcompressratio      1.00x                  -
nas1pool/testzvol  written               2.16T                  -

 


Note that "du", "volsize", and "refreservation" all report what I'd call 'correct' numbers.

"used", "referenced", "usedbydataset", and "written" don't seem to be correct. There are no snapshots, as also referenced by "usedbysnapshots" being 0. Compression / dedupe is off. Copies is 1. Nothing sparse going on that I'm aware of. "zfs list" has weird numbers, as does "df". How do you get -9G free?

I was at a loss to explain the numbers. So was jpaetzel when I asked him about it on irc. There very well may be some things zfs is doing in the backend that I'm not aware of.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
If you create a FFS filesystem with a free space reservation and then go in and have root fill the filesystem, of course it gets more than 100% full, and since the free space is computed as "size - used", of course avail comes out negative if you use more than you're supposed to.

fix 1: don't do that as root. do it as a regular user.

fix 2: "umount /mnt/test; tunefs -m 0 /dev/zvol/nas1pool/testzvol; mount /dev/zvol/nas1pool/testzvol /mnt/test"

fix 1 is better.

I once looked at a similar problem for the used...etc...written stuff early on and determined that it was doing something confusing but internally consistent somehow. There's something somewhere on a blog post that talks about it but I can't find it.
 

titan_rw

Guru
Joined
Sep 1, 2012
Messages
586
So with fix 1, use a regular user to put data on the ufs drive?

I tried fix 2 on the current ufs zvol. That did change the reported size by 'df'. But the numbers in 'zfs get all nas1pool/testzvol' are still what I'd call 'weird'.

Should I try recreating the zvol and writing the data as a regular user?
 

titan_rw

Guru
Joined
Sep 1, 2012
Messages
586
Tried recreating it.

Created a 900G zvol. Formatted it as ufs. A df -h showed 814G available, so I created a 700G test file via dd as a regular user.

Code:
root@nas ~ # df -h
/dev/zvol/nas1pool/testzvol    885G    700G    114G    86%    /mnt/test
 
root@nas ~ # du -sh /mnt/test/
700G    /mnt/test/

All this seems right. Here is where the 'funny' numbers start:
Code:
root@nas ~ # zfs list
nas1pool/testzvol  1.84T  4.20T  1.84T  -
root@nas ~ # zfs get all nas1pool/testzvol
NAME              PROPERTY              VALUE                  SOURCE
nas1pool/testzvol  type                  volume                -
nas1pool/testzvol  creation              Thu Jun 13 11:34 2013  -
nas1pool/testzvol  used                  1.84T                  -
nas1pool/testzvol  available            4.20T                  -
nas1pool/testzvol  referenced            1.84T                  -
nas1pool/testzvol  compressratio        1.00x                  -
nas1pool/testzvol  reservation          none                  default
nas1pool/testzvol  volsize              900G                  local
nas1pool/testzvol  volblocksize          8K                    -
nas1pool/testzvol  checksum              on                    default
nas1pool/testzvol  compression          off                    default
nas1pool/testzvol  readonly              off                    default
nas1pool/testzvol  copies                1                      default
nas1pool/testzvol  refreservation        928G                  local
nas1pool/testzvol  primarycache          all                    default
nas1pool/testzvol  secondarycache        all                    inherited from nas1pool
nas1pool/testzvol  usedbysnapshots      0                      -
nas1pool/testzvol  usedbydataset        1.84T                  -
nas1pool/testzvol  usedbychildren        0                      -
nas1pool/testzvol  usedbyrefreservation  0                      -
nas1pool/testzvol  logbias              latency                default
nas1pool/testzvol  dedup                off                    default
nas1pool/testzvol  mlslabel                                    -
nas1pool/testzvol  sync                  standard              default
nas1pool/testzvol  refcompressratio      1.00x                  -
nas1pool/testzvol  written              1.84T                  -

Why does it show that it's using 1.84T? That's way more than the parity would make up for.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
So with fix 1, use a regular user to put data on the ufs drive?

Right.

I tried fix 2 on the current ufs zvol. That did change the reported size by 'df'. But the numbers in 'zfs get all nas1pool/testzvol' are still what I'd call 'weird'.

Should I try recreating the zvol and writing the data as a regular user?

No, it wasn't intended to fix that. It was just illustrating your error with FFS/UFS space reporting. Blocks allocated by FFS and written to the ZFS zvol device are written regardless of the UID, which is only dealt with in the FFS layer.

I don't recall what the issue with the ZFS reporting was, just that it was pretty easy to characterize and it had some sort of internal consistency, then I saw someone's blog post about it that led me to dismiss it as irrelevant anyways. You could try mentioning it on the freebsd-fs mailing list (not forum, mailing list) and see what the opinion over there is. If I wasn't pretty busy I'd try installing IllumOS to see what happened on that.
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,402
Why does it show that it's using 1.84T? That's way more than the parity would make up for.
Which is double of this:
Code:
nas1pool/testzvol  refreservation        928G                  local
Other people have ran into this before, but I was unable to reproduce the issue myself.
 
Status
Not open for further replies.
Top