According to the manual, ZFS changes the record size dynamically if not configured to a fixed value like 1M, if I understand it correctly:
ZFS does use a dynamic recordsize, but it can only be smaller than the recordsize property you have set. So by default it cant go higher than 128KiB unless you raise the recordsize property. FWIW it can also only be set to powers of 2.
This is kind of irritating. I understand that it may be hard to calculate the free space, but why can the ocupied space not be calculated and displayed exactly?
It can I guess, it's just not implemented.
Well is the occupied space really meaningful if the capacity and free space can't be calculated?
The capacity and free space are not just difficult to calculate, they are impossible to calculate since the amount of capacity you have depends on how much data you end up writing to the different record sizes. You can't predict ahead of time how much data you are going to write to each recordsize dataset.
So remember, when I write a 10GiB file for instance to my 12-wide RAIDZ2, zfs used shows 9GiB "USED" for that file.
(disk usage) also shows 9GiB.
But remember, our total capacity of the pool is too low because it subtracts the overhead because it assumes 128KiB records. So if ZFS showed how much disk space was actually being used by the file(s), then we would get to the point where zfs says we are using up more disk space than the capacity of the pool and that's just confusing.
ZFS also can't really assume 1MiB records to calculate the capacity because it can't predict that you might fill it with 128KiB records and if you do that then the pool will become completely full even though it says you have hundreds of GiB or even more than 1TiB free still. It's better to underestimate than to overestimate.
So to try to reiterate. You can see the actual size of the file with
You can also see the amount of disk space the filesystem thinks it's using, but that number needs to relate to the total capacity of the filesystem or else everything gets screwy. So since the pool capacity is smaller than it really really is, the 1MiB recordsize files need to show up smaller than they really are and the 128KiB recordsize files need to show up exactly how large they are.
Say you have a 10TiB "capacity" pool. Best case is you write 10TiB of 128KiB recordsize files and you fit exactly 10TiB of files on the pool. Worse case is you write 10TiB of 1MiB recordsize files to the pool and you are left wth 1TiB free. Nobody is gonna complain about that extra free space.
The other option is you instead state that the pool "capacity" is 11TiB. The best case is you write 11TiB of 1MiB recordsize files as you expect and it fills up the pool. Worst case is you queue up 10.5TiB of 128KiB recordsize files to this 11TiB capacity pool, but it fails after writing only 10TiB as the pool ends up full because the 128KiB records can't be efficiently stored on that number of disks in the RAIDZ2 and it had to add padding.
This is obviously problematic as the admin needs to know that the data he plans to write will fit ahead of time without having to do extra math himself.