How is free space in ZFS determined?

CLEsportsfan

Dabbler
Joined
Nov 23, 2020
Messages
20
After reading this path to block storage success link, I have a question on how exactly ZFS determines "free space" as far as performance goes.... Does it go by un-allocated space on the drives or unused space in allocated sections?

For example: If I have a 10TB pool (let's say two 10TB drives in a mirror) and I create a 8TB volume for ESXi datastore, but I only have 5TB of files stored on it. What does ZFS consider ais my "free space"? Either the 2TB unallocated, 3TB unused in the volume, or the 5TB free between drive size and what's actually used on the drive?

Obviously fee space for ZFS is a good thing for performance. Your responses will basically tell me if it's better to have a volume size only slightly bigger than the files I have to store or if it's better to create the biggest volume I can regardless of how many files I'll have.

Sorry if this is too much of a "newb" question. I searched around the forum and couldn't find a specific answer.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
TL;DR make multiple sparse zvols holding VMFS6 datastores of 1-2TB each, scale up once you determine how well your data compresses but don't overcommit your storage unless you are 100% certain you aren't shooting yourself in the foot.

Longpost:

From a "performance" perspective, you'll start with all of your space being "free" because the zvol will have no data inside it. ZFS will have no records written to disk. From a logical perspective your pool is 20% free, but for performance it will be as if you were 100% free.

Once you write that 5TB of files into it, the next step depends on how well the compression handles it. If you get no compression at all (very unlikely on regular VM files) then you have 5TB of records written. Your pool is now 50% free. If it compresses 1.25:1 then you only write 4TB of physical data for that 5TB of logical, and your pool is 60% free.

The next bit of fun comes when you start making changes/updates/deletes from that VMFS filesystem. If it's a thick provisioned zvol (as they are by default) then there's no ability to reclaim that free space. Due to the copy-on-write nature of ZFS, after another 5TB (compressing to 4TB) of changes/updates/deletes, the zvol is fully allocated. VMFS thinks it's writing into "empty space" but ZFS is still protecting the old logical VMFS block/LBA and has to consider a "new write" as "overwrite" - the pool is treated as 20% free, and performance is suffering.

This is why you use thin provisioned or "sparse" zvols - when those are used in combination with VMFS6 (or VMFS5 with manual space reclamation) your change/updates/deletes in VMFS are passed down to the ZFS layer as SCSI UNMAP commands. ZFS can then free up the old/stale/dirty records that have newer versions and make truly free space. While this doesn't negate the fragmentation effects of copy-on-write, it means it can now see nice clear slabs of empty space to write into. All 5TB of your data changes? You might have a surge of consumption, but over time the UNMAP/reclamation will trend it back to the original 4TB it compresses to. Your pool is still 60% free, and it keeps that steady level of performance.

However, since you don't know how well your data compresses, it's a good idea not to spend all your bits in one place. Start with a 1TB/2TB zvol, apply VMFS6, and throw some data on it. See how much space is actually consumed at the ZFS level. Repeat as necessary.
 

CLEsportsfan

Dabbler
Joined
Nov 23, 2020
Messages
20
Thank you very much for the information! The VMs we're migrating are already thick provisioned right now, so I'll have to research into how to convert them if it's possible. Might have to just be the data drives attached to the VMs if I can't convert the vmdks.
 
Top