Using 100% of the disks spaces for iscsi

fabioteixei

Dabbler
Joined
Feb 10, 2022
Messages
11
Hi there.

I searched but I still not get why I can't use 100% of the available space on my disks to create vDevs and zvols.

I have 5 480 GB SSD and I like to use them in a kind of JBOD storage for storing Hyper-V VHDs using iSCSI.

I'm not as of now worried about parity. I use an external Backup for safety and it's for my home lab environment. If I lost any data I can rebuild the whole lab and all of the VMs with no problem.

Can anyone please help me understand why I can't use all the available space on the disks?

What's the recommended configuration for this kind of setup?

Thanks.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Your questions and the inevitable ones you didn't ask are answered in

 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
It's worth pointing out that filling up a filesystem to 100% is not a recipe for success on any filesystem, with any workload.
 

fabioteixei

Dabbler
Joined
Feb 10, 2022
Messages
11
Your questions and the inevitable ones you didn't ask are answered in

I have read that but still the doubt remains.

In that article it's said to it's ideal create an zvol with 50% only. So it means that I will throw away 50% of my disk space?

I don't think it's feasible to let 50% of my total disk space inaccessible. It's like throwing away 50% of the money I paid for the disks on the trash.

I intend to use my Truenas only for iSCSI. And I know that il will not use all the space all the time but blocking access to the possibility of using that space in a case of an emergency (moving data from one server to another for instance) don't seems as a good use of my hardware.

Can't I at least create 2 zvols each with 50%?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
“Throwing money away” is a rather simplistic and inaccurate analysis. Context matters, and fundamentally this is a matter of trading storage capacity for performance, through the magic of computer science. Fill it up more and performance will degrade substantially. Fill it to 95% and it’ll be like watching paint dry. Fill it to 100% and you’ll have a mess on your hands.

And who said anything about blocking storage space?
 

fabioteixei

Dabbler
Joined
Feb 10, 2022
Messages
11
“Throwing money away” is a rather simplistic and inaccurate analysis. Context matters, and fundamentally this is a matter of trading storage capacity for performance, through the magic of computer science. Fill it up more and performance will degrade substantially. Fill it to 95% and it’ll be like watching paint dry. Fill it to 100% and you’ll have a mess on your hands.

And who said anything about blocking storage space?
What I mean by blocking it's that 50% or even 20% off all my disk space will be unavailable if I follow the recommendations.

I know I can force the size to 100% if I want to, and in the end I don't really have to use Truenas or even iSCSI. I can do an SMB 3.0 share and map it to my Hyper-V server since it's a supported source for Hyper-V.

Well, anyways, thanks for your help. I will evaluate my options.

But you guys says that getting to 95% or 100% can get me in trouble. Can you expand on that?

Remember, it's a home lab with no real important data (maybe my Plex VM movie library) and with an external Backup. So resilience it's not a concern for this use case.

Again, thanks very much.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Regardless of the protocol used to share it (SMB/NFS/iSCSI) filling a ZFS pool to 100% will do Very Bad Things.

This is because of the nature of ZFS as a "transactional" and "copy-on-write" filesystem. What this boils down to is that there's no such thing as a "partial write" - similar to how a SQL or other transaction database won't perform an operation unless the entire transaction can be committed to the DB tables, ZFS won't write a new record, change an existing, or mark one for deletion unless there's enough space to write a new copy of the necessary data and/or metadata indicating as such - and then commit that metadata/pool state change all the way up the tree, until finally changing the uberblock to say "the new pool state is valid."

So if you fill the filesystem to a true 100% full, there's no space for ZFS to indicate "hey, I'd like to delete this 128K record" because it doesn't have a way to keep the pool's "current state" valid/immutable for the past transaction (it can't overwrite or delete in-place) while writing the metadata to say "delete record XYZ" for the "future state."

In regards to the other fill levels - with block storage (or SMB being treated as block-equivalent by serving VHD(X) files) the challenge is that you'll end up with fragmentation. Using all NAND lets you avoid the latency penalty of physical disk seeks, but you'll still likely see some degradation in write performance if you manage to outrun the garbage-collection routines on your SSD and have to write to dirty/partially used blocks. Leaving some free space lets the SSD write into unmapped space which is faster. 50% was Ye Olde Thumbrule for when you'd start to see noticeable pain on spinning disks. NAND you can usually push higher than that, but as mentioned before watch out for the GC routines. Better SSDs tend to be able to push closer to the wall; it depends on their firmware, amount of internal overprovisioning, etc. If your SSDs are Intel DC/HGST/etc you may have no problems until 80%+ - if they're SuperHappyFunBee from the Amazon bargain bin, less so.

But here's what you can do.

ZFS does inline compression very well using LZ4 or ZSTD (former tends to be faster, latter tends to have better compression - test with your dataset!) so you can certainly create a sparse ZVOL that's around half the size of your pool. You're striping 5x480G so you'll get roughly 2.3T usable in the pool, make a 1T sparse ZVOL, and then start loading data on it. Compare the logical size of data that you put on it (VHD allocated sizes) and see what kind of compression numbers you get. If you're getting a relatively conservative 1.33:1 compression ratio, that lets you make another 1T ZVOL and only use a grand total of about 1.5T of actual NAND to hold 2T of VHDs. Well under the margin of error.

Warning, here be potential dragons.

If you get better compression, and/or you're absolutely confident that you won't mess something up you can decide to overcommit storage by adding a third 1T ZVOL (3T logical) and make the necessary blood sacrifice to the compression gods to fit that into 2.25T, just squeaking into that 2.3T physical space. But if you're running a 5-drive stripe you're probably okay with some risk anyways. ;)

Cheers
 

fabioteixei

Dabbler
Joined
Feb 10, 2022
Messages
11
Regardless of the protocol used to share it (SMB/NFS/iSCSI) filling a ZFS pool to 100% will do Very Bad Things.

This is because of the nature of ZFS as a "transactional" and "copy-on-write" filesystem. What this boils down to is that there's no such thing as a "partial write" - similar to how a SQL or other transaction database won't perform an operation unless the entire transaction can be committed to the DB tables, ZFS won't write a new record, change an existing, or mark one for deletion unless there's enough space to write a new copy of the necessary data and/or metadata indicating as such - and then commit that metadata/pool state change all the way up the tree, until finally changing the uberblock to say "the new pool state is valid."

So if you fill the filesystem to a true 100% full, there's no space for ZFS to indicate "hey, I'd like to delete this 128K record" because it doesn't have a way to keep the pool's "current state" valid/immutable for the past transaction (it can't overwrite or delete in-place) while writing the metadata to say "delete record XYZ" for the "future state."

In regards to the other fill levels - with block storage (or SMB being treated as block-equivalent by serving VHD(X) files) the challenge is that you'll end up with fragmentation. Using all NAND lets you avoid the latency penalty of physical disk seeks, but you'll still likely see some degradation in write performance if you manage to outrun the garbage-collection routines on your SSD and have to write to dirty/partially used blocks. Leaving some free space lets the SSD write into unmapped space which is faster. 50% was Ye Olde Thumbrule for when you'd start to see noticeable pain on spinning disks. NAND you can usually push higher than that, but as mentioned before watch out for the GC routines. Better SSDs tend to be able to push closer to the wall; it depends on their firmware, amount of internal overprovisioning, etc. If your SSDs are Intel DC/HGST/etc you may have no problems until 80%+ - if they're SuperHappyFunBee from the Amazon bargain bin, less so.

But here's what you can do.

ZFS does inline compression very well using LZ4 or ZSTD (former tends to be faster, latter tends to have better compression - test with your dataset!) so you can certainly create a sparse ZVOL that's around half the size of your pool. You're striping 5x480G so you'll get roughly 2.3T usable in the pool, make a 1T sparse ZVOL, and then start loading data on it. Compare the logical size of data that you put on it (VHD allocated sizes) and see what kind of compression numbers you get. If you're getting a relatively conservative 1.33:1 compression ratio, that lets you make another 1T ZVOL and only use a grand total of about 1.5T of actual NAND to hold 2T of VHDs. Well under the margin of error.

Warning, here be potential dragons.

If you get better compression, and/or you're absolutely confident that you won't mess something up you can decide to overcommit storage by adding a third 1T ZVOL (3T logical) and make the necessary blood sacrifice to the compression gods to fit that into 2.25T, just squeaking into that 2.3T physical space. But if you're running a 5-drive stripe you're probably okay with some risk anyways. ;)

Cheers
Thanks great info.

And how about deduplication?

My system it's a 2x Xeon with 48G of RAM. Because I will be using to host VHDs, almost all off them Windows on the exact same version I think maybe deduplication can give even more available space but I'm not sure about possibility of data loss or performance impact.

You guys have any thoughts on using Dedup in this scenario?

Again thanks for the help. Because it's a lab environment it's in this environment the best place to test the possibilities.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
And how about deduplication?

Short answer is "don't" - deduplication is generally not recommended unless you have a lot of resources to throw at it (not just RAM, but low-latency storage such as Optane for the metadata/deduplication tables) - @Stilez has written a couple of excellent resources chronicling their adventures:



I would stick with just seeing what results you get from LZ4 or maybe ZSTD compression (bias towards space-savings vs performance) - if you do choose to experiment with deduplication (it's a homelab, experimentation is what they're for, right?) I'd suggest splitting your virtual image into separate disks for OS/Application/Data and only storing the OS disk on the dedup-enabled ZVOL. You'll still be subject to the bathtub curve of performance on non-Optane SSDs but there will be less data to index.
 
Top