compression: slows down zfs+server or just the humans accessing files?

guermantes

Patron
Joined
Sep 27, 2017
Messages
213
The manual is not clear on this topic I think.

Does choosing a high compression level affect
1) scrubs, replications (smart tests?) and zfs housekeeping, thus slowing down the server as a whole,
2) or just the read/write times when humans want to access a compressed file,
3) or both?
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
I believe it's 3, according to the Guide, Section 10.2.10.2:

ZFS automatically compresses data as it is written to a compressed dataset or zvol and automatically decompresses that data as it is read.​
Any reads/writes in a compressed dataset will incur compression overhead.
 

guermantes

Patron
Joined
Sep 27, 2017
Messages
213
I believe it's 3, according to the Guide, Section 10.2.10.2:

ZFS automatically compresses data as it is written to a compressed dataset or zvol and automatically decompresses that data as it is read.​
Any reads/writes in a compressed dataset will incur compression overhead.

Yes, I read that section, but I find it ambiguous, as "read" could also be taken to mean accessed by a human. It seems counter-intuitive to me that ZFS would care about the decompressed data when it comes to parity checking and housekeeping tasks. Would not ZFS have more reason to be concerned with the compressed ones and zeros actually being stored on the drives?
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
See this STH test of various ZFS compression algorithms.
 

subhuman

Contributor
Joined
Nov 21, 2019
Messages
121
SMART tests are definitely not, as they're done by the drive's hardware and not FreeNAS (or any OS for that matter).
Someone can feel free to enlighten me, but I can't see a scenario where a scrub would need to decompress data- meaning I expect it's not affected by compression level (directly). Indirectly, a scrub should theoretically be faster with higher compression, since scrubs only check used disk space and higher compression should mean less used disk space.
As for replication, if the dataset and the data stream use different compression, then I expect it must decompress and compress the data. I would hope there's a check that if the dataset and data stream use the same compression that it wouldn't waste CPU cycles decompressing data then immediately re-compressing it with the same algorithm.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
See this STH test of various ZFS compression algorithms.
Side note, the STH test is using Optane and SATA SSDs, but it's often even more beneficial when your backing vdevs are spinning disks and compression levels are high. In the sample here they got nearly 2x compression - so your HDDs are only receiving half as much workload as they would without compression.
 
Top