Theoretical: dedupe AND compression?

Status
Not open for further replies.

AgentZero

Dabbler
Joined
Jan 7, 2013
Messages
24
Before saying that dedupe is a resource hog and should never be used in ZFS, let's just assume that the volume in question is no larger than 1TB and the server has ample RAM - 64GB. Let's also assume that that it is storing virtual machines that are for the most part very similar and the expected dedupe ratio based on testing is at least 1.9+

Further, let's assume the volume is comprised of mirrored pairs of 15K SAS with SSD for L2ARC and ZIL and is being presented to VMware via NFS.

Now the hypothetical question: Deduplication works by storing pointers when duplicate blocks are detected. If two blocks are identical in raw format, then they would also be identical when compressed. My question is really - which operation occurs first in the IO stream? Does the dedupe engine checksum the block first then compress it if found to be unique and written to disk? Or is the block compressed, then checksumed against the DDT?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Before a write, the order of precedence is:
  • Compression – Data is compressed (if enabled)
  • Encryption – Data is encrypted (if enabled) (N/A for the open source version of ZFS)
  • Checksum – Checksum is created for data
  • Deduplication – Duplicated data sections are removed
 

fracai

Guru
Joined
Aug 22, 2012
Messages
1,212
Dedup occurs after encryption? Wouldn't this mean those are incompatible? Oh wait, if this is N/A for the ZFS that FreeNAS uses, I presume that GELI encryption occurs as the final step?
 

AgentZero

Dabbler
Joined
Jan 7, 2013
Messages
24
Thanks CJ - I was hoping it was the other way around...dedupe occurring first such that any duplicates are removed, and the additional encrypt and\or compression were not performed.
 
Status
Not open for further replies.
Top