Fast ZFS Dedup - Sponsorship Request

Status
Not open for further replies.

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
It’s no secret that ZFS and deduplication have had performance issues in the past. Use cases that demand high read performance have aligned well with deduplication, given the compressed and deduplicated data is only stored once in ZFS read cache. Sustained write performance however has presented challenges. The overhead of managing dedup metadata has restricted performance and limited pool sizes. iX has started a new “Fast Dedup” project to address both the performance and scalability limits.

Previous Dedup Work

Special vdevs in TrueNAS 12 added a dedicated group of SSDs to hold the deduplication table, increasing performance and expanding acceptable use-cases. With a dedup vdev in place, the small random I/O patterns of the deduplication table (DDT) could be quickly handled by a few high-performance SSDs rather than compete with data writes to the pool HDDs. Our own TrueNAS Community users conducted research and shared detailed results of configurations that worked well with their workloads. While dedup write performance was improved, several key limitations remained, such as the inherent write amplification of updating the DDT. Even with a dedup vdev, if the DDT did not fit entirely in RAM, performance of the ZFS pool is still much lower.

Proposals have been written previously for methods to improve deduplication, including Matt Ahrens’ Dedup Performance paper presented at the OpenZFS Developer Summit. Some of the ideas in this proposal are being used in the new project.

The Fast Dedup Project

iXsystems and Klara Systems have started a new Fast Dedup project in conjunction with the OpenZFS community. The work is underway and will require significant resources to complete.

Fast Dedup will include a rearchitecting of the DDT to improve performance. The new Fast Dedup Tables (FDT) have several significant new properties:
  • Dynamically sized to fit into RAM,
  • Broken into smaller, more manageable chunks for efficient updating,
  • Automatically prune non-dedupable data for better RAM efficiency
  • Massively reduce write amplification by using a log-based write process

All of these will result in a significant increase in performance when compared to the existing dedup experience.

1688763013602.png


iXsystems is looking for sponsors and collaborators who could benefit from this new Fast Dedup and leverage it to make new storage use-cases possible or improve on the performance of existing use-cases. If this sounds like your organization, click here to share your interest. We’ll then send you a sponsorship information package and answer any questions you might have.

Sponsorship contributions of between $5,000 and $30,000 are sought from commercial users of the free OpenZFS or the free TrueNAS software. There is no expectation that home or small users will contribute. TrueNAS Enterprise users have already effectively contributed via their business relationship with iX.

With funding, we aim to complete development this year. Once completed, Fast Dedup will be merged into OpenZFS and introduced as part of a TrueNAS update in 2024.
 

NickF

Guru
Joined
Jun 12, 2014
Messages
763

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
How does this align with the work that's being done with Block Cloning:


Block Cloning only works when files/blocks are deliberately copied within a pool. This might happen with a copy command like VAAI, SMB. Its an easier process to make efficient.

Fast dedup will work if two clients store the same or similar files... no copy process involved.

The lead engineer on Block Cloning is also involved in the Fast Dedup design.
 

Chin-Fah HEOH

Dabbler
Joined
Dec 14, 2016
Messages
22
The dedup thing has been a bugbear for me whilst engaging in the field, and I have been interested in how TrueNAS can enhance DFR (data footprint reduction) without the performance pains. We do have a few TrueNAS customers in Asia who have gotten better capacity reduction ratio using compression compared to dedupe of other brands like NetApp.

I have also followed Pawel's Block Cloning for 2 OpenZFS summits now , and would be interesting to see how his work gets pushed downstream.

Either or, Dedup, Compress (whichever algorithm) and even Block Cloning, and even if there is a Compaction tech for OpenZFS, I want something we can crow about rather than being pinned in the corner by TrueNAS competitors all the time. I am following this thread.
 
Joined
Oct 22, 2019
Messages
3,641
Block Cloning only works when files/blocks are deliberately copied within a pool. This might happen with a copy command like VAAI, SMB. Its an easier process to make efficient.
But even still, this will have to be explicitly done by the user / application? For example, using SMB or "cp" will still make a copy of the file in the traditional method? You would have to explicitly invoke something like "--reflink=always" to leverage block cloning, yes?
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
But even still, this will have to be explicitly done by the user / application? For example, using SMB or "cp" will still make a copy of the file in the traditional method? You would have to explicitly invoke something like "--reflink=always" to leverage block cloning, yes?

Yes, the higher level software (e.g Samba) has to use the copying APIs..... so there is plumbing to do beyond OpenZFS,

The general dedupe doesn't need any plumbing, but is not as efficient from a compute/metadata perspective.
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
The deduplication table (DDT)'s modest random I/O patterns might be easily handled by a few high-performance SSDs with a dedup vdev in place as opposed to competing with data writes to the pool HDDs.

That logic works if the DDT fits within RAM.....not when it overflows.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
The deduplication table (DDT)'s modest random I/O patterns might be easily handled by a few high-performance SSDs with a dedup vdev in place as opposed to competing with data writes to the pool HDDs.
While dedicated dedup vdevs with high-performance SSDs do help, it's important to quantify a bit what constitutes "high-performance" in this case, as DDT I/O patterns can be very far from "modest" during deletes that cause large table updates:

img_20200813_181408_648-jpg.40782


Note that this is roughly 1GB/s of throughput at 4KB sizes; not something that's easy to do.

Fast Dedup will significantly help this through the log-based updates, as well as through overall table size reduction and management.
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
This development project is now complete. I will lock the thread.

Next phase is integration with OpenZFS 2.3

 
Status
Not open for further replies.
Top