Fast ZFS Dedup - Sponsorship Request

HoneyBadger · Jul 7, 2023

It’s no secret that ZFS and deduplication have had performance issues in the past. Use cases that demand high read performance have aligned well with deduplication, given the compressed and deduplicated data is only stored once in ZFS read cache. Sustained write performance however has presented challenges. The overhead of managing dedup metadata has restricted performance and limited pool sizes. iX has started a new “Fast Dedup” project to address both the performance and scalability limits.

Previous Dedup Work

Special vdevs in TrueNAS 12 added a dedicated group of SSDs to hold the deduplication table, increasing performance and expanding acceptable use-cases. With a dedup vdev in place, the small random I/O patterns of the deduplication table (DDT) could be quickly handled by a few high-performance SSDs rather than compete with data writes to the pool HDDs. Our own TrueNAS Community users conducted research and shared detailed results of configurations that worked well with their workloads. While dedup write performance was improved, several key limitations remained, such as the inherent write amplification of updating the DDT. Even with a dedup vdev, if the DDT did not fit entirely in RAM, performance of the ZFS pool is still much lower.

Proposals have been written previously for methods to improve deduplication, including Matt Ahrens’ Dedup Performance paper presented at the OpenZFS Developer Summit. Some of the ideas in this proposal are being used in the new project.

The Fast Dedup Project

iXsystems and Klara Systems have started a new Fast Dedup project in conjunction with the OpenZFS community. The work is underway and will require significant resources to complete.

Fast Dedup will include a rearchitecting of the DDT to improve performance. The new Fast Dedup Tables (FDT) have several significant new properties:

Dynamically sized to fit into RAM,
Broken into smaller, more manageable chunks for efficient updating,
Automatically prune non-dedupable data for better RAM efficiency
Massively reduce write amplification by using a log-based write process

All of these will result in a significant increase in performance when compared to the existing dedup experience.

iXsystems is looking for sponsors and collaborators who could benefit from this new Fast Dedup and leverage it to make new storage use-cases possible or improve on the performance of existing use-cases. If this sounds like your organization, click here to share your interest. We’ll then send you a sponsorship information package and answer any questions you might have.

Sponsorship contributions of between $5,000 and $30,000 are sought from commercial users of the free OpenZFS or the free TrueNAS software. There is no expectation that home or small users will contribute. TrueNAS Enterprise users have already effectively contributed via their business relationship with iX.

With funding, we aim to complete development this year. Once completed, Fast Dedup will be merged into OpenZFS and introduced as part of a TrueNAS update in 2024.

NickF · Jul 7, 2023

How does this align with the work that's being done with Block Cloning:

Pawel Dawidek - Block Cloning Design.pdf

drive.google.com

Block Cloning by Pawel Jakub Dawidek

From the 2022 OpenZFS Developer Summit: https://openzfs.org/wiki/OpenZFS_Developer_Summit_2022Slides: https://drive.google.com/file/d/1eyvv_5madwBwlianA-Rb04...

youtu.be

morganL · Jul 7, 2023

NickF said:
How does this align with the work that's being done with Block Cloning:

Pawel Dawidek - Block Cloning Design.pdf

drive.google.com

Block Cloning by Pawel Jakub Dawidek

From the 2022 OpenZFS Developer Summit: https://openzfs.org/wiki/OpenZFS_Developer_Summit_2022Slides: https://drive.google.com/file/d/1eyvv_5madwBwlianA-Rb04...

youtu.be

Block Cloning only works when files/blocks are deliberately copied within a pool. This might happen with a copy command like VAAI, SMB. Its an easier process to make efficient.

Fast dedup will work if two clients store the same or similar files... no copy process involved.

The lead engineer on Block Cloning is also involved in the Fast Dedup design.

NickF · Jul 7, 2023

That’s great news! so we’ll eventually get the pleasure of two good opinions!

Chin-Fah HEOH · Jul 11, 2023

The dedup thing has been a bugbear for me whilst engaging in the field, and I have been interested in how TrueNAS can enhance DFR (data footprint reduction) without the performance pains. We do have a few TrueNAS customers in Asia who have gotten better capacity reduction ratio using compression compared to dedupe of other brands like NetApp.

I have also followed Pawel's Block Cloning for 2 OpenZFS summits now , and would be interesting to see how his work gets pushed downstream.

Either or, Dedup, Compress (whichever algorithm) and even Block Cloning, and even if there is a Compaction tech for OpenZFS, I want something we can crow about rather than being pinned in the corner by TrueNAS competitors all the time. I am following this thread.

winnielinnie · Jul 28, 2023

morganL said:
Block Cloning only works when files/blocks are deliberately copied within a pool. This might happen with a copy command like VAAI, SMB. Its an easier process to make efficient.

But even still, this will have to be explicitly done by the user / application? For example, using SMB or "cp" will still make a copy of the file in the traditional method? You would have to explicitly invoke something like "--reflink=always" to leverage block cloning, yes?

morganL · Jul 28, 2023

winnielinnie said:
But even still, this will have to be explicitly done by the user / application? For example, using SMB or "cp" will still make a copy of the file in the traditional method? You would have to explicitly invoke something like "--reflink=always" to leverage block cloning, yes?

Yes, the higher level software (e.g Samba) has to use the copying APIs..... so there is plumbing to do beyond OpenZFS,

The general dedupe doesn't need any plumbing, but is not as efficient from a compute/metadata perspective.

morganL · Aug 11, 2023

Arencies said:
The deduplication table (DDT)'s modest random I/O patterns might be easily handled by a few high-performance SSDs with a dedup vdev in place as opposed to competing with data writes to the pool HDDs.

That logic works if the DDT fits within RAM.....not when it overflows.

HoneyBadger · Aug 11, 2023

Arencies said:
The deduplication table (DDT)'s modest random I/O patterns might be easily handled by a few high-performance SSDs with a dedup vdev in place as opposed to competing with data writes to the pool HDDs.

While dedicated dedup vdevs with high-performance SSDs do help, it's important to quantify a bit what constitutes "high-performance" in this case, as DDT I/O patterns can be very far from "modest" during deletes that cause large table updates:

Note that this is roughly 1GB/s of throughput at 4KB sizes; not something that's easy to do.

Fast Dedup will significantly help this through the log-based updates, as well as through overall table size reduction and management.

morganL · Feb 19, 2024

This development project is now complete. I will lock the thread.

Next phase is integration with OpenZFS 2.3

OpenZFS "Fast Dedup" Project now in Public Review

The collaborative project between Klara and iXsystems on "Fast Dedup" has been completed and presented as a series of pull requests (PR's) to the OpenZFS Github, ready for public review We're targeting this "Fast Dedup" functionality to release, hopefully alongside RAIDZ expansion, with TrueNAS...

www.truenas.com

Important Announcement for the TrueNAS Community.

Fast ZFS Dedup - Sponsorship Request

HoneyBadger

actually does care

NickF

Guru

Pawel Dawidek - Block Cloning Design.pdf

Block Cloning by Pawel Jakub Dawidek

morganL

Captain Morgan

Pawel Dawidek - Block Cloning Design.pdf

Block Cloning by Pawel Jakub Dawidek

NickF

Guru

Chin-Fah HEOH

Dabbler

winnielinnie

MVP

morganL

Captain Morgan

morganL

Captain Morgan

HoneyBadger

actually does care

morganL

Captain Morgan

OpenZFS "Fast Dedup" Project now in Public Review

Similar threads

Important Announcement for the TrueNAS Community.

Fast ZFS Dedup - Sponsorship Request

actually does care

Guru

Captain Morgan

Guru

Dabbler

MVP

Captain Morgan

Captain Morgan

actually does care

Captain Morgan

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Fast ZFS Dedup - Sponsorship Request"

Similar threads