Trying to make sense of odd dedupe performance

Mitch_DMZ · May 5, 2022

I've been deep-diving into Truenas and zfs recently, and I'm seeing some perplexing behavior specifically with dedupe.

With dedupe disabled, performance easily saturates the 1gbe connection in read and write for a 50gbe transfer, processer flutters up to 10-20%.
-deleted and Rebuilt dataset enabling dedupe-
Performance is good for much of the transfer, but intermittently drops to almost nothing for 30+ seconds on write, while hard drives thrash wildly. Memory and processor load appear unchanged.
-Added cheap 16gig ngff drive on usb as a dedicated ddt vdev-
Performance is back up to non-dedupe levels, with no more apparent spikes and hitches, hard drives are much happier.

I'm under the impression that the dedupe tables should be stored in ram when no dedicated vdev is available, However experience would indicate that this is not the case.
I feel like there is some part of the dedupe implementation that I'm fundamentally misunderstanding.

Any insight would be appreciated.

Thanks!

Tangentially: I'm hoping to scavenge a couple 10gig nics for further performance testing. 1gig is quite obviously my main bottleneck at the moment.

Hardware:
Dell r620, E5-2630(v1), 32gb memory, 3x mirrored vdevs of 2 10k rpm sas drives (6 total), zstd compression. +/- one cheap Kingston rbu-sns4151s3/16g on a usb adapter.

Disclaimer: This is a purely test system. I slapped some old hardware together specifically for experimenting.

HoneyBadger · May 5, 2022

Afternoon Mitch,

Short answer is "yes, you're fundamentally misunderstanding some things about ZFS dedup" - while the tables do live in RAM, they also need a permanent home, which is often the bottleneck on writes as they need to be updated for each new record. Without a special or dedup vdev, that permanent home is the pool vdevs, in your case your SAS drives. They're fast spinning disk, but still spinning disk - and DDT operations are almost exclusively 4K random I/O.

I'd recommend this excellent resource summary from @Stilez on deduplication performance for more details.

My experiments in building a home server capable of handling fast + consistent deduplication

AIM: To help people looking at deduplication on TrueNAS 12+, what I've found on the way making it work on mine. On sustained mixed loads, such as 50GB+ file copies and multiple transfers, using TrueNAS 12 with a deduped pool and default config...

www.truenas.com

I'd also like to give a thumbs-up for this line:

Mitch_DMZ said:
Disclaimer: This is a purely test system. I slapped some old hardware together specifically for experimenting.

Absolutely the right approach when learning.

Mitch_DMZ · May 5, 2022

Ahoy,

In ZFS, metadata writes themselves generate further writes. Changing a block changes its checksum, which has to be stored in a block, which changes *that* blocks checksum, and alters free space, which alters the spacemap data, which alters *its* checksums, which have to be stored, which alters other metadata checksums.
...
When a pool uses dedup, every single block read requires lookup reads in the dedup table. Every single block written requires both dedup table reads beforehand (when reading what's on disk already), and then also requires dedup table updates at writeout

This is precisely what I was missing, Thanks much @HoneyBadger ! And a big thanks to @Stilez for the writeup on their own experiments with deduplication.

Important Announcement for the TrueNAS Community.

Trying to make sense of odd dedupe performance

Mitch_DMZ

Cadet

HoneyBadger

actually does care

My experiments in building a home server capable of handling fast + consistent deduplication

Mitch_DMZ

Cadet