If I have a metadata vdev, do I need a dedup vdev?

SwisherSweet

Contributor
Joined
May 13, 2017
Messages
139
I'm running a simple backup server with RAID Z2 across 6 SATA drives. I also use de-duplication on my pool as I do store a good amount of duplicate data. I want to add a metadata vdev but read somewhere that metadata vdevs also store de-duplication. If this is true, is there really a need to add a separate dedupe vdev and if so, why?

Again, this is just a simple backup server for a home office. Mostly time machine backups and a dumping ground for less important while I sort through it.

Thank you.
 
Last edited:

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
No, you don't need a separate de-dup vDev if you have a Metadata vDev. In theory, you can select what to store in a ZFS Special vDev. Like Metadata, small files, or de-dup.

One word of caution. It is highly recommended to have the same level of redundancy in any special vDev as the data vDevs. In your case, using RAID-Z2 you would need 3 physical storage devices for your special vDev. That way, you can loose any 2 devices in either vDev and not loose data. Loss of a special vDev will generally mean entire pool loss, requiring a complete rebuild and full restore from backups.

A cheaper option is to use a L2ARC device for de-dup. Since this is a copy of the de-dup table from the data pool, loss of the L2ARC device is not fatal. Just slows things down. There are some caveats, like using persistent L2ARC can help keep the speed after reboot.
 

SwisherSweet

Contributor
Joined
May 13, 2017
Messages
139
No, you don't need a separate de-dup vDev if you have a Metadata vDev.
Thank you for confirming this.

In theory, you can select what to store in a ZFS Special vDev. Like Metadata, small files, or de-dup.
If I just create a "Metadata" vdev for my pool in the TrueNAS Core GUI, what will my special vdev store?

you would need 3 physical storage devices for your special vDev
Noted, thank you.

A cheaper option is to use a L2ARC device for de-dup. Since this is a copy of the de-dup table from the data pool, loss of the L2ARC device is not fatal. Just slows things down. There are some caveats, like using persistent L2ARC can help keep the speed after reboot.
I plan on also using an nvme drive as an L2ARC for my pool in addition the the metadata vdev. Again, since I'm using the GUI to create the Cache (L2ARC) pool, what is stored in cache vdev by default (such as de-dup tables)?

Thanks again!
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
I'm not convinced that what has been said here is true... I think without a dedup VDEV, you don't get dedup data guaranteed to be written to the special VDEV you already have. See here: https://www.truenas.com/community/t...store-deduplication-tables.109528/post-756688 and the post immediately following it. Although to be fair, the posts after that all seem to be supporting that it does go to the metadata VDEV... I remain a skeptic.

If I just create a "Metadata" vdev for my pool in the TrueNAS Core GUI, what will my special vdev store?
Nothing but metadata (until you set the Metadata (Special) Small Block Size of one or more of your datasets):
This value represents the threshold block size for including small file blocks into the special allocation class. Blocks smaller than or equal to this value will be assigned to the special allocation class while greater blocks will be assigned to the regular class. Valid values are zero or a power of two from 512B up to 1M. The default size is 0 which means no small file blocks will be allocated in the special class. Before setting this property, a special class vdev must be added to the pool.

Also:
Nothing (until you add or re-add content to your pool), since there's no automated process to retrofit the VDEV into the filesystem. The metadata on the metadata VDEV will only represent the metadata of files written after you add it to the pool.

Perhaps helpful to that end, a method to manually do it: https://github.com/markusressel/zfs-inplace-rebalancing

what is stored in cache vdev by default (such as de-dup tables)?
you can check that with zfs get secondarycache (that's a per-dataset setting, so all datasets will be shown). Default is "all", which means both metadata and data are cached for reading in L2ARC where the system allows it.

Generally, I recommend not using L2ARC until you've verified with arc_summary that you're dropping stuff out of your ARC (which should already be 64GB or more before we even discuss it.
 
Last edited:

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
This is what the ZFS manual page for zpoolconcepts has to say about "Special Allocation Class" vDevs. Note the specific reference, multiple times, to deduplication tables. All that said, I don't know what the TrueNAS GUI will do... Never used the Special Allocation Devices before.

Special Allocation Class
Allocations in the special class are dedicated to specific block types. By default this includes all meta-
data, the indirect blocks of user data, and any deduplication tables. The class can also be provisioned to
accept small file blocks.

A pool must always have at least one normal (non-dedup/-special) vdev before other devices can be assigned
to the special class. If the special class becomes full, then allocations intended for it will spill back
into the normal class.

Deduplication tables can be excluded from the special class by unsetting the zfs_ddt_data_is_special ZFS
module parameter.

Inclusion of small file blocks in the special class is opt-in. Each dataset can control the size of small
file blocks allowed in the special class by setting the special_small_blocks property to nonzero. See
zfsprops(7) for more info on this property.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
OK, you sold me on it with this:
A pool must always have at least one normal (non-dedup/-special) vdev before other devices can be assigned to the special class. If the special class becomes full, then allocations intended for it will spill back into the normal class.
Deduplication tables can be excluded from the special class by unsetting the zfs_ddt_data_is_special ZFS module parameter.

and a quick check from arc_summary shows me:
ddt_data_is_special 1

That's on TrueNAS CORE 13-U4, but I see no reason to think it would be otherwise on other versions (I already checked Bluefin and it's the same).
 
Top