Metadata vdev layout for RAIDZ2

Shigure

Dabbler
Joined
Sep 1, 2022
Messages
39
I have read a lot about metatdata vdev recently so I generally have the idea. As I'm still pretty new to TrueNAS and ZFS, maybe ask before I do it makes more sense.

I planned a 5-wide RAIDZ2 data pool with 16TB HDDs, and want to add a metadata vdev to it. I have 4x 1TB SATA SSDs laying around so here is what I'm thinking:

The 2 options are: 4 wide RAIDZ2 as metadata vdev, or 3-way mirror with 1 hotspare as metadata vdev.
3-way mirror will be slightly faster than RAIDZ2 for metatdata because those are all small files(I assume.) Both allows 2 drives to fail, but 3-way mirror comes with a hotspare in addition. And 1TB metatdata should be enough for a 5-wide RAIDZ2 with 16TB HDDS(about 2.5% of storage taking account of the 20% free capacity suggestion of ZFS).

So I should go for 3-way mirror, is that correct?

(Still waiting for a bifurcation card and an extension cable to arrive so don't really have the ability to test...)
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
There is a difference between using a special vDev for metadata, (aka file directory information, access control lists, etc), and small files. If I remember correctly, either or both are possible, but it would depend on your application which would be better.

What is your application?
Please describe the rest of your hardware.

Another option is a to use 3 x 1TB SATA SSDs for small files, and use the last 1TB SATA SSD as a L2ARC for metadata only. This depends on the amount of memory you have, because a L2ARC requires some memory for the reference table of the L2ARC entries. And of course, the application maters. Even if you did use one 1TB SATA SSD for L2ARC, in an emergency, you can detach it at any time to replace a failing 1TB SATA SSD in the special vDev.
 

Shigure

Dabbler
Joined
Sep 1, 2022
Messages
39
There is a difference between using a special vDev for metadata, (aka file directory information, access control lists, etc), and small files. If I remember correctly, either or both are possible, but it would depend on your application which would be better.

What is your application?
Please describe the rest of your hardware.

Another option is a to use 3 x 1TB SATA SSDs for small files, and use the last 1TB SATA SSD as a L2ARC for metadata only. This depends on the amount of memory you have, because a L2ARC requires some memory for the reference table of the L2ARC entries. And of course, the application maters. Even if you did use one 1TB SATA SSD for L2ARC, in an emergency, you can detach it at any time to replace a failing 1TB SATA SSD in the special vDev.
Thanks. My other hardware are: Ryzen 5750G, 64GB ECC Memory(maxed out already), LSI9208i and a 10G NIC connected to my switch. I have 2x 2TB NVMe SSD planed for an application pool for things like Nextcloud, Transmission and other jails etc.

The NAS will be mainly used as media server for anime, music, comic and photo(those are relatively small and I do have a good amount of them). I also plan to setup Syncthings to it to backup some important stuff from other devices.
I'm not sure if L2ARC will really benefit with only 64GB RAM for a 48TB array that's why I didn't consider it at first. I saw people with L2ARC usually have 128GB RAM or more. But yeah I didn't really aware of I can detach it to replace failing drives in other array.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
64GB is considered to be the minimum required to use L2ARC and have a positive effect on the system, though the advice is to use it once you max out your RAM (if it's more than 64GB).

For special vdevs you should always go with mirrors, and a 3 way is enough to match your RAIDZ2 redundancy.

Suggested reading:
A special VDEV can store metadata such as file locations and allocation tables. The allocations in the special class are dedicated to specific block types. By default, this includes all metadata, the indirect blocks of user data, and any deduplication tables. The class can also be provisioned to accept small file blocks. This is a great use case for high performance but smaller sized solid-state storage. Using a special vdev drastically speeds up random I/O and cuts the average spinning-disk I/Os needed to find and access a file by up to half.​
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
In this case, I was suggesting to use the L2ARC as a metadata only. Not data. So 64GB is probably good enough. But, you can always test before and after.
 

Shigure

Dabbler
Joined
Sep 1, 2022
Messages
39
64GB is considered to be the minimum required to use L2ARC and have a positive effect on the system, though the advice is to use it once you max out your RAM (if it's more than 64GB).

For special vdevs you should always go with mirrors, and a 3 way is enough to match your RAIDZ2 redundancy.

Suggested reading:
Thanks. As my setup has already maxed out RAM at 64GB, L2ARC is an option but might not really beneficial if I'm understanding the old posts I read correctly. In the case I already have a metadata vdev for my data vdev, will L2ARC still worth it?

In this case, I was suggesting to use the L2ARC as a metadata only. Not data. So 64GB is probably good enough. But, you can always test before and after.
Yeah I think I do need to do some test. Based on what I read from other posts, L2ARC shouldn't be too large otherwise it will put too much pressure on ARC, is that right? In my case since I only have 64GB RAM maybe I need to use a smaller drive or a smaller partition for L2ARC if I really want to add it to my setup?
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Thanks. As my setup has already maxed out RAM at 64GB, L2ARC is an option but might not really beneficial if I'm understanding the old posts I read correctly. In the case I already have a metadata vdev for my data vdev, will L2ARC still worth it?
L2ARC could (in theory) still be useful since it can also hold normal data and not only metadata.
The main difference between a metadata only L2ARC and a special vdev is that the former is volatile* while the latter isn't (and kills your pool if you lose it).

Yeah I think I do need to do some test. Based on what I read from other posts, L2ARC shouldn't be too large otherwise it will put too much pressure on ARC, is that right? In my case since I only have 64GB RAM maybe I need to use a smaller drive or a smaller partition for L2ARC if I really want to add it to my setup?
Exactly, though I don't remember the exact ratio...maybe was 1:6 or 1:8 (ARC:L2ARC).

*Altough it can be set to be persistent.
Read more about L2ARC in the following reference.
 
Last edited:

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
The NAS will be mainly used as media server for anime, music, comic and photo(those are relatively small and I do have a good amount of them). I also plan to setup Syncthings to it to backup some important stuff from other devices.

I don't expect that this workload will benefit from a "regular" data L2ARC, as it's either sequential reads ("anime" is assumed to be video files here) or something that isn't latency-sensitive. A metadata-only L2ARC is an option, but are you actually seeing downward pressure on ARC that's causing the metadata to be evicted?

I guess my question here is "what performance challenge are you looking to solve?"

As your main vdev is a RAIDZ2, bear in mind that adding any of the "special" type vdevs (meta, small file, etc) will be a permanent addition. Only LOG and CACHE vdevs can be removed from pools with top-level RAIDZ vdevs.
 

Shigure

Dabbler
Joined
Sep 1, 2022
Messages
39
L2ARC could (in theory) still be useful since it can also hold normal data and not only metadata.
The main difference between a metadata only L2ARC and a special vdev is that the former is volatile* while the latter isn't (and kills your pool if you lose it).


Exactly, though I don't remember the exact ratio...maybe was 1:6 or 1:8 (ARC:L2ARC).

*Altough it can be set to be persistent.
Read more about L2ARC in the following reference.
I think the only downside for persistent L2ARC is restart might take a long time though I didn't expect I will restart my NAS quite often.

Well the ratio is something I didn't find a good answer but I'm pretty sure 1TB is just too much lol. Maybe try 512GB first, if I plan to add L2ARC.

I don't expect that this workload will benefit from a "regular" data L2ARC, as it's either sequential reads ("anime" is assumed to be video files here) or something that isn't latency-sensitive. A metadata-only L2ARC is an option, but are you actually seeing downward pressure on ARC that's causing the metadata to be evicted?

I guess my question here is "what performance challenge are you looking to solve?"

As your main vdev is a RAIDZ2, bear in mind that adding any of the "special" type vdevs (meta, small file, etc) will be a permanent addition. Only LOG and CACHE vdevs can be removed from pools with top-level RAIDZ vdevs.
TBH I didn't have performance challenge yet, but planning out existing hardware for the optimized performance so I can set things up and then forgot(not really forgot but without fiddling around).

And yes Anime here are basically video files, I also has a lot of photos but I guess they are also not really latency-sensitive data right? So I'm debating do I really need a L2ARC drive.

I learned that if special vdevs dead the pool dead so I choose 3-way mirror for the metadata drive and the left one will be hot-spare or L2ARC. I will run some benchmarks to see if my work load can really benefit from a metatdata only L2ARC, though I guess it will probably be negligible lol
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
I learned that if special vdevs dead the pool dead so I choose 3-way mirror for the metadata drive and the left one will be hot-spare or L2ARC. I will run some benchmarks to see if my work load can really benefit from a metatdata only L2ARC, though I guess it will probably be negligible lol
I don't think that a metadata-only L2ARC will give you any benefit since you already have a special vdev.

You were probably fine without either tbh.
 

Shigure

Dabbler
Joined
Sep 1, 2022
Messages
39
I don't think that a metadata-only L2ARC will give you any benefit since you already have a special vdev.

You were probably fine without either tbh.
Yeah I think that might be the final answer I found after I setup everything and run benchmarks xd.
But I always learned a lot through the process, and always thank you guys for the advices.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
The thought process is correct, and a 3-way mirror as special vdev to a raidz2 pool should be safe.
But if you have 64 GB RAM (which would be a fair size to a ca. 50 TB pool), a fair amount of metadata could be in ARC already. It seems that you're trying to solve the problem of browsing through large directories (photos, possibly with some "sidecar" files?) before actually hitting an actual issue.
I'd suggest building the HDD pool and using it. If browsing is too slow, then try adding a NVMe drive as L2ARC. If that's not satisfactory, or if the issue is with regularly writing files, then go for the special vdev—as a last resort because it is an irreversible choice.
 

Shigure

Dabbler
Joined
Sep 1, 2022
Messages
39
The thought process is correct, and a 3-way mirror as special vdev to a raidz2 pool should be safe.
But if you have 64 GB RAM (which would be a fair size to a ca. 50 TB pool), a fair amount of metadata could be in ARC already. It seems that you're trying to solve the problem of browsing through large directories (photos, possibly with some "sidecar" files?) before actually hitting an actual issue.
I'd suggest building the HDD pool and using it. If browsing is too slow, then try adding a NVMe drive as L2ARC. If that's not satisfactory, or if the issue is with regularly writing files, then go for the special vdev—as a last resort because it is an irreversible choice.
You actually pointed out one of my (bad)habits, thinking too much for potiential issues before I really hit an actual issue.
And try before adding metadata vdev is a really good point as once I added it I cannot remove it unless destory the pool. Now I basically have everything listed out with their own destination, just waiting for my riser card...
 
Top