Choosing SSD and configuring mirrored pools

pasha-19

Dabbler
Joined
Feb 15, 2021
Messages
19
TrueNAS-13.0-U5.1
Motherboard: Asus TUF Gaming B450M Plus II
RAM: 64GB
CPU: AMD Ryzen 5 5600G with Radeon Graphics

File access is almost always via SMB.

Boot pool: Mirrored 2 x 500GB SSD
Main Storage pool RAIDz1 (4 x 8TB spinning rust) (type this was GB first)

I intend to add a ASUS Hyper Mx2 x 16 Gen 4 card
with 2 mirrored vdevs one for jails and jellyfin metadata and the other for a special small file vdev

first question can the small file special vdev handle files from 2 pools the main and the jail/jellyfin and is that wise

I am looking at 2TB SK Hynix platinum P41 SSDs for both mirrors.

the special small file vdev should be formatted 512N as I understand it do I have the correct SSD for this purpose

Should the Jail.jellyfin metadata mirror be formatted 4096 or 512n
if 4096 can this share the small file vdev with the main pool? Is this a wise decision?

Thanks for considering these questions
 
Last edited:

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,700
first question can the small file special vdev handle files from 2 pools the main and the jail/jellyfin and is that wise
Not via the GUI and not supported in general, but technically can be done. Not wise/recommended.

the special small file vdev should be formatted 512N as I understand it do I have the correct SSD for this purpose

Should the Jail.jellyfin metadata mirror be formatted 4096 or 512n
if 4096 can this share the small file vdev with the main pool? Is this a wise decision?
I don't think you want different ashift settings for VDEVs in the same pool: https://openzfs.github.io/openzfs-d...g/Workload Tuning.html#alignment-shift-ashift

I don't think it's wise to mess with it and there's no clear benefit as I see it.
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,829
with 2 mirrored vdevs one for jails and jellyfin metadata and the other for a special small file vdev

first question can the small file special vdev handle files from 2 pools the main and the jail/jellyfin and is that wise

I am looking at 2TB SK Hynix platinum P41 SSDs for both mirrors.

the special small file vdev should be formatted 512N as I understand it do I have the correct SSD for this purpose

Should the Jail.jellyfin metadata mirror be formatted 4096 or 512n
if 4096 can this share the small file vdev with the main pool? Is this a wise decision?

Thanks for considering these questions

There be dragons here. Proceed cautiously. If both small vDEVs go, so does your pool. I would stick to data-center grade SSDs, like Intel. For example, I use three mirrored S3610s that have enormous write-tolerance and even a 1.6TB version is less than $100 or so on eBay.

As an aside, I'm really not a fan of how Hynix tries to tart up their specifications site vs. the simple, sober Intel PDF.
 

pasha-19

Dabbler
Joined
Feb 15, 2021
Messages
19
Not via the GUI and not supported in general, but technically can be done. Not wise/recommended.


I don't think you want different ashift settings for VDEVs in the same pool: https://openzfs.github.io/openzfs-docs/Performance and Tuning/Workload Tuning.html#alignment-shift-ashift

I don't think it's wise to mess with it and there's no clear benefit as I see it.

Point one -- reasonable will not do
Point two -- that seems to make sense for the jail/jellyfin pool keep the same as spinning rust.


For the special allocation vdev a 512N format if I read correctly is better for small files like less than 1K, 1.5k or 2 (maybe 2.5k). (i think generally 2k+ is in my thinking close enough to 4k. The whole purpose from what I read of the special allocation vdev is to better handle small files at a chosen length of less than 1M But I think the 3 or 4 values above are better. If this is to work doesn't the async value need to be 2?


I believe the boot pool on the small SSD is also 512n, I may be wrong. I tried ZFS get all boot-pool but I did not see async.
 
Last edited:

pasha-19

Dabbler
Joined
Feb 15, 2021
Messages
19
There be dragons here. Proceed cautiously. If both small vDEVs go, so does your pool. I would stick to data-center grade SSDs, like Intel. For example, I use three mirrored S3610s that have enormous write-tolerance and even a 1.6TB version is less than $100 or so on eBay.

As an aside, I'm really not a fan of how Hynix tries to tart up their specifications site vs. the simple, sober Intel PDF.
I need M.2 NVMe drives to attach to the Asus M.2 x16 Gen 4 card; these appear to be conventional SATA drives.

Thanks
 
Last edited:

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,700
zpool get ashift

In a pool with VDEVs of mixed ashift, you'll have issues to add or remove VDEVs with the GUI and will need close attention with CLI to handle that.
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,829
I need M.2 NVMe drives to attach to the Asus M.2 x16 Gen 4 card; these appear to be conventional SATA drives.
Intel also makes data-center-quality M2 drives. On the other hand, paying only $108 at Amazon for 2TB is pretty tasty. It's not a tradeoff I'd make, so I hope you'll have a good set of current backups.
 

pasha-19

Dabbler
Joined
Feb 15, 2021
Messages
19
Intel also makes data-center-quality M2 drives. On the other hand, paying only $108 at Amazon for 2TB is pretty tasty. It's not a tradeoff I'd make, so I hope you'll have a good set of current backups.
I may be wrong but I do not believe Intel is in the SSD business now; they sold out according to what I read to SOLIDIGM.

Other possibilities I also found

Intel 670p Series M.2 2280 2TB PCI-Express 3.0 x4 QLC Don't like Gen3 or QLC am I right?
Intel is not making SSD anymore if I read correctly

Solidigm™ P44 Pro Series 2TB PCIe GEN 4 NVMe 4.0 x4 M.2 2280 3D NAND Internal Solid State Drive

Purchased SSD business from intel as I understand ; I do not know their reputation.

SABRENT 2TB Rocket Nvme PCIe 4.0 M.2 2280 Internal SSD

Samsung 990 PRO Series - 2TB PCIe Gen4. X4 NVMe 2.0c - M.2 Internal SSD

These might be commercial grade (maybe)

WD_BLACK 2TB SN850X NVMe Internal Gaming SSD Solid State Drive - Gen4 PCIe, M.2 2280

Silicon Power 2TB UD90 NVMe 4.0 Gen4 PCIe M.2 SSD
 
Last edited:

pasha-19

Dabbler
Joined
Feb 15, 2021
Messages
19
zpool get ashift

In a pool with VDEVs of mixed ashift, you'll have issues to add or remove VDEVs with the GUI and will need close attention with CLI to handle that.
Are we talking about the same thing?

I mean the special vdev activated by the following

"Metadata (Special) small block size"

"This value represents the threshold block size for including small file blocks into the special allocation class. Blocks smaller than or equal to this value will be assigned to the special allocation class while greater blocks will be assigned to the regular class. Valid values are zero or a power of two from 512B up to 1M. The default size is 0 which means no small file blocks will be allocated in the special class. Before setting this property, a special class vdev must be added to the pool. See zpool(8) for more details on the special allocation"
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,700
For the special allocation vdev a 512N format if I read correctly is better for small files like less than 1K,
If you're talking about 512n/4Kn format, then that's ashift... but the "cutoff" for the special VDEV is called special_small_blocks... maybe that's what you meant to be talking about.

Picking a number for special_small_blocks needs to take into account the speed of your special VDEV at dealing with smaller files and the space you want to give to it... but also makes no sense at all to set it to a number equal to or greater than the recordsize of the dataset/pool (would mean you only use the special VDEV until it's full).
 
Last edited:

pasha-19

Dabbler
Joined
Feb 15, 2021
Messages
19
If you're talking about 512n/4Kn format, then that's ashift... but the "cutoff" for the special VDEV is called special_small_blocks... maybe that's what you meant to be talking about.

Picking a number for special_small_blocks needs to take into account the speed of your special VDEV at dealing with smaller files and the space you want to give to it... but also makes no sense at all to set it to a number equal to or greater than the recordsize of the dataset/pool (would mean you only use the special VDEV until it's full).
Yes I meant to create a SPECIAL allocation vdev so I could set special_small_blocks to a value probably close to the value for 2K files (as in byte length) and smaller to be in the special allocation vdev. The main (and only based on your advice) pool this will be associated with will contain media files (large), backups (also large), and synced Windows home groups, many files will be larger than 2k bytes. How should that special vdev be formatted that will be loaded with files equal to or smaller than the special_small_blocks value?

see TrueNAS help text included in 9:08 AM post.
 
Last edited:

pasha-19

Dabbler
Joined
Feb 15, 2021
Messages
19
zpool get ashift

In a pool with VDEVs of mixed ashift, you'll have issues to add or remove VDEVs with the GUI and will need close attention with CLI to handle that.
I currently have two pools used the command you indicated
boot-pool has an ashift of 0 (I do not believe I altered that I believe it was the default')
npool (my main pool) has a ashift of 12 as we both expected.

That is currently all the pools I have.

I am thinking about adding a fast mirrored pool for lack of a better name fpool with as you suggest an ashift of 12.
it will contain jails and jellyfin metadata and maybe more

I am also thinking of adding (associating) a mirrored special allocation class (aka a special metadata small file vdev) to npool and storing small files less than 2k in length in the special metadata small file vdev. TrueNAS Help text in post from 9:08 AM above.
 
Last edited:

pasha-19

Dabbler
Joined
Feb 15, 2021
Messages
19
If you're talking about 512n/4Kn format, then that's ashift... but the "cutoff" for the special VDEV is called special_small_blocks... maybe that's what you meant to be talking about.

Picking a number for special_small_blocks needs to take into account the speed of your special VDEV at dealing with smaller files and the space you want to give to it... but also makes no sense at all to set it to a number equal to or greater than the recordsize of the dataset/pool (would mean you only use the special VDEV until it's full).
I do not understand this answer at all (and the confusion is probably on my side). My understanding is that a special metadata small file vdev can be associated with a data vdev raidz1 4x8TB spinning rust. To do that one must allocate a special metadata small file vdev with it's own disks. I propose to use 2 NVMe (gen4) ssds (picking the right ssds is also something I am trying to determine) as a mirror for that special metadata small file vdev (very fast as I understand it when compared to spinning rust on a sata connection), Is that possible? If that is possible then the special metadata small file vdev must I believe be allocated with an ashift value (since it is supposed to hold small files is that ashift value 2 or 12 (or maybe the creation of the special metadata small file vdev creates whatever ashift it wants)) and are the drives formatted in a corresponding manner?. (if the drives require a special format how do I know which one especially in the case where the creation of the special metadata small file vdev may chose its own ashift value?) Once this special metadata small file vdev is created and associated with the data vdev raidz1 4x8TB spinning rust; I believe I can set the special_small_blocks. value then proceed to rename and copy the datasets in the data vdev raidz1 4x8TB spinning rust where I want the small files in the special metadata small file vdev. Lastly deleting the old datasets and I believe the conversion is over. Have I missed a step or do I not understand the process or am I complete in left field trying to do something that is impossible?

By the way -- thanks for the help -- the end of the confusion on my part will be understanding regardless of whether this can be done or not.
 
Last edited:

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,700
All I'm trying to tell you is that a mix of differently ashifted VDEVs won't be handled by the GUI, so anything you subsequently do with that pool regarding adding or removing VDEVs with the GUI will fail with a message that's a little cryptic if you're not expecting it (as it says something like only pools with mirrored top-level VDEVs support removal... even if that's the case for your pool).

You can (probably/sort-of) manage it properly with CLI, so:

either match your HDDs' ashift with the SSDs and have plain sailing, or mix it and deal with the trouble.
 

pasha-19

Dabbler
Joined
Feb 15, 2021
Messages
19
All I'm trying to tell you is that a mix of differently ashifted VDEVs won't be handled by the GUI, so anything you subsequently do with that pool regarding adding or removing VDEVs with the GUI will fail with a message that's a little cryptic if you're not expecting it (as it says something like only pools with mirrored top-level VDEVs support removal... even if that's the case for your pool).

You can (probably/sort-of) manage it properly with CLI, so:

either match your HDDs' ashift with the SSDs and have plain sailing, or mix it and deal with the trouble.
I may have fixed the ashirt part - I tried -- Does any of the rest of this make sense?

Please when you answer if you tell my my ashift is messed up also consider telling me if the rest of this is possible if I get the ashift right? Am I wasting your time. with something that is impossible regardless of the ashift value? I believe the following is what you are trying to tell me is standard configuration -- that is what I am aiming for the tested known standard configuration of these items

My understanding is that a special metadata small file vdev can be associated with a data vdev raidz1 4x8TB spinning rust. To do that one must allocate a special metadata small file vdev with it's own disks. I propose to use 2 NVMe (gen4) ssds (picking the right ssds is also something I am trying to determine) as a mirror for that special metadata small file vdev (very fast as I understand it when compared to spinning rust on a sata connection), Is that possible? The ashift for the special metadata small file vdev has to be 12 because the associated data vdev raidz1 4x8TB spinning rust is also 12. I agreed long ago the other data vdev fpool would be ashift 12. Formatting requirement for the SSD drives I will add to the special metadata small file vdev and the fpool vdev is 4096N? Once this special metadata small file vdev is created and associated with the data vdev raidz1 4x8TB spinning rust; I believe I can set the special_small_blocks. value then proceed to rename and copy the datasets in the data vdev raidz1 4x8TB spinning rust where I want the small files in the special metadata small file vdev. Lastly deleting the old datasets and I believe the conversion is over. Have I missed a step or do I not understand the process or am I complete in left field trying to do something that is impossible?
 
Last edited:

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,700
A special metadata VDEV works and is put into place as you mention.
 

pasha-19

Dabbler
Joined
Feb 15, 2021
Messages
19
sretalla: Thanks

I also read about l2arc possibly being used for all or part of the function of a special metadata vdev. Does this option remove some of the risk associated with the special metadata VDEV because the data in the l2arc version is maintained in its original location in the main pool so loss of the l2arc does not result in loss of the associated pool data? As an alternative can the jails and jellyfin metadata be forced (or it just automatically happens) onto the l2arc from the main pool again protecting against corruption of the l2arc resulting in data loss?

The more I read about ssd power failure and nothing about a UPS mitigating this issue I have a feeling with the information I can find which is negligible. The inability to get an actual hardware recommendation even now that my implementation process is confirmed as accurate places the possibility of implementing either SSD based vdev beyond my understanding and or ability. From what I read all the drives I have identified above are probably considered commercial and not recommended.

Thanks to all.
 
Last edited:

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
I intend to add a ASUS Hyper Mx2 x 16 Gen 4 card
with 2 mirrored vdevs one for jails and jellyfin metadata and the other for a special small file vdev

first question can the small file special vdev handle files from 2 pools the main and the jail/jellyfin and is that wise
The SSD pool won't benefit much from a small file VDEV, using it with the HDD pool has its merits.

A single VDEV can't be in two different pools.


I also read about l2arc possibly being used for all or part of the function of a special metadata vdev. Does this option remove some of the risk associated with the special metadata VDEV because the data in the l2arc version is maintained in its original location in the main pool so loss of the l2arc does not result in loss of the associated pool data?

You can set L2ARC as metadata-only. It's a different thing from how fusion pools work. It generally also need at least 64GB of RAM in order to be beneficial and not harm performance.



As an alternative can the jails and jellyfin metadata be forced (or it just automatically happens) onto the l2arc from the main pool again protecting against corruption of the l2arc resulting in data loss?

L2ARC is a cache, it doesn't protect from corruption. That's work done at the block-level.

You should read the following resources.
 
Last edited:

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,829
Couple of comments

If you're seeking very fast seek performance, then a L2ARC can be super beneficial. If the contents of your drive don't change much and you want to speed up finding stuff (rsync in particular benefits from this), then consider a metadata-only, persistent L2ARC. L2ARC is awesome because if it fails, so what. sVDEV is different!

Small file sVDEVs are great for databases and other small files that change a lot - but! you need to have a very good idea of how many small files you expect to have or how to reduce the #. For example, I crammed what used to be simple OS backups into sparsebundles, dramatically reducing my file count. Sparsebundles and like approaches also help speed up non-ZFS backups by reducing the number and increasing the size of the files it needs to transfer.

Trouble is, the ZFS UI still cannot tell you directly how full a sVDEV is. Once it spills over, its benefit may be gone and performance will become a lot less predictable. I wish the UI could give me a sense of how full the sVDEV is.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,700
I wish the UI could give me a sense of how full the sVDEV is
If the "Shell" page is part of the UI...

zpool list -v poolname

Otherwise... wishes get turned into reality by feature requests (via the Report a bug link at the top of the page here).
 
Top