Some general TrueNAS and ZFS questions

thomas-hn · Dec 11, 2021

Hello,

even after reading a lot about TrueNAS and ZFS, there are still some points unclear to me. Maybe, someone could help to increase my understanding as a beginner.

Datasets & Data Organization
- How to organize the personal data into datasets and pools? One solution I have seen is, to organize the data regarding its importance into datasets, so that datasets can simply be backed up based on their importance. Do you have examples on how you have organized your datasets?
- There are a lot of discussion in the forum here that on a highly filled Pool/Dataset the performance strongly decreases as well as a 100% filled pool cannot delete files anymore (because of the Copy-on-Write mechanism). Is there a simple mechanism to reserve around 20% of the disk space for the case of emergency? Does it work to simply create an additional dataset with some reserved space that can be reduced to zero in case we run out of space and to be able to delete some files? (sorry for this question, which was discussed a lot here in the forum, but there are so many different answers and I have not really got the point what works and what not)
VDevs
- I have found that TrueNAS recommends WD drives. However, is it recommended to mix drives of different makes and models in a VDev to avoid data corruption because of bugs in a specific drive model?
- How does TrueNAS and ZFS behave in case different makes of drives are used? For example, if a 8TB drive from WD and Seagate are combined, does TrueNAS automatically use the minimally smaller drive size?
ZPools
- I have learned that ZFS "fragments" over time and that a defrag is not possible. The only way to "defrag" seems to be to remove the whole pool and recreate it. Would it also be possible to create a new pool, copy over all data to the new pool and simply redirect all "references" from the old pool to the new one?
- Are there any rules on how to arrange Pools over VDevs? For example, if there are three VDevs A, B, C, is it possible to create one Pool spread over VDev A and B, another Pool over VDev A and C, and one pool only located on VDev B?
Encryption
- Are you recommending to encrypt complete pools or to encrypt on dataset level?
TrueNAS, SSD & TRIM
- Is it necessary to use TRIM for SSDs?
- Sometime it is recommended to use SSDs supporting RZAT (Deterministic TRIM; Deterministic read ZEROs after TRIM (rzat)), while other sources write "TRIM is not used and not needed due to ZFS's copy on write drive leveling.".
- Does encrypting pools on SSDs make problems (because of possible Write Amplification)?
Optimizations for SSDs
- Which configuration parameters should be optimized for SSDs?
  - Disable atime updates?
  - Using compression and deduplication to reduce the writes to the SSD vdevs, prolonging the lifetime.
  - Sector Size
    - How to configure the logical sectors, block sizes, etc.?
Config DB
- Do you recommend to place the Config DB on a separate Dataset or Pool? I assume that an own Pool has the advantage that it can simply imported after a new TrueNAS installation to re-import all TrueNAs settings without the need to directly import also all user data pools.

Thanks a lot in advance,

Thomas

NugentS · Dec 11, 2021

I'll pitch in on some of this

Datasets & Data Organization
This is very much of how do you want to do this. The world is your oyster. However I tend to organise by protocol, disktype & purpose. So If I have a pool called BigPool where most of my data is then I have a dataset below that called SMB, below which I have specific datasets for shares
NFS has its own top level dataset as does iSCSI. I try to make the names descriptive - so later on I know what's there. I have other pools, by disk type and size for other purposes

BigPool is a pool of 8*12TB in mirrored vdevs
     iSCSI
          iscsi zvol for vmware
     NFS
          NFS.BigPool.HDD.NAS - NFS Share
     SMB
          Install - SMB Share
          Media - SMB Share
          Common - SMB Share
          etc
     Snapshots
          OtherPools
               JailPool
               NVMePool
               SSDPool

Then I have 3 other Pools

JailPool - which is 2 * 800GB mirrored SSD
     iocage
     Jail-Data - Jail Data stored outside the actual jail - making rebuilding the jail easier
NVMePool - which is 2 * 2TB NVMe mirrored SSD
     iSCSI
          iscsi zvol
SSDPool - which is 6*1.6TB SSD mirrored pairs
     iSCSI
         iscsi zvol

The Snapshot dataset under BigPool contains replicated snapshots of the other pools for a few days as a just in case.

This is how I organise myself - I haven't found a big problem with it yet. Given that pools tend to inherit the options from high lever pools I can change all my SMB Pool record size in one go at the SMB Pool level, and then adjust for specific cases further down.

I use mirrored pairs for increased IOPS although I have toyed with the idea of making BigPool a Z3 or Z2 for a little extra capacity. However using mirrors means that I can easily increase capacity by adding 2 disks rather than wide vdevs of lots of disks. To improve performance of iSCSI and NFS I use a SLOG on SSDPool & BigPool, also BigPool has a special vdev for metadata of some Intel DC SSD's. I try to keep a pool at 50% or thereabouts, again for performance. BigPool is 54%. I don't always succeed - but definately <80%

vDevs
I don't mix drives - but I have no issues doing so on a scratch pool. A primary pool, I don't mix disks in vdevs. Different vdevs like specials / SLOGs etc have different disks appropriate to their function obviously.

ZPools
Fragmentation is not necessarily what you think it is. ZFS always tries to write a file consecutively and as long as there is plenty of spare space available it will normally do that. I THINK that fragmentation refers to how it won't nessesarilty write one file immediately after another depending on where the empty space is and how big it is. [Someone else may care to pitch in here]
You don't seem to get vdevs and pools
Pools are 1 or more vdevs added together for capacity / IOPS / some special purpose. vdevs do NOT traverse pools. vdevs are physical disks added together for a purpose. [Caveat] It is possible to have multiple partitions on a single disk and those partitions turned into a vdev which can be part of different pools - but this is generally not recommended and can't be done in the GUI. I do this specifically on a pair of Optanes that I use on two different pools as SLOG's.

Datasets are logical separators inside pools that contain data. They do not traverse pools, but do traverse vdevs obviously

Encryption
Your choice. Encryption can be an utter nuisance when trying to recover from a problem. I only encrypt a single small dataset containing confidential information.

TrueNAS, SSD & TRIM
No idea - I turn TRIM on as a matter of course

Optimizations for SSDs
No real idea. If drives can support 4K then I use 4K. I use compression as a matter of course and do not use dedupe (the use case for dedupe is rare and requires significant hardware resources) I disable atime as a matter of course. When I add a new drive I use another PC to set the sector size where applicable before adding to TN

ConfigDB
Once you build the bootpool (through TN Install) and then add a new pool the system dataset is automatically moved. Where you put it doesn't really matter - I actually put it on some high endurance SSD's but it doesn't really matter (it only really matters if you are trying to get some HDD's to go to sleep, in which case don't keep the system dataset on that pool). What I do do is get a copy of the configfile emailed to me everyday - so I always have an up to date copy off the NAS, somewhere where I can get to it in case my boot media dies and I need to rebuild.

The point is mostly that TN is very configurable around your use case. What I do works for me (so far), it might not work for everyone. I do back up (twice) and have most of the data offsite as a further backup so if / when I figure out a better way of doing things I can just trash the setup and start again by restoring the data.

Remember to consider backups as part of your design strategy. I use both replication to another QNAS box and file by file backup to a Synology (which despite being a bit clunky, does the job really really well, backing up SMB and VM's individually without issue.)

Ericloewe · Dec 11, 2021

thomas-hn said:
How to organize the personal data into datasets and pools? One solution I have seen is, to organize the data regarding its importance into datasets, so that datasets can simply be backed up based on their importance. Do you have examples on how you have organized your datasets?

If it's to be treated differently, separate datasets. It's also a good idea to divide huge datasets into smaller ones, in case you need to shuffle them around to a different pool or something at a later date.

thomas-hn said:
There are a lot of discussion in the forum here that on a highly filled Pool/Dataset

This does not apply to datasets, just to the pool.

thomas-hn said:
Does it work to simply create an additional dataset with some reserved space that can be reduced to zero in case we run out of space and to be able to delete some files?

Sure, but it's still best to avoid the 100% mark.

thomas-hn said:
I have found that TrueNAS recommends WD drives.

Ehh, that's Marketing really, not much more to it if you're building your own.

thomas-hn said:
However, is it recommended to mix drives of different makes and models in a VDev to avoid data corruption because of bugs in a specific drive model?

Not particularly, because it's not very practical unless you have a simple setup.

thomas-hn said:
8TB drive from WD and Seagate are combined, does TrueNAS automatically use the minimally smaller drive size?

Yes, though this isn't a common thing these days.

thomas-hn said:
I have learned that ZFS "fragments" over time and that a defrag is not possible. The only way to "defrag" seems to be to remove the whole pool and recreate it. Would it also be possible to create a new pool, copy over all data to the new pool and simply redirect all "references" from the old pool to the new one?

Don't worry about it, but yes, that's more or less it.

thomas-hn said:
Are there any rules on how to arrange Pools over VDevs? For example, if there are three VDevs A, B, C, is it possible to create one Pool spread over VDev A and B, another Pool over VDev A and C, and one pool only located on VDev B?

A pool is made up of vdevs and data is distributed across vdevs. You cannot share a vdev between pools. You could do some trickery like partitions or NVMe namespaces, but I can't think of a good reason to do so.

thomas-hn said:
Are you recommending to encrypt complete pools or to encrypt on dataset level?

Different goals. Easy disk disposal or you can't leak anything via metadata? Full-disk encryption. Want to be able to do encrypted sends to untrusted endpoints or just don't want to encrypt your media? Native encryption.
You can also do both, but that's getting obscene.

thomas-hn said:
TrueNAS, SSD & TRIM

Is it necessary to use TRIM for SSDs?

Sometime it is recommended to use SSDs supporting RZAT (Deterministic TRIM; Deterministic read ZEROs after TRIM (rzat)), while other sources write "TRIM is not used and not needed due to ZFS's copy on write drive leveling.".

Does encrypting pools on SSDs make problems (because of possible Write Amplification)?

Trim is good, but not 100% essential, typically. Should work out of the box without too many issues, these days.

thomas-hn said:
Optimizations for SSDs

Which configuration parameters should be optimized for SSDs?

Disable atime updates?

Using compression and deduplication to reduce the writes to the SSD vdevs, prolonging the lifetime.

All these apply to HDDs as well. Disabling atime is good to avoid needless writes and snapshots containing just new atimes. Compression is basically free and typically a speedup. Stay away from dedup.

thomas-hn said:
How to configure the logical sectors, block sizes, etc.?

If your SSDs can be told to report larger sectors, such as 4k or 8k, you might as well.
As for the recordsize property... For datasets with mostly large files, go with 1M. Databases and virtual disks? Smaller, depends on the specifics. Anything else? Leave the default alone.

Important Announcement for the TrueNAS Community.

Some general TrueNAS and ZFS questions

thomas-hn

Explorer

NugentS

MVP

Ericloewe

Server Wrangler

Similar threads

Important Announcement for the TrueNAS Community.

Some general TrueNAS and ZFS questions

thomas-hn

Explorer

NugentS

MVP

Ericloewe

Server Wrangler

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Some general TrueNAS and ZFS questions"

Similar threads