Newbie question: Pools and Datasets

rofe

Cadet
Joined
Aug 1, 2022
Messages
5
Hi,

I'm new to both TrueNAS and FreeBSD.

I've installed TrueNAS following the guide, but I can't seem to find an explanation on Pools and if I need to create a Dataset my self.

I understand it like Pools is one or more disks in a Data VDev (Virtual Device?) configuration. When creating a VDev TrueNAS automatically creates a Dataset (a filesystem, ZFS?) called Root Dataset.

Can I use this for data or do I need to create my own Dataset or Zvol on top of the Pool?

What is best practice?
In my searches I found that some people have problems with permissions on the Root Dataset can't be changed.

Can I make multiple Datasets on the same Pool and will the Datasets automatically/dynaimcally share the available space in the Pool between them?

Regards
Ronni
 

Jailer

Not strong, but bad
Joined
Sep 12, 2014
Messages
4,977
It's best to create a dataset(s) and store your data there rather than in the root of the pool.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
Can I make multiple Datasets on the same Pool and will the Datasets automatically/dynaimcally share the available space in the Pool between them?
Simple answer is YES. Sharing the space is the default but you can make quota limits if desired. Make sure you read the user guide.

And there are a lot of new terms to understand so take your time. Google is a good search engine to use to look up things, for example "truenas create dataset".
 

erikaJ

Cadet
Joined
Aug 4, 2022
Messages
1
Which version of TrueNAS do you have, CORE or SCALE? The instructions are similar but the UIs have differences.
 

rofe

Cadet
Joined
Aug 1, 2022
Messages
5
Thank you for helping out.

I'm reading the guides and documentation, as I move forward, also reading different sources - sometimes I just need to get it explained in different ways :smile:

Great with the quota, didn't know that.

I'm running TrueNAS Core due to ZFS and BSD.
 
Joined
Oct 22, 2019
Messages
3,641
I'm just going to say this in here. I hate how since ZFS's inception, they blurred the lines between the "pool" and the "top-level root dataset".

You issue "zfs" commands against datasets.

You issue "zpool" commands against pools and vdevs.

So far so good.

But new users get confused about the differences between pools and datasets because upon creating a pool, a dataset of the same name is automatically created as the one and only root dataset. And yes, this root dataset is a dataset... with dataset properties, and it is a filesystem in which files/folders can be saved directly within. (So "myPool" is a pool or dataset, depending on what you're trying to do....)

To this day, I don't understand why this is the behavior of ZFS.

You cannot simply create a pool without a root dataset of the same name. (Please demonstrate if this is even possible.)

But I believe to be able to do so would have made ZFS much more versatile than it is today.



This is what we're stuck with:
  • myPool (zpool)
    • myPool
      • documents
      • media
      • legal
      • VMs
      • temporary


Yet it should be like so upon creating a pool:
  • myPool (zpool)


So then you create some root datasets:
  • myPool (zpool)
    • personal
    • other


And now you can structure a layout that intuitively makes more sense for inheritance, organizing, and simplified replications:
  • myPool (zpool)
    • personal
      • documents
      • media
      • legal
    • other
      • VMs
      • temporary

Now, in the very first example (of how ZFS is today), how would you do a recursive replication of only the important/personal datasets as a single unit, and drop it into a backup pool? It takes more configuration (such as picking only specific datasets, doing exclusions, etc). And this will not automatically include newly created datasets that are added later. And you'd be dumping them directly under the backup pool's own root dataset.

Now look at the last example. You simply replicate "personal" to your backup pool. Done. Even if you create new datasets under "personal" at a later time, they will be included by default. Also, "personal" remains its own root dataset in the backup pool. Same layout as the source. Oh, want make recursive changes to the dataset properties of "personal" and below? You issue such commands against "personal". Done. This is no different than the current behavior issuing "zfs" dataset commands against "myPool", except that you're dealing with your own root datasets, and you're not stuck with a non-negotiable "same name as the pool" root dataset.



There are ways around this limitation now, but it involves creating your own "placeholder" pseudo-root datasets, which is what I resorted to, and it works for the most part. It's just not as sleek as the alternative in which ZFS could have avoided "baking in" an immutable root dataset of the same name upon pool creation.


Here's what my "pseudo-roots" look like:
  • myPool (zpool)
    • myPool
      • personal
        • documents
        • media
        • legal
      • other
        • VMs
        • temporary

It's redundant, and somewhat silly, but it works.
 
Last edited:

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
I understand it like Pools is one or more disks in a Data VDev (Virtual Device?) configuration. When creating a VDev TrueNAS automatically creates a Dataset (a filesystem, ZFS?) called Root Dataset.
Let's keep it simple.

Drives go into vdevs (that's where redundancy is created).
One or more vdevs go into a pool (always as a stripe: this is where performance is defined).
This is the physical layer, managed by zpool — but keep it with the GUI as much as possible.

Datasets is how data is organised—one may think of datasets as partitions of the raw storage space of the pool.
This is the logical layer, managed by zfs — again, use the GUI, avoid making changes from the command line.

Can I use this for data or do I need to create my own Dataset or Zvol on top of the Pool?
Always create your own datasets and do not store any data in the top dataset.
 
Top