First ZPool advice

nemesis1782

Contributor
Joined
Mar 2, 2021
Messages
105
Hi guys,

I'm looking for some advice on a configuration for my first zPool. The system this relates to is here: https://www.truenas.com/community/t...-on-feasibility-of-first-treunas-setup.91526/

I seperated this since I have a very specific question! zPools and VDEV's.

My current understanding:
- Any zpool consists of 1 or more VDEV's
- the redundancy is handled on the VDEV level
- the VDEV's are striped to create a zPool
- A VDEV which fails will take the whole zPool down without any hope of recovery
- A VDEV has the write performance of a single disk
- A ZPool write speed is ependent on the number of VDEV's

Usage for this zpool:
- Because of the limitations of ZFS I will be using this to house completed series and movies. Once the data is there it should stay there and not be moved around. So little fragmentation should occur

I'll be getting 12x3TB tomorrow and am looking into the following configurations:
- 1 RAIDZ3 11 disks + 1 hot spare, simple straight forward 3 disk redundundancy, bad efficiency
- 2 RAIDZ2 of 6 disks each, better (write) speed, redundancy of 2 to 4 disk redundancy and even worse efficiency since I should keep one more disk lying around to rebuild
- 2 RAIDZ1 5 disks each + 2 hot spare, Same write speed as RAIDZ2 however worse redundancy. Efficiency is about the same.

So now for my questions:
- Having multiple VDEVs in a pool seems risky to me especially on lower RAIDZ levels
- What configuration would you advice for this use case. The data will not be backupped. Loosing the array is a inconvenience but nothing mission critical
- Is there maybe a better configuration?

After this I'll add these zPools:
- Single disk for downloads no redundancy in not meant to store data
- 2x SATA SSD in mirror for VM and such
- 2x m2 SSD for later usage
- A ZPool where series that aren't complete will go
 

ornias

Wizard
Joined
Mar 6, 2020
Messages
1,458
Why are you building your first ZPOOL at the ALPHA software TrueNAS SCALE (as you posted this in the SCALE forum section)?

SCALE is primarily meant for testing, I would advice against using it as your first ZFS learning platform
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
Having multiple VDEVs in a pool seems risky to me especially on lower RAIDZ levels
You manage risk at the VDEV level, so I don't see what you mean.

A VDEV has the write performance of a single disk
In terms of IOPS, yes, but throughput can be much more than a single disk (dependent on transaction group size and fragmentation of the pool/vdev).

even worse efficiency since I should keep one more disk lying around to rebuild
Why are you considering hot spares? They are essentially wasted disks in a RAIDZ pool and they would usually be spinning for nothing if you can physically access the box in enough time to do a replacement when SMART indicates it's needed.

2 RAIDZ2 of 6 disks each, better (write) speed
You will potentially have more IOPS, but the speed may be the same or worse than 1 VDEV.

I will be using this to house completed series and movies.
Why would you need anything more than a single RAIDZ2 VDEV for this use case?


As a general comment, don't confuse the two components of "speed" with each other... IOPS and throughput are different and the limitations of RAIDZ are around IOPS, not throughput.
 

nemesis1782

Contributor
Joined
Mar 2, 2021
Messages
105
Why are you building your first ZPOOL at the ALPHA software TrueNAS SCALE (as you posted this in the SCALE forum section)?

SCALE is primarily meant for testing, I would advice against using it as your first ZFS learning platform

Hi, to start off this will be running in parralel to my existing NAS for testing. The reason I wanted to start with TrueNAS SCALE is because it fits my use case much better!

I need/want to be able to run docker and kubernetes. Now on TrueNAS CORE I could do this by running a Linux VM.

From what I read on the TrueNAS site the plan is to have it realease ready this year. I'll probably be in the testing phase for that long anyway.
 

nemesis1782

Contributor
Joined
Mar 2, 2021
Messages
105
@sretalla: thnx for the reply!

You manage risk at the VDEV level, so I don't see what you mean.
If a single VDEV fails the whole ZPool fails right? So you make the data on VDEV reliant on the data on another.

In terms of IOPS, yes, but throughput can be much more than a single disk (dependent on transaction group size and fragmentation of the pool/vdev).
Hmm. Yeah I get what you're saying. The difficulty is that it's very difficult to find any concrete data on performance comparisons.

Why are you considering hot spares? They are essentially wasted disks in a RAIDZ pool and they would usually be spinning for nothing if you can physically access the box in enough time to do a replacement when SMART indicates it's needed.
Hehe, wel mostly out of habit :P Good point though! Can you configure alarms on the SMART values?

Why would you need anything more than a single RAIDZ2 VDEV for this use case?
Well because from what I read you want to avoid large VDEV's. Also performance. However since it's mostly sequential and write once that is not really a important metric. I just have little patience :P

QUOTE="sretalla, post: 647086, member: 55604"]
As a general comment, don't confuse the two components of "speed" with each other... IOPS and throughput are different and the limitations of RAIDZ are around IOPS, not throughput.
[/QUOTE]
Yeah, of course. To be honest it's more of a case of it'd be cool if :P So I'm going to be realistic and drop any write speed requirements. Anyway I'm limited to 1GBe atm.

So I keep reading that the amount of disks is key. Could and should I create a 12x3TB RAIDZ2 VDEV? That would give me at least 22TB of usable space (counting the 80% rule due to fragmentation, however for my use case I imagine I'd be able to get to 95% without to many issues.)
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
If a single VDEV fails the whole ZPool fails right? So you make the data on VDEV reliant on the data on another.
Yes, but, for example in a RAIDZ2 pool with 2 VDEVS, you would expect to be able to lose 2 disks without data loss (and you can... actually as many as 4 if you're lucky on the distribution between vdevs of the failed disks).

That comes at a cost of an additional 2 disks lost to redundancy for each VDEV, but that's the risk level you chose, so you should be (or need to become) fine with that.

I don't see how you can call it risky for the pool to be managing risk at least as well (and sometimes better than) as you planned.

In the case of a single VDEV pool of RAIDZ2 you have a 100% chance of pool loss with 3 disks gone... with a 2 VDEV RAIDZ2 pool, there's something like a 60-70% chance the pool can survive a 3rd disk loss and maybe 39% that it can survive a 4th loss.

Let me make those values precise... (although with wider VDEVs, the chances get better in favor of the pool staying alive, so these will be worst case with the smallest possible RAIDZ2)

8 disks, 2 VDEVs of RAIDZ2, 4 disks each.

first 2 failures happen in VDEV1, leaving 2 disks in VDEV1 and VDEV1 now at risk of causing pool loss...

Next disk to fail is a 2 in 6 chance of landing in VDEV1, so that's a 33% chance (worst possible case for RAIDZ2... numbers get better with wider VDEVs), meaning a 66% chance of survival.

That leaves 5 disks, 2 in VDEV1 and 3 in VDEV2, so the next one is a 2 in 5 chance that the lost disk is in VDEV1, that's 40%, so a 60% chance of survival. (you could say 60% of 66% to get to this point, so something like 39%)

If you want to have the data security/insecurity offered by UNRAID, you'll need to switch NAS products

Can you configure alarms on the SMART values?
Not sure if that's actually working right now in SCALE, but it will eventually. You need to set up an email address for the root account and set up the alerts option.

Could and should I create a 12x3TB RAIDZ2 VDEV?
12 wide is considered somehow a limit in terms of reasonable VDEV width, so you may want to consider staying away from it, but the ultimate choice is yours. Resilver and scrub times increase with VDEV width, so that's your only real concern with going to 12.

You'll lose 2 additional disks of capacity to do 2 VDEVs, so that's the downside with that option... gain of double the IOPS for the pool though, so it's not a total loss... even though it sounds like IOPS won't matter much for you.
 

ornias

Wizard
Joined
Mar 6, 2020
Messages
1,458
I need/want to be able to run docker and kubernetes. Now on TrueNAS CORE I could do this by running a Linux VM.
Please be aware that using native docker directly, is not officially supported.

The rest of your reasoning is pretty solid though :)
 

nemesis1782

Contributor
Joined
Mar 2, 2021
Messages
105

nemesis1782

Contributor
Joined
Mar 2, 2021
Messages
105
Yes, but, for example in a RAIDZ2 pool with 2 VDEVS, you would expect to be able to lose 2 disks without data loss (and you can... actually as many as 4 if you're lucky on the distribution between vdevs of the failed disks).

That comes at a cost of an additional 2 disks lost to redundancy for each VDEV, but that's the risk level you chose, so you should be (or need to become) fine with that.

I don't see how you can call it risky for the pool to be managing risk at least as well (and sometimes better than) as you planned.

In the case of a single VDEV pool of RAIDZ2 you have a 100% chance of pool loss with 3 disks gone... with a 2 VDEV RAIDZ2 pool, there's something like a 60-70% chance the pool can survive a 3rd disk loss and maybe 39% that it can survive a 4th loss.

Let me make those values precise... (although with wider VDEVs, the chances get better in favor of the pool staying alive, so these will be worst case with the smallest possible RAIDZ2)

8 disks, 2 VDEVs of RAIDZ2, 4 disks each.

first 2 failures happen in VDEV1, leaving 2 disks in VDEV1 and VDEV1 now at risk of causing pool loss...

Next disk to fail is a 2 in 6 chance of landing in VDEV1, so that's a 33% chance (worst possible case for RAIDZ2... numbers get better with wider VDEVs), meaning a 66% chance of survival.

That leaves 5 disks, 2 in VDEV1 and 3 in VDEV2, so the next one is a 2 in 5 chance that the lost disk is in VDEV1, that's 40%, so a 60% chance of survival. (you could say 60% of 66% to get to this point, so something like 39%)
Sorry, what I meant is 1 RAIDz3 vs 2 RAIDz2

12 wide is considered somehow a limit in terms of reasonable VDEV width, so you may want to consider staying away from it, but the ultimate choice is yours. Resilver and scrub times increase with VDEV width, so that's your only real concern with going to 12.

You'll lose 2 additional disks of capacity to do 2 VDEVs, so that's the downside with that option... gain of double the IOPS for the pool though, so it's not a total loss... even though it sounds like IOPS won't matter much for you.
Yeah, that's why I'm really debating to do the 12 disk raidz2. Since that would be the most cost efficient. Also currently I run RAID5 on synology it's better then that anyway ;)

@Resilver times: I really don't care. Haven't had many disks die on me and if they do I usually know long before they actually die.
 

beagle

Explorer
Joined
Jun 15, 2020
Messages
91
Bear in mind that if you plan to expand your pool in the future, if you use a 2 x 6-disk RAIDZ2 you could expand your pool by adding another 6-disk vdev whilst on a single 12-disk RAIDZ2 you would need to add another 12-disk vdev to expand it or replace all the 12-disk for ones with a larger capacity.

Disclaimer: Although you could expand your pool by adding vdev's using different configurations (i.e. mirror pairs, etc) that configuration is not recommended.
 

ornias

Wizard
Joined
Mar 6, 2020
Messages
1,458
Really? The feature and roadmap infographics give me the idea they are: https://www.truenas.com/truenas-scale/

It would be dissapointing if the info graphics represent incorrect info :(
Yes, really.
That ribbon in the infographic displays underlaying technologies.
It does not say "native docker" or "native k8s" or "native kvm" support, it tries to display what underlaying technologies SCALE uses.
Each of those technologies is not officially supported to be directly accessed by the end user.
 

nemesis1782

Contributor
Joined
Mar 2, 2021
Messages
105
Yes, really.
That ribbon in the infographic displays underlaying technologies.
It does not say "native docker" or "native k8s" or "native kvm" support, it tries to display what underlaying technologies SCALE uses.
Each of those technologies is not officially supported to be directly accessed by the end user.
First off having "native" support would be really wierd. A piece of software natively runs if it either runs barebone and customized for the platform or interfaces directly with the OS using dedicated calls for that hardware. So without any simulation or translation in between. These not running natively is a given.

If what you say is true then the truenas website is very misleading!

Also looking at the sheet(s). It's written nor formulated this way. Could please add some references that supports this? Also a reference to what to products actually do support, which is wierd thing to have to ask but hey here we are :P
 

ornias

Wizard
Joined
Mar 6, 2020
Messages
1,458
First off having "native" support would be really wierd. A piece of software natively runs if it either runs barebone and customized for the platform or interfaces directly with the OS using dedicated calls for that hardware. So without any simulation or translation in between. These not running natively is a given.
Well, Native isn't really the best choice of words indeed.
Lets call it "direct", direct docker, helm, k8s or kvm access is not supported.

I'll try to keep that in mind next time, because native indeed means something totally else.

If what you say is true then the truenas website is very misleading!
Yes. Do you want a list of it...
Thats not even a joke, remember the "hyperconvergence" salespoint? Thats also not fully true. First release does not look to have k8s clustering, nor vm migration. Features one expects from hyperconverged solutions. While "technically" still hyperconverged, thats not truely a fact.

Also looking at the sheet(s). It's written nor formulated this way.
Yeah. Not my way of communicating features either.

Could please add some references that supports this?
The general rule with TrueNAS:
- If it's not in the GUI, it's not officially supported
- If it's not in the API, it's not officially supported either

In the case of docker(-compose) it's also on the forums like everywhere where you look related to SCALE, though the search function is (sadly enough) a broken piece of garbage. Mostly it's @morganL advicing people to use kompose (which builds half-broken helm charts from docker-compose)

Also a reference to what to products actually do support, which is wierd thing to have to ask but hey here we are :P
Well it's still ALPHA (soon to be BETA) so not everything is flused out.
In short I think it's best described as:
The same as core, except:
- Added Gluster support (Using TrueCommand 2.0 or API)
- Plugins/Jails replaced by SCALE Apps/k8s (via GUI or API)
- KVM instead of bhyve (via GUI or API)

It's all not that much different, luckily :)
 

nemesis1782

Contributor
Joined
Mar 2, 2021
Messages
105
Well, Native isn't really the best choice of words indeed.
Lets call it "direct", direct docker, helm, k8s or kvm access is not supported.

I'll try to keep that in mind next time, because native indeed means something totally else.
Hehe, no worries. I understood what you meant.

Also I do think I saw the option to create VM's so think kvm is usable by the user. My current plan is to just run a VM, which in turn runs docker and kubernetes. Which is fine since nothing tasking will be running there and most of my kubernetes platform will run on three rpi's.

Yes. Do you want a list of it...
Thats not even a joke, remember the "hyperconvergence" salespoint? Thats also not fully true. First release does not look to have k8s clustering, nor vm migration. Features one expects from hyperconverged solutions. While "technically" still hyperconverged, thats not truely a fact.
Well I mean never ask a guy like me if they want a list :P It's ok though I think your time might be better spend!

On the kubernetes, from what I read it'll have k3s not k8s. Hyperconverged is one of those sales buzzwords I do not even read to be honest. Since there is no agreed upon meanon and/or definition. It's just something to put on your cake to make it more attractive!

Yeah. Not my way of communicating features either.

The general rule with TrueNAS:
- If it's not in the GUI, it's not officially supported
- If it's not in the API, it's not officially supported either
Where have the good old days gone when we could nail them a board in the center and have people throw rotten fruit at them ;)

In the case of docker(-compose) it's also on the forums like everywhere where you look related to SCALE, though the search function is (sadly enough) a broken piece of garbage. Mostly it's @morganL advicing people to use kompose (which builds half-broken helm charts from docker-compose)
Well yeah I mean docker compose and helm are well not compatible. Helm is a sort of docker compose for kubernetes. I have never seen a tool that successfully converts them.

Well it's still ALPHA (soon to be BETA) so not everything is flused out.
In short I think it's best described as:
The same as core, except:
- Added Gluster support (Using TrueCommand 2.0 or API)
- Plugins/Jails replaced by SCALE Apps/k8s (via GUI or API)
- KVM instead of bhyve (via GUI or API)

It's all not that much different, luckily :)
Well that was also one of the reasons I thought why not just start off with SCALE. I do understand the risk though and appreciate you pointing it out to me!
 

ornias

Wizard
Joined
Mar 6, 2020
Messages
1,458
Also I do think I saw the option to create VM's so think kvm is usable by the user. My current plan is to just run a VM, which in turn runs docker and kubernetes. Which is fine since nothing tasking will be running there and most of my kubernetes platform will run on three rpi's.
yes, using the API or GUI that is. Which is an abstracted form of using KVM ;-)

Well I mean never ask a guy like me if they want a list :P It's ok though I think your time might be better spend!
Yeah, I think making lists of PR donkey-poo, from PR articles is going to be another dayjob ;-)

On the kubernetes, from what I read it'll have k3s not k8s.
k3s is a k8s distribution, it's still "kubernetes" for what most discussions here are concerned. I stopped using k3s, because users can hardly find any good documentation on it, because they need to use native k8s docs.

Hyperconverged is one of those sales buzzwords I do not even read to be honest. Since there is no agreed upon meanon and/or definition. It's just something to put on your cake to make it more attractive!
Yeah... "Think runs docker and haz harddrive" basically.

Where have the good old days gone when we could nail them a board in the center and have people throw rotten fruit at them ;)
Don't we do that all the time here at the forums? ;-)

Well yeah I mean docker compose and helm are well not compatible. Helm is a sort of docker compose for kubernetes. I have never seen a tool that successfully converts them.
The primary issue with that is that goals differ.
Docker compose tries to just be a "values.yaml" equivalent from a Helm perspective. While Helm works more like a templating engine, that crafts deployments based on a values.yaml, but allows the content of one to be customised.
Though the work on creating k8s objects from docker-compose style files is very interesting too :)


Well that was also one of the reasons I thought why not just start off with SCALE. I do understand the risk though and appreciate you pointing it out to me!
Risk aware is fine, as long as someone is risk aware, they tend not to blame other people when things brake :)
 

nemesis1782

Contributor
Joined
Mar 2, 2021
Messages
105
yes, using the API or GUI that is. Which is an abstracted form of using KVM ;-)
Well yeah. That's actually why i choose a product like this and do not just go with a Linux distro. I do understand this comes with limitations and circumventing those limitations is not a good idea.

k3s is a k8s distribution, it's still "kubernetes" for what most discussions here are concerned. I stopped using k3s, because users can hardly find any good documentation on it, because they need to use native k8s docs.
Ah. Yeah. I understand for all intents and purposes in the case it is the same just wanted to point it out. Since it does come with it's own limitations.

Don't we do that all the time here at the forums? ;-)
Yeah I know putting all that frustration in the first throw is just not same as writing a post which the receiving party often does not understand anyway...

Though the work on creating k8s objects from docker-compose style files is very interesting too :)
Well yeah had to do this recently actually. Thing is those are completely different things.

Risk aware is fine, as long as someone is risk aware, they tend not to blame other people when things brake :)
Hey, hey I'll blame whom ever I want for my inadequacies. You hear me :P
 
Top