60 X 14TB Drive TrueNas backup solution.

Gagik

Dabbler
Joined
Feb 21, 2023
Messages
27
I would not use a 1U rack server, as it would likely not have enough PCIe slots. You more or less want a new SAS 12Gbps SAS 8e card for each 60 disk external JBOD. So plan ahead for PCIe slots. And, if your system board does not have enough Ethernet performance, then you want a PCIe slot for an Ethernet card or 2 as well.

A 2U server might suffice, depending on it's base I/O and amount / speed of PCIe slots.


And I agree with others, TrueNAS Core would likely serve you better. Unless you need some special local app that is supplied in SCALE via it's Apps, Core will likely be more stable and probably slightly faster. TrueNAS SCALE is just too much of a work in progress and will likely remain so for another year.


On the subject of DRAID, their is a quirk in DRAID that involves integrated hot spares. Well, not so much quirk as it's main feature. But, using 60 disks, in multiple stripe groups and hot spares, less is known how to configure DRAID for different uses when compared to RAID-Zx.

And questions about DRAID remain, (at least for me), like can DRAID have unintegrated hot spares?
Is it best to split 60 drives per card? The setup will consist of two JBODs with 60 drives in each. Would it be ideal to daisy chain the second JBOD or add it to the second SAS card? The server we have has a total of 6 PCIe 8 and 16X slots. keep in mind, the second set of 60 drives will come later once the 500TB of data is copied to the inital 60 bay.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
The only event when I lost data stored in ZFS was my first backup system. Built years ago with FreeBSD 8. I had three 12-disk JBODs daisy chained to a 1 U server. I built a single pool across all three enclosures.

Although all JBODs had redundant power supplies, we did not properly monitor them and eventually one of the enclosures just lost power.
The pool was lost. Fortunately this was a central backup system for our data centre and nobody needed a restore at that time. So I could rebuild from scratch and no customer ever noticed.

What I did since then was think about the "blast radius". So if possible I would create separate pools for each enclosure. Should the power/HBA/cabling/whatever fail, the pool is offline but not destroyed. Fix the problem, re-import, fine.

The downside is some manual management of the distribution of your data across the pools.
 

Gagik

Dabbler
Joined
Feb 21, 2023
Messages
27
The only event when I lost data stored in ZFS was my first backup system. Built years ago with FreeBSD 8. I had three 12-disk JBODs daisy chained to a 1 U server. I built a single pool across all three enclosures.

Although all JBODs had redundant power supplies, we did not properly monitor them and eventually one of the enclosures just lost power.
The pool was lost. Fortunately this was a central backup system for our data centre and nobody needed a restore at that time. So I could rebuild from scratch and no customer ever noticed.

What I did since then was think about the "blast radius". So if possible I would create separate pools for each enclosure. Should the power/HBA/cabling/whatever fail, the pool is offline but not destroyed. Fix the problem, re-import, fine.

The downside is some manual management of the distribution of your data across the pools.
Although all JBODs had redundant power supplies, we did not properly monitor them and eventually one of the enclosures just lost power.
The pool was lost

This scares me. If I have two JBODs and one of them completely goes offline due to a back-plane issue, and I obtain another unit to replace it, does this mean I have lost all my data?
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
This scares me. If I have two JBODs and one of them completely goes offline due to a back-plane issue, and I obtain another unit to replace it, does this mean I have lost all my data?
This depends on the architecture of your pool(s). Explaining that was the intention of my post.

Generally a pool is built from VDEVs and redundancy is provided at the VDEV level. Lose a VDEV and the pool is gone.

If you build a separate pool per enclosure, a failure of the "metal" will only mean that the pool goes offline. The data on the disks will most probably be fine and after repair, re-import and scrub all will be well. Just like a single server system and pulling the power. You will lose some data in-flight but generally ZFS is very robust in these scenarios.

If you build a single pool from multiple VDEVs and a single VDEV is contained in one enclosure, so for example:

Enclosure 1: 6x 10 disk wide RAIDZ2
Enclosure 2: same

If one enclosure goes south the pool is toast. ZFS does not tolerate the loss of a VDEV. Again, redundancy is provided at the VDEV level.
You might be lucky and ZFS "notices" the VDEV loss fast enough and offlines the pool. So maybe you can re-import after repair, but there's no guarantee.

Alternatively picture a pool built from mirror VDEVs. With each pair of disks in the mirror being one disk from each of the enclosures. One enclosure fails, data is still there and live, but without any redundancy. Lose one more drive and the pool is gone.

One could build a pool built from 3-way mirrors with 3 enclosures. Of course one would want three inseparate HBAs for that.

I read from your posts that you are intending to maximise net capacity at the possible expense of performance but generally want a "bullet proof" system.

So I think going with one pool per enclosure is a good idea in your case.

HTH,
Patrick
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
Is it best to split 60 drives per card? The setup will consist of two JBODs with 60 drives in each. Would it be ideal to daisy chain the second JBOD or add it to the second SAS card? The server we have has a total of 6 PCIe 8 and 16X slots. keep in mind, the second set of 60 drives will come later once the 500TB of data is copied to the inital 60 bay.
In my opinion, yes, a separate 12Gbps SAS card with 8e ports for each enclosure would be best. Your enclosures are quite large, 60 disks, so more SAS lanes, and higher throughput per lane, (assuming your JBOD enclosure's SAS Expander supports SAS 3's 12Gbps per lane speed).

Daisy chaining works better with smaller number of disks in the upstream enclosure. Like if you have HBA <-> 12 disk JBOD <-> 12 disk JBOD. But 60 to 60, too many in my opinion.

And since you likely want low down time, pre-planning & installing 2 x SAS 12Gbps cards with 8e, plus cables and rack space is warranted. In theory, you can add the second 60 disk JBOD live to your NAS, since it would be a on dedicated SAS HBA.

Their are some SAS 16e cards, which might be useful. But, those generally need a full height PCIe slot for the 4 x SAS connectors on the PCIe back panel.


Do your enclosures support 2 x 4 lane SAS connectors?
Do they also support SAS 3's 12Gbps speed for there SAS Expander?


Failure modes depend on circumstances. As has been said, ZFS pool loss is by loosing a vDev. However, a loss of communication, (HBA overheating, cable failure or JBOD power failure), does not generally result in ZFS pool loss. Your pool would go off line and you would have to fix the problem. ZFS was designed with this specific failure in mind.

The example where @Patrick M. Hausen lost his pool, was probably due to unknown vDev redundancy loss, before the enclosure lost power. That is the purpose of the E-Mails, let you know of failures so you can deal with them.
 
Last edited:

Gagik

Dabbler
Joined
Feb 21, 2023
Messages
27
In my opinion, yes, a separate 12Gbps SAS card with 8e ports for each enclosure would be best. Your enclosures are quite large, 60 disks, so more SAS lanes, and higher throughput per lane, (assuming your JBOD enclosure's SAS Expander supports SAS 3's 12Gbps per lane speed).

Daisy chaining works better with smaller number of disks in the upstream enclosure. Like you have HBA <-> 12 disk JBOD <-> 12 disk JBOD. But 60 to 60, too many in my opinion.

And since you likely want low down time, pre-planning & installing 2 x SAS 12Gbps cards with 8e, plus cables and rack space is warranted. In theory, you can add the second 60 disk JBOD live to your NAS, since it would be a on dedicated SAS HBA.

Their are some SAS 16e cards, which might be useful. But, those generally need a full height PCIe slot for the 4 x SAS connectors on the PCIe back panel.


Do your enclosures support 2 x 4 lane SAS connectors?
Do they also support SAS 3's 12Gbps speed for there SAS Expander?


Failure modes depend on circumstances. As has been said, ZFS pool loss is by loosing a vDev. However, a loss of communication, (HBA overheating, cable failure or JBOD power failure), does not generally result in ZFS pool loss. Your pool would go off line and you would have to fix the problem. ZFS was designed with this specific failure in mind.

The example where @Patrick M. Hausen lost his pool, was probably due to unknown vDev redundancy loss, before the enclosure lost power. That is the purpose of the E-Mails, let you know of failures so you can deal with them.

Do your enclosures support 2 x 4 lane SAS connectors?
have to check on this one

Do they also support SAS 3's 12Gbps speed for there SAS Expander?
The enclosures have 12G SAS ports.
 
Joined
Jul 3, 2015
Messages
926
Yeah I can vouch for ZFS’s ability to bounce-back after having the rug pulled from underneath it.

I had a 90 bay chassis recently lose one of its three backplanes resulting is losing 30 drives simultaneously. Although the pool wasn’t happy as you’d expect after the hardware was fixed the pool reimported fixed errors and no data lost. The pool had about 500TB of data on it so that would have been annoying to lose.
 

Gagik

Dabbler
Joined
Feb 21, 2023
Messages
27
Yeah I can vouch for ZFS’s ability to bounce-back after having the rug pulled from underneath it.

I had a 90 bay chassis recently lose one of its three backplanes resulting is losing 30 drives simultaneously. Although the pool wasn’t happy as you’d expect after the hardware was fixed the pool reimported fixed errors and no data lost. The pool had about 500TB of data on it so that would have been annoying to lose.

Hi Johnny,

In your case, with a 90-drive setup, how are the drives configured? Are they organized into one large RAID or split into multiple RAIDs, etc.? Thanks.
 
Joined
Jul 3, 2015
Messages
926
I use 15 disk Z3 x 6 vdevs so one big pool. I run lots of these systems (20 plus) and have done for about 7 years and that works well for my use case which is big filestore. I also have 4 drives in each server head that are hot-spares which is overkill but I’m a belt and braces kind of guy. All my systems replicate to another identical system in another DC.
 

Gagik

Dabbler
Joined
Feb 21, 2023
Messages
27
Thanks for the reply. Looks like I will be going with similar setup of 15 drives each one raidz2 or 3
 
Joined
Jul 3, 2015
Messages
926
Performance and rebuild times are your two main considerations. For me I needed no more that to saturate one or two 10Gb connections at best which this setup can do ish (more like 15Gb). Secondly rebuilds of a half full pool so in my case 500TB take about 5-6 days so I’m guessing at full capacity (80% factor included) then it will be about double that. I did run some 60 bay all-in-ones back in the day and used 10disk Z2 for them x 6. The more vdevs you have the more performance.
 

Gagik

Dabbler
Joined
Feb 21, 2023
Messages
27
Since you have more experience with large volumes, let me ask you this: If I set up 60 drives and copy 500+ TB of data to them, then add an additional 60 empty drives to expand the pool size, will this process take a lot of time? These will be blank drives added to the existing pool to increase its size. Will the data be moved across the drives, or will the newly added drives simply be used for storing new data as it is written?
 
Joined
Jul 3, 2015
Messages
926
It will happen instantly and no data will be moved. You would have what I call an unbalanced pool. All future writes would massively favour the empty vdevs until parity is restored across the pool.

PS: Size parity that is not data parity.
 

Gagik

Dabbler
Joined
Feb 21, 2023
Messages
27
Hello,


I received all the gear and will be putting it all together. I just wanted to ask if this configuration will work fine or if I should do it differently.
I have the chassis with 60 bays, and I will be installing 60 x 14GB drives. My plan is to create four vdevs, namely vdev01, vdev02, vdev03, and vdev04, using raidZ2 with 15 drives each. These vdevs will be added to the pool as one large volume, which I will share for storage purposes. Later on, I intend to set up another identical chassis with drives and add vdev05, vdev06, vdev07, and vdev08 to the original pool I created.
Please let me know your thoughts on this configuration.
 
Joined
Jul 3, 2015
Messages
926
The process sounds fine I would just suggest you carefully consider the 15 disk Z2. I run 15 disk Z3 and I have no doubt there are members of the forum that wouldn’t even go there but you are pushing a bit further so give it some thought. I assume you are trying to keep costs down and get as much space as possible?
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
With 60 disks, I'd use 5 x 12 disk RAID-Z2 vDevs. Even though it is only 3 disks less per vDev, it is both safer for disk failure replacement. And gives you one more vDev to stripe data across. At the ever low price of 2 extra parity disks.

Please note, if you start having slow downs because of the 15 disk RAID-Z2 vDevs, their is very little you can do. RAID-Zx vDev removal does not exist for OpenZFS. So you would be stuck building a replacement server, AND storage, to off load the work. Then you can "fix" the original server's too wide RAID-Zx vDev.
 
Joined
Jul 3, 2015
Messages
926
Thinking about this and here are some of your options.

Option 1. 4 x 15disk Z2 = 520TiB usable @ 80% - (SLOWEST)

Option 2. 10 x 6disk Z2 = 400TiB usable @ 80% - (FASTEST)

Option 3. 5 x 12disk Z2 = 500TiB usable @ 80% -

Option 4. 6 x 10disk Z2 = 480TiB usable @ 80% -

When I look at it this way I would choose either option 3 or 4 unless I needed more performance. I wouldn't pick option 1 in any scenario as I lose performance and resilience and only gain 20TiB.
 

Gagik

Dabbler
Joined
Feb 21, 2023
Messages
27
Hello,

I have chosen the configuration of 5 x 12 Vdev Raid-Z3. However, I'm wondering if using Raid-Z3 is necessary in this case. I have added the Vdevs to the pool and shared it out. In terms of settings and options, I have disabled snapshot, smart trim, and compression. Are there any other settings or configurations that I should consider or need to set up? I want to clarify that this setup is exclusively for rsync storage and not intended for use as a media server or NAS for playing videos or similar purposes. It is solely for backup purposes.

Thanks,
 

Gagik

Dabbler
Joined
Feb 21, 2023
Messages
27
I thing I forgot to ask. When you create the Vdev's as raid, is it instatanly ready the drives don't have to setup parity or anything like that? like with raid cards it takes time for the drives to bake and get the raid set ready. Thanks.
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
It can take ZFS a few seconds to update it's on-disk configuration information when you add a RAID-Zx vDev, (or any vDev for that mater). But, as you say, it is NOT initializing the entire disk set. Then any write will be to empty space, with the parity information as part of the writes. Thus, no RAID-5/6 write hole in ZFS.

In the case of 5 x 12 disk RAID-Z3 vDevs verses using RAID-Z2, it is about risk acceptance and how much storage you can / want to dedicate for parity. YOU have to decide. We can only advise, (like a too wide RAID-Zx vDev...).
 
Top