Best practices for large amount of discs

Status
Not open for further replies.

KempelofDoom

Explorer
Joined
Apr 11, 2014
Messages
72
I have a norco case with 24 hot swapbays. After reading cyberjocks guide I want to make sure I understand the implications of my choices.

Originally, I started with 8 discs, then added another 8 discs and finished the case out with the last round of 8. I have each group of 8 running raidz1. Is this a stupid way to have done it? How is the performance impacted by doing it this way? Would it be better to do all 24 and then do raidz3? Is raidz3 enough if all 24 drives were 4TB each? Does anyone have data on what the best practices are for setting up a pool when using a large amount of discs?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
RAIDZ1 of 8 drives *is* stupid. 8 drives should be RAIDZ2 at the minimum. RAIDZ1 is not recommended around here because of the potential for URE errors causing a loss of the pool.
 

Z300M

Guru
Joined
Sep 9, 2011
Messages
882
I have a norco case with 24 hot swapbays. After reading cyberjocks guide I want to make sure I understand the implications of my choices.

Originally, I started with 8 discs, then added another 8 discs and finished the case out with the last round of 8. I have each group of 8 running raidz1. Is this a stupid way to have done it? How is the performance impacted by doing it this way? Would it be better to do all 24 and then do raidz3? Is raidz3 enough if all 24 drives were 4TB each? Does anyone have data on what the best practices are for setting up a pool when using a large amount of discs?
I do not consider myself an expert. but allow me to pass on what I have picked up by reading here for the past two years or so.

I agree with CyberJock that RAIDZ1 is asking for trouble: what happens if a second drive fails while the replacement for the first failed drive is still being resilvered?

The best performance is achieved by using 2^x + y drives, where x is 1, 2 or 3 and y is the number of extra/parity drives. So, for example, a RAIDZ3 setup of 7 drives (2^2 + 3) or a RAIDZ2 setup of 10 drives (2^3 + 2). So if I had 24 drives available, I would probably configure them as two sets of 11 drives each in RAIDZ3 (2^3 + 3) and keep two drives as spares.
 

KempelofDoom

Explorer
Joined
Apr 11, 2014
Messages
72
So expanding the pool by adding 8 drives at a time is not reckless? As long as I am using RAIDZ2 for each vdev? Am I right to assume that each vdev would have it's own parity drive in that situation?
 

Z300M

Guru
Joined
Sep 9, 2011
Messages
882
So expanding the pool by adding 8 drives at a time is not reckless? As long as I am using RAIDZ2 for each vdev? Am I right to assume that each vdev would have it's own parity drive in that situation?
You mean two parity drives, I assume? I don't think it's reckless from a data security point of view, but such a configuration does not comply with the recommendations for optimal performance; I don't know what the performance disadvantage would be.
 

KempelofDoom

Explorer
Joined
Apr 11, 2014
Messages
72
What are the recommendations for optimal performance? Sounds like adding vdevs to expand the pool is not a good idea?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
What are the recommendations for optimal performance? Sounds like adding vdevs to expand the pool is not a good idea?

He said it in post 3...
 

Nick2253

Wizard
Joined
Apr 21, 2014
Messages
1,633
...So, for example, a RAIDZ2 setup of 7 drives (2^2 + 3) or ...

This should say a "RAIDZ3 setup of 7 drives". The math is correct (Z300M indicates 3 parity drives), but labeled it wrong. Typo, I'm assuming.
 

KempelofDoom

Explorer
Joined
Apr 11, 2014
Messages
72
So if I had 24 drives available, I would probably configure them as two sets of 11 drives each in RAIDZ3 (2^3 + 3) and keep two drives as spares.

That just seems like such a waste of drives and space. I would literally have 8 drives for use out of 12. Are you going with business needs in mind or a server with a bunch of movies and music being streamed to various devices every now and then? I understand that it's my data and I'm ultimately responsible for how well I protect it so I'll probably do the RAIDZ3 but man, this is getting pricey. However, I am using a product that is next gen and would help prevent the kinds of issues I would face if I had stuck with nix and LVM or some other "solution".
 

Nick2253

Wizard
Joined
Apr 21, 2014
Messages
1,633
That just seems like such a waste of drives and space. I would literally have 8 drives for use out of 12. Are you going with business needs in mind or a server with a bunch of movies and music being streamed to various devices every now and then? I understand that it's my data and I'm ultimately responsible for how well I protect it so I'll probably do the RAIDZ3 but man, this is getting pricey. However, I am using a product that is next gen and would help prevent the kinds of issues I would face if I had stuck with nix and LVM or some other "solution".

If all you are doing is home stuff, then I wouldn't worry about performance. If your use case is really "every now and then", the difference between optimal performance and suboptimal performance will barely register. But you literally did ask for "optimal performance," and that often doesn't correspond to optimum space use.

Assuming your use case is similar to mine (one or two users, with one or two simultaneous streams at worst), I'd do 3x 8-disk RAIDZ2. Then you can do 8 disks at a time. There's no problems adding vdevs to pools.
 

Z300M

Guru
Joined
Sep 9, 2011
Messages
882
This should say a "RAIDZ3 setup of 7 drives". The math is correct (Z300M indicates 3 parity drives), but labeled it wrong. Typo, I'm assuming.
Thanks for the correction, Yes, a typo, which I've now corrected.
 

KempelofDoom

Explorer
Joined
Apr 11, 2014
Messages
72
If all you are doing is home stuff, then I wouldn't worry about performance. If your use case is really "every now and then", the difference between optimal performance and suboptimal performance will barely register. But you literally did ask for "optimal performance," and that often doesn't correspond to optimum space use.

Assuming your use case is similar to mine (one or two users, with one or two simultaneous streams at worst), I'd do 3x 8-disk RAIDZ2. Then you can do 8 disks at a time. There's no problems adding vdevs to pools.

Thank you. You are correct that I did ask for optimum and that doesn't always allow for more storage but protection of the data on that storage. I still would be hurting if I lost the data so I do plan on getting old school with the latest tape backup devices that will let me store large amounts so I keep the tape count to a minimum.

Something so simple as a home media server being this involved is kinda nuts. However I have always been referred to as crazy and I have been having a blast with this so they must be right.
 

Z300M

Guru
Joined
Sep 9, 2011
Messages
882
That just seems like such a waste of drives and space. I would literally have 8 drives for use out of 12. Are you going with business needs in mind or a server with a bunch of movies and music being streamed to various devices every now and then? I understand that it's my data and I'm ultimately responsible for how well I protect it so I'll probably do the RAIDZ3 but man, this is getting pricey. However, I am using a product that is next gen and would help prevent the kinds of issues I would face if I had stuck with nix and LVM or some other "solution".
My response indicating how I would configure 24 drives if I had them was simply taking the number of drives you already have (or are proposing to use). If I did plan a setup that required 16 drives plus parity, I probably would use two RAIDZ2 pools each of ten drives (2^3 + 2), a total of 20 drives. Plus maybe one or two spares.

You are the one suggesting a total of 24 drives. How did you arrive at this magic number? I would not feel compelled to use 24 drives just because the case holds that many.
 

KempelofDoom

Explorer
Joined
Apr 11, 2014
Messages
72
My response indicating how I would configure 24 drives if I had them was simply taking the number of drives you already have (or are proposing to use). If I did plan a setup that required 16 drives plus parity, I probably would use two RAIDZ2 pools each of ten drives (2^3 + 2), a total of 20 drives. Plus maybe one or two spares.

You are the one suggesting a total of 24 drives. How did you arrive at this magic number? I would not feel compelled to use 24 drives just because the case holds that many.

Sorry if it sounded like I was complaining. I went with a 24 bay case since I need to store at least 60TB of data and a case that would let me expand as I go and offers expansion cases for future proofing. That's where I got the 24 number from.
 

solarisguy

Guru
Joined
Apr 4, 2014
Messages
1,125
If you have time to test, create 4 pools of raidz2 with 6 disks each and compare their write performance to that of 3 pools of raidz2 8 disks each with.

Also, with ZFS, it is recommended to never use more than 80% storage space in the dataset (filesystem), so your 60TB might not be attainable if you value your data.
 

KempelofDoom

Explorer
Joined
Apr 11, 2014
Messages
72
If you have time to test, create 4 pools of raidz2 with 6 disks each and compare their write performance to that of 3 pools of raidz2 8 disks each with.

Also, with ZFS, it is recommended to never use more than 80% storage space in the dataset (filesystem), so your 60TB might not be attainable if you value your data.


I have been doing testing with both setups and the raidz3 with 2 vdevs of 11 disks each seems to have better performance.

Last question is can I configure the 2 spare drives as hot spares that would be put in use if one of the drives experienced a failure automatically?
 

KempelofDoom

Explorer
Joined
Apr 11, 2014
Messages
72
I've been reading several posts and links about deduplication. Since I'm use a E5-2630V2 cpu I wonder if I'll see much overhead. Does it dedupe blocks or is it dedupe files? Adding a SSD to this array for L2ARC to make up for the lack of RAM seems like a workaround. I have 128GB RAM now and am looking at a 50TB volume so at a 5GB RAM to a TB, a 250GB SSD should suffice? Or will enabling dedupe really impact performance?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
No.

1. Dedup works on the block level.
2. SSDs do NOT make up for a lack of RAM. It never has and won't without recoding ZFS.
3. 50TB will be ungodly slow to dedup. As the DDT gets bigger it becomes a larger and larger drag for every single block of data being written. Even the fastest CPUs can be crippled by large DDTs.
4. Dedup works great for small sets of data that has huge numbers of duplicates. That's ALL. It is not and will *never* be cost a cost effective way to get more bang for your buck with regards to disk space aside from that. RAM is $80 for 8GB, and you can buy 2000GB of disk space for that kind of money. You plan to see a 250 fold savings on disk space? If not, it's not cost effective.

So no, give up on dedup. On TrueNAS boxes it doesn't even come with the option of enabling it because it's so incredibly useless. If you buy a TrueNAS box and you want dedup we have to specifically enable it for you, and we only do that if you can make a case for it being a win-win for your setup. I haven't seen a box yet that has it enabled either. ;)
 
Status
Not open for further replies.
Top