cronsloth
Cadet
- Joined
- Jul 26, 2017
- Messages
- 9
Just started at a new studio, and I got to inherit the previous admins config decisions. (great :p) I was told we had 8 disks in RAIDz2, but when I did the math, we were sitting about about 1/2 the usable space that I was expecting to see. Due to the space restriction, the admin also disabled snapshots since he thought there wasn't enough space to spare..
So anyways, I finally got the time to dig into it, and found something interesting.
I'm trying to wrap my brain around why this config was chosen over something else. I would love any input or opinion on the matter, or if there is something I'm unaware of that is extra beneficial about this RAID config I'm all ears... I'm sure there was a reason, but I'm not sure if it still applies. So before thinking about changing anything I wanted more opinions of smarter people than me ;).
As a background on workload, we use Autodesk products (Maya, 3DsMax, etc), which can decently be read/write heavy, and may use some compositing software in the future. We will eventually have around 50 users hitting the server at any given time. Which is what spawned my curiosity about performance vs redundancy.
So we have an 8-disk array populated with 6TB drives (5.4TB usable) = 43.2TB total usable
The current pool config is this:
A bit of googling made me realize something.
Instead of a typical RAIDZ2 where all disks are part of the same vdev, and you have that 2 disk parity...the current pool config is split into 2 vdevs with EACH vdev configured in a RAIDz2 parity, so essentially a RAID60.
This means each vdev has only 10.8TB usable (5.4TB x 2 lost for parity within the vdev). So you combine those together in the pool, and get about ~21TB usable which is exactly where we are.
Consequences of this are performance based mostly but storage usable is also no small concern.
I found a website that did some throughput tests and his results showed this config being dead last in the race for faster throughput (which would make sense with all that parity needing to be calculated).
This config is aptly called "RAIDZ2 x 2" in his write-up (which is really good to read for understanding)
ZFS RAID Performance Comparisons
I guess the question is:
Is this 'RAIDz2 X 2' really the best way to strike a nice balance between redundancy AND performance.
Also, if someone wanted to link me or take the time to explain performance differences with having multiple vdevs in a pool. And
Do you get better performance from 2 vdevs of 4 disks (combined into a single pool) vs a pool with 1 vdev of 8?
What are some of the ways multiple vdevs can be used to support multiple artist workloads and throughput needs, and what are recommendations for scaling?
For example: I heard at one point a guy make a comparison rule of having about 2 to 3 disks per artist (to handle workload),
then breaking those up into multiple vdevs of 9 disks per pool with z2 parity.
Thanks!
So anyways, I finally got the time to dig into it, and found something interesting.
I'm trying to wrap my brain around why this config was chosen over something else. I would love any input or opinion on the matter, or if there is something I'm unaware of that is extra beneficial about this RAID config I'm all ears... I'm sure there was a reason, but I'm not sure if it still applies. So before thinking about changing anything I wanted more opinions of smarter people than me ;).
As a background on workload, we use Autodesk products (Maya, 3DsMax, etc), which can decently be read/write heavy, and may use some compositing software in the future. We will eventually have around 50 users hitting the server at any given time. Which is what spawned my curiosity about performance vs redundancy.
So we have an 8-disk array populated with 6TB drives (5.4TB usable) = 43.2TB total usable
The current pool config is this:
Code:
pool: home state: ONLINE config: NAME STATE home ONLINE raidz2-0 ONLINE gptid/626c4aa8-aa92-11e6-8f61-00259058a2ce ONLINE gptid/828aab96-a9bd-11e6-afa2-00259058a2ce ONLINE gptid/0a503125-a898-11e6-8a89-00259058a2ce ONLINE gptid/f915be70-ab61-11e6-a3f5-00259058a2ce ONLINE raidz2-1 ONLINE gptid/dfb3ecea-ac89-11e6-8582-00259058a2ce ONLINE gptid/3d590c22-ac25-11e6-8beb-00259058a2ce ONLINE gptid/4990c399-aaec-11e6-b21c-00259058a2ce ONLINE gptid/05d19174-a90f-11e6-c40d-00259058a2ce ONLINE df -h results in: home 20T 14T 6.8T 67% /mnt/home
A bit of googling made me realize something.
Instead of a typical RAIDZ2 where all disks are part of the same vdev, and you have that 2 disk parity...the current pool config is split into 2 vdevs with EACH vdev configured in a RAIDz2 parity, so essentially a RAID60.
This means each vdev has only 10.8TB usable (5.4TB x 2 lost for parity within the vdev). So you combine those together in the pool, and get about ~21TB usable which is exactly where we are.
Consequences of this are performance based mostly but storage usable is also no small concern.
I found a website that did some throughput tests and his results showed this config being dead last in the race for faster throughput (which would make sense with all that parity needing to be calculated).
This config is aptly called "RAIDZ2 x 2" in his write-up (which is really good to read for understanding)
ZFS RAID Performance Comparisons
I guess the question is:
Is this 'RAIDz2 X 2' really the best way to strike a nice balance between redundancy AND performance.
Also, if someone wanted to link me or take the time to explain performance differences with having multiple vdevs in a pool. And
Do you get better performance from 2 vdevs of 4 disks (combined into a single pool) vs a pool with 1 vdev of 8?
What are some of the ways multiple vdevs can be used to support multiple artist workloads and throughput needs, and what are recommendations for scaling?
For example: I heard at one point a guy make a comparison rule of having about 2 to 3 disks per artist (to handle workload),
then breaking those up into multiple vdevs of 9 disks per pool with z2 parity.
Thanks!