Register for the iXsystems Community to get an ad-free experience and exclusive discounts in our eBay Store.

Getting the most out of ZFS pools!

Status
Not open for further replies.

hpux735

Newbie
Joined
May 27, 2011
Messages
1
Getting the most out of a ZFS pool takes a little bit of math and some thought about how you want your system to work. There is a central trade-off between redundancy, performance, and capacity. Choose any 2. For ultimate performance and capacity, there is no beating a striped array. For redundancy and performance, mirror... etc. RAID 5 and 6 strike a balance. Even more so, RAID 5+0 or 6+0 (striped arrays of RAID 5s or 6s). If you have a huge disk array using FreeNAS and ZFS there are many opportunities for optimizing your configuration. But, to do it requires a trick.

First an example of deciding how to make a configuration:

Let's assume that you have a 96 drive disk array (big, I know, but bare with me)
I can tell you now that you DO NOT want a 96 drive RAID-6, or even a 94 drive RAID-6 with 2 spares. Not only is that an unwieldy Z-pool, ZFS doesn't like it either. That document says that less than 40 devices is recommended, but less than 12 is preferred. Also, as a side benefit, if you want to add devices to the pool later, expanding the capacity, you can, but only in the same sized chunks that the stripe already uses. In the same example, if you built up your disk array using 12 device JBODs (Just a Bunch Of Disk), and were using a 12 device pool then you can add capacity one JBOD at a time.

Moving on to the tradeoffs... I've made a spreadsheet that helps you calculate these numbers for an arbitrary configuration, but for the above example:

If each pool contains less than or equal to 5 disks, I'll assume RAID-5, unless it's 2 disks which will be mirrored. Above 5 disks, to 12, I assume RAID-6. My design criteria is to have at least 1 hot-spare.

The '#' symbol is the number of disks in the pool, lost means lost capacity (due to parity or spares), read and write are the multiplier of a single disk speed, assuming that the disk is the bottleneck.

Code:
#  Spares	Pairity	Lost		Read	Write		Raid level	
1	0	0.00		0.00%	96	96		Striping
2	0	48.00		50.00%	96	1		Mirroring
3	0	32.00		33.33%	64	64		Raid-5 (Z)
4	0	24.00		25.00%	72	72		Raid-5 (Z)
5	1	38.00		40.63%	57	57		Raid-6 (Z2)
6	0	32.00		33.33%	64	64		Raid-6 (Z2)
7	5	26.00		32.29%	65	65		Raid-6 (Z2)
8	0	24.00		25.00%	72	72		Raid-6 (Z2)
9	6	20.00		27.08%	70	70		Raid-6 (Z2)
10	6	18.00		25.00%	72	72		Raid-6 (Z2)
11	8	16.00		25.00%	72	72		Raid-6 (Z2)
12	0	16.00		16.67%	80	80		Raid-6 (Z2)


Now, in convenient graph form:
raid.jpg

I decided to use 5 disk pools. This was because it was the only option with a single spare disk. I ran the math out until the next time there was a single spare, and it didn't occur until I had pools of 19 disks each, which is too much for me. I will admit that I used RAID-5 rather than RAID-6 because it's a development machine.

Now the question: "How do I make FreeNAS build a configuration like this?" It's not available in the web interface, but you can make it work. The trick is using the command line zpool command. First, you need a list of disk devices. I don't remember how I did this in FreeNAS, but I'll update this when I remember. Or, someone can reply with the answer. Once you have a list of devices, paste it into a text file. Collect the device names for each disk into a single line for each pool. Do this for every pool that you want. Leave the spare devices on a separate line. Like this:

zpool create data \
raidz c1t5000C50025F5D3B7d0 c1t5000C50025F6448Fd0 c1t5000C50025FCD6C7d0 c1t5000C50025FD2D3Fd0 c1t5000C50025FD5A0Fd0 \
raidz c1t5000C50025FD9D9Fd0 c1t5000C50025FD93E3d0 c1t5000C50025FD669Fd0 c1t5000C50025FDACFBd0 c1t5000C50025FDBC27d0 \
raidz c1t5000C50025FDC39Bd0 c1t5000C50025FDC393d0 c1t5000C50025FDDD93d0 c1t5000C50025FDE9FFd0 c1t5000C50025FDF6C7d0 \
raidz c1t5000C50025FDFE4Bd0 c1t5000C50025FE1BB3d0 c1t5000C50025FE2C93d0 c1t5000C50025FE3D6Bd0 c1t5000C50025FE012Fd0 \
raidz c1t5000C50025FE174Fd0 c1t5000C50025FE226Fd0 c1t5000C50025FE446Bd0 c1t5000C50025FE494Bd0 c1t5000C50025FE2867d0 \
raidz c1t5000C50025FE5503d0 c1t5000C50025FED877d0 c1t5000C50025FF90FBd0 c1t5000C50025FFAC3Fd0 c1t5000C50025FFAF83d0 \
raidz c1t5000C50025FFB7C7d0 c1t5000C50025FFB027d0 c1t5000C50025FFB99Fd0 c1t5000C50025FFB917d0 c1t5000C50025FFC9CFd0 \
raidz c1t5000C50025FFD1DBd0 c1t5000C50025FFD2AFd0 c1t5000C50025FFD79Bd0 c1t5000C50025FFD787d0 c1t5000C50025FFE13Bd0 \
raidz c1t5000C50025FFE88Bd0 c1t5000C50025FFF55Fd0 c1t5000C5002600AFE7d0 c1t5000C5002600B7A7d0 c1t5000C5002600B587d0 \
raidz c1t5000C5002600BADBd0 c1t5000C5002600BE13d0 c1t5000C5002600C1E7d0 c1t5000C5002600C8A3d0 c1t5000C5002600CB47d0 \
raidz c1t5000C5002600CF2Bd0 c1t5000C5002600DDF7d0 c1t5000C5002600E62Fd0 c1t5000C5002600E483d0 c1t5000C5002600EDABd0 \
raidz c1t5000C5002600F02Bd0 c1t5000C5002600F027d0 c1t5000C5002600F33Bd0 c1t5000C5002600F96Bd0 c1t5000C5002600FA27d0 \
raidz c1t5000C5002601A47Bd0 c1t5000C5002601A237d0 c1t5000C5002601D96Fd0 c1t5000C5002601D197d0 c1t5000C5002601DF77d0 \
raidz c1t5000C5002602E77Fd0 c1t5000C5002604EF4Bd0 c1t5000C5002604F19Fd0 c1t5000C5002604F313d0 c1t5000C50026014DA7d0 \
raidz c1t5000C50026016A87d0 c1t5000C50026016AF7d0 c1t5000C50026016BEBd0 c1t5000C50026016C03d0 c1t5000C50026016CD3d0 \
raidz c1t5000C50026017A5Bd0 c1t5000C50026017CFBd0 c1t5000C50026018D6Bd0 c1t5000C50026018D97d0 c1t5000C50026019D9Bd0 \
raidz c1t5000C50026033BE7d0 c1t5000C50026050EBFd0 c1t5000C500260103A7d0 c1t5000C500260112FBd0 c1t5000C500260113DFd0 \
raidz c1t5000C500260135CBd0 c1t5000C500260315B3d0 c1t5000C5002601043Bd0 c1t5000C5002603314Fd0 c1t5000C5002603335Fd0 \
raidz c1t5000C50026033293d0 c1t5000C50026013853d0 c1t5000C50026017323d0 c1t5000C50026017623d0 c1t5000C50026052597d0 \
spare c1t5000C50026016737d0


Add "raidz" before each line, '\<CR>', where <CR> is a actual carriage return, at the end of each line, and "zpool create data \<CR>" at the top of the document. Finally, add "spare" to the line with your spare disks.

Copy and paste this into the command line as root. This will instruct ZFS to create your Z-pool as a stripe of many RAID'd volumes.

Now, the trick. We need to "export" the pool. This means to essentially take it out of the OS. This may seem counter intuitive, but do it anyway.

"zpool export data"

Now, if you run "zpool status" you shouldn't see any Z-pools.

In the web interface, go to Storage->Volumes->View all Volumes
Click on the "Auto import all volumes"
In a few minutes your carefully crafted ZFS pool should appear!

Idea credit: http://blogs.oracle.com/roch/entry/when_to_and_not_to
 
J

jpaetzel

Guest
Another useful piece of information:

RAIDZ, RAIDZ2, and RAIDZ3 have the best performance when there is a power of two number of data drives. In keeping the arrays under 12 drives, you end up with 2, 4 or 8 data drives. Adding the parity drives to that you get:

RAIDZ 3, 5, or 9 drives
RAIDZ2 4, 6, or 10 drives
RAIDZ3 5, 7, 11 drives
 

esamett

Senior Member
Joined
May 28, 2011
Messages
343

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
9,039
Another useful piece of information:

RAIDZ, RAIDZ2, and RAIDZ3 have the best performance when there is a power of two number of data drives. In keeping the arrays under 12 drives, you end up with 2, 4 or 8 data drives. Adding the parity drives to that you get:

RAIDZ 3, 5, or 9 drives
RAIDZ2 4, 6, or 10 drives
RAIDZ3 5, 7, 11 drives
Okay, the way you worded it confused me, not that I'm easily confused :eek:

So, for a 4 drive setup, that would be 2 data drives and 2 parity (RaidZ2) ? I ask because I'm still playing around with FreeNAS 8.x (build 6553 as of a few minutes ago) and I have some fantastic transfer rates using all 4 drives selected as a single RaidZ volume. Typical transfer rates over a single Gbit NIC is 95MB/sec. I'm maxing out the Ethernet connection and I have no need for a second connection at this time.

I guess I don't need a "faster" setup but one that was efficient or more resilient would be nice.

Thanks,
Mark
 

Tekkie

Senior Member
Joined
May 31, 2011
Messages
344
RAIDZ, RAIDZ2, and RAIDZ3 have the best performance when there is a power of two number of data drives. In keeping the arrays under 12 drives, you end up with 2, 4 or 8 data drives. Adding the parity drives to that you get:

RAIDZ 3, 5, or 9 drives
RAIDZ2 4, 6, or 10 drives
RAIDZ3 5, 7, 11 drives
Where did you get this information? I've read the ZFS wiki and not found any reference to ^2 data drives and performance?
 

StigOfTheDump

Newbie
Joined
Jun 11, 2011
Messages
1
List disk devices?

First, you need a list of disk devices. I don't remember how I did this in FreeNAS, but I'll update this when I remember. Or, someone can reply with the answer.
Does anyone know how to do this... would prefer not to have to spend the remainder of the day/week/month googling ....

basically I have 5 x 500gb in one raid 5 and 3 x 1tb in another raid 5. I'd like to combine these into a raid 50, as much as anything because I only want one mount point - seems odd to me that freenas doesn't support this through the gui?
 

TravisT

Senior Member
Joined
May 29, 2011
Messages
292
Where did you get this information? I've read the ZFS wiki and not found any reference to ^2 data drives and performance?
(N+P) with P = 1 (raidz), 2 (raidz2), or 3 (raidz3) and N equals 2, 4, or 8

Link

See the section labeled "RAIDZ Configuration Requirements and Recommendations" for more details...
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
9,039
Now that's a good link to read! Thanks for posting it.
 

TravisT

Senior Member
Joined
May 29, 2011
Messages
292
Np, hope it helps out!
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
9,039
Right now, and I know I'm not the only one out there, I'm trying to figure out the best implementation of my four drives for the uses I have planned and should I purchase a fifth drive. I will be placing important (to me) data on the NAS which will be periodically backed up to DVD media. This is photos, financial documents, full system backups of the 4 computers in the house, etc... so I want these as safe as possible. I will also be storing digital music and movies on the NAS as well. I don't care if I lose the movie information as I can rip those again as needed but the music will also need to be protected. So my thought is to create two separate drive sets, one set for all my important to me data and one for the movies and anything else.

So I don't mind setting up 1 hard drive for movies, but what about the last three, RAIDZ or should I just create two ZFS mirrors, or...

My point is I'm still trying to figure out the best way to implement four identical drives vice an odd number of drives. As for speed, I max out my 1GB connection with a RAIDZ during testing so speed isn't a real issue.

If there is some easy advice, I'll take it but I'm sure I'll settle on something reasonable. I still have a question about two mirrored drives I'm mulling over, just not sure how to word it so it makes sense.

-Mark
 

Milhouse

Neophyte Sage
Joined
Jun 1, 2011
Messages
564
Does anyone know how to do this... would prefer not to have to spend the remainder of the day/week/month googling ....

basically I have 5 x 500gb in one raid 5 and 3 x 1tb in another raid 5. I'd like to combine these into a raid 50, as much as anything because I only want one mount point - seems odd to me that freenas doesn't support this through the gui?
It is supported.

In the GUI you create your volume - let's call it "data" - by selecting the 5x 500GB drives with RAIDZ1 (RAID5) redundancy, then once the volume is created you go to create a second volume and as long as you use the same volume name, "data", the remaining disks (3x1TB) will be added to your "data" volume as an additional vdev. You keep repeating until you have added all your disks as separate vdevs combined in a single volume/zpool.

You can mix and match redundancy levels with each vdev, but if you choose RAIDZ1 for both your vdevs the data in the zpool will be stripped across both vdevs giving you RAIDZ1+0 (aka RAID5+0, or RAID50) for the whole zpool.
 

TravisT

Senior Member
Joined
May 29, 2011
Messages
292
@joeschmuck

I know where you're coming from because I did the same thing about a year ago. I'm using my NAS for almost the same things you are using yours for, except I'm not doing system backups.

I went with a RAIDZ pool of 3 x 2TB disks. I use this for an electronic file cabinet, which I keep all of my manual, bills, etc on. I also have a section that I use for my media. I only have one copy of a large portion of it, so it is more important to me. I also have 4 x 1TB drives that I will be migrating over from another server that I have decided what to do with yet.

The most important thing you will need to consider is the demand you will place on these disks. If you're streaming media from them, it may put more of a demand on it than would saving or accessing a document. If you want very good redundancy, go with a RAIDZ2 (RAID6) which would give you two disks of redundancy, which you can do with 4 disks.

If it works for your situation, you could also do a RAIDZ (3 disk) for the important data, and a single disk (or mirror/stripe if you wanted to purchase another disk) for the other data.

Hope this helps.

You are doing pretty good if you can fully saturate a 1GB connection - I even have trouble with that, although my file server is virtual, so it's sharing some of the resources with other virtual machines.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
9,039
@joeschmuck

You are doing pretty good if you can fully saturate a 1GB connection - I even have trouble with that, although my file server is virtual, so it's sharing some of the resources with other virtual machines.
I can only saturate because I'm using SSD's on my main computer. My normal hard drives are not capable of keeping up and I was only saturating it for testing to see if ZFS bombed out, something I have too much experience at before. I can state that my system ran for about 9 hours fully saturating the 1Gb network. If I only had a 10Gb network :)

As for limited bandwidth, my current home NAS is slow, topping out at 15MB/sec and it's able to stream raw BluRay files without issue to my PS3. I have since realized that while BluRay is very nice and clear, I don't see a significant difference from DVD at my normal viewing distance. That is not saying I can't see a difference because I can but it doesn't bother me. When I buy my next plasma TV it will be likely a 60+ inch and then maybe DVD just won't be enough. I thin I've got another 5 years before then anyways. Yea, that was a bit off topic.

So as for configuration, I think I'll weigh two drives mirrored against three drives in RAIDZ. I understand capacity will be different but survivability and rebuilding are in my mind right now. I can use a single drive for movies if I need to.

Thanks for the advice,
Mark
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
9,039
I guess the forum failed to protect against those to spam postings :eek:
 

djiceman

Neophyte
Joined
Jan 1, 2012
Messages
11
@jpaetzel

Hey man, thx for sharing that information. But the link you posted states:

(N+P) with P = 1 (raidz), 2 (raidz2), or 3 (raidz3) and N equals 2, 4, or 6

witch would make a raidz 3, 5 or 7 drives, wouldn't it?

Regards,

Ice
 

TravisT

Senior Member
Joined
May 29, 2011
Messages
292
Total drives, yes. As I understand it, you can have a virtually unlimited number of drives in a raidz pool, but you lose one disk for parity. If you have 30 disks in raidz you will only have 29 drives of usable space. raidz2 uses the same concept, except that you have 2 disks for parity. The 30 drive example would give you 28 drives of usable space.

As for reliability, the first example would have data loss if the parity drive and any other drive were lost. For the second, you would have to lose both parity drives plus another drive IIRC. If I'm off in my logic here, please correct me - it's been a while since I've dug into this stuff!
 

djiceman

Neophyte
Joined
Jan 1, 2012
Messages
11
@TravisT

Thx for your reply. No offense, but my question wasn't about how raidz/2 work, but about the recommended setup of such. Your example of 30 drives in a single raidz is clearly NOT recommended!

Regards,

Ice
 

TravisT

Senior Member
Joined
May 29, 2011
Messages
292
djiceman said:
Hey man, thx for sharing that information. But the link you posted states:

(N+P) with P = 1 (raidz), 2 (raidz2), or 3 (raidz3) and N equals 2, 4, or 6

witch would make a raidz 3, 5 or 7 drives, wouldn't it?

Regards,

Ice
No offense taken, but maybe you should re-read the post as well. Both I and hpux735 posted links to the solaris docs on ZFS. It lists all of the best practices. The link I posted answers your question.

A RAIDZ configuration with N disks of size X with P parity disks can hold approximately (N-P)*X bytes and can withstand P device(s) failing before data integrity is compromised. (N=# of disks in group, P=Parity disks)

It also says:

Start a single-parity RAIDZ (raidz) configuration at 3 disks (2+1)
Start a double-parity RAIDZ (raidz2) configuration at 6 disks (4+2)
Start a triple-parity RAIDZ (raidz3) configuration at 9 disks (6+3)

This means that a raidz array could have any number of disks (N) = 3, 4, 5...,30. You will have N-1 times the size of the disks (since it's raidz and you only have one parity disk) of usable space.

Same deal for raidz2 or raidz3, except that you will have (N-3)*size of disk of usable space.

You didn't ask what was recommended... obviously since I've researched this and have run ZFS both on solaris and freenas, I realize that a 30 disk array is possible but not recommended. The best practices for ZFS are to use groups of 3 to 9 disks. If that's the answer you wanted... it sure wasn't clear!
 

djiceman

Neophyte
Joined
Jan 1, 2012
Messages
11
@TravisT

OK, sorry for the misunderstanding. My bad. Let me try to rephrase my question. I was confused because it has been stated in this post
N equals 2, 4, or 8
while the ZFS documentation says
Code:
N equals 2, 4, or 6

That's all I've been trying to clarify. That would make the max recommended number of drives in a raidz 3, 5, or 7. Thus jpaetzel's statement
Another useful piece of information:

RAIDZ, RAIDZ2, and RAIDZ3 have the best performance when there is a power of two number of data drives. In keeping the arrays under 12 drives, you end up with 2, 4 or 8 data drives. Adding the parity drives to that you get:

RAIDZ 3, 5, or 9 drives
RAIDZ2 4, 6, or 10 drives
RAIDZ3 5, 7, 11 drives
is not correct. It would have to be
Code:
Another useful piece of information:

RAIDZ, RAIDZ2, and RAIDZ3 have the best performance when there is a power of two number of data drives. In keeping the arrays under 12 drives, you end up with 2, 4 or 6 data drives. Adding the parity drives to that you get:

RAIDZ 3, 5, or 7 drives
RAIDZ2 4, 6, or 8 drives
RAIDZ3 5, 7,  9 drives


Or am I mistaken?

Regards,

Ice
 

TravisT

Senior Member
Joined
May 29, 2011
Messages
292
This statement is confusing to me.

(N+P) with P = 1 (raidz), 2 (raidz2), or 3 (raidz3) and N equals 2, 4, or 6
It has this formula in the ZFS guide right after this:

A RAIDZ configuration with N disks of size X with P parity disks can hold approximately (N-P)*X bytes and can withstand P device(s) failing before data integrity is compromised.
They kind of contradict each other. I think the point to remember is the second formula. If you are concerned with raidz only, then following the best practices and using (N-1)*X, you would start with 3 drives (3-1) and go up to 9 drives (9-1)
would be the recommended solution. This would give you 2-8 drives of data storage and you could withstand one drive failure without data loss.

Another useful piece of information:

RAIDZ, RAIDZ2, and RAIDZ3 have the best performance when there is a power of two number of data drives. In keeping the arrays under 12 drives, you end up with 2, 4 or 8 data drives. Adding the parity drives to that you get...
I'm really not sure where this info came from, but it seems to contradict the zfs guide. The guide says it's recommended to keep groups under 9 drives:

The recommended number of disks per group is between 3 and 9. If you have more disks, use multiple groups.
I read this as total disks, not "usable disks". In a raidz array, this would amount to between 2 and 8 "data" drives and one parity drive. I can't remember reading anything in the guide saying drives should be a "power of two data drives" for best performance. Not saying it's not correct, but I'd stick with the zfs guide over comments unless they can back them up with a reference.

With that said, there is no "right" number as long as the TOTAL number of disks in the array is between 3 and 9.
 
Status
Not open for further replies.
Top