zpools, raidz/mirror, data duplication, parity

Status
Not open for further replies.

mael

Dabbler
Joined
Jun 27, 2013
Messages
20
Okay, I've read the manual cover to cover and about thirty links from it and some more branching off of it. All the vids from it and a few others. I should've probably stuck to the guide and come and asked my questions, because now I think I've merged together a bunch of Solaris/ZFS specific stuff into the FreeNAS jumble of questions I already had. If something doesn't pertain to FreeNAS, please let me know.
So, I think I'm finally ready to try to set up my system, but I have a few questions. Please bear with me, I am after all a n00b lol.

I want to see if what I'm planning makes any sense or not. I was initially thinking of using 3x3T drives each separate from one another. Then I slowly moved onto Raidz1 as I only read about one pool in the manual (might've said something about more at one point but didn't fully grasp what was being said) which everyone seemed at the time to consider a good idea. But today I'm seeing that Raidz1 seems to be a terrible idea ? Z1 Pool in Faulted state, any hope?
I don't have a good enough system (see signature) for Raidz2, so I started looking into multiple pools to prevent the whole thing collapsing on me once one drive fails. And I do seem to see a few snippets here and there indicating multiple pools; but nothing concrete. And it's all from links coming out of FreeNAS. No clue if it's a solaris thing or something else.

So my questions are:

1) Can one create multiple pools ? Is a root pool what we create before creating more pools or is that something entirely different ? I saw that term thrown around a lot but not sure if that pertains to anything here or not.
2) I want to set atime off. Should I do that in the vdev options instead of doing it in the datasets? Just have those inherit said attribute, right ?
3) What is parity ? I've tried looking for info on it but I just don't get it. To me it sounds like an extra bit to tell if a transfer is successful or not but the way it's talked about here it sounds like this extra bit would allow you to recover a whole hard drive crash as long as there wasn't any bad sectors on the parity drive.
4)In the n00b powerpoint presentation it says there aren't a lot of "recovery tools" because ZFS is enterprise-class software, and no enterprise would waste time with recovery tools. But from all I've read here, I've gathered that if one vdev dies, everything in a pool dies with it.
So say a Raidz1 x 100 pool (from the numbers I've been reading about from Oracle and various other sources) and the off chance that no one is able to get to it in time (if it happened overnight, during Christmas/New Year's, funeral, a batch of failing hard drives, insert an unexpected event here) the only thing that matters is that one drive in a vdev fails and then a second one in it fails. Now 300 drives are down ? Instead of just 2 or 3 ?
Now, I'm not thinking this would cause 300 drives to be taken out instead of just three, right ? But if they did have to be taken out that would be hell. Please tell me it won't destroy every single drive in the pool and it'll just make you have to re-write the data back. Although again, seems troublesome instead of swapping in two drives, buying two new drives and using them as back up and being done with it; now you have to transfer 300x1T of data. Or is the data retained on the drives till the dead vdev is recovered ?
I'm sure I've got something really mixed up here, so please clarify if you can! I know I'm a really confusing creature but if you can help, it would be much appreciated!


This is only relevant if multiple pools can be created:
5) There is some attribute called copy or copies that reproduces files written to disk n amount of times. Is it possible in FreeNAS (again kicking myself for reading other documentation too....) ?

6) Is it possible to install a game (or any program) via CIFS/NFS and will multicore be a major factor here ? Or Hz ? Or is it just the speed of the network (NIC/Switch/"router") ?

There are many more questions.... but these are the ones I would really love to have answered as soon as possible. ( http://wp.me/p3KhYb-2 my blog post with all my wonderings if you truly dare tackle the beast which is confusion)

Thanks for reading and hope you can make sense of this and help me out in some way!
 

MadsRC

Dabbler
Joined
Jul 14, 2013
Messages
20
I'll try and answer your questions as precisely as I can do.

1) Can one create multiple pools ? Is a root pool what we create before creating more pools or is that something entirely different ? I saw that term thrown around a lot but not sure if that pertains to anything here or not.
Yes you can create more zpools / volumes. but instead of creating several zpools, you can create one, and then create datasets on top of that. A dataset is essentially a "folder" in the zpool that you can assign permissions to, setup quota's and share them out.
I, myself, run one zpool with 14 2TB drives in raidz3 and 8 1TB drives in raidz2, which is then divided into 6 datasets, where some of those datasets are shared using CIFS/SAMBA and NFS.

2) I want to set atime off. Should I do that in the vdev options instead of doing it in the datasets? Just have those inherit said attribute, right ?
Sorry, no idea :(

3) What is parity ? I've tried looking for info on it but I just don't get it. To me it sounds like an extra bit to tell if a transfer is successful or not but the way it's talked about here it sounds like this extra bit would allow you to recover a whole hard drive crash as long as there wasn't any bad sectors on the parity drive.
Parity is essentially a section of each harddrive being dedicated to storing "parity information".
That parity information is then used to rebuild/resilver your disks, incase a drive fails.

To be honest it's a pretty big topic. I can recommend Wikipedia's article in RAID.

4)In the n00b powerpoint presentation it says there aren't a lot of "recovery tools" because ZFS is enterprise-class software, and no enterprise would waste time with recovery tools. But from all I've read here, I've gathered that if one vdev dies, everything in a pool dies with it.
So say a Raidz1 x 100 pool (from the numbers I've been reading about from Oracle and various other sources) and the off chance that no one is able to get to it in time (if it happened overnight, during Christmas/New Year's, funeral, a batch of failing hard drives, insert an unexpected event here) the only thing that matters is that one drive in a vdev fails and then a second one in it fails. Now 300 drives are down ? Instead of just 2 or 3 ?
Now, I'm not thinking this would cause 300 drives to be taken out instead of just three, right ? But if they did have to be taken out that would be hell. Please tell me it won't destroy every single drive in the pool and it'll just make you have to re-write the data back. Although again, seems troublesome instead of swapping in two drives, buying two new drives and using them as back up and being done with it; now you have to transfer 300x1T of data. Or is the data retained on the drives till the dead vdev is recovered ?
I'm sure I've got something really mixed up here, so please clarify if you can! I know I'm a really confusing creature but if you can help, it would be much appreciated!

In your example, with a Raidz1 consisting of 100 drives, 2 disks failing would be catastrophic. It would kill ALL the data on ALL 100 drives. RaidZ1 can survive one disk dying, RaidZ2 can take 2 and RaidZ3 can take 3.

This is also the reason why RaidZ1 isn't very safe. Say one disk die and you plug in new drive, all your drives would now be put to work resilvering the new drive (Essentially using all their parity information to recreate the failed drive). During this resilver, a lot of strain is put on the drives. This can lead to another disk dying, and voila! The entine RaidZ dies (And as a consequence, your entire zpool dies!)

The reason the entire zpool dies (Remember, a zpool can consist of several vdevs/raidz's), is that each vdev in your zpool is stripped. In a stripped configuration, if one disk (or VirtualDevice/VirtualDisk) dies, the entire stripe is lost. Say you have 2 RaidZ1's in a zpool, and 2 disks dies at the SAME time in the SAME vdev, you have lost ALL data in that zpool.

Remember my setup? 14 x 2TB in RaidZ3 and 8 1TB in RaidZ2. If 4 of my 2TB drives dies, I loose EVERYTHING.
But since I'd have to loose either 4 disks or 3 disks, I don't worry too much.

If I had used RaidZ1, I wouldn't be able to sleep at night.

Please, don't use RaidZ1 ;)

5) There is some attribute called copy or copies that reproduces files written to disk n amount of times. Is it possible in FreeNAS (again kicking myself for reading other documentation too....) ?
Again, sorry, can't answer that :(
 

mael

Dabbler
Joined
Jun 27, 2013
Messages
20
1) Yeah, after all this time I ran into a thread talking about multiple pools but no one responded to the person.
3) Crud! Forgot to say I wanted something with a bit of substance. I read the wiki article on "Parity bit" and it left a lot to be desired then found another article that didn't do it for me either. I'll check out that Raid article.
What I can't fathom is if I have say 3x3TB in Raidz1 (don't worry I'll try to avoid that) (which means I have two drives to use and one for parity and ~5.58TB of usable space - 1/64th reserved for ZFS...~5.49TB) and I have say 5.21TB of data and one drive dies it would be able to recover ~2.6TB ? No matter which of the drive it is ? How the...wait I should read the article and first see if that is explained there.. lol..
4) Yeah, I know Raidz1 is bad I just threw that in there but disaster could happen in Raidz2 or Raidz3 just a lot less common but I don't have that many 3TB drives and would rather not use them as 500GB drives lol.
Aha! I somehow missed that it was in a stripped configuration, even though now that I think about it, I might've read and heard that quite a few times. x'D
Well, I'm limited on hardware; I was just thinking of buying two extra 3TB drives, but then that means that for drives alone I'd need 2 extra GB and I'm already maxed out.

For these reasons I'm thinking that it would be better to have separate pools. I'd have enough RAM. And when one drives dies the others would(should ?) still work, then I could just buy a new one and copy over my backup files. It's not a heavy production system just two users actually lol. With tremendously light load. I am just not sure how to go about creating multiple pools. I read that if you use only one disk for a vdev and I think mirrors ( ?) that you will have individual pools but I have no clue if that is accurate or not.

2/5) No problem, you've already helped me out a lot. :)
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Think of parity like this..

Remember geometry from grade school? The Pythagorean theorem?

A2+B2=C2

A and B are your data for a given "stripe". C is your parity data. Remember how if you had any 2 letters you could calculate for the 3rd? Parity works EXACTLY like that. In this case, A and B would be your data an C would be your parity. This is my example for RAID5/RAIDZ1 because you lose 1 whole disk(C) worth of availble storage space.

It doesn't use the Pythagorean theorem of course, but I think you understand now. :)

RAIDZ2 is the exact same way, but it uses 2 equations that share some variables to get 2 separate (C) values. RAIDZ3 simply has 3 equations that give you 3 different parity values.
 

mael

Dabbler
Joined
Jun 27, 2013
Messages
20
Wait I thought that was for a right angle not equilateral triangles. Or did I miss something ? I haven't touched geometry in...ever and I took geometry and got a B x'D
No but seriously, I get "how it works" it just doesn't add up in my head but I guess it's either a simple "it works" or about twenty eight five hundred page papers on some algorithms that I'll never comprehend in my life so I'll just leave it at "it works".
Well, either way that's pretty cool, lol.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Hehe. The jist is that you have known values(your data) and calculated values(derived from your data). If any part of the equation is missing the remaining pieces can be calculated for. There is a "law of conservation of data" that says that parity data can never repair more characters than the number of characters in the parity data. In short, X disks of parity protects from X disks of failure. Anything more than that and you lose data. A bad sector counts as a disk failure in that given "stripe" of data (normally 128kb). So a whole disk failed plus some bad sectors on another disk = trouble for RAIDZ1. Because the statistical likelyhood of having 1 disk fail plus bad sectors on another disk is so likely these days, RAIDZ1 is not considered to be "protective" anymore.

http://www.zdnet.com/blog/storage/why-raid-5-stops-working-in-2009/162

RAIDZ2 will eventually have the exact same problem. I believe current estimates are 2019 or something like that.
 

mael

Dabbler
Joined
Jun 27, 2013
Messages
20
I think I sort of get the gist of it now. And seeing as I have three disks Raidz1 is not an option my only other option for the amount of data I have I'll have to do three individual pools. It would be nice to have it all linked up but this seems the safest route that I can think of.
Actually I linked to a post where you posted that link. x'D

So, far it's 1) Use multi 2) Screw it I'll disable atime on each vdev and hope that's that right way to do it. 3) Get it but sadly can't use it. 4) Stay away from big pools. Can't swim so no problem there.
Thanks for getting me two thirds there guys! Totally hoping I can try it out later today I should be sleeping. o.o;

5) Just need to find out if I can find this option in the GUI hopefully it's somewhere in there.
Oh and I forgot to add
6) Is it possible to install a game (or any program) via CIFS/NFS and will multicore be a major factor here ? Or Hz ? Or is it just the speed of the network (NIC/Switch/"router") ?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Some games/programs won't let you install to network shares. Some games/programs won't run well over network shares either.

In windows, one problem I was able to identify was that the drive letter mappings used your Windows login(mael for example) but some programs needed to use admin rights and gave you that UAC popup. Those programs won't "see" the drive mappings because it's not available on the account its running on. So the executable will load, but immediately crash because it can't find the rest of its installation. I don't know of any fix for this.

Also many programs run things during system startup and login. Your network share won't exist until you actually login. In those cases, all of those programs will have issues. My linux machine runs Plex and accesses my movies via a network share. Unfortunately Plex loads before the network share is mounted so Plex always has an error when it first loads and I have to either wait 2 hours for the automated scan or force it to rescan since it thinks my entire library has been deleted.

There will be some added latency with running programs over CIFS/NFS because of network limits. It's one of those things that sounds great in theory, but in practice it doesn't do so well. The best advice I can give is to not try it. But you are welcome to try it at your own peril. Just don't be surprised if you trash your Windows installation. ;)
 

mael

Dabbler
Joined
Jun 27, 2013
Messages
20
I should seriously be shot for wasting everyone's time on badly worded questions. I won't even try to make an excuse that it was past midnight when I wrote that. Although you did probably save me from making some mistake in the future and for that I am grateful.

What I meant is can I run the installer over the CIFS/NFS.

However, I'm guessing the answer is probably going to be no ? At least for Windows. I want to run some scripts I wrote on Linux most of them probably under fifty lines.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Uhh.. what installer? Are you talking something like Microsoft Office CD copied to a network share? Often you can. Sometimes you can't. Depends on the program.

For linux, I'm not 100% sure. Everything I've ever tried to run that was on a network share ran fine as long as it was a standalone application or script.

*bang* (you're dead)
 

mael

Dabbler
Joined
Jun 27, 2013
Messages
20
Say Assassin's Creed or The Witcher 2 those are the two biggest games that we own that I can think of off the top of my head.

And just to run the installer from the network share. The actual install location will be C: or some other local drive.

*dies* D;
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I'd say give it a try. It may work, it may not. Some games won't let you copy a disk due to DRM.
 

mael

Dabbler
Joined
Jun 27, 2013
Messages
20
I'm afraid I don't follow you. A disk ? I almost exclusively (damn you games I bought off of Amazon with your DRM!) buy games from gog and humble bundle which both provide DRM free games and they only offer them in executable files not discs. I probably only own like 20 game that are on CD/DVD and those I'll just install with the disk.
I just find no point in putting a bunch of .exe's that'll just be used to install the game on the Windows machine. I worry about fragmentation. And figure this might help a bit.

Would NFS make a difference or would it be the same in CIFS ?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I thought you had your games on CD/DVD.

I don't see how the protocol itself would matter. I stick with CIFS in Windows though because that's what the Windows world uses, almost exclusively.
 

mael

Dabbler
Joined
Jun 27, 2013
Messages
20
Well, my initial concerns were would it be intolerably slow but I guess with gigabit ethernet it should be fine.
I also didn't know if multicore would affect it. I guess I misunderstood someone else and multicore is only important for media or whatnot and installing a game I guess is just heavy on transfer speed.
And then I started thinking about what you mentioned about permissions and didn't know if that'd be an issue when the UAC popup appears but I guess you're saying it's not ? No...if I understood you right that would only be an issue with an installed program.

Well, big thanks! I think I'll go ahead and set it all up now! Woot! I'm quite excited, heh. Seriously appreciate everyone that helped me out. Tried reading as much as I could and googling till it hurt but without the first hand experience I can never be sure.
 
Status
Not open for further replies.
Top