BUILD Haswell build.

FREAKJAM · Oct 21, 2013

Hi,

For the last 6 weeks i'v been reading the internet after I decided that I want to renew my setup that I am currently using as my "NAS". Basically im running a HTPC, with XBMC on it and 2 harddrives, providing me with a total of 4TB of data. Im not running RAID or whatsoever. My current setup is like 3 years old and it actually works just fine, but as stated im running out of space.

I'm a system engineer and I work with Windows Servers and VMWare on a daily base. Combined with the "problem" as mentioned above I decided I want to build a new home server. I want to setup a ESXi server combined with FreeNAS running Plex Media Server and I want to use the home server to educate myself more and to play with VMWare. I'm aware of the fact that it's not adviced to run FreeNAS in a virtualized environment.

After 6 weeks of reading about ZFS (which is totally new to me), ESXi 5, FreeNAS, OpenMediaVault, ZFSGuru etc etc, I think it's time to make a choice, order my parts and setup my new server. Before ordering, I would like to post my current setup which I have in mind here, and ask u guys to critique my setup and maybe to give me some pointers. (Many people already assisted me on some dutch forums, but I also want you guys opinions). As stated, i'm totally new with ZFS & FreeNAS, so correct me when wrong if/when i'm making wrong statements.

CPU: Intel Xeon E3-1230V3
MOBO: Supermicro X10SL7-F
MEMORY: Kingston ValueRAM KVR16E11/8 (2x, thus providing me 16GB ECC unbuffered memory).
HDD: WD Red WD40EFRX, 4TB (3x)
SSD: Crucial 2.5" M500 120GB (2x, one as datastore for ESXi, the other for ZIL/L2ARC on the very same SSD).
PSU: Seasonic G-Serie 360watt
USB: Sandisk Extreme USB 3.0 Flash Drive

Setup explained:
CPU: cheapest Xeon CPU with hyperthreading. No GPU chipset included, but the SM board supports IPMI. Supports VT-d/VT-x and is based on the C222 chipset, so good to go for ESXi.
MOBO: Dual NIC, LSI 2308 onboard. NIC's might not be working with ESXi out the box, but we have a fix for that. ECC memory, best option for when running ZFS.
MEMORY: ECC Memory. I will probably only run FreeNAS with maybe pfSense for now, so 16GB will be just fine. I'm aware of the rule xTB equals xGB ram, and since I want to provide my home server with 12TB of data, 16GB will do. I will have 2 slots left in the SM board, so that I can upgrade in the future.
HDD: Not really sure bout these, but I think RED's are the best to go. Suitable for a 24/7 environment and a fair price. Wanted to use Seagate Desktop HDD.15 ST4000DM000 drives at first in my setup, but someone on newegg stated that the drives do not always work with LSI-cards:

This drive does not work with some controller cards! When connected to LSI host bus adapters and connected to LSI backplanes the drive will not spin up. Additionally the bay that the drive was inserted into is permanently disabled until the entire computer is powered off and cold-booted!

This is a hardware/firmware flaw either in one of:
* LSI SAS9207-8i (contains an LSISAS2308 controller)
* Supermicro BPN-SAS-826EL2 (contains an LSI SASX28 expander)
* Seagate ST4000DM000

LSI refuses to address the problem since this drive is a "desktop" drive. That's stupid because the drive is in my _desktop_ -- not a server! Seagate has clearly done something different to this drive since their "enterprise" class drives work just fine -- and cost double the price of this one. I'm not even using RAID.

This drive model is defective and I will not be purchasing any more unless either Seagate or LSI provide me with a working firmware to fix this problem. I tested 6 different drives with two different firmwares, from two different stores, manufactuered in two different countries on two different controller cards. This is a design flaw!

If anyone thinks I should use other drives, please tell me so.
SSD: has power-safe capacitors, so ideal for ZIL.
USB: supports very good random write speeds, ideal USB-stick to run a OS from. (ESXi boot).

I'm aware that running 3 disks of 4TB is not the best/safest option. I want to run RAIDZ. My case is a Bitfenix Phenom Micro ATX, so i'm not able to stuff the case with tons of hard disks. (Phenom Micro-ATX can be outfitted with up to five 3.5" HDDs or six 2.5" SSDs.). I will not be storing critical data on the NAS, most of it will be media, so running everything in a RAIDZ setup (meaning that if 2 disks will fail I will loose everything, is a risk i'm willing to take).

I will keep my old HTPC and I will remove the 2x2TB hard drives. A new SSD will be placed in the HTPC. Will be running Windows + Plex Client. (So basically I want to stream the media from the NAS to my HTPC). Maybe I can reuse the 2x2TB (Samsung Spinpoint F4 EG HD204UI, 2TB) drives in some way in my new setup? When going for a 3x4TB setup, i will have room left for 2 more HDD. I maybe can use the old 2tb drive to save ESXi snapshots.

questions:
1) Critique my setup (keep in mind that I am new to FreeNAS, ZFS etc, so please be gentle :))
2) If 3x4tb in RAIDZ is really not the way to go, what setup should i use? Maybe4x4TB in RAIDZ2? Again, I wont be storing critical data on my drives. (still, nobody wants to lose data).
3) How many NICS would I actually need for my setup? Two is just fine right?
4) Reading mixed stories bout running ZIL/L2ARC on the same SSD. Is it safe to run ZIL/L2ARC on the same SSD?
5) is 360 watts psu power enough for this setup?

Thanks in advance.

FREAKJAM · Oct 25, 2013

Received my case today. Want to order the rest really soon. Still doubting mostly bout the amount of hdds i should use and what kind of raid i should use. My case is too small to add alot of hdd, so adding an extra vdev is no go.

JimPhreak · Oct 25, 2013

FREAKJAM said:
Received my case today. Want to order the rest really soon. Still doubting mostly bout the amount of hdds i should use and what kind of raid i should use. My case is too small to add alot of hdd, so adding an extra vdev is no go.

Yea that case is not really meant to house 3.5" drives. I've used one for a SFF gaming build which is really what they are meant for. You'd have been better off with something like the Arc Mini or Midi (fits 6-8 3.5" drives natively).

FREAKJAM · Oct 26, 2013

Yeah, but the case got delivered today and I really like to looks of it. Want to keep it small anyways. It can prolly fit 3 or 4 3.5" drives max (and 2 SSD). RAIDZ with 4 drives is not really the way to go it seems. Perhaps i should just skip to ZFS and go with OMV. I can start with 3 drives in RAID5 and add 1 when needed.

cyberjock · Oct 26, 2013

For the same reasons that RAIDZ with 4 drives is "not really the way to go" RAID5 is also just as bad. The link in my sig explains why UREs make rebuilding without data loss almost impossible.

Gonzo · Oct 26, 2013

cyberjock said:
For the same reasons that RAIDZ with 4 drives is "not really the way to go" RAID5 is also just as bad. The link in my sig explains why UREs make rebuilding without data loss almost impossible.

I am building a home NAS with 3 2TB HD and I am planing to use RAIDZ, Does this mean it will not be a good setup for data protection?

JimPhreak · Oct 26, 2013

Gonzo said:
I am building a home NAS with 3 2TB HD and I am planing to use RAIDZ, Does this mean it will not be a good setup for data protection?

What it means is that if one of your drive fails and you have to rebuild your array that there is a good chance another drive will fail (or your array will see a URE) during that rebuild process in which case you will lose all your data.

cyberjock · Oct 26, 2013

Gonzo said:
I am building a home NAS with 3 2TB HD and I am planing to use RAIDZ, Does this mean it will not be a good setup for data protection?

It means you'd better keep religious backups as RAIDZ1 is not very reliable with today's hard drives.

enemy85 · Oct 27, 2013

cyberjock said:
It means you'd better keep religious backups as RAIDZ1 is not very reliable with today's hard drives.

But this is just because of big size of the drives right? I mean, if i do a RAIDZ1 with 4x2TB or 4x1TB drives an URE is less likely to apper than a 4x4TB right?
Sorry to bother with this question but i want to be sure to have understood the point

FREAKJAM · Oct 27, 2013

Well the larger the disk, the longer it takes to rebuild, so i guess a URE is more likely to happen with larger disks.
Checked my case yesterday and I can store 5 3.5" drives max. Doubting if i should buy 4x3TB or 4x4TB. (want to go for RAIDZ2 after reading the article cyberjock mentioned earlier). Placing a 5th hdd at the top of the case isn't really practical.

The case isn't really hdd friendly, but it fits 5 harddisks (and 2 SSD).

cyberjock · Oct 27, 2013

enemy85 said:
But this is just because of big size of the drives right? I mean, if i do a RAIDZ1 with 4x2TB or 4x1TB drives an URE is less likely to apper than a 4x4TB right?
Sorry to bother with this question but i want to be sure to have understood the point

Maybe. It's a relationship between the URE rate and the size of the disk. The only way to answer that question is to do the calculation for error rates across the disk(s) based on the vendor provided values. I will say that about 90% of our users that lose their data had a RAIDZ1.

jyavenard · Oct 27, 2013

FREAKJAM said:
Well the larger the disk, the longer it takes to rebuild, so i guess a URE is more likely to happen with larger disks.

sure, and if a URE does occur (and it likely will), just like any read error it will simply retry a few more times as necessary. It's not going to kill your RAID as what that article states.

jgreco · Oct 27, 2013

Well, that all depends, doesn't it. True to say "it MIGHT not"...

jyavenard · Oct 27, 2013

I do fully concur that rebuilding/resilvering is a very hardware intensive task ; and if one of the disk in the array failed due to old age; the chances for another disk (often of the same age) to fail increase significantly. But a URE error itself, will *not* make an array fail; even if a retry failed (and the basis of a URE is a random error; and being probabilistic, is unlikely to occur in the exact same spot once again). So at worse you will loose something; but not the whole array.
I wonder where that nonsense came from...

I've had cases where I tried to rebuild an array using a shiny new disk that happened to be a dud... You see the rebuild either failing, or taking an awful long time. In all cases; it was just a matter of putting a new disk in (while praying no other disk will fail in the mean time) and life goes on...

cyberjock · Oct 27, 2013

Really? Have you seen how many users have had corrupted pools that wouldn't mount because ZFS' metadata got trashed from a few UREs? Do what you want... I'm not going to sell people or argue it because I don't care. If you think its okay, then do it. Don't come crying to use when you end up with a zpool that won't mount and nobody has the solution.

jgreco · Oct 27, 2013

Some of us don't like praying when a technological fix is available.

jyavenard · Oct 27, 2013

cyberjock said:
Really? Have you seen how many users have had corrupted pools that wouldn't mount because ZFS' metadata got trashed from a few UREs? Do what you want...

except that ZFS doesn't have a central location for metadata; just like any modern file system, the time there would be a singular point of failure like this are long gone. If not, ZFS would simply be a totally unreliable file system and no-one would use it.

So yes, I'd love to see the list of all those users who lost all their data with ZFS due to a single URE error :) if there are so many of them; they can't be hard to find.

Interestingly; a google for "ZFS URE data loss" always link to the same articles or *your* own answer about doom & gloom; yet not a case of someone actually loosing data (and I've looked at the first 3 pages of results)

I've been using ZFS since the time it was introduced in FreeBSD 7 (and for months before that while it was being tested); built a lot of arrays and did enormous amount of trials and testing over the years. Not once has any failure amounted to a total loss of data (assuming of couse that the hardware redundancy was respected)

I'm not saying you won't suffer data loss; I'm arguing against the "you'll lose your whole array" argument; that's FUD in my opinion...

When "metadata" or a stripe actually gets damaged due to a URE on the remaining disks: it will lead to the loss of a file; ZFS is actually smart enough to tell you which file it is, but the rebuild continues, it never stops and loose it all. The only common point between RAIDZ1 and RAID5 is the number of disks involved; and that pretty much stop there. While with RAID5 you could end in a whole world of troubles under the same circumstances, ZFS will sail through.
A URE like what occurs in the article you have in you signature, is a random event; occurring on average every 10^14 bits read. When an error occur, if detected (yes: that's not always the case with a URE) like with most errors: there's a retry. For the a block of data to fail to read again, at that exact moment, on the exact same location being a one in a 10^14 event: the maths state that's rather unlikely (of course not impossible, but damn low)
To loose all your array, you need a complete hard drive failure at a time when there's no more redundancy left.

I'm not going to sell people or argue it because I don't care. If you think its okay, then do it. Don't come crying to use when you end up with a zpool that won't mount and nobody has the solution.

when you get into that kind of argument, it usually means that there's no more rationality behind them, and it falls into the religious debate

jyavenard · Oct 27, 2013

jgreco said:
Some of us don't like praying when a technological fix is available.

well, that's the key I guess... You can either pray following an irrational fear; or look at the facts. Unfortunately, you can never, *ever* cater for all scenarios. Ultimately, there's always a chance you'll lost it all at once, including all your backups not matter how many you've made....

or you could simply die and not care either way that your data is safe :)

Before my first daughter was born; I was told that there was one chance in 1500 the child would have down syndrome. That's particularly great odds (interestingly my wife felt better after I told that was 0.06%.. people and numbers: always weird) , yet people still have healthy kids... You don't stop having kids because there's a chance things go wrong...

jgreco · Oct 27, 2013

For those of us who do this professionally, things working the way you'd like or expect are the exception rather than the rule. It isn't an irrational fear. Thirty years in the business teaches that with the air of indifference you're suggesting, that the bad blocks will conspire together to render a pool unrecoverable in the worst possible way... because fate picks the unprepared to victimize. The annoying corollary to that is that if you have a RAIDZ3, you might never have a disk fail at all (fsck it all!).

You've basically had to rephrase the problem by suggesting

a URE error itself, will *not* make an array fail;

which is true, but the question is what else is going on. Does the drive freeze because of bad firmware? Is there a large chunk of blocks with URE's? Does the drive not have TLER and basically appear to lock up from trying to read a bunch of bad blocks? Have the drives all been baked due to environmental problems? And remember, this is all in the context of a single drive already having failed and being out of the array. How many more errors do you expect ZFS to handle? Metadata is typically redundant, yes, but ...

This is the real world. We're not really lucky enough to merely smack into a single URE most of the time.

cyberjock · Oct 27, 2013

jyavenard said:
except that ZFS doesn't have a central location for metadata; just like any modern file system, the time there would be a singular point of failure like this are long gone. If not, ZFS would simply be a totally unreliable file system and no-one would use it.

So yes, I'd love to see the list of all those users who lost all their data with ZFS due to a single URE error :) if there are so many of them; they can't be hard to find.

You know what's so great.. I don't have to do JACK SQUAT. I don't have to prove anything to you. NOTHING. I don't go looking for a list. I've provided the information and if YOU want to do the research on the topic, go have a ball. But I have thoroughly explained everything I have needed to in plenty of detail. You don't like our advice.. you don't have to take it. But don't expect me to try to "sell" you on it because you aren't paying me to do what should be your OWN homework. It's YOUR server.. YOUR data.. and YOUR risk vs. reward.

jyavenard said:
Interestingly; a google for "ZFS URE data loss" always link to the same articles or *your* own answer about doom & gloom; yet not a case of someone actually loosing data (and I've looked at the first 3 pages of results)

So 3 pages of results out of 73,300 results.. thank freakin' god you did your "homework" I'm glad you are so thorough! And I saw a single freenas post on the first page, with the rest being other websites from all over the place. So be a little more honest about "always link to the same articles or your own answer". Far from the truth.

jyavenard said:
I've been using ZFS since the time it was introduced in FreeBSD 7 (and for months before that while it was being tested); built a lot of arrays and did enormous amount of trials and testing over the years. Not once has any failure amounted to a total loss of data (assuming of couse that the hardware redundancy was respected)

I'm not saying you won't suffer data loss; I'm arguing against the "you'll lose your whole array" argument; that's FUD in my opinion...

Congrats.. would you like a cookie?

You know what? ZFS isn't supposed to become unmountable from a loss of power. ZFS is supposed to smartly roll back a partially complete transaction instead of pools becoming unmountable no matter what you do. But you know what? We've had about a dozen+ of those this year. I can't explain everything, but I do see trends. And I provide that information. I don't have any reason to "sell" you on it. I just want readers that are new to ZFS to understand potential pitfalls they might not be aware of. Don't like it, don't read the forums! It's as simple as that.

jyavenard said:
When "metadata" or a stripe actually gets damaged due to a URE on the remaining disks: it will lead to the loss of a file; ZFS is actually smart enough to tell you which file it is, but the rebuild continues, it never stops and loose it all. The only common point between RAIDZ1 and RAID5 is the number of disks involved; and that pretty much stop there. While with RAID5 you could end in a whole world of troubles under the same circumstances, ZFS will sail through. A URE like what occurs in the article you have in you signature, is a random event; occurring on average every 10^14 bits read. When an error occur, if detected (yes: that's not always the case with a URE) like with most errors: there's a retry. For the a block of data to fail to read again, at that exact moment, on the exact same location being a one in a 10^14 event: the maths state that's rather unlikely (of course not impossible, but damn low)
To loose all your array, you need a complete hard drive failure at a time when there's no more redundancy left. when you get into that kind of argument, it usually means that there's no more rationality behind them, and it falls into the religious debate

First, UREs are what again? Oh, that's right.. unrecoverable. So how the red text could happen is pretty amazing. The WHOLE definition of URE is "unrecoverable read error". That means that the drive is unable to recover. It can retry once or it can retry 1 billion times. It is unrecoverable. But you are right. You know what that "damn low" value is? 1x10^14(or whatever your hard drive lists). Believe it or not the average hard drive has read errors pretty regularly. Especially drives doing crazy amounts of random seeks. Most of the time it retries and all is better. Google some Seagate SMART data some time. They actually deviate from the "norm" by doing the read error rate. All the other manufacturers call stuff "Raw_Read_Error_Rate" and "Multi_Zone_Error_Rate but they kind of lie and say zero until its unrecoverable even with retries. Then, to make things worse, its not retries like you and I would think. It's only after its internal calculated value exceeds some really small value. WD might use a "Raw_Read_Error_Rate" of zero to signify anything smaller than 10^13 and 1 for 10^12. We don't know as we're not privy to what "0" and "1" mean. Rates aren't just raw numbers. It's "errors per unit X" where X might be time, MB, block read, sector, etc. The only thing you and I really can derive from zero and one is that zero is "better" than one. That's it. If you have a Seagate you can actually make those read error rate values go way up if you start forcing certain types of work loads that cause read errors due to seek errors. You just have to be REALLY smart with it because the hard drive will use NCQ and TCQ to try to reorganize them to be in order. A custom program has worked the best for me in the past. Also keep in mind that those are supposed to be "worst case" for a properly functioning disk. Your brand spanking new disk might be 10^16. But you have no way to easily make this distinction without firmware monitoring via serial connection to the controllers. Hard drives are fun to play with via serial connections directly to the controllers. ;)

Actually, I can(and have)deliberately induce failures that force ZFS into a repeating loop that will make it scrub forever. It'll get to that one file, then loop forever resulting in it never actually finishing the scrub nor scrubing the remaining files. So talk all you want. But it's pretty obvious that you actually don't have the knowledge that you think you do. So yes, its completely possible to end up with metadata that is corrupted in a way that makes the pool do very weird things. There's been plenty of people that have had scrubs last for days when it should have taken a few hours. Some people have screenshots showing that the scrub was 100% complete for 3+ straight days. So saying it doesn't happen is a misnomer. It's quite possible and for some, happens unexpectedly. Most people wouldn't put much faith in a pool that they could never scrub to completion ever again. The really crappy part is that you have no way of easily identifying the actual file that its looping on. Whoops!

The bottom line is that ZFS is designed to do 3 things to protect your data:

1. Prevent any transaction from being partially written. It's supposed to detect and roll back any partial transaction due to a loss of power, etc. Notice I say "supposed" to because of my comment above about people having an unmountable pool from a loss of power. Journaling file systems use a similar approach to solve this problem. NTFS does this in the background. This is also why those old school Win98 boxes would want to do a chkdsk on bootup if the previous shutdown was not a clean shutdown. This was done by Microsoft on purpose because there was no way to determine if transactions were incomplete except with a file system check.
2. Use checksums and/or parity to verify and correct errors from the storage subsystem. (Notice I didn't say RAM)
3. Use scrubs to ensure that all of your disks are in sync with each other.

When you do something that breaks one or more of these(#1 is pretty hard to break because you don't really have control of it), but #2 can be broken by having bad non-ECC RAM or having corruption and insufficient parity to fix it, then you have a problem. ZFS has no choice but to accept whatever input it receives from the disks, even if its garbage. It's just like that FCC warning you see on everything that says "this device must accept any interference received, including interference that may cause undesired operation". If it can't figure out what should have been there it just takes it as it is. Hopefully it'll work well enough to not cause a kernel panic. But there's plenty of users here that have had systems that bootup and kernel panic when they try to mount their pool. And it's not limited to FreeNAS or even FreeBSD.

Important Announcement for the TrueNAS Community.

BUILD Haswell build.

Dabbler

Dabbler

Contributor

Dabbler

Inactive Account

Dabbler

Contributor

Inactive Account

Guru

Dabbler

Inactive Account

Patron

Resident Grinch

Patron

Inactive Account

Resident Grinch

Patron

Patron

Resident Grinch

Inactive Account

Similar threads