UFS non ECC vs ZFS non ECC risk numbers

Status
Not open for further replies.

BobJ

Dabbler
Joined
Jun 3, 2015
Messages
32
I read through the (Locked Ecc discussion) but found people skating around the question. They usually just say use ECC end of story.

I would like to see real world numbers on the risks. Many people have old perfectly good (Non ECC) computers laying around that would make a nice home NAS. So rather than dump it in the trash they can make it useful. I fully realize ECC is the best way to go.

Are you saying that if you do not have ECC ram and wanted to set up a RAID 1 freenas one should only use UFS?

1. Single Hard drive SATA - risk of corruption?
2. Raid 1 - non ECC ZFS risk "
3. Raid 1 non ECC UFS
4. Raid 1 Dell Perc 6i?
5. Raid 1 ECC + ZFS ?

I have 5 NAS that I built.
Two of them have ECC and quad Xeons with 8 gig (Dell 490s)
Two of them are non ECC ZFS raid 1 with 1 8 gig stick
one is a UFS non ECC.

They are running 24/7 with no UPS for 4 years or so. They have had power outages, abrupt shutdowns and all sorts of bad things.

So far - zero bad scrubs, zero bad files, they all working like new.

I use them all for backups but I don't copy to one then take that copy and move to the others. The copy is from my PC. I also have USB3 drives I keep in a fireproof box as a 6th backup. Again I don't copy from one NAS to another so no chance for them to spread corruption unless my original was bad to begin with.

Maybe using raid 5 uses more ram and thus might potentially induce more errors than raid 1 ZFS.

How does one prove that a memory issue caused a non mounting pool vs other reasons?

Should I change my non ECC zfs NAS boxes to UFS?

Is there any real proof that UFS +non ECC is ?X more reliable vs ZFS + non ECC?

Of course sun would recommend ECC in enterprise situations, almost every computer in business enviroments use ECC non matter if they use ZFS or not.

Does ZFS + non ECC offer anything over UFS + non ECC?

If not then just make it simple. In the ZFS setup of nas4free or Freenas put a big RED warning pop up when someone attempts to make a ZFS pool. "DO YOU HAVE ECC RAM? if not please select UFS file system for you raid.

Just seems to be a bit of a religious fear type mongering going on here on the topic of ECC.
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
If you value your data, then you will invest in ECC RAM. Period. Degrees of safety and probabilities of corruption are irrelevant in my mind. Either it's important or it's not. And to quote Matt Aherns from the link you posted (which referred me to the ArsTechnica comments): "I would simply say: if you love your data, use ECC RAM. Additionally, use a filesystem that checksums your data, such as ZFS."

Very interesting report on the topic of correctable memory errors, which should address some of your probability questions:
http://www.cs.toronto.edu/~bianca/papers/sigmetrics09.pdf
Still think it's a bit of religious fear mongering?
 

BobJ

Dabbler
Joined
Jun 3, 2015
Messages
32
It all depends on the situation. Using blanket statements does not answer the questions. Did you read that link I gave?

I have ECC in my 3 servers, 2 of my NAS. But the ZFS + no ECC is no worse than any other raid file system. Many commercial NAS boxes do not have ecc. Its not about my data its about reality and the truth and the real stats.

Sure ECC is good in most ways (Except speed) but where data integrity matters ECC wins. But for some people with a box laying around without ECC upgrade a new MB, cpu, ram, ps etc to get ecc might be a waste of money depending on their needs. Maybe they just want to stream movies and don't care about the slightly higher risk. It appears that ZFS + non ecc is no worse than any other file system without ecc.

If you read that link the risk is not as large as some people say and coincides with my greater than 30 years experience.

And yes no question is there fear mongering going on. In a enterprise environment by all means ECC. But I would not tell people to throw away a computer and spend 800.00 on a nas box just to get ecc. There are ram errors but in my experience they do not amount to tons of lost data, unbootable hard drives etc. It can happen just not at the rate some would have you believe.

Some data has more value than others, some can be trashed and you do not care. There is a place for raid 0 as there is a place for a used computer converted into a raid 1 nas with no ecc.

People act like OMG the world will end if you don't get ecc.

Like I said I have many many servers, computers (40+) some ecc some not. Can't think of the last time I lost anything important. I keep that backed up in several places I don't rely on one ecc nas to keep my important backups.

for a new nas build with no existing computer then yes the cost of ecc is not that great as they need mb, ram, hd, box, ps etc.
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
I read through the (Locked Ecc discussion) but found people skating around the question. They usually just say use ECC end of story.

I would like to see real world numbers on the risks. Many people have old perfectly good (Non ECC) computers laying around that would make a nice home NAS. So rather than dump it in the trash they can make it useful. I fully realize ECC is the best way to go.

Are you saying that if you do not have ECC ram and wanted to set up a RAID 1 freenas one should only use UFS?

1. Single Hard drive SATA - risk of corruption?
2. Raid 1 - non ECC ZFS risk "
3. Raid 1 non ECC UFS
4. Raid 1 Dell Perc 6i?
5. Raid 1 ECC + ZFS ?

I have 5 NAS that I built.
Two of them have ECC and quad Xeons with 8 gig (Dell 490s)
Two of them are non ECC ZFS raid 1 with 1 8 gig stick
one is a UFS non ECC.

They are running 24/7 with no UPS for 4 years or so. They have had power outages, abrupt shutdowns and all sorts of bad things.

So far - zero bad scrubs, zero bad files, they all working like new.

I use them all for backups but I don't copy to one then take that copy and move to the others. The copy is from my PC. I also have USB3 drives I keep in a fireproof box as a 6th backup. Again I don't copy from one NAS to another so no chance for them to spread corruption unless my original was bad to begin with.

Maybe using raid 5 uses more ram and thus might potentially induce more errors than raid 1 ZFS.

How does one prove that a memory issue caused a non mounting pool vs other reasons?

Should I change my non ECC zfs NAS boxes to UFS?

Is there any real proof that UFS +non ECC is ?X more reliable vs ZFS + non ECC?

Of course sun would recommend ECC in enterprise situations, almost every computer in business enviroments use ECC non matter if they use ZFS or not.

Does ZFS + non ECC offer anything over UFS + non ECC?

If not then just make it simple. In the ZFS setup of nas4free or Freenas put a big RED warning pop up when someone attempts to make a ZFS pool. "DO YOU HAVE ECC RAM? if not please select UFS file system for you raid.

Just seems to be a bit of a religious fear type mongering going on here on the topic of ECC.
You don't seem to have a good grasp on how ecc works or how a filesystem works. Until you get more familiar with it just know that ecc memory is a must for zfs and is helpfull with other non checksumming fs but probably won't be what causes corruption. A bad disk will probably cause corruption via bit rot before bad memory unless you just have a bad stick to start with.
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
Clearly you did not read my post. I quoted from the link you provided.
 

BobJ

Dabbler
Joined
Jun 3, 2015
Messages
32
Clearly you did not read my post. I quoted from the link you provided.

I read it and yes I already know there are errors with non ecc that was not my question. And that article can be taken out of context too. One could say 92 percent of the dimms had zero errors. And it did not correlate those errors with how many files were be expected to be corrupted from those errors.

The fear was that ZFS + non ECC was much worse than say UFS + non ecc. I know how ECC works. I am not saying ECC is worse that is not the point. I see people parroting misinformation.

Also my experience 30+ years has not seems the types of corruption in terms of (My data is gone never coming back)
I could write a paper that says 99 out of 100 computers are broken when you buy them. You might go that is not my experience with.

The link I gave showed that ZFS + non ecc is not worse than say UFS + non ecc.

Again I am not saying ECC is bad, its good. So can you get past that and stop with the "I don't think you understand etcetc"
I've been using ECC from before you were 10 (unless you 50+ years old) I know how it works.

Scrub Of Death was one fear monger topic. This appears to be overblown, if you read the link I gave. If you do not agree with the links suggestion let me know. I think this put unfounded fear into people that thought ZFS+ non ecc was much worse than UFS + non ecc. Again this is non ecc vs non ecc. Not comparing to ecc.

Its like you got blinders on, ECC or bust I will answer any and all question with "YOU MUST USE ECC, read my ECC article.
you will be assimilated, repeat... Yes I get the merits of ECC, I've had ECC since

Again I think ECC is the way to go, I am just saying that non ecc +zfs is not as bad as it was made out to be.
 

BobJ

Dabbler
Joined
Jun 3, 2015
Messages
32
You don't seem to have a good grasp on how ecc works or how a filesystem works. Until you get more familiar with it just know that ecc memory is a must for zfs and is helpfull with other non checksumming fs but probably won't be what causes corruption. A bad disk will probably cause corruption via bit rot before bad memory unless you just have a bad stick to start with.

Not sure where you got that, unless you just being a troll.
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
I think we are talking about 2 different things.

I look at ECC like a seatbelt. And the OS is like the car. The probabilities of injuries without wearing a seatbelt, regardless of the type of car I'm in are irrelevant to me. I could be in a yugo or an 18 wheeler, it doesn't matter. I'm wearing my seatbelt.

You are trying to somehow quantify the risk of injury by not wearing a seatbelt in a yugo, camry, f-150 and dumptruck.

I'm saying all of those risks are higher than they would be with a seatbelt, and are therefore in my opinion, not worth trying to quantify.
 

mjws00

Guru
Joined
Jul 25, 2014
Messages
798
So it seems to me that you want an admission that this board tends to be conservative with data. Yep. We are.
Are there some folks that are quite zealous? Yep. Every system and forum has them.
Are there some incredibly experienced high level folks around that tend to be conservative? Yep. There are.
Can you ignore their advice? Sure. Just own your choice.

The only statistically relevant studies I've seen are linked all over. You have googles study. You have the one linked earlier. Both suggest that memory errors are far more common then we thought they were. Almost everything else is VERY subjective and has HUGE sample size problems. So in my mind it is all BS, until proven otherwise.

So there is no good math that can be done to accurately evaluate risk, imho. We only know that memory errors and problems occur and they can be prevented easily and cheaply. ECC was not created as a conspiracy or money sink, IT SOLVES A KNOWN PROBLEM. In the case of ZFS we are adopting an enterprise level file system, so hardware assumptions are stricter than your run of the mill file system.

Frankly I have the same 30 years experience. I can't remember EVER losing data specifically to an event ecc would have remedied. That is across hundreds of servers and workstations. Of which many were under-specced and not on 'server' hardware. I even tend towards Matt Ahrens assessment philosophically that 'ZFS is no more dangerous than other files sytems without ECC' (with caveats). BUT I'll never run a ZFS system containing meaningful data without ECC. EVER.

FreeNAS IS NOT about recycling old gear. It never will be. ZFS ain't that great when under-resourced either. There are lots of systems to address limited hardware. Pretty much any linux distro can get you there. But here we DO lean towards 'DO IT RIGHT' or just don't do it.

Yes much of the ECC or GTFO is circular, and frankly I find the Scrub of Death logic kinda meh. I also see beyond black and white.
 

Jailer

Not strong, but bad
Joined
Sep 12, 2014
Messages
4,977
xK1wGft.jpg
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I've closed this thread. The topic duplicates a recent one. I summarized that previously:

https://forums.freenas.org/index.php?threads/risk-of-using-non-ecc-ram.32753/page-2#post-203960

We're kind of conservative here, because most people who don't give a frak about their data don't bother with a heavyweight ZFS implementation and all the expense incurred. It's a no brainer to set up a FreeBSD or Linux box running vanilla or a lightweight NASware of some sort to serve files. Most of the people who end up here are looking for something a little more in terms of data protection.

IF YOU ARE NOT ONE OF THOSE PEOPLE, BY ALL MEANS, BUILD YOUR SYSTEM OUT OF WHATEVER CRAP YOUR LITTLE HEART DESIRES AND GET ON WITH YOUR LIFE.

We just ask that you don't blame ZFS or FreeNAS if things go south. For the rest of the new users, though, who do care about protecting their data, ECC is part of the foundation of the protection strategy, as is redundancy in the pool. You can do without ECC and you can do without redundancy in the pool! Of course! But then your chance of data loss goes up.

What's the chance that non-ECC could contribute more heavily to losing a ZFS pool than a more traditional filesystem? The real problem is that we're talking small differences in risk percentage. Opinions are like arses, everyone has one. I don't really give a crap what a software developer thinks of his own awesomeness and how well he thinks his crap will hold up under duress.

The lack of filesystem repair utilities for ZFS introduces additional considerations for ZFS that do not apply to many other filesystems. You are welcome to go run a stack of a thousand ZFS versus a thousand non-ZFS servers and see what the observed reliability is like; all I can say for sure is that I would be forever nervous about the state of a pool that had some sort of error introduced into it, because there's nothing there to repair it. It is potentially damaged forever.

You seem to be interested in trying to quantify what exactly that risk is. I don't give a frak. The additional cost of ECC isn't onerous when you look at the cost incurred to get into the ZFS game to begin with.

So let me summarize this for you:


Boring conversation anyways.
 
Status
Not open for further replies.
Top