ZFS and ECC RAM question

Status
Not open for further replies.

Djg

Cadet
Joined
Apr 7, 2014
Messages
3
Hi

In everything I read about Freenas / ZFS, it says ECC ram is pretty much a concrete requirement.

I know about bad RAM can cause pool corruption, hence why it’s needed, but you don’t see all these warnings / horror stories about Linux mdadm.
What makes ZFS so much more sensitive to RAM corruption?

I’m not trolling here: just evaluating if Freenas would meet my needs as I have an existing server (with ECC) that runs an old Linux version with mdadm that needs a ground up rebuild…
Although I already use ECC, the warnings jumped out at me and I haven’t recollected seeing the same on mdadm, even though it does a (broadly) similar job.

Thanks for any info :)
 

alexg

Contributor
Joined
Nov 29, 2013
Messages
197
google zfs vs mdadm. I've spent several months researching where to move from Windows Home Server and settled on ZFS (freenas specifically). Bought new server and ECC memory, but at least I can be sure that my data will not go bad over time.
 

Djg

Cadet
Joined
Apr 7, 2014
Messages
3
Thanks Alex.
I had looked at google (regarding ZFS and BTRFS) but I could only come up with that its easier to recover from a bad mdadm array with BTRFS, whereas recovery for ZFS is very difficult.
I'm sure I'm missing something though!

Whilst I'm happy that accepted wisdom is that ZFS needs ECC, the bit I'm missing is why the same issues don't happen to mdadm.

Not being argumentative or anything; I'm a security guy by trade, so my PC hardware knowledge is quite limited, which is why I'm looking at Freenas rather than fiddling with upgrading Linux.
 

alexg

Contributor
Joined
Nov 29, 2013
Messages
197
Bitrot protection, write hole, snapshots, and replication via snapshots are my primary reasons for picking ZFS
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
The most basic reason is that ZFS checksums everything. That extra layer is both a plus and a minus. It helps you in the event your disk is corrupt. But, it hurts you in the event your RAM is corrupt. Read the ECC vs non-ECC RAM for an example...
 

alexg

Contributor
Joined
Nov 29, 2013
Messages
197
Djg, I'll be honest with you, freeNAS web GUI makes it easy to configure and manage for non-techie, but I'm not sure I would recommend it for someone without good hardware/OS knowledge. Others may disagree with me :)
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Djg, I'll be honest with you, freeNAS web GUI makes it easy to configure and manage for non-techie, but I'm not sure I would recommend it for someone without good hardware/OS knowledge. Others may disagree with me :)

I agree whole-heartedly.
 

memario

Cadet
Joined
Jul 29, 2012
Messages
2
I find it weird that there are so many opinions on the consequences of using non-ECC ram.

Reading this thread and the ECC vs non-ECC ram and ZFS post from cyberjock I get the impression that ECC ram is mandatory when using ZFS. If you choose not to use ECC-ram combined with ZFS it will even hurt you.

The most basic reason is that ZFS checksums everything. That extra layer is both a plus and a minus. It helps you in the event your disk is corrupt. But, it hurts you in the event your RAM is corrupt. Read the ECC vs non-ECC RAM for an example...

On the contrary in this fragment of the BSD Now podcast it is said by Allan Jude (which quotes Matt Ahrens developer of open-zfs) that using non-ECC ram combined with ZFS will at least not hurt you.

Although this still leaves me somewhat puzzled about the consequences of using non-ECC ram, I must say that I have made my decision and spent the extra euro's and went with ECC-ram. This is mainly due to the fact that my Freenas system is used for backups and I demand that everything is stored with the utmost certainty of not being corrupted.
 

alexg

Contributor
Joined
Nov 29, 2013
Messages
197
If you want industrial strength protection for your data, you go with ECC regardless which O/S or filesystem. There is no need to discuss it.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
We discussed this in the ECC vs non-ECC thread memario.. check out the last 10 posts or so in that thread.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
using non-ECC ram combined with ZFS will at least not hurt you

Well THAT'S probably true: it is unlikely that the non-ECC RAM is going to reach out from your screen and strangle you to death. But use of non-ECC RAM could hurt your DATA though. And our basic theory is that people select ZFS to protect their data, so design decisions that compromise the ZFS protection strategy are something we discourage.

Feel free to use non-ECC RAM. Also feel free to slam your hard drives on a table several times before installing them; obviously if they still work after that then you can trust them, right? I also highly recommend rubbing your feet on the carpet and not grounding yourself when you handle the components in your computer. No need to shut down your computer nicely either, just power that bad boy off by yanking out the power cord and it'll take care of itself when you plug it back in. Also it is perfectly fine to move your computer around while it is running, don't worry if you bang or jar it.

You can absolutely do every one of the above things and there is a good chance it'll still be working after you've done it. The question is .... do you actually want to?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
My biggest complaints with the whole ecc versus non-ecc are:

1. Most people don't realize that non-ECC RAM can spell certain disaster.
2. Most people don't realize that non-ECC RAM can spell certain disaster for your backups.
3. Most people are trying to reuse some desktop hardware that is old(for various definitions of old).
4. Most people are not informed enough to realize that ECC RAM really makes a difference for reliability.

For #1 and #2, there's "ifs" attached. IF your RAM goes bad. IF you don't catch it the moment the RAM fails(which is impossible). IF your backups are "online"(that is, they aren't in a safety deposit box or equivalent).

If your RAM never fails, great. You can use non-ECC RAM and all will probably be just fine. But, how are you going to guarantee that your RAM never fails? Some manufacturer came out and said that statistically, 5% of RAM sticks out there fail every year. So if you have 4 RAM sticks, you have a 20% chance that one stick will fail in a given year. That's pretty nasty odds when you are trying to use RAIDZ2, which has a statistical chance of total pool loss of <1% per year.

So when people claim they did RAIDZ3 with non-ECC RAM I stop and ask myself what the f*ck they were smoking. You just took a pool that has something like <1% chance of being lost in a year, and made it 20% because of the RAM. Remember, it's about the weakest link in the chain, which means 20%(technically it's an equation that is more than the weakest link, but let's keep this simple). RAIDZ1 is something like 18%, and we don't recommend that stuff! So why the hell would we recommend non-ECC RAM.

So, strictly from a math numbers thing, RAIDZ1 = 18%, and 4 sticks of non-ECC RAM are 20%. So if you take these numbers as fact and have a choice between going with RAIDZ2 with non-ECC RAM or RAIDZ1 with ECC RAM, RAIDZ1 with ECC RAM is technically safer.

Statistically, there will always be people who "beat the odds". So freakin' what. Does that mean you should go gambling too? But, these people that beat the odds will never stop telling the world how awesome they are because they beat the odds. And they'll have no problem telling you that they did exactly what PersonX said not to do and that it worked for them(and therefore it MUST work for you). But, when it doesn't work for you, your data is lost. They'll shrug their shoulders, walk to the next guy, and start telling him how awesome they are because they beat the odds.

So if you wanna go with non-ECC RAM, do it. I double dog dare you. It's your win or your loss. The stupid shall be punished, and I won't miss any sleep with someone makes the wrong choice and loses their data. The only thing I *want* is for people to *know* about these potential pitfalls. If they know and still do it, that's fine. Their gamble and their data. My concern is for the people that *don't* know.

I'm through giving a sh*t about people that want to argue it. I'm not here to argue it. I'm here to inform. If you don't want to be informed, then don't read about it and keep doing whatever it is that you are doing. If you want to dismiss me as an uninformed idiot, that's fine too. I still won't care. But, I will gladly laugh at your demise when you lose your pool because of bad RAM. Not because I'm a jerk(I am a jerk.. but for other reasons), but because I told you so and you didn't listen.
 

joelmusicman

Patron
Joined
Feb 20, 2014
Messages
249
Cyber, I'm surprised that you admin types don't do the following anytime an ECC thread comes up (which seems to happen at least twice a day!):
  1. Provide a link to the main ECC thread and tell people they're welcome to discuss there.
  2. Lock the thread.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Oh, trust me.. I've started that after the conversation I had last night.

And on that note.. thread LOCKED.
 
Status
Not open for further replies.
Top