Blog article advising AGAINST ECC RAM

Status
Not open for further replies.

bigphil

Patron
Joined
Jan 30, 2014
Messages
486
That's funny to say the least. "hey...shit only breaks from the factory, so why spend the money!?" Not my experience. Just recently I had a new DL 380 G8 blow out a stick of memory that had previously been burnt in and tested perfect.
 

briancmoses

Dabbler
Joined
Apr 19, 2014
Messages
30
Hey folks,

Just noticed this very recent article on the interwebz and thought I should bring it up to your attention. I guess he was too chicken to post the article here...

http://blog.brianmoses.net/2014/03/why-i-chose-non-ecc-ram-for-my-freenas.html

I'm the blog's author. I've written a few FreeNAS articles and have considered sharing them here however the forum rules state:
5. Advertising of any websites, services and/or products is prohibited in the forums. Pornography, drugs, warez, hacking and/or links to websites of this type are also prohibited.

Promoting my own blog in the FreeNAS forums seems to violate the spirit of that rule a bit.

If you'd really take some time and re-read the article, you should find the only thing that I'm advising is that people think before they spend the money and weigh all their options in order to make sure they get the most use out of the money they spend. I just happen to be of the opinion that there's more utility to be had by spending that same money in other ways.

I'm not a chicken, but I do have one set as my forum avatar.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
Not a bad blog, it is his opinion but I just added a posting on the blog myself trying to point out the importance of Valuable Data and the extra money spent is worth it to ensure your data remains in tact.
 

trey22

Dabbler
Joined
Apr 11, 2013
Messages
28
Not a bad blog, it is his opinion but I just added a posting on the blog myself trying to point out the importance of Valuable Data and the extra money spent is worth it to ensure your data remains in tact.

Exactly. Many people will derive from this what they "want to hear" vs "what they need to hear". Give people a choice between recommended (ECC) and minimum specs (non-ECC), and guess what, they'll choose the min spec route then complain when something goes wrong. I know people are sheep ( :-0 ), but sometimes it's best not to give them a choice.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
neckhole, We're not against advertising services, within reason. If you want to start a discussion about a post(even a blog you write) you can always do that. We generally just ask for a copy of the text on the forums as blogs tend to just disappear, which makes for poor conversation in the future.

Now if you showed up and started in about how your brand of memory was superior or something like that, we'd probably have a problem and we'd take it up with you in PMs. Even when people show up and just start cussing out everyone they can find I have a hard time banning them. Most of us mods are very conservative with bans and moderating. We don't feel that the best way to have a conversation with someone who either we think is an idiot or doesn't make good arguments should be banned. Note that I don't think you fit into either of those categories, I'm just making a case for the worst kind of posters.

But, you have witnessed firsthand why we admonish comments about non-ECC RAM here. People are stupid, and suddenly things are twisted into a meaning the opposite of what you said or tried to explain. In short, the average user here is an idiot, and we know it. In this case, your article seems to basically say "yeah.. there's serious real risks. I'm okay with those risks though". But then it changed a few days ago when someone mentioned your blog to mean something like "non-ecc clearly doesn't have the negatives that people like me talk about despite the overwhelming objective evidence proving the situation for other users". Now it's 'someone recommends against ECC RAM". Welcome to my pathetic life and dealing with people that clearly aren't/shouldn't work in IT.

Now, in my defense(and yours), 99% of people that come here are looking for a rock solid reliable file server that won't eat their data suddenly one day. They're usually willing to pay a reasonable extra price for hardware if there is a verifiable meaningful benefit for their intended application. For most people, that definition means ECC RAM. Where I am(the USA) ECC RAM is only modestly more expensive than non-ECC RAM, but it means you likely won't be reusing RAM from your old desktops either. My whole argument for ECC RAM is not that everyone should use it. Yes, I think everyone should, but that's not why I made the post. I made the post because 99% of people that come here have no idea just how destructive a bad stick of RAM can be.

If you already know and you still go with non-ECC that's your business. Your server will run fine until it doesn't. And when it doesn't there's not enough money in the world to fix it. It's just gone. If it's going to be nothing more than some temp area for a video service that needs a huge playground for temp files non-ECC RAM may be what you want. But, I won't pretend to think that my opinion is all that matters. It's your data, and I simply want people to be informed of the potential consequences.

Now, after I read your blog post a few weeks agothree things went through my mind:

1. Great.. someone who actually *should* work in IT. He's got a brain and can assess risk for himself.
2. WTF did that a**hole just do to me!? Now everyone will argue with me forever that non-ECC RAM is safe. (I don't really think you are an a**hole, it was more the shock of what I can expect in the forums.. like this thread)
3. How many people are going to decide I'm an idiot and that all of my advice should now be ignored because one person has a slightly different(but knowledgeable) perspective.

As the poor soul that will now have to field those questions and arguments forever you get my sarcastic "thank you". (I am joking because I'm fairly sure you didn't think it was going to get twisted around and shoved in my face like it has/will be).

And now, I have to sit back and ask myself how do I deal with this... do I delete any and all comments related to your blog? do I just choose never to answer them? do I go and ask you to remove your blog? do I just let people be stupid and buy non-ecc RAM and laugh at them when their only copy of their wedding pictures and family album turns to random bits? do I spend the rest of my days trying to explain myself in extreme detail how you came to your decision and how your decision was right for you but may not be right for me or someone else?

If it's the last question I'm really not looking forward to that. As an unpaid volunteer there's a certain amount of BS that comes through the forums I can deal with. But, knowing that forever, even 10 years from now, people will argue against me because "some blog said non-ECC RAM is recommended against!"
 

joelmusicman

Patron
Joined
Feb 20, 2014
Messages
249
Cyber: A novel, more like!

Back to topic: Even as a "home" user, I liked the benefits of bit rot protection offered by ZFS, and read about the dangers of non-ECC ram. I decided that I'd do an ECC build because ripping or "acquiring" movies takes a long time, and I don't want to try to play a movie one day and find out that my entire library of 1700ish movies and 500 TV episodes is trashed, not to mention DSLR images, documents, etc. So for me, saving a potential 200+ hours of my VALUABLE time was worth an extra $100 or so on the build.
 

joelmusicman

Patron
Joined
Feb 20, 2014
Messages
249
I have no idea what happened, but that was well laid out, but then got turned into a giant paragraph. forum failure FTW.


When I posted my reply just now, the carriage returns didn't go through as well. It worked when I edited it though. FYI...
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
Back to topic: Even as a "home" user, I liked the benefits of bit rot protection offered by ZFS, and read about the dangers of non-ECC ram. I decided that I'd do an ECC build because ripping or "acquiring" movies takes a long time, and I don't want to try to play a movie one day and find out that my entire library of 1700ish movies and 500 TV episodes is trashed, not to mention DSLR images, documents, etc. So for me, saving a potential 200+ hours of my VALUABLE time was worth an extra $100 or so on the build.

And that's a very responsible way to look at it! For some people, they wouldn't care. For others, they'd pay $500 for that insurance. Still, for others, it's not worth $20.

All I want is for people to be able to make that decision for themselves. There is no "right or wrong" as it is totally their choice. They're data is lost. I just don't want them to lose their data and then be in shock when they find out that if they had spent an extra $100 on ECC RAM they'd never have lost their data. And believe me.. I had been here 18-20 months before I decided to write that ECC vs non-ECC thread.
 

briancmoses

Dabbler
Joined
Apr 19, 2014
Messages
30
As a reader of many forums in the past, it always troubled me when the guy who never had anything to contribute in the forums was pimping his latest blog in the forums. I hated to see a guy who had post count right around the same number of blogs that he'd promoted in the forum. I tend to take these kinds of rules pretty strictly, but thanks for clarifying. I didn't feel like it was unwelcomed to share my blogs but I also didn't want to be that guy that I don't care very much for.

I apologize for singling your thread out and using it as an example, I wasn't attempting at singling you out or making you look bad. My intent was to point out that there was is a pervasive anti-Non-ECC vibe within the FreeNAS forums. I understand the justification and agree it probably benefits the uninformed masses. If there's some verbiage from my blog that you think is particularly offsides, just private message or email me your thoughts/suggestions and I'd be happy to take a stab at revising what I wrote to better express what I intended.

ECC ram is only modestly more expensive, but there are no modestly-priced low-power motherboards with a healthy number of SATA ports that support ECC. They exist, but they are quite expensive. The minute a "budget" mini-ITX board exists and supports ECC RAM, I'll be both blogging about it and upgrading to it.

For those people who might doubt and point to my blog and say that "see, this guy says Non-ECC is okay", I can state unequivocally that ECC is the better choice. I'm pretty certain that I state as much within my blog but I'm happy to repeat it here, since people may have glossed over it.

If as a potential FreeNAS user you can't be bothered to understand the dangers that ECC RAM saves you from, then you definitely should be buying ECC RAM.

Edit: Wow, forum formatting failures abound!
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
I don't feel like you were singling me out or anything. Your blog was, in my opinion, very professional and very rationale. The problem is not with you, your logic for using non-ECC RAM, or anything like that. The problem is there are too many people that should not be working in IT at all, that have for years, and they'll happily justify themselves with "I got 20 years in IT". It doesn't matter what you argue, you are wrong because they have 20 years in IT. As soon as someone drops the "I've worked in IT for XX years" I know they are an ignorant idiot.

Well, I technically have zero years in the IT field, but many people look up to me like some kind if IT god. I'm no god, and I've made plenty of extremely stupid mistakes. I've lost data that I'll never get back because of backups that were automated, but failed at an inopportune time along with me making a very poor decision with a RAID controller. I just make sure I always learn from them, even if they are the most expensive mistake I've ever made in my life.

As soon as someone is willing to admit they are not a genius in their field and that no matter how much they know there's a cubic butt-load of information in your field you don't know, you are better off than 98% of the people in your field. But for the ignorant people that think they can do no wrong, think they have learned all there is to learn because "they've worked in IT for XX years", those people deserve what's coming.
 

DJ9

Contributor
Joined
Sep 20, 2013
Messages
183
My dad has a saying, "Experience teaches slowly and at the cost of mistakes".
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
My dad has a saying, "Experience teaches slowly and at the cost of mistakes".

Yep.. that's why I like to know why people lose their data. Let everyone else lose their data and you get the benefit of their experience. ;)

Not that I hope people lose data, but it's gonna happen no matter what I do.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
ECC ram is only modestly more expensive, but there are no modestly-priced low-power motherboards with a healthy number of SATA ports that support ECC. They exist, but they are quite expensive. The minute a "budget" mini-ITX board exists and supports ECC RAM, I'll be both blogging about it and upgrading to it.

What do you consider a budget price for a MB? Mine cost $65 US, but but is a uATX board, CPU was $99 US which brings the cost up to $164 US for MB+CPU. All the other costs are the same between components in general. My MB isn't fancy like all those Supermicro boards but it's been getting the job done for almost a year now without a single issue.
 

D4nthr4x

Explorer
Joined
Feb 28, 2014
Messages
95
My opinion on this is that if you want to cut corners and that data isn't "that" important you shouldn't be wasting resources on ZFS anyway since the entire file system is designed to have overhead for protecting important data. If you just want to host stuff you torrented you are much better off with mdadm on linux. It is still affected by bad non-ecc ram but to a much lesser degree (no scrubs or checksumming etc.).
 

aufalien

Patron
Joined
Jul 25, 2013
Messages
374
A real simple way to put it is that ZFS uses RAM as a sort of storage extension so why wouldn't one want to ensure that the RAM used is as error free as possible?

Just because one hasn't been hit by a car crossing the street on a red light, doesn't mean that it can't happen.
 

joelmusicman

Patron
Joined
Feb 20, 2014
Messages
249
A real simple way to put it is that ZFS uses RAM as a sort of storage extension so why wouldn't one want to ensure that the RAM used is as error free as possible?

Just because one hasn't been hit by a car crossing the street on a red light, doesn't mean that it can't happen.

Not to mention that hardware RAID controllers almost without exception use internal ECC ram for buffering.
 

aufalien

Patron
Joined
Jul 25, 2013
Messages
374
Not to mention that hardware RAID controllers almost without exception use internal ECC ram for buffering.

OMG right, how could I have over looked this one!

Man, blogs are no more then opinions and those are like a#$holes, every one has 1.

It would be more interesting to have him list his experience in tech. How many years has he been were the rubber meets the road so to speak.

Oh never mind, I just read more of it, yea he lists it. Just goes to show you that any one can do tech and get paid to do it.

I'm sorta wondering what the point was? We all have an experience, should we all blog those? Just weird.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
That argument regarding RAID controllers has been made many times here. The analogy is something like "Your RAID controller uses ECC RAM for it's playground, while ZFS uses your system RAM for it's playground. So why would you think it's illogical to then expect ECC RAM to be the expected norm for ZFS?"

The blog does ignore some basic problems with using non-ECC RAM. Based on dozens of user feedback and some simple understanding of how all this technology works you can expect:

1. If your RAM goes bad, there's a significant chance(>90%) that you will have no clue that the RAM has failed for some indeterminate period of time.
2. If you do things like rsync which touch all of your data at regular intervals there's an extremely high chance you will destroy a significant amount of data on your entire backup location. Why? Because when your RAM fails you have no clue it's failed for that "indeterminate period of time". For people that have had a very good solid backup routine they've had no logs indicating errors, no clue whatsoever anything was wrong until the primary server and backup were damaged beyond usability. In almost every case they realize the RAM is bad, think this is a simple "do a restore from backup" only to find that the backup files don't open, are garbled, etc.
3. Running memtest pretty much validates the RAM is still good. When RAM goes bad you won't likely know until you run the test and get errors. Well, that's a little too late for your primary server(and almost certainly your backups). The whole point of ECC RAM is to prevent errors or halt the system to prevent data corruption in RAM. Anotherwords, you've deemed the server's job to be of high enough importance that you'd rather stop the server from functioning rather than let it do what it probably shouldn't do. Good example is banking. They'd rather stop all incoming and outgoing transactions for account processing than take the risk that the server might decide that instead of depositing your paycheck for the last 2 weeks you deserve $24 million dollars. You might like it, but your bank won't. ;)
4. Bitflips are a real problem. I don't consider them a "significant" problem because a bitfip for data that either is in the read cache(and may later be written corrupted) or is about to be written to the zpool, is not what I would consider to be a "significant problem". Statistically the quantity of user data written far far exceeds the amount of metadata(the pool is supposed to store user data.. not tons of metadata on user data). If you lose one file you'll be upset, but it probably won't be the end of the world. But, corrupt metadata that crashes the server and makes the pool unmountable and unrepairable... that'll definitely ruin your day/weekend.
5. ZFS has real problems handling corruption of metadata if it can't immediately repair it *properly* with parity data(that may not exist in a non-corrupt state).
6. Many users here just don't do backups. Yes, this is pretty stupid. People have done it. People will continue to do it. But, for many spending that extra $100 is a damn good insurance policy compared to building a whole second server to handle the real possibility of losing their data.

Just like what neckhole wrote up pretty expertly, it's very much about your risk assessment. If you think your RAM is good enough that you are willing to risk your pool for your situation, feel free. You'll get no sympathy from me if your life falls apart when your data is lost forever.

The one forum user I'll never forget: A guy from Oklahoma messaged me and begged me to help him. He had lost his two children in a house fire a few years before. He created a FreeNAS server, setup ZFS and had done everything right except for two things.. he was using non-ECC RAM and he had no backups. Both were going to be rectified about 2 weeks later, he was just waiting for parts to arrive in the mail. Well, his RAM failed in some period of time that was less than about 4 days, and in those 4 days he lost all of his data. All of the video and digital photo collection he had of his kids was lost that day. He and I talked on Skype. Both him and his wife were on the other end, crying hysterically because they had some pictures of their kids, but not nearly enough to keep them calm during this disaster. I tried to get something back, but there was nothing I could do. The metadata was scrambled beyond recovery. There just wasn't enough useful information to get anything from it.

I know two things though:

1. I will never have a problem with bad RAM killing my pool.
2. I will still see people that don't have a clue of the risks showing up in this forum with lost data and no backup.
 

aufalien

Patron
Joined
Jul 25, 2013
Messages
374
The one forum user I'll never forget: A guy from Oklahoma messaged me and begged me to help him. He had lost his two children in a house fire a few years before. He created a FreeNAS server, setup ZFS and had done everything right except for two things.. he was using non-ECC RAM and he had no backups. Both were going to be rectified about 2 weeks later, he was just waiting for parts to arrive in the mail. Well, his RAM failed in some period of time that was less than about 4 days, and in those 4 days he lost all of his data. All of the video and digital photo collection he had of his kids was lost that day. He and I talked on Skype. Both him and his wife were on the other end, crying hysterically because they had some pictures of their kids, but not nearly enough to keep them calm during this disaster. I tried to get something back, but there was nothing I could do. The metadata was scrambled beyond recovery. There just wasn't enough useful information to get anything from it.

That's very sad, sorta reminds me of a similar issue I had with a neighbor who backed up pics on CD back in the day. I mentioned that CDs you burn can chemically change color over time rendering them useless.

He doubted me but started to print some photos. Later on, some of his most valued family CDs were no longer readable. They looked pristine, tried many readers, few diff OSes etc... He was distraught to say the least. When ppl ask me what I do with personal data, I make 3 copies were 1 is offsite, say nothing else and let them stew over that.

And thanks for the share, I'm sure that will open some eyes.
 
Status
Not open for further replies.
Top