ECC Memory + AMD

Status
Not open for further replies.

DJABE

Contributor
Joined
Jan 28, 2014
Messages
154
Right said joeschmuck.
I'm running ESXi on another machine - 990X chipset (http://www.gigabyte.com/products/product-page.aspx?pid=4434#ov) + FX 8320 (O/C to 4.7 GHZ ;)) with 24 GB non-ECC RAM... since damn Gigabyte doesn't have ECC support on any prosumer mobo's... xD
I just wonder with ASUS / AMD board cpu combo do we really have ECC capability or not... Asus specifies other boards as Non-ECC (Intel chipsets from the same mid range), so I guess they are really serious when they state both ECC and nonECC DIMM's are supported for AM3+ socket boards...
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
The ASUS MB I have does have ECC support, the only thing I can't say is how the MB notifies the user of an ECC error. I suspect by halting the computer and maybe the internal speaker will beep but I don't know for certain. I'm working on faith that something will happen.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
You can simulate a failure with a particular command.
 

DJABE

Contributor
Joined
Jan 28, 2014
Messages
154

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
You can simulate a failure with a particular command.
Okay I'll bite, I didn't know it could be simulated at all, especially by running a command. Tell me more or if this has been discussed elsewhere then I hopefully will be able to find it. Hopefully the simulation isn't just for Supermicro but it will impact Asus MBs as well.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
I think the command is "ipmitool event 3".
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
The Asus MBs in question here do not have IPMI so I don't know if that would work.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
Yeah. I don't know. :(
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
That's okay. I'm hoping that the computer just stops when it runs into an ECC problem it cannot fix. There are no logs but this is the unfortunate risk I take in using this Asus product. I wish Asus would give me an answer on how it would fail but so far the person who answers the questions is clueless.
 

DJABE

Contributor
Joined
Jan 28, 2014
Messages
154
There's an detailed ZFS case study from some US university, they intentionally 'attacked' nonECC memory modules in order to validate proof of concept how ZFS is vulnerable if RAM error occurs. So there must be a way (buffer overflow LOL) to inject bad bit into ram...
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
There's an detailed ZFS case study from some US university, they intentionally 'attacked' nonECC memory modules in order to validate proof of concept how ZFS is vulnerable if RAM error occurs. So there must be a way (buffer overflow LOL) to inject bad bit into ram...

Link please! This is relevant to my interests!
 

DJABE

Contributor
Joined
Jan 28, 2014
Messages
154
Hang on. I downloaded that PDF from my workplace, I need to recall the URL/search parameters :D
I'll post a link here ASAP or upload it if I can VPN to my work machine :)
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
I actually looked at inducing errors in non-ECC RAM via various methods just to observe the consequences with ZFS. But, they are all going to be expensive to implement or extremely time consuming. As a volunteer on this project, that's a little hard to justify. :P
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
Ah. I have read that one before. I was thinking it was something else. Thanks for the link.
 

DJABE

Contributor
Joined
Jan 28, 2014
Messages
154
You're welcome. It's a nice && detailed article for understanding how ECC RAM is really important with ZFS...
 

DJABE

Contributor
Joined
Jan 28, 2014
Messages
154
http://hardforum.com/showpost.php?p=1039700874&postcount=30

That guy says using ECC RAM with M5A99FX Pro R2.0 does not seem to be validated when using famous C program "ecc_check".

Also, I found a shocking truth with that program too - I compiled it on my workplace machine (brand name full blown server Fujitsu TX300 S2) and guess what - I got all zeros!
5004-5007h: ff ff ff ff
5008-500Bh: ff ff ff ff
So I was digging into how could that be since I could put my hand in a fire place that we are running ECC memory!
And the shocking news - mot***fu**ers from Intel && Fujitsu built a server with ECC memory, MOBO with ECC support, BUT CPU WITHOUT ECC SUPORT. That's nowdays old XEON 3.40 GHz: http://ark.intel.com/products/27086/64-bit-Intel-Xeon-Processor-3_40E-GHz-2M-Cache-800-MHz-FSB

I just' can't believe. Even worse, we were upgrading RAM always with ECC RAM for this and other same servers, no one would ever think we are actually not running ECC at all! But that's not my problem but company'.

Among other reasons that kind of things makes me hate 'enterprise' and other stories. You put extra MB, you put extra RAM, but you choose to deal with CPU that can't handle all that! So cyberjock, also the big $$$ companies (Fujitsu in this example) are also to blame for a bad choices and bad setup, not only home users here on the forum! ;)
I still can't believe this is true... gonna inform my CIO tomorrow morning! :D
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
Ok, you need to realize something. That ecc_check only works for certain generations of processors. I believe it is Nahelem, SandyBridge, and IvyBridge only. I'm not sure its been validated for Haswell. In short, if you don't have a CPU that is one of those 3 designs, then the ecc_check is not guaranteed to work. It won't work on AMD CPUs at all.
 

DJABE

Contributor
Joined
Jan 28, 2014
Messages
154
OH gee... !
What a mistake-a-maken! :D
Is there any way to check other cpu's, both AMD and Intel?
If I order SM board and ECC ram with Haswell CPU, is there any way to find out support or to blindly take vendor's word for it?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
There is no way I know of to check. Surely if you dig into the documentation on the CPU Intel(and AMD) may have provided a way to check the ECC features. But good luck on figuring it out, or even coding a program to work.

Supermicro is a very good name, and if they say that they use ECC, you can be sure it works. Nothing would tarnish a server-grade manufacturer's name faster than a post with testable evidence that the ECC function is broken.
 
Status
Not open for further replies.
Top