How does ECC memory work?

Status
Not open for further replies.

rogerh

Guru
Joined
Apr 18, 2014
Messages
1,111
If it makes a difference to my question, I am using a N54L Microserver. My understanding is that ECC memory allows the correction of some bit errors. Where I am confused is:


1. Is any record kept (without IPMI or any other non-OS access to the system) of error corrections? Presumably if they became at all frequent there would be a risk of multiple errors being falsely corrected.

2. What happens if an uncorrectable error occurs? Does the CPU shut down, or report the error to the application in some way?

In summary, ECC is obviously a good thing, but how do I know it is working properly?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
A lot of it is faith, since there is no standard way of doing it.

ECC can correct any single-bit error and detect the vast (we're talking universe-scale) majority of multiple-bit errors.

At least some BIOSes keep track of ECC errors.

Typically, the system should be halted if an uncorrectable error is detected. I assume this is done at the BIOS level.
 

rogerh

Guru
Joined
Apr 18, 2014
Messages
1,111
So if I've got a bios log it will be somewhere in BIOS setup? Does IPMI enable access to this information? Apparently I can get an IPMI card for about the same price I paid for the N54L!
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
So if I've got a bios log it will be somewhere in BIOS setup? Does IPMI enable access to this information? Apparently I can get an IPMI card for about the same price I paid for the N54L!
It depends a lot on your system...
 
Status
Not open for further replies.
Top