BUILD Small form factor home NAS build, hardware feedback and some questions

Status
Not open for further replies.

survive

Behold the Wumpus
Moderator
Joined
May 28, 2011
Messages
875
Hi guys,

I've based my belief that the i3-2100 supports ECC when paired with a C200 chipset mostly on this presentation from Intel:

http://cache-www.intel.com/cd/00/00/46/78/467819_467819.pdf

on page 5 there is no differentiation between the "level" of support (is supported, is actually used) between the E3's & the i3's. Intel has enabled ECC on certain i3 & Pentium processors when paired with the C200 chipset so they could offer systems with ECC at a lower price point than what they would charge for a proper Xeon. If nothing else the whole point of the presentation I linked above is to promote the value of a system with ECC, it simply doesn't make sense that they would include the i3's if they didn't.

Dusan:

It would make sense that the 24:25 bits of the MAD_DIMM_ch0 register would be listed as reserved in the desktop cpu datasheet because ECC simply isn't an option on the desktop chipsets. When you pair an i3 with a server chipset an un-reserved value should be assigned there.

All that said, I don't think there has been any additional clarity brought to the whole question of i3's & ECC support by my experiment. Personally I think that the script returned a value that seems in line with what is expected if ECC is actually working helps to confirm what the presentation from Intel says, but the difficulty of finding some documentation that expressly says that ECC works in this combination of board and proc is annoying.

So I got to thinking....

I have an ESXi box that runs nearly the same combo of board and proc (X9SCL+-F instead of the X9SCL-F)...maybe there's a tool in the ESXi CLI that can provide the information I'm looking for. So I stuffed "esxi confirm ecc" into the google and was directed to "dmidump" which isn't in 5.1, but "smbiosDump" (which doesn't decode the dmi info) is. That command returned this information for each of my DIMMs:

Code:
 
  Memory Device: #9
    Location: "DIMM_1A"
    Bank: "BANK 0"
    Manufacturer: "Kingston"
    Serial: "45239058"
    Asset Tag: "9876543210"
    Part Number: "9965525-008.A00LF"
    Memory Array: #7
    Error Info: #18
    Form Factor: 0x09 (DIMM)
    Type: 0x18 (Other)
    Type Detail: 0x0080 (Synchronous)
    Data Width: 64 bits (+64 ECC bits)
    Size: 4 GB
    Speed: 1333 MHz
 


Next I tried dmidecode on my filer and found this in the output:

Code:
 
Handle 0x0007, DMI type 16, 23 bytes
Physical Memory Array
        Location: System Board Or Motherboard
        Use: System Memory
        Error Correction Type: Single-bit ECC
        Maximum Capacity: 32 GB
        Error Information Handle: 0x000F
        Number Of Devices: 4
 


Given this information I am fairly comfortable with my belief that an i3 processor, when paired with the proper motherboard will indeed support ECC memory.

-Will
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
I agree with your analysis. I'd love to see if using non ecc ram in that machine changes the output of the ecc_check.

I just prefer not to jump to the conclusion that it is correct without some solid evidence. I wouldn't call it solid(IMO) unless it changes as we expect with none cc ram.
 

survive

Behold the Wumpus
Moderator
Joined
May 28, 2011
Messages
875
Hi cyberjock,

Heh....the fatal flaw with that plan is the system simply won't boot with plain-old DIMMs....it will just beep at you a few times. You have to have the proper combo of proc & RAM to even get one to turn on.

-Will
 

Dusan

Guru
Joined
Jan 29, 2013
Messages
1,165
You have dmidecode directly in FreeNAS, it was even already mentioned in this thread.
The problem is that the SMBIOS can often report funny data. Even in your case:
Code:
Data Width: 64 bits (+64 ECC bits)

ECC modules are 72 bits wide -- 64 data bits + 8 bits for ECC. 64+64 is nonsense, that would be more like RAM mirroring than checksumming :).
However, my hardware is officially ECC capable (Supermicro X9SCL + Xeon E3 + Kingston ECC modules) and the X9SCL SMBIOS reports the same -- Single bit ECC and the strange 64 + 64 bits for memory. It's hard to trust the BIOS when you see that part of the data is wrong.
So, we are back to square one, is the ECC really working with your CPU (or mine :) )? Hard to tell, short of pulling a module, masking one data pin and trying how will the system behave.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
Weird.. I have 2 systems that I can choose to use ECC or non-ECC. I was under the impression(apparently wrongly) that you could choose to use non-ECC at any time(although it wouldn't be smart to use non-ECC if you can use it).
 

iostream

Dabbler
Joined
Aug 20, 2013
Messages
13
I agree with all that's been said here. The fact that both dmidecode and the script seem to indicate that my motherboard/cpu/ram combination are utilizing ECC makes me think I'm good. I think we should be very clear that the script will not detect ECC RAM for every system, but it would be interesting to have a bunch of people run it on their systems and report its output. I ran it on my old Dell Centrino laptop and I got nothing but ff values, like you did on your VM system, cyberjock.

By the way, I decided to run dmidecode -t 17 on my system to see what width it reported. The width values look right in my case:

[root@freenas] ~# dmidecode -t 17
# dmidecode 2.11
SMBIOS 2.7 present.

Handle 0x0029, DMI type 17, 34 bytes
Memory Device
Array Handle: 0x0026
Error Information Handle: No Error
Total Width: 72 bits
Data Width: 64 bits
Size: 8192 MB
Form Factor: DIMM
Set: None
Locator: DIMM1
Bank Locator: CHANNEL A DIMM0
Type: DDR3
Type Detail: Synchronous
Speed: 1333 MHz
Manufacturer: Kingston
Serial Number: 3212C976
Asset Tag: A1_AssetTagNum0
Part Number: 9965525-100.A00LF
Rank: 2
Configured Clock Speed: Unknown

Handle 0x002C, DMI type 17, 34 bytes
Memory Device
Array Handle: 0x0026
Error Information Handle: No Error
Total Width: 72 bits
Data Width: 64 bits
Size: 8192 MB
Form Factor: DIMM
Set: None
Locator: DIMM2
Bank Locator: CHANNEL A DIMM1
Type: DDR3
Type Detail: Synchronous
Speed: 1333 MHz
Manufacturer: Kingston
Serial Number: 3212E076
Asset Tag: A1_AssetTagNum1
Part Number: 9965525-100.A00LF
Rank: 2
Configured Clock Speed: Unknown
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
I have a few hardware configurations with ECC and non-ECC RAM. I plan to compile a list. Maybe we should start a thread on this when we have more information.
 

survive

Behold the Wumpus
Moderator
Joined
May 28, 2011
Messages
875
Hi guys,

Just for completeness here's a post from the ESXi 5 forums over at vmware.com:

https://communities.vmware.com/message/2139752#2139752

That's about the only thing I find in google when I search for "
Data Width: 64 bits (+64 ECC bits)".

Anyway, they guy is running a Dell T110 server which looks like Dell's version of our Supermicro system. He's got a Xeon in there that returns the same odd value for data width.

Could simply be something silly\poorly worded Vmware did with smbbiosDump.

-Will
 

Dusan

Guru
Joined
Jan 29, 2013
Messages
1,165
Could simply be something silly\poorly worded Vmware did with smbbiosDump.
Nope, it's not the tool, the SuperMicro SMBIOS is reporting those funny values. The tool just displays what the BIOS reports. When I run dmidecode -t 17 on my SuperMicro (X9SCL + Xeon E3 + Kingston ECC) FreeNAS box I get:
Code:
        Total Width: 128 bits
        Data Width: 64 bits

Which is just a different way of displaying the silly 64 + 64 information reported by the BIOS. Also, apparently much more people are using dmidecode than smbiosDump, so when you try to google for "dmidecode Total Width: 128 bits" you will get tons of results -- many broken SMBIOSes out there :(.
 

Dusan

Guru
Joined
Jan 29, 2013
Messages
1,165
Just for fun:
Googling 'dmidecode "Total Width: 128 bits"': About 2,520 results
Googling 'dmidecode "Total Width: 72 bits"': About 1,160,000 results

So, most BIOSes get it right :).
 

Krutet

Dabbler
Joined
Jul 19, 2013
Messages
37
n00b here; How do I run this python script to (maybe) check if I'm using ecc or not on my freenas?
 

titan_rw

Guru
Joined
Sep 1, 2012
Messages
586
I can confirm this properly detects ecc ram NOT in use on my system (i5-3570) and consumer MB with non-ecc ram:

Code:
5004-5007h: 20 20 66 0
5008-500Bh: 20 20 66 0


As expected, non-useful results on an AMD system (x2-6000+ I think):

Code:
5004-5007h: ff ff ff ff
5008-500Bh: ff ff ff ff
 

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824
n00b here; How do I run this python script to (maybe) check if I'm using ecc or not on my freenas?

Put the script on some dataset (your home directory would do).
Then go to the shell and change directory to wherever you decided to put the script in.
Then execute these commands:
Code:
chmod o+x ecc_check.py
./ecc_check.py

Make sure you're logged in as root before executing those commands.
 

Krutet

Dabbler
Joined
Jul 19, 2013
Messages
37
Thanks! I run the script but it gives me no output what so ever..?

Edit: I've got it, it worked. Thanks.

I've got two 3's at the end. Running s1200kpr with the g2020 and ecc ram. So all ecc gear.
 

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824
Glad it worked out for you. :cool:
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
WOW. Here's a fun twist.

Hardware: Gigabyte GA-X58-UD5
CPU: E5606 @ 2.13Ghz
ECC RAM installed

Code:
[root@freenas /data]# python ecc_check.py                                   
5004-5007h: ff ff ff ff                                                     
5008-500Bh: ff ff ff ff                                                     
[root@freenas /data]# dmidecode -t 16                                       
 
# dmidecode 2.11                                                           
SMBIOS 2.4 present.                                                       
 
Handle 0x0017, DMI type 16, 15 bytes                                       
Physical Memory Array                                                     
        Location: System Board Or Motherboard                             
        Use: System Memory                                                 
        Error Correction Type: None                                       
        Maximum Capacity: 6 GB                                             
        Error Information Handle: Not Provided                           
        Number Of Devices: 6          


Naturally I wasn't expecting the ecc_check.py to work, but I figured I'd run it just to see what the output was. But the dmidecode command is interesting. I bought an installed ECC RAM because allegedly it will use ECC if you install ECC RAM. The CPU clearly supports ECC. So where am I mistaken? Is the dmidecode not the correct method to verify ECC for x58 chipsets? I did bootup memtest86+ 4.20 and it does say "ECC: Detect/Correct".

With non-ECC RAM memtes86+ 4.20 says "ECC: Disabled". The output of the ecc_check.py and dmidecode are the same.

So presumably memtest86+ validates ECC is working for x58, but ecc_check.py and dmidecode don't. ecc_check.py is a "duh" for not working, but dmidecode.. anyone have info on this?
 

Dusan

Guru
Joined
Jan 29, 2013
Messages
1,165
WOW. Here's a fun twist.?
ecc_check.py: Nehalem/Westmere is a different microarchitecture, the registers used by ecc_check are not there (I checked the data sheet).

dmidecode: Broken BIOS. Many vendors do not care much about SMBIOS on desktop grade boards. :( And as we already saw even Supermicro returns silly module widths (128 bit).

memtest86: There are two forks of the original memtest86 codebase.
memtest86+: http://www.memtest.org/
memtest86 (without the plus): http://www.memtest86.com/

memtest can usually detect ECC properly if it was updated to support the specific CPU architecture.
memtest86 is currently the one being regularly updated (last update: August 2013), but the source code is no longer available. memtest86+ had last version released in January 2011.

As Nehalem/Westmere is an older architecture, both memtests support it. So, I'd say in this case memtest gives you the correct answer.
(memtest86+, controller.c, function setup_nhm() detects ECC in Nehalem and it's completely different to what ecc_check does.)
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
I use the one at memtest.org. I was thinking of putting together a thread with all this ECC stuff. If I have all of my info straight in my head::

1. Nahelem/Westmere seems to be best tested with memtest86+ (memtest.org)
2. Sandy Bridge / Ivy Bridge is best tested with ecc_check and/or dmidecode. Ignore the potential silly widths and just look for the line that says ECC is on(or not). ecc_check is probably better because it gives you the 4 possible outcomes.
3. Haswell is a big question mark right now, but is quite possibly compatible with ecc_check and/or dmidecode.

If I put all this junk together, I'll probably PM you Dusan(if you don't mind) to verify I'm technically accurate before I post it.

This has been pretty enlightening though. I've had alot of questions regarding ECC vs non-ECC over the years and this thread cleared up all of my questions.
 

Dusan

Guru
Joined
Jan 29, 2013
Messages
1,165
dmidecode is hit & miss, you are at the mercy of the BIOS, so I would not recommend it
ecc_check only works with "few" CPUs
memtest86(+) would be my choice, there is now RC1 of memtest86+ 5.0 available (http://forum.canardpc.com/threads/68001-NEW-!!-Memtest86-5.00-RC1-available-!-Need-betatesters-!), that claims support for Ivy/Sandy bridge and preliminary support for Haswell.
memtest86 has also a promising 5.0 Beta (with UEFI support, see: http://www.memtest86.com/support/ver_history.htm), but I can't find a list of supported CPUs.
I think it would make sense to have thread that keeps track of which CPUs are supported by which memtest86.
 

Dusan

Guru
Joined
Jan 29, 2013
Messages
1,165
Hmm, reading the memtest86+ 5.0 RC thread. It seems it no longer displays the ECC status :(.
 
Status
Not open for further replies.
Top