Over the last couple of days I have seen the following error message appear in the console.
Apr 25 03:27:25 freenas MCA: Bank 5, Status 0xd400008000910091
Apr 25 03:27:25 freenas MCA: Global Cap 0x0000000000000806, Status 0x0000000000000000
Apr 25 03:27:25 freenas MCA: Vendor "GenuineIntel", ID 0x406d8, APIC ID 0
Apr 25 03:27:25 freenas MCA: CPU 0 COR OVER RD channel 1 memory error
Apr 25 03:27:25 freenas MCA: Address 0x5aa320380
Having searched the forums, I found the following thread: HOWTO: Troubleshooting faulty RAM
I need someones second opinion as to which my DDR3 ECC memory modules needs replacing.
mcelog:
MCA error messages:
dmidecode:
Memory modules:
Is DIMMB1 failing/dying? Or have I mis-read the logs/information?
Apr 25 03:27:25 freenas MCA: Bank 5, Status 0xd400008000910091
Apr 25 03:27:25 freenas MCA: Global Cap 0x0000000000000806, Status 0x0000000000000000
Apr 25 03:27:25 freenas MCA: Vendor "GenuineIntel", ID 0x406d8, APIC ID 0
Apr 25 03:27:25 freenas MCA: CPU 0 COR OVER RD channel 1 memory error
Apr 25 03:27:25 freenas MCA: Address 0x5aa320380
Having searched the forums, I found the following thread: HOWTO: Troubleshooting faulty RAM
I need someones second opinion as to which my DDR3 ECC memory modules needs replacing.
mcelog:
Code:
root@freenas[~]# mcelog Hardware event. This is not a software error. MCE 0 CPU 0 BANK 5 TSC 7ddec99c5a40 ADDR 5aa320380 TIME 1619336966 Sun Apr 25 09:49:26 2021 MCG status: STATUS d400008000910091 MCGSTATUS 0 MCGCAP 806 APICID 0 SOCKETID 0 CPUID Vendor Intel Family 6 Model 77 Step 8 Hardware event. This is not a software error. MCE 1 CPU 0 BANK 5 TSC 85ba92baf978 ADDR 5aa320380 TIME 1619336966 Sun Apr 25 09:49:26 2021 MCG status: STATUS 9400004000910091 MCGSTATUS 0 MCGCAP 806 APICID 0 SOCKETID 0 CPUID Vendor Intel Family 6 Model 77 Step 8 Hardware event. This is not a software error. MCE 2 CPU 0 BANK 5 TSC 13a7a07bf6650 ADDR 5aa320380 TIME 1619336966 Sun Apr 25 09:49:26 2021 MCG status: STATUS d400008000910091 MCGSTATUS 0 MCGCAP 806 APICID 0 SOCKETID 0 CPUID Vendor Intel Family 6 Model 77 Step 8
MCA error messages:
Code:
root@freenas[~]# cat /var/log/messages | grep MCA Apr 23 11:29:54 freenas Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Apr 23 11:29:55 freenas Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Apr 24 03:27:24 freenas MCA: Bank 5, Status 0xd400008000910091 Apr 24 03:27:24 freenas MCA: Global Cap 0x0000000000000806, Status 0x0000000000000000 Apr 24 03:27:24 freenas MCA: Vendor "GenuineIntel", ID 0x406d8, APIC ID 0 Apr 24 03:27:24 freenas MCA: CPU 0 COR OVER RD channel 1 memory error Apr 24 03:27:24 freenas MCA: Address 0x5aa320380 Apr 24 04:27:24 freenas MCA: Bank 5, Status 0x9400004000910091 Apr 24 04:27:24 freenas MCA: Global Cap 0x0000000000000806, Status 0x0000000000000000 Apr 24 04:27:24 freenas MCA: Vendor "GenuineIntel", ID 0x406d8, APIC ID 0 Apr 24 04:27:24 freenas MCA: CPU 0 COR RD channel 1 memory error Apr 24 04:27:24 freenas MCA: Address 0x5aa320380 Apr 25 03:27:25 freenas MCA: Bank 5, Status 0xd400008000910091 Apr 25 03:27:25 freenas MCA: Global Cap 0x0000000000000806, Status 0x0000000000000000 Apr 25 03:27:25 freenas MCA: Vendor "GenuineIntel", ID 0x406d8, APIC ID 0 Apr 25 03:27:25 freenas MCA: CPU 0 COR OVER RD channel 1 memory error Apr 25 03:27:25 freenas MCA: Address 0x5aa320380
dmidecode:
Code:
root@freenas[~]# dmidecode -t 20 # dmidecode 3.2 # SMBIOS entry point at 0x000f0560 Found SMBIOS entry point in EFI, reading table from /dev/mem. SMBIOS 2.8 present. Handle 0x002E, DMI type 20, 35 bytes Memory Device Mapped Address Starting Address: 0x00000000000 Ending Address: 0x001FFFFFFFF Range Size: 8 GB Physical Device Handle: 0x002D Memory Array Mapped Address Handle: 0x002C Partition Row Position: Unknown Handle 0x0030, DMI type 20, 35 bytes Memory Device Mapped Address Starting Address: 0x00200000000 Ending Address: 0x003FFFFFFFF Range Size: 8 GB Physical Device Handle: 0x002F Memory Array Mapped Address Handle: 0x002C Partition Row Position: Unknown Handle 0x0032, DMI type 20, 35 bytes Memory Device Mapped Address Starting Address: 0x00400000000 Ending Address: 0x005FFFFFFFF Range Size: 8 GB Physical Device Handle: 0x0031 Memory Array Mapped Address Handle: 0x002C Partition Row Position: Unknown Handle 0x0034, DMI type 20, 35 bytes Memory Device Mapped Address Starting Address: 0x00600000000 Ending Address: 0x007FFFFFFFF Range Size: 8 GB Physical Device Handle: 0x0033 Memory Array Mapped Address Handle: 0x002C Partition Row Position: Unknown
Memory modules:
Code:
root@freenas[~]# dmidecode -t memory | grep -A23 "0x002" Handle 0x002B, DMI type 16, 23 bytes Physical Memory Array Location: System Board Or Motherboard Use: System Memory Error Correction Type: Single-bit ECC Maximum Capacity: 64 GB Error Information Handle: Not Provided Number Of Devices: 4 Handle 0x002D, DMI type 17, 34 bytes Memory Device Array Handle: 0x002B Error Information Handle: Not Provided Total Width: 64 bits Data Width: 64 bits Size: 8192 MB Form Factor: SODIMM Set: None Locator: DIMMA1 Bank Locator: BANK 0 Type: DDR3 Type Detail: Synchronous Unbuffered (Unregistered) Speed: 1600 MT/s Manufacturer: Hynix Serial Number: 14254030 Asset Tag: BANK 0 DIMMA1 AssetTag Part Number: HMT41GA7AFR8A-PB Rank: 2 Configured Memory Speed: 1600 MT/s Handle 0x002F, DMI type 17, 34 bytes Memory Device Array Handle: 0x002B Error Information Handle: Not Provided Total Width: 64 bits Data Width: 64 bits Size: 8192 MB Form Factor: SODIMM Set: None Locator: DIMMA2 Bank Locator: BANK 0 Type: DDR3 Type Detail: Synchronous Unbuffered (Unregistered) Speed: 1600 MT/s Manufacturer: Hynix Serial Number: 14254021 Asset Tag: BANK 0 DIMMA2 AssetTag Part Number: HMT41GA7AFR8A-PB Rank: 2 Configured Memory Speed: 1600 MT/s Handle 0x0031, DMI type 17, 34 bytes Memory Device Array Handle: 0x002B Error Information Handle: Not Provided Total Width: 64 bits Data Width: 64 bits Size: 8192 MB Form Factor: SODIMM Set: None Locator: DIMMB1 Bank Locator: BANK 0 Type: DDR3 Type Detail: Synchronous Unbuffered (Unregistered) Speed: 1600 MT/s Manufacturer: Hynix Serial Number: 11231852 Asset Tag: BANK 0 DIMMB1 AssetTag Part Number: HMT41GA7AFR8A-PB Rank: 2 Configured Memory Speed: 1600 MT/s Handle 0x0033, DMI type 17, 34 bytes Memory Device Array Handle: 0x002B Error Information Handle: Not Provided Total Width: 64 bits Data Width: 64 bits Size: 8192 MB Form Factor: SODIMM Set: None Locator: DIMMB2 Bank Locator: BANK 0 Type: DDR3 Type Detail: Synchronous Unbuffered (Unregistered) Speed: 1600 MT/s Manufacturer: Hynix Serial Number: 14253919 Asset Tag: BANK 0 DIMMB2 AssetTag Part Number: HMT41GA7AFR8A-PB Rank: 2 Configured Memory Speed: 1600 MT/s
Is DIMMB1 failing/dying? Or have I mis-read the logs/information?