Ryzen 7 2700 w/ ECC RAM creating errors

edge-case

Dabbler
Joined
Nov 2, 2019
Messages
28
Hi, I've been experiencing the following errors on my main FreeNAS system; once booted, they start within a few hours, and then repeat every hour after that. I've done lots of searching online, but haven't found anything that matches the exact error message.
System is the "Main" entry in my sig, with the X470D4U motherboard [also listed at the bottom of this post].

Posting this here to see if another has any thoughts/insight/advice, and in the hope it may prove to be useful to somebody else in the future...

I've done lots of component swapping - CPU, motherboard, PSU, single sticks of memory - and looks like the errors always follow the Ryzen 7 2700 CPU.
So, I'm assuming it's a CPU issue, and it's on it's way back to NewEgg for a replacement. It will be interesting to see if the replacement is error free...[or my theory/interpretation of the errors was wrong].

CPUs: Ryzen 7 2700 & Ryzen 7 1700
Motherboards: ASRock Rack X470D4U & ASRock B450M-Pro4 [ECC support enabled]
PSUs: Seasonic Prime Gold 550 W and PowerSpec 650 W
[CPUs are running cool - around 32 to 34 deg C; the case has multiple fans and and the cooler is way overspecced for the CPU's TDP; and I also removed the CPU cooler [checked/clean/re-appleid thermal paste] multiple times and re-used it on the Ryzen 7 1700 CPU too... ]

I get the FreeNAS error messages, and PassMark's MemTest86 fails on both motherboards with the 2700 CPU and ECC RAM; it's 100% repeatable/consistent every time [MemTest86 screen dump image attached].
- no errors / failures on either motherboard with the Ryzen 1700 and the ECC memory.
- If I remove a single memory stick and run with just one I can repeat the errors - doesn't matter which memory stick I remove or leave in the system.
- No errors / issues on any combination of CPU / motherboard if I use plain, non-ECC DIMMs** (unfortunately I don't have any other ECC memory to test it with).
[**which means I would be totally unaware of the "issue" if I wasn't using ECC RAM, which raises a totally different set of questions...].

Example of the FreeNAS errors [they start after an hour or two of boot, and then repeat consistently every hour]:
Code:
Dec 31 12:24:06 freenas-test MCA: Bank 15, Status 0xd42040000000011b
Dec 31 12:24:06 freenas-test MCA: Global Cap 0x0000000000000117, Status 0x0000000000000000
Dec 31 12:24:06 freenas-test MCA: Vendor "AuthenticAMD", ID 0x800f82, APIC ID 0
Dec 31 12:24:06 freenas-test MCA: CPU 0 COR OVER GCACHE LG RD error
Dec 31 12:24:06 freenas-test MCA: Address 0x40000032cf3f340
Dec 31 12:24:06 freenas-test MCA: Bank 16, Status 0xd42040000000011b
Dec 31 12:24:06 freenas-test MCA: Global Cap 0x0000000000000117, Status 0x0000000000000000
Dec 31 12:24:06 freenas-test MCA: Vendor "AuthenticAMD", ID 0x800f82, APIC ID 0
Dec 31 12:24:06 freenas-test MCA: CPU 0 COR OVER GCACHE LG RD error
Dec 31 12:24:06 freenas-test MCA: Address 0x40000032af923c0


and the output of "mcelog":
Code:
root@freenas-test[~]# mcelog
mcelog: Unknown CPU type vendor 2 family 23 model 8
Hardware event. This is not a software error.
MCE 0
CPU 0 BANK 15 TSC a8eaffdf2a0
ADDR 40000032cf3f340
TIME 1577814630 Tue Dec 31 12:50:30 2019
STATUS d42040000000011b MCGSTATUS 0
MCGCAP 117 APICID 0 SOCKETID 0
CPUID Vendor AMD Family 23 Model 8
mcelog: Unknown CPU type vendor 2 family 23 model 8
Hardware event. This is not a software error.
MCE 1
CPU 0 BANK 16 TSC a8eaffe0dc0
ADDR 40000032af923c0
TIME 1577814630 Tue Dec 31 12:50:30 2019
STATUS d42040000000011b MCGSTATUS 0
MCGCAP 117 APICID 0 SOCKETID 0
CPUID Vendor AMD Family 23 Model 8


System:
ASRock Rack X470D4U
Ryzen 7 2700 [now replaced by a 1700]
32 GiB ECC RAM : 2x Kingston KSM26ED8/16ME 16GB DDR4 2666
LSI 9211-8i 8-port 6Gb/s (IT-MODE)
Antec P101 Silent Case
Noctua NH-D15S CPU cooler
Seasonic Prime Gold 550W ATX PSU
6x WD Red (White Label) 8TiB drives
 

Attachments

  • IMG_2787.jpeg
    IMG_2787.jpeg
    324.9 KB · Views: 365
Last edited:
Top