Same MCA errors on two different machines

Status
Not open for further replies.

rs@joy

Cadet
Joined
Jan 12, 2018
Messages
1
We have two Supermicro X9DRi-F Storage Servers running FreeNAS with FreeBSD 11.1-Stable.
Since upgrading from FreeBSD9 to 11 we encountered a strange behavior quite similar to this thread.
The systems run fine for a few hours to a few days (first machine) or a few weeks (second machine - backup), then without an obvoius trigger start spamming MCA Memory errors.
Unfortunately the system is inresponsible and can only be restartet with a power reset.
I followed the troubleshooting process in the thread above and the adresses of the memory are not allocated to any physical memory. They simply dont exist.

How is this possible?
Is there a workaround?
 

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
can only be restartet with a power reset.
Have you tried an ACPI shutdown from the power button or IPMI? I wonder if the SM bus/ACPI memory mapping are getting fudged. I don't know alot about this topic but it may be something to look into.

Check BIOS/IPMI firmwares to make sure there up to date. Also any HBAs and other PCI devices as they all can use the SMbus.
 
Status
Not open for further replies.
Top