Mini XL with Alzheimers

Status
Not open for further replies.

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,829
Well, looks like my Mini XL is unhappy with me. At 3:01 in the morning, I got the following kernel message:

Code:
Timecounter "ACPI-safe" frequency 3579545 Hz quality 850
SMP: AP CPU #5 Launched!
SMP: AP CPU #4 Launched!
SMP: AP CPU #6 Launched!
Timecounter "TSC-low" frequency 1208365649 Hz quality 1000
igb0: link state changed to UP
MCA: Bank 5, Status 0x9400004000900090
MCA: Global Cap 0x0000000000000806, Status 0x0000000000000000
MCA: Vendor "GenuineIntel", ID 0x406d8, APIC ID 0
MCA: CPU 0 COR RD channel 0 memory error
MCA: Address 0x5fcb9dc00
MCA: Bank 5, Status 0xd400018000900090
MCA: Global Cap 0x0000000000000806, Status 0x0000000000000000
MCA: Vendor "GenuineIntel", ID 0x406d8, APIC ID 0
MCA: CPU 0 COR OVER RD channel 0 memory error
MCA: Address 0x3f55b640
MCA: Bank 5, Status 0xd400020000900090
MCA: Global Cap 0x0000000000000806, Status 0x0000000000000000
MCA: Vendor "GenuineIntel", ID 0x406d8, APIC ID 0
MCA: CPU 0 COR OVER RD channel 0 memory error
MCA: Address 0xc0d44d000
MCA: Bank 5, Status 0x9400004000900090
MCA: Global Cap 0x0000000000000806, Status 0x0000000000000000
MCA: Vendor "GenuineIntel", ID 0x406d8, APIC ID 0
MCA: CPU 0 COR RD channel 0 memory error
MCA: Address 0x75c248c00
MCA: Bank 5, Status 0xd400008000900090
MCA: Global Cap 0x0000000000000806, Status 0x0000000000000000
MCA: Vendor "GenuineIntel", ID 0x406d8, APIC ID 0
MCA: CPU 0 COR OVER RD channel 0 memory error
MCA: Address 0x17d074040


The Mini XL subsequently crashed and will not boot beyond a certain point whereupon it reports a general protection fault and stops. I'll reboot it when I return home and follow along as best as I can and report the results here. The only recent changes to the system were the swap of the OEM-supplied 32GB for 64GB two weeks ago.

I presume the first step is to go back to the OEM-supplied memory and see if it boots?
 

m0nkey_

MVP
Joined
Oct 27, 2015
Messages
2,739
You could start by running a memtest for 24 hours. If you don't see any errors, re-install the OEM memory and see if it will boot successfully.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
When reading the error message I first thought RAM issue and then as I kept reading you stated that you swapped out the RAM. Exactly what memory did you install? Did you run Memtest86 on it initially after installing that new RAM?

I suspect it will fail when you run Memtest86 so do not do anything except run the test, do not reseat the RAM, do nothing. When the test fails then reseat the RAM and test again. If it works then continue to test for 3 days at a minimum. If it fails then reinstall the original RAM and run Memtest86.

If you fix it and you are still running the new RAM, run a CPU Stress test for a while for good measure, 30 minutes should be more than good just to verify it all works.

Report your results.
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,829
Thanks to both of you. Currently hunting down a USB stick for Memtest+ install. Will report back once I have a run in.

Curiously, when I rebooted the thing (to capture error messages on the console), the Mini XL came up without issues. Weird!

Thanks again!
 

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
Use Memtest86, not Memtest86+. They are different and to my knowledge Memtest86 works better. I think the + version may throw false possitives.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,829
Paid the $39 for the pro version. Still working on getting it on a USB stick and booted.

As for the sticks, the seller claimed they were tested clean.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
As for the sticks, the seller claimed they were tested clean.
Are they on the Qualified Vendors List for RAM for your motherboard?
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,829
71C5D3C2-0FE5-4DDD-B26E-9985EE9868DB.jpeg
Are they on the Qualified Vendors List for RAM for your motherboard?
Yup. Precisely!

Test running now. Should take a while given that it’s only running one core at a time.
 

wblock

Documentation Engineer
Joined
Nov 14, 2014
Messages
1,506
Three in the morning is when the nightly status reports are generated. Some of that includes disk activity. You might be near the limits of the power supply, and it might not show on the memory tests because the disk drives aren't being used at the same time.
 

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925
That's not the screen display from the Pro test ($39). You need to boot in UEFI mode to get it to run the latest version - 7.4 Pro - it'll look like this:
index.php


Allegedly you will be able to set it up to inject ECC errors also. I have been in discussion with Passmark as, while I saw the screen messages suggesting it was happening (see the last 1.5 lines above) I never saw any system response or other indication that it actually took place. Passmark has a debug file from me now to examine.
 

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925
Three in the morning is when the nightly status reports are generated. Some of that includes disk activity. You might be near the limits of the power supply, and it might not show on the memory tests because the disk drives aren't being used at the same time.
@wblock, yes, when I was in the middle of my Mini memory testing saga, I realized that my drives were out of the chassis and that constituted a potentially significant difference. I did probe my PSU outputs (with a Fluke 787, not a $5 Home Depot job) after restarting with drives in place and all checked out OK, and I had the high CPU temperature issues also, but I'm keeping the PSU in mind for possible replacement.
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,829
Thanks! I will restart the process and report back when it’s completed a run or two. Part of the problem is how quickly the setup screens flash by - it’s relatively easy to set the memtest flash drive as a boot drive via the Asrock BIos. How is done via UEFI?

My planned case upgrade is going to include a PSU upgrade to 650W. Should hopefully cover the capacitor aging issue, etc.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
You should also be able to toggle to use multiple CPUs at the same time to speed up the memory testing. In my mind it will place a larger stress on the system so I prefer it.
 

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925
How is done via UEFI?
F11 at the Asrock screen with 4(?) options.
Select your USB.

Edit - you have to have enabled UEFI boot in the main BIOS BOOT screen.
 
Last edited:

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,829
Memtest 7.4 pro running now for three hours - thank you! All 8 cores doing their thing without issues thus far despite ecc error injections and so on.

If the PS is at fault, there is no way to stress it with memtest, however. The HDDs are spinning idly.

With every pass taking about 4 hours and 4 passes, I guess I’ll know tomorrow afternoon what memtest thinks of my rig.
 
Last edited:

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925
Passmark responded on the review of my debug file:
"Yes, it does appear that your chipset supports ECC injection. However, ECC injection has been disabled by your BIOS. Once it is disabled, it cannot be re-enabled until the next system reset.
You may want to check your BIOS setup to see if there is an option to enable ECC injection. Otherwise, you would need to flash a custom BIOS to prevent the ECC injection from being disabled."
There isn't such an option in my BIOS, nor I imagine in yours as they are the same MB. $39 blown.
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,829
62469A35-666B-42F3-BB45-4CE23B4ABBBC.jpeg
Can’t remember if there was such a setting. Would have to review all the settings that Asrock allows you to change...

On a related note, does the RAM I’m using look right to you? The Asrock QVL list for my board is here. As best as I can tell my memory is listed as the third from the bottom.

Thank you for following up with passmark. That’s good info to know.
 
Status
Not open for further replies.
Top