memtest86 locks up during Test 1 on X9SCM-F iff using 4 DIMMs

sugar scoop

Cadet
Joined
Nov 26, 2020
Messages
7
Hi all, I have a box on which I recently upgraded the motherboard & RAM via eBay as my old gaming ITX mobo capped out at 16GB. Unfortunately, right after all my jails came up the box would panic and die. So I did the obvious thing and ran MemTest86 (v7.5). Funny enough, it freezes almost instantly (in Test #1) once the test starts. Over the last few days I've been testing the DIMMs individually to try and determine what's going on. It appears the crashes only happen when all four DIMMs are installed. Every DIMM individually and pair-wise pass the test suite without issue.

The memory in use is listed on Supermicro's list of compatible memory. The only other fact I have been able to determine is it seems the test suite fails in different spots depending on the order of the DIMMs. I will be performing further test iterations to see if this yields anything useful.

At this point I'm not sure how to proceed. I'm open to suggestions as at this point I will just have to return both if I cannot get this figured out.

Hardware:
Supermicro X9SCM-F (2.3a BIOS)
4x 8GB DDR3-1600 ECC UDIMM (HMT41GU7AFR8A-PB)
Intel Xeon E3-1225 v2

Thank you!
 

Attachments

  • crash.jpg
    crash.jpg
    38.7 KB · Views: 133
  • memtest crash.jpg
    memtest crash.jpg
    37.9 KB · Views: 133

QonoS

Explorer
Joined
Apr 1, 2021
Messages
87
the following helped me in the past when i had similar problems:
  • taking mobo,cpu,ram apart > properly dust clean it > cleaning contacts gently with isopropyl alcohol > put everything back together
  • trying a different PSU
  • complete power off with power cables removed and BIOS battery removed for min. 8h
and as last option
  • raising RAM voltage. your modules are specced with 1.35V, increase in 0.05V steps up to max 1.50V if BIOS supports it.
edit:

An intel document on memory validation for a E5-2600v2 CPUs suggest raising voltage from 1.35V to 1.50V when using 2 of your DIMM modules "HMT41GU7AFR8A-PB" per channel. this is a strong indicator that you should do the same. ;)
See attachment too.
 

Attachments

  • 2021-08-27 00_13_41-DDR3 UDIMM ECC E5 V2 Family Memory List.png
    2021-08-27 00_13_41-DDR3 UDIMM ECC E5 V2 Family Memory List.png
    193.7 KB · Views: 156
Last edited:

sugar scoop

Cadet
Joined
Nov 26, 2020
Messages
7
Okay, didn't mean to reply like that. The inability to edit posts is ... annoying. Anyway.

Thanks, I've been cycling through checking and cleaning the sockets and sticks. I haven't tried a reset of the motherboard like you are describing and will try that.

Thank you for the information QonoS, that's really interesting. I dug around in the BIOS and was unable to find voltage configuration so I might be stuck there. That gives me concern isn't not a motherboard issue per say but an incompatibility with the ostensibly compatible sticks. That's frustrating.

My only memory settings are to set the Frequency (Auto, 1066, 1333, 1600) which is self-explanatory and the RefreshRate (Disabled, 1x, 2x) though that later one I'm unclear on and is undocumented in the motherboard docs. Regardless, neither relates to voltage :/.
 

QonoS

Explorer
Joined
Apr 1, 2021
Messages
87
Is there a "Force SPD" switch ?

Otherwise Supermicro X10 Series usually have
"
DDR Voltage Level
Select Force to 1.50V to force all DDR3 memory modules to operate at 1.50V. Select Force to 1.35V to force all DDR3 memory modules to operate at 1.35V. The options are Auto, Force to 1.50V, and Force to 1.35V.
"
 

sugar scoop

Cadet
Joined
Nov 26, 2020
Messages
7
Just checked, neither setting appears to exist in the motherboard. I did try setting the Frequency (it was already auto-detected correctly). It had no effect.
 

sugar scoop

Cadet
Joined
Nov 26, 2020
Messages
7
So, that's interesting. I had tried resetting the BIOS settings when I initially encountered the issue. However, last night I tried your idea of pulling the battery and power and letting it drain & reset. This morning, the first memory tests passed. I'm going to let it run the rest of the day but it definitely is behaving differently. Maybe it'll work!

Awkwardly, I've already requested returns and purchased 1.5V memory. So... yeah.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
The Xeon E5 v1/v2 stuff should be old enough to work decently well with memtest86+, maybe you can give that a try, too. These things are not as reliable as they should be (same goes for both the non-plus and the plus version).

Awkwardly, I've already requested returns and purchased 1.5V memory. So... yeah.
1.35V DDR3 is also rated to run at 1.5V, so there's no advantage to buying the older 1.5V stuff. Of course, that's not to say every single DIMM will work perfectly (defects and incompatibilities and all).
 

sugar scoop

Cadet
Joined
Nov 26, 2020
Messages
7
Okay, so reexamining all the BIOS settings that I flip (datetime, boot, aes-ni, and vt-d), the issue appears to be VT-d. Enabling that causes the lockup in memtest.

Taking Ericloewe's advice I also tried memtest86+ v5.31. It fails with and without VT-d enabled. It's a bit harder to grab since it crashes and reboots but I've attached the roughly last visible frame from the run. So this didn't give any more useful information.

> 1.35V DDR3 is also rated to run at 1.5V
I didn't know that, that's good to know! In this case it may not matter if voltage were an issue since the X9 bios doesn't allow one to tweak voltage.


At this point, the question is: Why might enabling VT-d in my motherboard cause memtest/kernel panics?
 

Attachments

  • memtest86+ with vtd_Moment.jpg
    memtest86+ with vtd_Moment.jpg
    732.8 KB · Views: 155
Last edited:

QonoS

Explorer
Joined
Apr 1, 2021
Messages
87
Glad you found the culprit. ;)

Xeon E3 v2 and its chipsets count to the ones when VT-d was still "new technology". Lots of bugs have been resolved since then and I remember Sandy Bridge had issues generally with VT-d. But this is speculation...

Lots of things could be root cause. BIOS, CPU, incompatibility of some kind,....

If you do not plan to do PCIe passthrough just leave it disabled and be happy with it. :)
 

sugar scoop

Cadet
Joined
Nov 26, 2020
Messages
7
Yeah, it does seem like a strange confluence of factors. Only happens when I have >2 sticks in (or 2 sticks in mismatched channels) and I've got VT-d toggled on. The RAM configuration causing the incompatibility is the oddest thing to me.

Since the Host has the SAS card, I'm not gonna need to pass that through to any Guests. But it does mean I won't be able to swap it out with a hypervisor. Meh.

I'm going to see if I can do a test with 1.5V RAM and I'll be marking the thread resolved soon regardless of the outcome.

If anything I now have an excuse to upgrade the hardware in a couple of years when its really ancient.

Thank you both for your help in diagnosing what was going on. I was gonna start pulling out my hair as it didn't make any sense!
 
Top