Supermicro H11DSi only sees 768GB RAM

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,175
Do I understand correctly that the host OSes all have no problem using all available memory? Or are you using dmidecode to inspect the DIMMs directly and the OS only recognizes some of the RAM?

If the former, this is a simple IPMI bug of little consequence.
 

Psynapsx

Dabbler
Joined
Oct 31, 2020
Messages
28
Do I understand correctly that the host OSes all have no problem using all available memory? Or are you using dmidecode to inspect the DIMMs directly and the OS only recognizes some of the RAM?

If the former, this is a simple IPMI bug of little consequence.

the OS does NOT see all available memory. Ubuntu reports the same as detected by Supermicro IPMI, which is 6 DIMMs out of 16 DIMMs = 6x 128GB = 768GB out of 16x 128GB = 2TB.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,175
Alright, I got switched around and though the 768 GB were the full amount.

That being the case, my suggestion would be to probe Supermicro support for a beta system firmware image with an updated IMC initialization. They always seem to have beta firmwares lying around.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,681
I'm not reporting to anyone, except myself:)
My first question would be: why does Supermicro list FAKE specifications on their site and user manual? Why list 128GB DIMMs as supported when they did not validate any?

Moderator hat on, that's out of bounds, please don't make accusations of FAKE specifications unless you have proof. And you already disproved it, since you said six modules did work. I understand your frustration, but let's be realistic and fair.

It is quite common for processor manufacturers to create support for DIMM sizes that do not yet exist.

Board manufacturers generally pass this information on verbatim. Since memory controllers these days are integrated into the CPU, "support" on a motherboard mostly means that you run the copper traces from the CPU socket to the RAM socket using the CPU manufacturer's specifications as a guide for maximum length and other technical issues. If a 64GB DIMM is the largest thing out there, but the CPU theoretically supports 128GB, it's totally reasonable for a manufacturer to list the hypothetical 128GB DIMM as supported on the basis that the CPU is capable of it, and the traces are all run in the same way, so it's sorta expected that it'll work.

Because this doesn't ACTUALLY work out 100% of the time in practice, the X9 Ivy Elpida debacle being the poster child for this, Supermicro also does test memory modules to identify specific ones that have passed. Quite often, these are the same ones that they resell under the Supermicro brand.

Saying that 128GB modules are supported does not mean EVERY 128GB module is supported, even if the specs seem to match.

System integrators often spend a lot of time and effort messing around with this kind of thing to get working configurations. If you don't have the resources to "swing-and-miss" at some of these configurations, please consider whether you might be better off buying a prebuilt system from a vendor who has validated a configuration.
 

Psynapsx

Dabbler
Joined
Oct 31, 2020
Messages
28
Moderator hat on, that's out of bounds, please don't make accusations of FAKE specifications unless you have proof. And you already disproved it, since you said six modules did work. I understand your frustration, but let's be realistic and fair.

my claim is not that it does not support 128GB modules. My claim is that it does not support 2TB total memory. Yes, 6 modules work at the same time, which is 768GB, not 2TB.

I do not expect that all 128GB modules are supported. However, claiming that it supports 128GB modules and have zero validated modules makes no sense.
How can you claim 128GB support if you haven't validated a single one?
The specifications are misleading and some could easily characterize this as being fake specifications.
When buying a motherboard, I'm interested in the motherboard's specifications, not in my CPU's as I already know that. I should be informed accurately if it cannot utilize some of the CPU's capabilities or at least not claim support if they did not validate it.
BTW in my Asrock Rack motherboard the same CPU detects all DIMMs correctly. It is Supermicro related.

I will continue to build my own systems regardless if I have 20k USD for memory or not because I like building systems and choose to hold manufacturers accountable instead because I feel that's the right thing to do.
I also cannot get away with selling a car with heated seats and then saying well maybe it will not work when you are cold, uups.
 

Psynapsx

Dabbler
Joined
Oct 31, 2020
Messages
28
The site and user manual claim 128GB DIMM support.
When asked about which 128GB DIMM to buy: well we don't know, haven't validated any:)

Honestly, do you think it's fair? I don't think so.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,681
How can you claim 128GB support if you haven't validated a single one?
The specifications are misleading and some could easily characterize this as being fake specifications.

Because the CPU supports it, and nothing in the board design precludes it. Just like I said a few messages above.

Whether or not YOUR 128GB modules work is another matter. They seem to, perhaps not fully, perhaps needing some BIOS update, it's hard to say.

Honestly, do you think it's fair? I don't think so.

I classify this as wishful thinking on your part. But I've spent decades in this industry. We used to have to worry about all sorts of stupid stuff with memory compatibility. These days, I am constantly amazed it works as well as it does.
 

Psynapsx

Dabbler
Joined
Oct 31, 2020
Messages
28
I did some more testing since then. I pulled 8x Hynix 64GB ECC LRDIMM 2666 HMAA8GL7AMR4N-VK modules and another EPYC 7551 from a perfectly working H11SSL system of mine to test the H11DSi with a single CPU configuration. These modules are on the validated parts list for the H11DSi.

The board detected only 4 modules out of 8:
P1-DIMMC1
P1-DIMMD1
P1-DIMMG1
P1-DIMMH1

Not detected:
P1-DIMMA1
P1-DIMMB1
P1-DIMME1
P1-DIMMF1

Looks like the motherboard is faulty and it needs to be replaced.
 
Top