Dying PSU?

Status
Not open for further replies.

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,829
Good afternoon,
I have been trying to diagnose a PSU that seems to be dying. My TrueNAS was operating happily throughout the PSUs life span, but when I left for school vacation, I also decided to power down the NAS (via GUI). When I returned and tried to restart the NAS, I got nothing. No lights on the motherboard (some of which usually are steady green or flashing, even if the unit is "off" as long as the PSU power switch is still on). One of the no-boot issues listed in the SuperMicro manual was a bad coin cell on the motherboard, so I changed it.

I extracted the PSU. Hooking up a PSU "tester" indicated good voltages. So I reinstalled the PSU. Without the SATA drives, the system booted OK. I shut it down, then attached the SATA power lines. This time the system booted but the pool was degraded, with three drives missing. Then it rebooted itself spontaneously w/o a command.

Given the symptoms, I presume this PSU is toast?
 
Joined
Oct 22, 2019
Messages
3,641
Without the SATA drives, the system booted OK. I shut it down, then attached the SATA power lines. This time the system booted but the pool was degraded, with three drives missing.
Would this not also implicate the motherboard as well?

It could be that at a certain load, which the drives just barely push over the threshold, the PSU fails. But I'd imagine it as a sudden hard cut, not a "reboot". (Unless you mean the power suddenly goes off and the system boots up again, rather than a normal reboot cycle?)

What if you have the drives and everything plugged in normally, but don't let it boot beyond the motherboard's setup screen? Does it show all drives available in all ports?
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,829
Would this not also implicate the motherboard as well?
That was my first thought until I was able to stably start the system w/o any SATA drive power lines attached. Perhaps changing the coin cell helped (it's been there since 2019). However, the system seemed to start /run fine with just the motherboard plugged in.
It could be that at a certain load, which the drives just barely push over the threshold, the PSU fails. But I'd imagine it as a sudden hard cut, not a "reboot". (Unless you mean the power suddenly goes off and the system boots up again, rather than a normal reboot cycle?)
I could have phrased that better. To the user, it looks like a reboot (i.e. the GUI becomes non-responsive), but in reality, the unit didn't announce it was undergoing a commanded reboot, it was rebooting. Presumably, a power cut is what precipitated the reboot and I am loathe to run a expensive motherboard with a flaky PSU for fear of damaging it.
What if you have the drives and everything plugged in normally, but don't let it boot beyond the motherboard's setup screen? Does it show all drives available in all ports?
That's a bit hard to say because not all drives are visible to the boot manager, only the SATA ports on the motherboard. The stuff on the LSI HBA is handled in a separate, later screen AFAICT.

On the hardware side, the only thing that comes to mind is that I perhaps have a bad connection between the SATA power lines and the Lian Li SATA drive backplanes and/or between the drives and the backplanes. That seems really remote. But, I am going to have a look at the backplanes and the drives to see if there is anything odd going on.

Plus, take some measurements with my DMM to see if there are any shorts associated with the SATA power lines. Anything else I missed?

Since Seasonic requires me to send the dead PSU in for repair, I've bought a replacement Seasonic PSU (TX series) for the time being since that will hopefully allow me to re-use already-installed cabling.
 
Joined
Oct 22, 2019
Messages
3,641
Plus, take some measurements with my DMM to see if there are any shorts associated with the SATA power lines. Anything else I missed?
I'm no expert by any means, but I don't put much weight into using multimeters to test the reliability and quality of a product.* Things might report the correct values and voltages when you test them with a multimeter, yet fail (or come close to it) when put under a load.

* Of course, yes, if you find a short or continuity where there shouldn't be any, then that's an outright "failure" regardless.
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,829
When it comes to DMM, I was concerned about a outright short between 12VDC and GND or 5VDC and GND on the SATA power lines, found no shorts. Only GND-GND on both buses came up as 0.2 Ohms, which is expected as connecting the leads directly shows 0.2 Ohms as well. No other shorts. So it does not appear that the SATA power lines are causing a problem.
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,829
life just got more interesting. With a 450W Corsair PSU installed, the pool fist showed 2 drives UNAVAILABLE followed by the whole pool going kaploink and the only GUI option being to detach or export said pool.

Given my past experiences with TrueNAS, I will take the usual exhortations re: pool is dead, you need to rebuild, etc. with a grain of salt. First, I will replace the SFF-SATA wiring that went from the HBA to the two drives that are unhappy.

I am happy to report that Seasonic offers an advance replace program for only $75, which is considerably less expensive than a new PSU.

I’ll start a new thread re: recovering the pool. Any updates re the PSU will appear here.
 

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925
Possible AI-response post?
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
if you even suspect the PSU is failing get it out ASAP. a failing PSU can kill other components.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
@Constantin
Your last post was almost 2 months ago, not sure why someone revived it. I would assume you have fixed this problem by now. Did you ever fix the problem? If yes, I will close this thread.
 

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925
Status
Not open for further replies.
Top