TrueNAS-12.0-U8.1 unscheduled reboots

berrick

Explorer
Joined
Mar 19, 2013
Messages
78
This is well tested hardware and when running FreeNAS 9.10 had ZERO issue's

Since doing a fresh install of TrueNAS-12.0-U8, earlier this year, this NAS has been plagued with these random reboots.

The system comprises
  • Supermicro X9SCM
  • Intel(R) Xeon(R) CPU E31270 @ 3.40GHz
  • Dual Intel NIC's
  • Dual PSU's
  • 16GB ECC RAM
  • 4 x 2Gb sata disk
  • 1 x SSD boot disk

What has been done so far.

No its not dusty.
No its not in an environment prone to condensation.

looked at
/var/log/messages - nothing
/var/log/console.log - nothing

Created debug file

CPU, System load and CPU Temp - NAS is doing nothing
CPU Mean temp = 36.55
CPU mean usage = Idle 86.64. Everything else approx 0
System load mean = all less than 1 IE 0.something

Looked for ipmi errors in logs - none found

At the weekend ran memtestv9 for 8 hours - no fault found

Brought the system back online and upgraded to TrueNAS-12.0-U8.1 within hours......

had an unscheduled system reboot. The operating system successfully came back online at Mon Aug 22 02:26:36 2022.​


Is there anything else I can check. Maybe I have overlooked something in the logs. I wouldn't class myself an expert
 

berrick

Explorer
Joined
Mar 19, 2013
Messages
78
Forgot to mention I have also checked the cmos battery and its good
 

berrick

Explorer
Joined
Mar 19, 2013
Messages
78
SMART is also disabled on the SSD boot disk

Anyone any other suggests?
 

blanchet

Guru
Joined
Apr 17, 2018
Messages
516
In the IPMI web interface there is a Health Event Log that records hardware failure.

I can tell for the X9 motherboards but for X10 and X11 it works very well.
 

berrick

Explorer
Joined
Mar 19, 2013
Messages
78
@blanchet
Thanks for the reply. Everything is OK nothing being logged in IPMI to say there is a problem
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
A "random, unexpected reboot" is sometimes the result of the "watchdog timer" but that would normally pop a result in the System Event Log (SEL) claiming itself as the cause of the reboot. Does ipmitool sel elist show any detailed events?
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
This is well tested hardware and when running FreeNAS 9.10 had ZERO issue's

Since doing a fresh install of TrueNAS-12.0-U8, earlier this year, this NAS has been plagued with these random reboots.
Can you return to the previously operating version of FreeNAS/TrueNAS before you went to 12.0-U8 ? This would be a good start to see if it's a software driver type issue. If you have upgraded your pool then I think you are stuck with the current version you have.

I'm assuming you have a good UPS.

As for troubleshooting a hardware issue, I'd remove one of the dual power supplies and run Prime95 or similar for several hours. If that passes then install the power supply and remove the other one, retest. I'm looking for a power supply causing problems. Also Prime95 will stress the CPU and motherboard voltage regulators.

While on the power supply, normally the dual power supplies connect to a semi-intelligent board and then that distributes power inside the chassis. This board could be failing. Seen it before, just not intermittent but because I haven't seen it doesn't make it not possible. I'm not suggesting replacing the board, but rather obtain a standard power supply which can be plugged into the motherboard and the rest of your system to take the entire power system out of the picture. It may not be a pretty site while testing but it can be done.

Good luck and don't pull your hair out. Troubleshooting these kinds of problems can take a long time to solve. I really hope @HoneyBadger provided you with the key that provides a quick solution.
 
Top