SOLVED Spontaneous reboots - need ideas urgently!

Status
Not open for further replies.

GrahamBB

Explorer
Joined
Sep 6, 2014
Messages
77
)We upgraded from 9-10-2-U6 to 11-1-U1 (details here: https://forums.freenas.org/index.ph...o-11-1-u1-periodic-crashes.61358/#post-436415) and was, following recommendations, bugged here: (https://redmine.ixsystems.com/issues/28249#change-164553).

We cannot roll back as the previous environments have been deleted (https://redmine.ixsystems.com/issues/28476).

Over the weekend a long copy was started by a staff member and ran overnight successfully. The system did NOT reboot whilst this activity was underway. However, after completing the file copy the system spontaneously rebooted several times with about 1 - 2 hours gap between reboots.

The recommendation in the bug report is "
There can be two ways: either try to catch what's going on a physical (or may be serial, if you set it up) console when the server crashes, or try to figure out why crash does not leave the dumps. The second would be more reasonable but require to find developer time.

Could you also check your motherboard's event log for any events correlating with reboot? Just to be sure that those are indeed a software crashes."

Any thoughts on how to go about this - or any other suggestions would be most welcome!

Cheers
 

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925
What about the iDRAC "last screen" crash log and the system event viewer?
 

GrahamBB

Explorer
Joined
Sep 6, 2014
Messages
77
What about the iDRAC "last screen" crash log and the system event viewer?

We've never used that, I'll go and do some digging/learning.
Ta!
 

GrahamBB

Explorer
Joined
Sep 6, 2014
Messages
77
Just an update:
We rolled back to 9-10-2-U6 and the problems have gone away, the system is now functioning as before, with no issues.
I have updated the bug report to reflect that. No response from the team so far to the watchdog issue, or to the rollback behavior. This is bug 28249.

The rollback issue (28476) was worked by removing a number of BEs from the FreeNAS boot menu in the GUI. This did NOT show more BSs in GRUB (as was expected). However, I had previously selected the 9.10.2-US BE in the GUI menu, but it had no effect (the system still booted into 11). During the restarts associated with the above testing, somehow the previous BE was loaded and run.

I noticed that in the emailed security report from the system and when I checked, the GUI settings seem now to be honored.

Cest la Vie!

Cheers
 
Status
Not open for further replies.
Top