Updated to 11.2 and now have random reboots

Status
Not open for further replies.

kdbaumann

Explorer
Joined
Mar 19, 2013
Messages
50
I know it's related to the update since I updated one server and this started to happen. I waited 4 weeks and updated a second server and now that one is also having random reboots.

Thoughts? I first thought that it had to do with NTP, but I changed all of the servers I was using and it's still happened, but at a bit slower rating.

I thought it was just something wrong with the one server until I updated the second one and now am having the same issues.

Kurt
 

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
It could be memory related. i.e. out of memory. I would update to BETA 3 and see if its better.
 

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
Oh well you didn't say that. If that's the case I would move back to 11.1U6.
 

diskdiddler

Wizard
Joined
Jul 9, 2014
Messages
2,377
What specs for the machine?
 

wblock

Documentation Engineer
Joined
Nov 14, 2014
Messages
1,506
"random reboots" is lacking some detail. What is the hardware, exactly? Does the machine show errors either before or after the reboot?
 

kdbaumann

Explorer
Joined
Mar 19, 2013
Messages
50
Both of these servers have been running like sewing machines for years. The 16 bay server has been running for roughly 10 years with one form or the other of FN on it. Not once random reboot.


This server is the one I upgraded first to Beta2 and it started to have reboots randomly. Though I still believe tied to NTP.

OS Version:
FreeNAS-11.2-BETA3

(Build Date: Sep 10, 2018 1:18)


Processor:
Intel(R) Xeon(R) CPU X5560 @ 2.80GHz (16 cores)



Memory:
12 GiB

12 5Tb Drives


This server has been in use for over 3 years with zero issues. Issues only started with Beta3.

OS Version:
FreeNAS-11.2-BETA3

(Build Date: Sep 10, 2018 1:18)


Processor:
AMD FX(tm)-8320 Eight-Core Processor (8 cores)



Memory:
32 GiB

16 8Tb Drives
 

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
What motherboard does it have? Do you have an HBA? What network cards are you using? Are you using a SAS expander? If so what model? What is your power supply MODEL?
 

kdbaumann

Explorer
Joined
Mar 19, 2013
Messages
50
I will gather this info. But here's my question.

Being how this is on two servers showing the same symptoms, and those symptoms didn't start until after the upgrade, are you suggesting that there is a driver issue?

Power supply's are enterprise server grade by EMACS, and I swap them out when they complain, they aren't complaining and all of them are new (last 6 months). The mobo of one is a Supermicro (I will get the exact details later), the other is a desktop grade AMD board. I think ASUS. CPU Dual Intels in one and single AMD in the other.

The network cards are 2 Intel X520-DA2 10Gb PCI 10Gbps Dual Server Adapter E10G42BTDAG1P5. About a year old. The HBA's are LSI Internal SAS SATA 9211-8i.

Hope this helps.
 

Ixian

Patron
Joined
May 11, 2015
Messages
218
I don't know how anyone could possibly answer that question without a lot more detail. It could be a thousand things.

Also why would you update a server that has been "running like a sewing machine" for years to a beta? A quick perusal of these forums would tell you that the 11.2 betas aren't production ready, including beta 3.

What you should do is downgrade to 11.1U6 and wait.
 

kdbaumann

Explorer
Joined
Mar 19, 2013
Messages
50
Note: I am not complaining, but rather trying to figure out what's going on so that perhaps a fix can be implemented or whatever is going on can be noted for others. :smile: These are my personal internal servers, not ones I have running for customers, that I would never do. But my desire to play with beta has nothing to do with my question other than I am using a beta. Beta's whole purpose is to find out where the rough edges are and fix them. I am just trying to figure this out so that either I know I have something wrong in my setup or there is a bug that is found that allows for beta to become production.

That being said. What more info is needed. I will get later today the actual mobo model numbers for everyone. But I think so far I have answered what questions were asked. If not point that out and I will get more data.

Thanks. Just doing my part to help move beta from beta to production.

PS I think there is a software or config issue since this hardware has been running for a long time with little issue. Hence why I believe that when the problem first showed in one server and I waited for a new update before adding the other server into the mix and still am getting this problem, that it's something with the software and not the hardware. Logically if I had two servers both on this version and one was acting up and the other was not then I would suspect hardware. But we need to obviously rule hardware out before moving to other things.

Also note that this was an issue reported in the bug tracking site and was noted as fixed, yet it is still occurring, at least for me.
 

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
Note: I am not complaining, but rather trying to figure out what's going on so that perhaps a fix can be implemented or whatever is going on can be noted for others. :) These are my personal internal servers, not ones I have running for customers, that I would never do. But my desire to play with beta has nothing to do with my question other than I am using a beta. Beta's whole purpose is to find out where the rough edges are and fix them. I am just trying to figure this out so that either I know I have something wrong in my setup or there is a bug that is found that allows for beta to become production.

That being said. What more info is needed. I will get later today the actual mobo model numbers for everyone. But I think so far I have answered what questions were asked. If not point that out and I will get more data.

Thanks. Just doing my part to help move beta from beta to production.

PS I think there is a software or config issue since this hardware has been running for a long time with little issue. Hence why I believe that when the problem first showed in one server and I waited for a new update before adding the other server into the mix and still am getting this problem, that it's something with the software and not the hardware. Logically if I had two servers both on this version and one was acting up and the other was not then I would suspect hardware. But we need to obviously rule hardware out before moving to other things.

Also note that this was an issue reported in the bug tracking site and was noted as fixed, yet it is still occurring, at least for me.
We appreciate your willingness to test the beta. I know I dont have the time or ability to put my datas accessibility at risk.
A software update that uses devices differently can definitely reveal show hardware issues that have been there the whole time. For example, if a devices power save mode is no longer supported , you could run into power issues that you were just barely avoiding. You mentioned you have "enterprise server grade" power supplies but you said nothing of the model or wattage.

Please provide the full detailed specs of the system. Also have you reviewed any logs? Does your motherboard support logging of memory errors or similar?
 
Status
Not open for further replies.
Top