HARD locking (9.3 -> 11.1-U4)

Status
Not open for further replies.

BloodyIron

Contributor
Joined
Feb 28, 2013
Messages
133
So a while ago (over a year ago), I tried upgrading my FNAS from 9.3 to 10.x. I can't recall the exact version I upgraded to, but after about 2 days it hard locked and outputted nothing to the local console. Same thing happened 2 days later. I tried to reach out to the devs but the stuff that got uploaded was blank/not helpful. So I rolled back to 9.3.

TODAY I upgraded to 11.1-U4 from 9.3. And guess what happened about 6ish hours later?... HARD locking again.

What do I mean by hard locking? The system does not respond to network interfacing, and the local console does not even show that a keyboard is connected, and does not respond to keyboard inputs. As far as I can tell the system has hard locked.

I really have no idea how to begin troubleshooting this, as normally I would expect some form of barf output to the console. And I don't understand what could have happened in NEWER versions to make the system unstable.

When running 9.3 I have never had this HARD locking before, and it has been completely stable for months on end (I have not seen any stability issues at 9.3).

Help! Please!
 

MrToddsFriends

Documentation Browser
Joined
Jan 12, 2015
Messages
1,338
The Forum Rules lists the desirable hardware information which is important when diagnosing problems. Please include the information in a follow-up posting (and not in your signature as also suggested there).

Some kind of screenshot of the last console output might be also useful.
 

BloodyIron

Contributor
Joined
Feb 28, 2013
Messages
133
I know for a fact my hardware is compatible with FreeBSD, since it's been using FreeNAS for going on 5 years now, and has only been unstable when going to 9.10 or 11.x or higher. Plus I've already confirmed it matches FBSD HCLs.

Also, did you completely miss the part where I said the console outputted nothing?
 

BloodyIron

Contributor
Joined
Feb 28, 2013
Messages
133
Yeah I figured I should probably submit it. Do you know if the stuff it attached is of any use? I'm really not sure where to begin getting useful info for this, since the normal means seemed to have failed me. :(

Notes that the bug is at https://redmine.ixsystems.com/issues/32091, though currently hidden due to the attached debug.
 
D

dlavigne

Guest
Yeah I figured I should probably submit it. Do you know if the stuff it attached is of any use? I'm really not sure where to begin getting useful info for this, since the normal means seemed to have failed me. :(

It will get the developer started on diagnosing the issue and he will leave a comment on the ticket if he needs more info to figure it out.
 

BloodyIron

Contributor
Joined
Feb 28, 2013
Messages
133
If there's anything more I can do to help with this, please let me know. :) But otherwise, duly noted! :D

It will get the developer started on diagnosing the issue and he will leave a comment on the ticket if he needs more info to figure it out.
 

BloodyIron

Contributor
Joined
Feb 28, 2013
Messages
133
This just happened AGAIN. :(
 

BloodyIron

Contributor
Joined
Feb 28, 2013
Messages
133
Well, happened again. D:
 

BloodyIron

Contributor
Joined
Feb 28, 2013
Messages
133
Woke up to it being locked again...
 

BloodyIron

Contributor
Joined
Feb 28, 2013
Messages
133
And again.
 

BloodyIron

Contributor
Joined
Feb 28, 2013
Messages
133
As mentioned earlier I've already filed a bug report from the FreeNAS system, pretty sure that includes hardware specs.

I'm posting each time it happens so that I can accurately reflect the frequency of it.

And also, I don't take kindly to threats, and I don't believe you're a forum mod either.
 

BloodyIron

Contributor
Joined
Feb 28, 2013
Messages
133
It just occurred to me that since this is probably going to get indexed by google, that I should do a better job of listing hardware. In the vein of the xkcd comic that's relevant ( https://xkcd.com/979/ ).

I have personally and professionally been bacon-saved by obscure posts indexed on the internet, so while I faltered in upholding that, I am now going to correct that as best I can...

Hardware Specs:
  • Motherboard : intel S1200BTSR
  • CPU : intel Core i3-2100
  • RAM : (16GB) 4x4GB, Kingston (Part Number: 9905413-019.A01LF, as listed by dmidecode), 1333mhz ECC
  • NICs : 2x1gige onboard + 2x1gige expansion (82571EB), + 4x1gige expansion (82575GB), all in lagg0 LACP (switch configured for LACP too)
  • SAS HBA : SAS1068E (SAS3801E, FW 1.33.00.00-IT), which connects via 1x SFF-8088 cable to...
  • SAS Expander/Enclosure : SGI Rackable SE3016 bay with a bunch of 2TB and 500GB disks in not impressive configurations
  • Additional : Some intel SSDs are directly plugged to some SATA ports on the mobo for an SSD zpool
  • OS SSD : intel 60GB 330 series

PSU and Chassis are inconsequential IMO, so not included.

Just to reiterate, this server has been reliable for going on 5 years now on FreeNAS 9.3 and prior. I have previously attempted to upgrade it to 9.10, and saw the same stability issues I'm seeing now that I'm on 11.1-U4. Which suggests to me the issue is software. I'm at the point where I'm taking the bullet (downtime) to get this fixed, since it seems waiting over time has not resolved the issue, itself, whatever it may be.
 

BloodyIron

Contributor
Joined
Feb 28, 2013
Messages
133
Woke up to it locked up again. Well, looks like it's going to do this at least once a day now. Yay :/
 

dir_d

Explorer
Joined
Nov 9, 2013
Messages
55
I have a 9.3 Server as well. Its using a flashed H200 to IT mode and has firmware 16 on it. I upgraded the 9.3 a couple times and it would hard lock and crash due to FreeNas switching to a higher driver version than what was on my raid card. I rolled back and just kept it at 9.3 I just got another server that I'm installing 11.1 U4 on and it has an LSI 9211 in IT mode on firmware 20. Make sure that your Raid card has the latest firmware on it if you want to update from 9.3 to 11.1.
 

BloodyIron

Contributor
Joined
Feb 28, 2013
Messages
133
I actually double-checked this with ericlowe before doing this. He informed me that SAS1 HBAs (which mine is) have not had a firmware update in years, and I didn't need to do it.

While either you, or he, may be right, I am unsure if that is what is going on here.

One thing I did note is, I have no idea where to get firmware for it anyways. I tried looking before-hand, and turned up empty. Hence me jumping on IRC to ask.

A Dell H200 is SAS2 btw, so its firmware is completely different from mine.

I have a 9.3 Server as well. Its using a flashed H200 to IT mode and has firmware 16 on it. I upgraded the 9.3 a couple times and it would hard lock and crash due to FreeNas switching to a higher driver version than what was on my raid card. I rolled back and just kept it at 9.3 I just got another server that I'm installing 11.1 U4 on and it has an LSI 9211 in IT mode on firmware 20. Make sure that your Raid card has the latest firmware on it if you want to update from 9.3 to 11.1.
 

MrToddsFriends

Documentation Browser
Joined
Jan 12, 2015
Messages
1,338
I actually double-checked this with ericlowe before doing this. He informed me that SAS1 HBAs (which mine is) have not had a firmware update in years, and I didn't need to do it.

To summarize my forum reading experience from the last 40 months: Everyone here advises against using (at least almost any) SAS-1 components for several years now. For me it's no big surprise that driver support for SAS-1 hardware fades out in FreeBSD/FreeNAS.

Please keep us informed if your problem can be tracked down to a hardware error that never showed up before upgrading from 9.3 to 11.1 or indeed a software/driver error.
 

BloodyIron

Contributor
Joined
Feb 28, 2013
Messages
133
If it was a real concern, Ericklowe would have advised me to not use it.

That being said, I plan to keep this thread appraised, should the solution be identified. So far, I have not heard anything from iX...

To summarize my forum reading experience from the last 40 months: Everyone here advises against using (at least almost any) SAS-1 components for several years now. For me it's no big surprise that driver support for SAS-1 hardware fades out in FreeBSD/FreeNAS.

Please keep us informed if your problem can be tracked down to a hardware error that never showed up before upgrading from 9.3 to 11.1 or indeed a software/driver error.
 

Agi

Dabbler
Joined
Feb 26, 2016
Messages
14
If this is happening once a day, I'm sure you can easily identify where the problem lies. Take some downtime on the Nas for a while (you can use your backup if needed ;) ). Then do the very obvious thing of pulling the HBA. Boot it up and see if it locks after a day without the HBA attached. Yes, it will not directly diagnose the HBA itself, but as many things are pointing this direction it's an ideal start point.

Whilst it also won't directly address the issue on hand, if I got this annoyed (which you are, based on your tone in the above posts) I'd have saved my config, wiped my install and installed from scratch and restored the config. This clearly doesn't resolve the underlying issue, but atleast it will help remove any nasty conflicts from incompatible packages which may have slipped the net.
 
Status
Not open for further replies.
Top