Disks/pool gone after reboot/shutdown, can get it back.. but.. help

mirkywoody

Cadet
Joined
Nov 19, 2019
Messages
5
Hello there awesome people.
I spent 20 hours now trying to troubleshoot something, and i think it's time to ask for help. This is part of my own experiment/learning at an school.

Hardware
Server: HP DL360G7 - dual x5660 - 96gb ram ...
Controller: LSI SAS 3801E (chip: c1068e) ... Flashed to IT mode.

Diskshell: 16bay disk shell with... 2x controller/expander: Promise J630S
Disks: 16x Seagate ST33000650SS A003 ... 3TB disks.

FreeNAS
version: 11.2-U6
installed on hardware raid1, onboard controller. (also had on usb stick before).

Problem: disks and created pool disapears after reboot or shutdown.. and will not appear again before disconnecting and reconnecting sas cable to disk shell. Everything actually works, can make a share, transfer, etc. But gone after reboot/shutdown, and i can get the pool back.
Ofc i don't want to reconnect cable at everytime on start.


Further story...
On fresh install of FreeNAS, with the diskshell connected, the disks do not appear!. However, if i unplug the cable to the diskshell, and connnect it again, the disks will show. (important thing i found out about this, is to wait sufficent time before plugging back in, or it wont work!! disks wont come up, it needs the time).
I then can make my pool, Z3 on all 16 disks, and make my shares, etc. everthing works. ...

But then after reboot, the disks are not seen in "disks" or with command "mptutil show drives" .. also nothing on "sesutil map", or "camcontrol devlist".. but the controller itself can still be seen with forexample "mptutil show adapter".
I then can do the re-connect trick. And everything can be gotten back into order.

I also tried with the FreeNAS nightly dev, just to see, but it was same.

I think i read something about my controller maybe not supporting more than 2tb disks?, but i can see them as.. 2.7gb something(must be after some formatting), everything working, when it works.
...

So whats the problem here guys?.
To me it.... feels like..... some timeout issue? or something not re-initializing?... that is the only thing i can say in generel, not super experienced here.
 

Heracles

Wizard
Joined
Feb 2, 2018
Messages
1,401
Hey mirki,

According to your post, your disk shell boots up too slowly. It is not ready to present itself to FreeNAS in time when FreeNAS needs it and is ready to read and access your drives. I do not know if FreeNAS can delays some parts of its own boot process to wait for your too slow disk shell... Or maybe can you boot your disk shell first and wait for a long enough before booting up FreeNAS ?

If indeed you are in a race condition, know that FreeNAS is wining that race very clearly. So you need to slow it down, to speed up the disk shell, to delay FreeNAS or anything like that for FreeNAS not asking for the disks before the shell is ready to present them.

Good luck with that one,
 

mirkywoody

Cadet
Joined
Nov 19, 2019
Messages
5
Hey mirki,

According to your post, your disk shell boots up too slowly. It is not ready to present itself to FreeNAS in time when FreeNAS needs it and is ready to read and access your drives. I do not know if FreeNAS can delays some parts of its own boot process to wait for your too slow disk shell... Or maybe can you boot your disk shell first and wait for a long enough before booting up FreeNAS ?

If indeed you are in a race condition, know that FreeNAS is wining that race very clearly. So you need to slow it down, to speed up the disk shell, to delay FreeNAS or anything like that for FreeNAS not asking for the disks before the shell is ready to present them.

Good luck with that one,

thanks :).
but but...... hmm... the disk shell is always powered on. has been powered on for loooong time before FreeNAS starts in all my testing.
But i am not sure how it all works, does the FreeNAS/controller send a signal for it power on in another sense? initialize with it. is that what you mean?.

I could look into delaying boot of freenas just to try it, must be someone who done that before.
ALSO, in the SAS controller utility (the one accessed at boot), there are various timeout and delay options.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
installed on hardware raid1, onboard controller.
This is not a good thing. No hardware RAID for FreeNAS if you want support. (using the onboard controller is fine, just not in Hardware RAID mode)

It looks like your disk shell is something a little special and perhaps not suited to FreeNAS.

Does that shell behave normally with any other OS?

You may want to do some tinkering with the delay options in the SAS controller utility as it will be the part most likely to deliver success.
 

Fredda

Guru
Joined
Jul 9, 2019
Messages
608
What happens if you enter the BIOS of the HBA card upon reboot? Usually RAID or HBA cards have kind of a "rescan drives" function.
Is there a way you can make them reappear via the HBA-BIOS?
 

mirkywoody

Cadet
Joined
Nov 19, 2019
Messages
5
This is not a good thing. No hardware RAID for FreeNAS if you want support. (using the onboard controller is fine, just not in Hardware RAID mode)

It looks like your disk shell is something a little special and perhaps not suited to FreeNAS.

Does that shell behave normally with any other OS?

You may want to do some tinkering with the delay options in the SAS controller utility as it will be the part most likely to deliver success.

I did know that it's no-go with freenas + hardware raid anywhere :D, but did anyway, and it worked.. but yes, I changed it to non-hardware raid before i left for home. There also was a weird error message on boot drive at times, oddly after it adds the disk shell drives. I don't see it could be causing trouble tho, everything seemed to work, etc.

maybe it is a special snow flake disk shell. It is a Symantec Norton branded one.
Will do other OS once I exhausted some more ideas.
I did try to mess around with the settings in controller utility, seemingly not changing anything so far.. will post my options there later.

What happens if you enter the BIOS of the HBA card upon reboot? Usually RAID or HBA cards have kind of a "rescan drives" function.
Is there a way you can make them reappear via the HBA-BIOS?

The drives have always appeared in the HBA-BIOS/utility, no problem there it seems. I look there before FreeNAS starts, ofc, and they are there, but not in FreeNAS until i re-plug the cable to disk shell.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
I did know that it's no-go with freenas + hardware raid anywhere :D, but did anyway, and it worked.. but yes, I changed it to non-hardware raid before i left for home. There also was a weird error message on boot drive at times, oddly after it adds the disk shell drives. I don't see it could be causing trouble tho, everything seemed to work, etc.
OK, well, you were lucky not to have problems and I'm glad to see you're out of harm's way now.

I wonder if we could be looking at something more like firmware on the HBA or in the backplane of the shell... perhaps see if you can find something in that direction to work with.
 

mirkywoody

Cadet
Joined
Nov 19, 2019
Messages
5
Here are some logs finally, taken from /var/log/messages .. maybe there is better place to be looking?.
I put on pastebin, since i reached maximum for post.


Freenas started, no disks found.
https://pastebin.com/7snMdC9Y

Unplugged/plugged cable back to diskshell, disks were found.
https://pastebin.com/dcV62y9x

Pool made, all 16 in z3.
https://pastebin.com/A0quDugr

System rebooted, no disks found again.
https://pastebin.com/K8iNPzDr

Re-connected cable to disk shell again. Pool exported/disconnected and imported. (should it had brought the pool back up automatically?)
https://pastebin.com/BGk9QKsr


LSI Config utility - settings possibilities..

Boot support - Enabled BIOS & OS // Enabled BIOS // Enabled OS // None
and....

2019-11-22 11_08_44-.jpg


2019-11-22 11_09_00-.jpg


2019-11-22 11_09_45-.jpg


2019-11-22 11_07_17-.jpg


Disks are always found here.

2019-11-22 11_08_15-.jpg
 

subhuman

Contributor
Joined
Nov 21, 2019
Messages
121
First, don't change a thing based on what I'm gonna say. I have zero experience with this HBA, which means I could be saying something totally wrong that could cost you everything. Maybe someone who is familiar with this can chime in however.
Two things stick out. the first probably doesn't matter, but max int13 devices=24?
The manual https://www.supermicro.com/manuals/other/LSI_HostRAID_2308.pdf
says that it does accept a value of 0. int13 means bootable, and you say you boot from drives connected to the motherboard, not from these disks. There's no need for any of them, let alone more than actually exist, to be bootable.
But the spinup delays, those may be problematic. Looks like a non-logical combination. "Every two seconds, spin up zero more drives."
I haven't seen this with SAS, but I did years ago with SCSI. A drive set to delay spinup, if the controller wasn't set to issue that command wouldn't spin up until the first time you tried to access it- which of course took many seconds, which of course meant it timed-out. Sounds a lot like your current situation.
This is part of my own experiment/learning at an school.
Keep working on it! :)
 

mirkywoody

Cadet
Joined
Nov 19, 2019
Messages
5
First, don't change a thing based on what I'm gonna say. I have zero experience with this HBA, which means I could be saying something totally wrong that could cost you everything. Maybe someone who is familiar with this can chime in however.
Two things stick out. the first probably doesn't matter, but max int13 devices=24?
The manual https://www.supermicro.com/manuals/other/LSI_HostRAID_2308.pdf
says that it does accept a value of 0. int13 means bootable, and you say you boot from drives connected to the motherboard, not from these disks. There's no need for any of them, let alone more than actually exist, to be bootable.
But the spinup delays, those may be problematic. Looks like a non-logical combination. "Every two seconds, spin up zero more drives."
I haven't seen this with SAS, but I did years ago with SCSI. A drive set to delay spinup, if the controller wasn't set to issue that command wouldn't spin up until the first time you tried to access it- which of course took many seconds, which of course meant it timed-out. Sounds a lot like your current situation.

Keep working on it! :)

There is absolutely nothing to be lost here, in this setup :) no critical data or anything, so no worries. Ofcourse i don't want it to set on fire tho, and i do want it to work :).

The settings in my screenshot is just from playing around, I had hoped to make all values default, but couldn't find, only part of it.
I was thinking the spinup delay was about how... when you first plug in the disk shell, that it shouldn't create a big spike in power, but that wouldn't make sense, since it can also be powered without the controller.
I will play around more with it today.
 
Top