Drive not Recognized

Status
Not open for further replies.
Joined
Dec 31, 2012
Messages
8
Hello everyone. I have a FreeNAS box with the following configuration:
  • FreeNAS-9.2.1.7-RELEASE-x64 (fdbe9a0)
  • SuperMicro X9DRi-LN4F+ Motherboard
  • Intel(R) Xeon(R) CPU E5-2609 0 @ 2.40GHz
  • 32GB RAM
  • Kingston 8GB Thumbdrive for booting
  • LSI SAS 9300-8i HBA (Flashed to IT Firmware)
  • 16 - 3TB WD Green drives
  • 2 - ZRAID2 pools striped to form the main pool.
I built this box about 2 years ago, originally with an adaptec RAID controller set to JBOD mode. (Due to company restrictions, I had to use a RAID card rather than an HBA...). All was well and good until drive 11 failed. No problem. We RMA'd the drive, and popped the new one in. The replacement procedure went fine, and I thought everything was just fine. About a month later, the drive dropped out of the array again. So we RMA'd it again, and this time it wouldn't recognize the drive. The drive showed up in the BIOS of the Adaptec Controller, but not in the OS. `camcontrol devlist` only showed 15 drives, the controller, and the thumbdrive.

I thought that was really odd, so I tried replacing the drive with a new 3TB Seagate drive we had at the office here. No luck. Again, you could see the drive in the controller BIOS, but not in the OS. Using this as leverage, I finally was allowed to buy the LSI HBA. Put that in the server, booted up, and boom, drive 11 was there again. YAY! Wait. Now drive 0 is missing. I went ahead and did a zpool replace on drive 11, and pool 1 is now online instead of degraded, but pool 0 is degraded due to disk 0 (da0) missing. Same exact behavior. New drive, no go. Different brand drive? No go.

In desperation, I flashed the IT firmware onto the new HBA, without any change. I can see drive0 in the HBA BIOS, but not in the OS. Also interesting is after I installed the FreeBSD sas3ircu LSI command line tool, and issued sas3ircu 0 DISPLAY, I cannot see the drive. Slot 0 is filled with the controller itself.

I have posted the output of camcontrol and the sas3ircu utility here: https://gist.github.com/russianryebread/b08238898e449a28e03a

I am really at a loss where to go from here, and am a little concerned, since there is another disk that is showing some SMART errors in pool0. I know I can lose 2 disks in each pool, but I'd rather not tempt fate. The data is backed up, but since it is remote, it isn't easily or quickly restoreable. So, on that same note, reformatting everything and starting from scratch isn't really a great option either.

Does anyone here have any idea if I am missing something stupidly simple? I replaced the SAS cables when I put in the new controller, since the HBA uses the newer MiniSAS connectors, so I don't think it is a cabling issue. I suppose it could be a backplane / SAS Expander issue, but why is the missing drive "moving"?

Any help would be GREATLY appreciated. If you need more info, just ask.

Thanks!

P.S. Admins, it sure would be nice to have a way to do inline code objects for accenting bash commands and such. The blocklevel
Code:
code
element breaks up the flow for anything other than large pastes. Just a thought.
 
Last edited:

Robert Marley

Dabbler
Joined
Sep 11, 2014
Messages
29
Having the same problem with SuperMicro X9DRi-LN4F+ Motherboard and LSI-9300-8i flashed to latest IT firmware.
It's not a FreeNas problem. I'm having the same problem in win2k12 r2 with the Supermicro 36-bay enclosure.
The 24 drives in front seem fine. The 12 in back all work except slot 0.

From what I can tell it has something to do with an "Enclosure Services Device" taking enclosure 2 / slot 0.
Slot 0 on the front backplane works because it sees that as enclosure 3.

If you can help me figure out how to start the drive count on a backplane at 1 instead of 0, or how to move the enclosure services device to a slot number that's out of the way, we can both win and I don't get FIRED! YAY!!!!

EDIT: forgot to mention, SAS3-SAS2 converter cable used, or port on LSI card, or drive brand or drive capacity don't matter. You're slot 11 is most likely a coincidence and leading you off track. I also booted this machine into FreeNas and the exact same Slot was missing.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Well, I see several problems...

1. Support for the 93xx series is dubious at best. The driver is in there, but it's more of an experimental driver and the 93xx is not recommended for anyone, especially in production at the present time. It wasn't a mistake that the card isn't listed as a favorite in our hardware recommendations. Someday it probably will be recommended. But not right now.
2. Your output shows the firmware is v6. Not sure what driver version FreeNAS is using but I'd recommend you find out and make them match. I can't provide any assistance with how to find out because I'm one of those guys that is still avoiding the 9300 series until better support is confirmed.

If I were you I'd pull the 9300 and go with an M1015 or equivalent, reflashed to P16 IT firmware. They are less than $100 on ebay and are amazing cards.

I hate to be the guy that says "I told ya so", especially when things aren't looking so great. But to be honest, the second you were forced to go with a RAID controller I would have instantly avoided ZFS like the plague. Mixing the two seems to be even worse than choosing one and sticking with it to the end. The fact that your higher-ups couldn't be bothered to actually understand the tech before handing down requirements should have been your queue to 'gtfo'. I'm glad I don't have your job and if I did I would probably be looking elsewhere. When(not if) things go bad you'll be the guy getting hung. That's not fun and it's better to leave *before* you are fired for "losing the company's critical data".

You are probably the 5th person in the last 2 weeks to have major problems with Adaptec. They are basically the worst possible choice because they give the illusion that they work, so you depend on them and then the stop working when you need them. I'd prefer that they just didn't work at all. Then people couldn't use them even if they wanted to. :P

Other than that, all I can say is "good luck". :(
 

Robert Marley

Dabbler
Joined
Sep 11, 2014
Messages
29
O.P. I have something that might help you.
After sitting cold all night, whatever Flash stores drive mapping info for the HBA seemed to reset itself.
The "Enclosure Services Device" jumped to Enclosure 3, Slot 24, which make Enclosure 2, Slot 0 available for the drive.

LSI tech says it may be possible to hurry this process up using the command
sas3flash.exe -o -e 3
-o = advanced mode
-e = erase
3 = region 3

reboot

Link to PDF on sasflash usage
http://www.lsi.com/sep/Documents/oracle/files/SAS2_Flash_Utility_Software_Ref_Guide.pdf
the correct version sas3flash can be found with the firmware updating tools in the 9300-8i area of LSI's website.

please note this guide refers to the older SAS2 flash utility, and that i have no idea what region 3 means and be careful.
 

Robert Marley

Dabbler
Joined
Sep 11, 2014
Messages
29
More info about SAS mapping and persistence to drive you crazy. Apparently the LSI HBA can map by physical location or SAS address.
trying to attached their doc here.
In the end, your problem may not be solved permanently without contacting LSI support for a custom "burned firmware" that will add/remove the kind of persistence FreeNas will want to see.

Good luck with that though. When I called them about this last night, the guy on the phone said, "look, it's your cables, or your backplane, or your HBA needs to be RMA'd. Pick one of those things, otherwise I don't know how to help you" Gotta love it when support is wrong, frustrated, and rude about it :)
 

Attachments

  • S11209_v1.0_SAS-2EnclosureMapping.pdf
    187.7 KB · Views: 1,579
Joined
Dec 31, 2012
Messages
8
Thanks for all your help, Robert. I have been in contact with LSI support, and at first they told me that there is a known, outstanding issue with the v6 firmware. His solution was to reflash it to v4. I did so, with no apparent difference, and then in rereading the case notes, I noticed that the tech misunderstood me to have said the drive was missing at POST. My controller sees all the drives at POST and in the BIOS, I just have no access to it after it boots.

I will try and ask them what they can do to map my devices statically...

Thanks again, and I will try to post back if I find a solution!
 

Robert Marley

Dabbler
Joined
Sep 11, 2014
Messages
29
ugh, was he called Joseph? I've been dealing with one of their best guys in channel support that we the proletariat don't normally get access to.

you may not want the side effects of Slot based mapping. Remember if you have to emergency move your drives to a new enclosure, FreeNas may refuse to believe that they are the same drives.
If you're going for hardware/enclosure agnostic, or want to be able to pull drives out when the server is cold and rearrange them, I don't recommend asking for a persistent slot based firmware.

LSI knows there is a defect in the V6 firmware and they need to fix it. I've tried their hardware raid cards and the older SAS2 based HBA 9207-8i and every other card sees and reports slot0 just fine.

Your quickest surest fix will be to locate another SAS HBA. Sorry that's not exactly "free" advice, but they really screwed us both on this one.
 

Robert Marley

Dabbler
Joined
Sep 11, 2014
Messages
29
oh wait. there's more. Sorry almost forgot. Try rewiring your backplanes so that 1 backplane 1's input is running to the expander port on the other backplane. Then run only 1 cable to the 9300-8i. See if that makes any improvement.
That was the only way for me to get even the 9207-8i up and running correctly. It did not help with my V6 9300 but I did not try with the V4 firmware... Otherwise it would hang on boot, or refuse to see one of the backplanes. it seems like there is some huge issue with the way the backplanes announce themselves to the HBA, or the way the HBA is deciding to number the backplanes.
 

Robert Marley

Dabbler
Joined
Sep 11, 2014
Messages
29
LSI was bought out by a Chinese company called Avago. Now currently at Joseph's level there are probably 3-4 people.
And I think one of them is in Berlin for the night shift, so actually the chances we would get the same tech are probably about 50:50.
 

souporman

Explorer
Joined
Feb 3, 2015
Messages
57
In desperation, I flashed the IT firmware onto the new HBA, without any change. I can see drive0 in the HBA BIOS, but not in the OS. Also interesting is after I installed the FreeBSD sas3ircu LSI command line tool, and issued sas3ircu 0 DISPLAY, I cannot see the drive. Slot 0 is filled with the controller itself.

I have posted the output of camcontrol and the sas3ircu utility here: https://gist.github.com/russianryebread/b08238898e449a28e03a

Sorry for the necro... Did you do anything special to get sas3ircu to recognize your SAS3008 card? You're the only person I've encountered on these boards who is able to use that tool in FreeNAS and have it recognize your controller.
 
Status
Not open for further replies.
Top