Supermicro SC846 and LSI 9211-8i

Status
Not open for further replies.

BDMcGrew

Dabbler
Joined
Sep 22, 2015
Messages
49
Good evening... I'm facing an interestingly characterized set of problems at the moment. I realize this isn't specific to FreeNAS per say but hopefully someone can shed a bit of light on my misfortune before I just lose my mind! Over in the thread "Confused about that LSI Card? Join the crowd ..." located here, jgreco specifically recommends the Supermicro SC846 with IBM M1015 (LSI 9211-8i) controllers. Needing to move a hardware platform, I bought 3 of these off of eBay from a reputable seller that's moved hundreds of them as well as 3 LSI Logic 9211-8i cards from another reputable US seller. The Supermicro boxes came with some blah blah Adaptec card which I had no intentions of using and all arrived on Monday.

I tore the first box apart and dropped in an i340-T4 and the LSI cards. Booted FreeDOS from a flash drive and cross-flashed the LSI to IT mode and all was good, FreeNAS booted and installed just fine. While watching /var/log/messages I started installing hard drives one a time making sure each was seen and all was good until I hit a bad slot on the backplane. Fast forward and machine #1 appears to have 2 bad slots on the backplane. WTF, can we say untested? Nope - not it!

Lather, rinse, repeat for machine #2 and find 3 bad slots on the backplane. I contacted the vendor and 2 replacement backplanes are being sent as I type but I was assured that out of the hundreds of machines he'd sold there had only been a "couple" bad backplanes. Odd.

I decided to try things a bit different with machine #3. Boot up stock out of the box to check integrity and it looks good. Add hard drives 6 at a time and check for tasty goodness until all 24 drives are installed and seen by the Adaptec controller. Boot FreeDOS and I see 24 hard drives. Boot up a sandbox copy of FreeNAS and yup, I see 24 hard drives. Guess the backplane is good.

Shut down, drop in LSI card, with cables unhooked, cross-flash to IT mode and all looks good. Connect hard drives and try to boot and bam, no good.

1) The controller set at the Initializing stage for about an hour (ok, I didn't time it).
2) When it did get past the Initializing phase and showed connected devices, it only showed itself.
3) Boot up FreeNAS and hung at ""run_interrupt_drivern_hooks: still waiting after 300 seconds for mps_startup"" and I'm locked hard.

(Machine #1 is down but on Machine #2 and 3):

4) Pull all the drives out, boot FreeNAS and add drives one at a time and all the machines exhibit 2 or 3 bad bays on the backplane. Populate those bays and I'm dead on reboot. Leave them empty and life is good.

5) Swap the LSI controller out and put the Adaptec back in and all 24 bays/drives are detected by the controller and by FreeNAS.

I don't get it.

I find it hard to believe I have 3 bad backplanes but, I suppose, it's possible???
Could it be 3 bad LSI controller cards??? Eh, maybe? I can't seem to find any diagnostics for the cards.

I doubt it's the hard drives or carriers, I swapped them all around and I know the disks are good - they are new but were all previously tested on known good working hardware.

I'm going on 36 hours of bench time and have nothing to show for it save for high blood pressure and frustrations. I thought I was doing the right thing buying recommended hardware (which happened to suit my needs perfectly) and I read everything I could find on the backplane, LSI cross-flashing, etc. Did I maybe miss something? If so, what am I missing?

Do I dare even attempt to bring up a box with the Adaptec cards? I've read so many bad things about them that I'm reluctant to even ask such a question.

I'm just at a total loss right now and have no idea what to do next or where to go next...

I'm totally open to ideas if anyone has any and like I said, I know it's not _specific to FreeNAS but kind of specific to hardware preferred by FreeNAS, no?

Thanks!

-b
 

BDMcGrew

Dabbler
Joined
Sep 22, 2015
Messages
49
So a quick update, with a couple screenshots attached. While I was writing this post, machine #2 which I thought was hung finally booted. There are loads of messages on the console about disk errors and the GUI only sees 7 of the 24 drives in the system. See attach pics and remember, if I put the Adaptec card back in it comes up right away and sees all 24 drives.



Screen Shot 2016-02-23 at 11.33.13 PM.png
Screen Shot 2016-02-23 at 11.33.04 PM.png
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
Which model SC846? I'm wondering if you might have the "older" backplane.
 

BDMcGrew

Dabbler
Joined
Sep 22, 2015
Messages
49
Yeah, I'm wondering the same myself. I bought this model:

SUPERMICRO 846E1-R900B X8DTE-F 2x E5530 QUAD CORE 48GB MEM 2x 30GB SSD 24x TRAYS


And I'm pretty sure it is the "older" model. I guess I could've dug a lot deeper before I bought it perhaps but oh well. After hours of reading here, yeah, the older models don't support >2TB drives and have questionable functionaly with the 9211-8i cards, etc. I'm reading where many others have had similar issues with OpenSolaris/ZFS with this configuration. I'll contac the vendor in the morning.

I just confirmed it is the older SAS-846EL1 model, not the newer EL2 model.
 
Last edited:

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874

BDMcGrew

Dabbler
Joined
Sep 22, 2015
Messages
49
So... an update and hopefully some more user input.

First I'll say that I'm wondering if my problems might be of my own making because I'm using SATA drives and not SAS drives???

1) There are definitely bad slots on the backplane. Through the time consuming process of elimination (with both the LSI card and Adaptec cards) I've weeded out the bad slots. If I put a drive in one of the bad slots it takes the whole bus down on cold boot, everything times out and other bad things happen. As long as those 'bad' slots are clear, everything is good and both the LSI and the Adaptec controllers see all the installed drives and FreeNAS-9.3-CURRENT boots with no errors. Could this be a SATA vs. SAS issue? The vendor still maintains surprise at bad slots across 3 separate backplanes --- one of them has 5 bad, one has 4 bad and one has 3 bad. IDK!?!?!?

2) (This is on all 3 machines, not unique to just one). With the Adaptec card installed the system boots from my trusty SanDisk 16GB USB Flash drives with no problems. Next in line please. With the LSI card installed it will try to boot from the USB drive but comes up missing O/S, press any key. I don't have an any key. Repeatably, I can hit the BIOS F11 to get the boot menu _and_ then CTRL-C to get into the LSI BIOS and then (and only then) will I get the SuperMicro boot menu with USB or SAS BIOS and if I pick USB the system will boot just fine. Freaking strange?!?!?! I'm thinking either my flash drives are possessed or the LSI card is fighting with the flash over int13h or, I just have some configuration option wrong on all the boxes but I've no idea what it would be; I've looked at everything I can find?

Anyway, two of the boxes are up and one of those is running badblocks -ws /dev/da{0-20} and I'm still working on the third. I'm asking the vendor to find replacement backplanes and test them with an LSI card (or other SAS2008 BIOS???) before sending them to me but I have this deep underlying concern that I'll see the same kind of behavior.

At any rate, I'm running badblocks on one just to kind of stress test the disks in a low-impact sort of fashion. I'm a bit more than nervious and apprehensive about putting any of them in production until I figure out what exactly is causing the backplane issue and boot issue.
 

adamgoldberg

Explorer
Joined
Dec 12, 2015
Messages
60
Oh, please keep us up-to-date on this. I'm planning a near-future SC846 upgrade...
 

Mirfster

Doesn't know what he's talking about
Joined
Oct 2, 2015
Messages
3,215
First I'll say that I'm wondering if my problems might be of my own making because I'm using SATA drives and not SAS drives???
Nope, SATA Drives work perfectly fine in with a SAS Backplane. All my drives are SATA and connected to SAS Backplanes. Just can't do it the other way around (SAS to SATA).

Maybe you can convince the vendor to swap everything for a different model (not E1) or even get a refund? Seems to me from what you have been through you may end up still having more issues down the line since the E1 is known to be problematic.

Just to note jgreco recommended the E26 (SC846BE26-R920B) so if you could get that from the vendor then things would definitely look more in your favor.
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
I'm also using 20-some SATA drives in my SAS enclosure without issue.

As for the E26 variant, that only is beneficial if you are using dual-port SAS disks. SATA is only a single port, so it's irrelevant.
 

BDMcGrew

Dabbler
Joined
Sep 22, 2015
Messages
49
In what I hope is my final update to this saga... the vendor replaced the SAS846E1 backplane with a SAS2846E1 backplane and all 24 slots are working now!
 

adamgoldberg

Explorer
Joined
Dec 12, 2015
Messages
60
In what I hope is my final update to this saga... the vendor replaced the SAS846E1 backplane with a SAS2846E1 backplane and all 24 slots are working now!
Is the conclusion here that we should avoid the SAS-846-E1 backplane in favor of the SAS2-846-E1, or that there was something wrong with your particular backplane(s)?
 

BDMcGrew

Dabbler
Joined
Sep 22, 2015
Messages
49
Sadly, I can not give a clear and specific answer to that question but as I move forward I will in fact make sure all the boxes I get have the SAS2-846-E1 backplane.

What I can say as fact is that I bought 3 boxes that all came with the older SAS-846-E1 and all of them had various numbers of bad slots on the backplane that manifested themselves both with the included Adaptec controller and the LSI 9211-8i that I swapped in. I put the SAS2-846-E1 backplane in one server and all of the slots came up.

I will be replacing the backplanes in the other two servers shortly and will post the status here.
 
Status
Not open for further replies.
Top