LSI MegaRAID 9460-16i with physical drives in JBOD mode

geronimo

Cadet
Joined
Jul 20, 2012
Messages
6
Hello Gents,

I am running FreeNAS with LSI MegaRAID 9460-16i. I know you would say you should not run FreeNAS with a hardware RAID, but the case is that all my drives are set in JBOD mode. When I boot the same controller in various flavors of Linux and ESXI 6.7U2 all the drives are recognized correctly and have direct access.

In FreeNAS I can see the 9460-16i card recognized and the proper driver is already compiled in the kernel and I can see it loaded. The problem I have is that even though the driver is loaded and card recognized, and all the drives are configured in JBOD mode, FreeNAS sees no drives at all.

Any ideas would be highly appreciated.

As I mentioned I tested with various flavors of Linux and the devices are visible with direct access. Also I tested with ESXI 6.7U2 which sees all drives as well.

This is the output of StorCLI on both ESXI/Linux:
1573044341520.png


StorCLI on FreeNAS fails to recognize the controller although the OS has the driver loaded and the device is recognized:
1573044421414.png


Another thing that came to my attention using StorCLI on both ESXi/Linux is that even though each drive shows as JBOD and is recognized in the OS StorCLI reports that the controller does not support JBOD:
1573044629386.png


Also based on the official product brief all controllers in the 9400 series except for 9440-8i support JBOD mode:
1573044909910.png


P.S. Last but not least also upgraded to the latest firmware available on Broadcom website. Didn't change anything.

Please, ideas!
 

geronimo

Cadet
Joined
Jul 20, 2012
Messages
6
Digging further shows that my controller is set in JBOD personality which presents all drives as JBOD. As of the most recent firmware versions controller also supports HBA.

1573090198546.png


I decided to give it a try and switch the controller to HBA mode and see if that changes anything so I deleted the current config, set personality to HBA and rebooted the system into Linux to verify the settings:

1573098879873.png


1573098891827.png


After a reboot the controller now reports JBOD is enabled, drives are again in JBOD mode, personality reports RAID even though HBA has been selected (my take on this one is when you select HBA it just switches back to RAID and enables the JBOD property)

1573099372418.png


Still come back to the issue. Controller is visible in PCI devices list. Driver is properly loaded. Storcli in FreeNAS does not recognize the controller. No drives are discovered. At the same time both ESXi/Linux see the drives normally as JBOD. Trying to manually load the latest FreeBSD driver from Broadcom website failes as the driver is already compiled in kernel:

1573100550598.png


Here is the output of pciconf -lv

1573100778489.png


I did a lot of troubleshooting on this one and I don't see any particular reason why it wouldn't work, but maybe I am missing something?
 

geronimo

Cadet
Joined
Jul 20, 2012
Messages
6
My conclusion on this is the mrsas driver compiled in the kernel is just outdated. This will rather require me to compile a new kernel with the updated driver to get this controller rolling. That's a bummer, these chips have been out there for quite some time and things should have been looking good already.
 

Jailer

Not strong, but bad
Joined
Sep 12, 2014
Messages
4,974
Is there IT firmware available for your card? IF so flash it with the IT firmware and try again.
 

geronimo

Cadet
Joined
Jul 20, 2012
Messages
6
These controllers come with single firmware. You can switch the personality of the controller in order to make it act as HBA as it supports HBA/JBOD/RAID modes. That's exactly what I did. Drives are presented as JBOD and have direct access. There is no way you can flash ROM for example from 9400-16i which is the HBA only card. I believe the problem is caused by the outdated version of the mrsas compiled driver in the current kernel. The solution for that would be to recompile the kernel without this driver and then to use the latest provided module driver for FreeBSD 11.2 provided by Broadcom. I don't know how long it would take to have this updated mainstream.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,681
You can switch the personality of the controller in order to make it act as HBA as it supports HBA/JBOD/RAID modes.

No, you cannot switch the personality of the controller to make it act as a true HBA - not interested in arguing the point, it's just a fact. It might be an "'HBA'" but it isn't the HBA we need. It is a hardware RAID controller, and there is not an option to omit the hardware RAID data path. Even if there were, there is still the issue of driver support. There are billions of driver-hours of FreeNAS on the LSI 6Gbps HBA 20.00.07.00 in IT mode and all the bugs are shook out. There's a lesser number of hours on the 12Gbps HBA products but they're known to work with only a caveat or two. These things do have drivers for FreeBSD, but they're really oriented towards typical "run a program, write a datafile" style usage. ZFS can pound continuously on your controller for days doing a scrub or resilver. The driver can't be 99.999%. It really needs to be as close to 100.000000% perfect as possible. The 92xx and 93xx controllers are definitely known to exhibit issues under ZFS-class loads.

You really need the true HBA product and it needs to be in the software-RAID-bludgeoned-out IT mode. If "storcli" runs you have the wrong product. I am sorry that Avago has chosen to mislead people by trying to market this card as a multimode card. It's quite possible that at some point the driver support will actually be up to snuff, but ZFS is *incredibly* demanding. It will pummel your typical RAID controller CPU and cache. It will stress the driver to the breaking point with massive I/O operations.

It's not that this might not be able to work at some point some day. Ideally it *ought* to. However, you really do not want to be the guinea pig for a problematic new controller that no one else is using on FreeNAS at this time. It is hazardous to be the guy with the unusual controller. Lots of users here have learned that the hard way. :-(

Interesting tidbit, I actually arrived at this thread because I was doing some Google research on the 9460 for use with ESXi. I would totally buy one of these, just not for use with FreeNAS.
 

Scharbag

Guru
Joined
Feb 1, 2012
Messages
620
9211-8i plus SAS expander for the win!!
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,681
9211-8i plus SAS expander for the win!!

Given the limited speed of HDD, that's not horribly far from the truth. From a cabling perspective, I am guessing one of the 12Gbps 93xx HBA products would be the way to go, but I've mostly stayed away from the 12Gbps stuff because it doesn't make a lot of sense for FreeNAS with hard drives.

I've got some nice 12Gbps cards under ESXi that I got exclusively for the larger cache size -- mostly we stick to 9271CV's. I really do want to be able to do NVMe at scale, but the types of systems I run are generally "must try super hard not to fail" so RAID1 is usually mandated for VM datastores unless it's something like image build scratch space. It's amazing to see NVMe smash images around at 1-2GBytes/sec, and the price drops in the last year have made that really attractive.

I can see someone wanting to be able to support a bunch of NVMe flash in a FreeNAS system, and I'm not really sure what the ideal solution is there. The big issue with FreeNAS is always the driver reliability thing. I don't know what the state of affairs is for the 9400-8i is in terms of FreeBSD driver support, as I know there was a bit of a push to unify their device driver stack at one point, and I thought I heard it happened. I'm not sure I want to be the guinea pig on that one. Plus it isn't really clear how soon it'll actually be practical to use NVMe at scale on a non-prebuilt system. It looks like you really need a midplane with a PCIe switch on it to take real advantage of the potential. Seems likely to be harder than working with SAS once you get past a few devices. If you wanted to use RAID though I guess you need to go that route.
 
Joined
Dec 18, 2020
Messages
4
Because 9400-8i8e are cheap these days, I am just wondering, if there is some news regarding this topic?

If there is no solution to use these cards as true HBA, is there a modern HBA with 8e and 8i with 12G-SAS and NVME support?

If not, the only solution would be to buy two cards (8i + 8e)
or
one 16i-card (9500-16i ?) and and use one port with a 8i to any kind of 8e connector.

Right?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,681
Because 9400-8i8e are cheap these days, I am just wondering, if there is some news regarding this topic?

It's a bit hard to know without laying hands on one. My read is that it is based of the SAS 3408 HBA controller, and if so, the mpr(4) manpage claims it is a supported controller under mpr, which is the 12Gbps HBA driver. It might be worth trawling around the Broadcom support website to see if firmware 16.00.12.00 is available for it, as this is what TrueNAS is going to expect to be running on the card. If you can do that, then I would generally expect SATA and SAS to work fine, no idea about the NVMe.

Also just noting that we've strayed from the topic of 9460 with JBOD mode, which is covered by the MRSAS driver and not appropriate for use with TrueNAS. Please see

 
Joined
Dec 18, 2020
Messages
4
Yes, thanks, i totaly understand the HBA vs RAID topic.

And i think the mistake is on my side. Its because there are two different types of 9400 cards.

There is a Broadcom 9400 MegaRaid series which is described in this PDF (also shown in OPs screenshots):

And there is a Broadcom 9400 Tri-Mode Storage HBA series, which is described in this PDF:

Both type of 9400 series cards seems to be the second last series with PCIe 3.1 x8
But only the Tri Mode Card can act like a HBA.

The current is 9500, which is also devided into HBAs and MegaRaid series, is the latest version with PCIe 4.0 x8

9500 HBAs:

9500 MegaRaid:
 
Joined
Dec 18, 2020
Messages
4
Edit:

9400 series is the third newest generation today, 9500 is the second and 9600 is the latest generation.

9600 HBAs and MegaRaid in one document:
 

john60

Explorer
Joined
Nov 22, 2021
Messages
85
jgreco, Does Raid controller in JBOD mode = bad, unpredictable still true with Truenas scale (Debian-Linux)?
Specifically the 9361-8i
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,737
Does Raid controller in JBOD mode = bad, unpredictable still true with Truenas scale (Debian-Linux)?
Yes. Raw unfiltered disk access is a requirement of ZFS, not of FreeBSD in particular.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,681
jgreco, Does Raid controller in JBOD mode = bad, unpredictable still true with Truenas scale (Debian-Linux)?
Specifically the 9361-8i

Yes, why would you think otherwise? The reasons for this are explained in


I truly would like to understand the thought process here, because it's clear to me that there are a bunch of people who think "oh it's just a FreeBSD problem." It's not. Proxmox, IIRC, is also Debian-derived and they have similar warnings, as does Nexenta, which is based on I forget what. If there's something that I'm saying that's adding confusion, please let me know; my goal in these things is to communicate clearly and unambiguously, making life easier for newcomers to the project.
 

john60

Explorer
Joined
Nov 22, 2021
Messages
85
I would have expected that a raid controller / driver in jbod mode
would make the disk passthrough transparent to most of the kernel and applications.
So applications and/kernels would see a raw disk.

Not having a detail description of where in the sequence this breaks down in freenas, one can hope that the part that breaks my assumption changed over the course of many years.

Why would it make sense for jbod mode make the disks almost appear as real disks. Obviously some limitation that could be address over time. Since lots of time as passed, I was hoping things improved.


vendors often try to address all use case with one sku. Ie why jbod mode exists.

A hba and raid controller are probably implement in fpga during development, so the only difference is code. If volumes are small, it may actually be an fpga. In wireless protocols, raid is called forward error correcting codes. In wireless they are often swap for different levels of protection down to zero. So easy to think of hba like a fec turned off

No offense intended.
 
Last edited:

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,398
I would have expected that a raid controller / driver in jbod mode
would make the disk passthrough transparent to most of the kernel and applications.
So applications and/kernels would see a raw disk.

Yes, that's a reasonable expectation. The problem is, there's no telling if the RAID firmware properly implements this expectation in the JBOD mode. Oftentimes the RAID controller injects its cache in between the host system and the disks, because the quick & dirty way of implementing this is as a large RAID-0 stripe over all the disks, instead of a true JBOD mode. The controller at this point is free to reorder blocks, wait until its cache is full before flushing to disk, or other sorts of silliness.

Now you're silently dependent upon the RAID controller doing the right thing, as ZFS has no knowledge of what the RAID controller is doing.

What we're telling you has been learned in the school of hard knocks and lost data. Feel free or not to take our advice, but it's advice that's been earned by being bit multiple times across multiple systems.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,681
I would have expected that a raid controller / driver in jbod mode
would make the disk passthrough transparent to most of the kernel and applications.

If your expectation were actually true, then perhaps it wouldn't be a problem.

Not having a detail description of where in the sequence this breaks down in freenas, one can hope that the part that breaks my assumption changed over the course of many years.

It's not a breakdown in FreeNAS. LSI did its shit right, and wrote Initiator-Target firmware for its HBA to make it a real HBA. These cards are essentially identical to LSI RAID controllers, except that the firmware onboard is different, and the host-side driver is different as well. This means that these cards have more than one host driver that can run them (for example, mfi the RAID driver vs mps the IT driver) and card firmware to go along with that. It makes them work very differently. The LSI RAID driver is written to assume certain things about how to handle failures, timeouts, and errors, being relatively aggressive in trying to maintain ongoing operations at almost any cost, hoping for disk redundancy to cover the gaps. RAID firmware generally includes caching (on cards with cache), possibly write reordering, and unknown handling of error conditions in response to read failures or dead disks. Meanwhile, proper IT firmware, offered by LSI, is basically using the controller CPU primarily as a proxy to send commands generated by the driver back and forth to the drives, in a manner that looks a lot more like classic SATA AHCI or SCSI. The IT firmware controller CPU does not "interfere" or "process" the data, and is merely engaged in a game of telephone, passing messages up and down the SAS lanes from the host driver.

Why would it make sense for jbod mode make the disks almost appear as real disks.

You answer your own question:

vendors often try to address all use case with one sku. Ie why jbod mode exists.

The thing is, in many cases, you do just add more disks to a system where the system is keeping independent filesystems on each drive, and classically, filesystems have an fsck or chkdsk utility when something goes wrong. So it's no big deal to have a JBOD array of EXT3 filesystems to hold data. If something goes wrong, you fsck the disk and move on with life. But ZFS isn't that way. Errors, once introduced into the pool, tend to hang around and fester for the life of the pool. And I'll note that even the name "JBOD" is idiotic; IT mode is true "JBOD", it is just a bunch of disks attached to a host. RAID-card "JBOD" generally includes having to configure the disks as "JBOD", and maybe having to put up with other RAID-ish affectations such as caching and quirky drivers; it is certainly not "just" disks.

No offense intended.

How can one take offense at what you said? It is okay not to have the information. It could be offensive if you then tried to explain to me how I am so wrong and how your favored most awesome RAID controller was a perfect gift from heaven and would never do anything wrong. Observationally, we know RAID controllers to be potentially problematic, and TrueNAS isn't the only ZFS project to hold that position. So if you were to argue the point, then that's offensive and a waste of my time. If instead you have learned some new stuff and I've reorganized some misconceptions you had, then we've both had an enjoyable discussion and the world is a slightly better place as a result. I'm up for that any time I have some free time. :smile:
 

john60

Explorer
Joined
Nov 22, 2021
Messages
85
I think I understand some of this:
- HBA are raid cards if using raid SW, HBA are IT mode cards if using IT SW.
- there is no technical reason the JBOD SW can't be well implemented.
- historically, HBA cards did not really implement a truly transparent passthrough JBOD and misbehaved by leaking some raid or cache functionality, causing issues with ZFS, but appeared ok at first glance. ie: the school of hard knocks.

I have worked with Broadcom wireless SW. They SW is extremely professional. It feels improbable that they would have
buggy production SW persists for years like this. So I am missing something.
Is it possible that JBOD means something different than IT mode? In this way, their JBOD SW could be bug-free, it is just not what ZFS needs.

Are there test suites available to validate new versions of HBA IT firmware or new cards running the existing HBA IT sw?
Does IX-Systems have these test suites?

Are there new versions of the HBA IT firmware, or is it a frozen thing?

Are there new cards that run the HBA IT firmware or are these old cards only?

Is HBA IT mode a mainstream and supported product or are HBA mainly used as RAID controller and the IT mode was done as an experiment?

I am not trying to debate, just trying to understand.
 
Last edited:

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,681
- there is no technical reason the JBOD SW can't be well implemented.

Sorry, this sounds like you're unable to come to terms with the fact that RAID and HBA are implemented somewhat differently. I'm not interested in speculating that "JBOD could be implemented more like IT mode". We have to work with what is actually available in the real world.

I have worked with Broadcom wireless SW. They SW is extremely professional. It feels improbable that they would have
buggy production SW persists for years like this. So I am missing something.

Yeah, apparently you've missed the crapfest that is Broadcom ethernet; your opinion of their products might be different if you had suffered this. Additionally, Broadcom didn't author their HBA IT mode stuff, LSI did, like ten or fifteen years ago. That's why we refer to these as "LSI HBA". Different company.

Are there test suites available to validate new versions of HBA IT firmware or new cards running the existing HBA IT sw?
Does IX-Systems have these test suites?

You're welcome to ask Broadcom support. From external appearances, they do not appear to be releasing significant new versions and primarily focus on bugfixes. My suspicion is that they have some key members of the original firmware/driver team on retainer.

Are there new cards that run the HBA IT firmware or are these old cards only?

It's hard to know; we're not seeing new products roll out every few years anymore. My personal feeling is that we are in the long tail of SAS product evolution. You can see a recent discussion of this at


Broadcom seems more focused on MegaRAID and it seems likely that we will not be seeing new HBA products moving forward, but that's probably okay since there's a healthy supply of SAS9300 (12Gbps) and earlier 6Gbps products already out there.

Is HBA IT mode a mainstream and supported product or are HBA mainly used as RAID controller and the IT mode was done as an experiment?

HBA definitely was a mainstream product of LSI, but in the channel, a lot of this was done by selling the 9240-8i retail package and then allowing it to be firmware downgraded in the field from a low end RAID controller to an HBA. It is one of the few products that allowed a host to act as a SAS target, among other things, something the RAID firmware cannot do.
 
Top