LSI MegaRAID 9460-16i with physical drives in JBOD mode

john60

Explorer
Joined
Nov 22, 2021
Messages
85
I do not take issue with RAID SW being different from HBA IT mode SW.

I do not take issue with past JBOD mode of RAID software doing a poor job and causing issues, some many years ago.

I am somewhat surprised that there is not a test suite developed by IX system or this open-source community to
validate the required functionality of this mode of operation of these cards that convert SATA/SAS to PCI. Without
a test suite and a failed test case, there is nothing for Broadcom to target fixes and validate new SW for new cards.

While the hobby community could be happy with "these old cards work when reflashed with this old sw", I thought IX systems was trying to sell to big enterprises. The core technology of ZFS only works with this old HW using old SW, can't help their perception of a serious company.

This exchange started with me asking if things were still the same.
You responded, " why would you think otherwise? " and "I truly would like to understand the thought process here".

Well, I expected SW to evolve driven by failed test cases.

Can anyone offer specific things that can be checked? ie the beginnings of a test case.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I am somewhat surprised that there is not a test suite developed by IX system or this open-source community to
validate the required functionality of this mode of operation of these cards that convert SATA/SAS to PCI. Without
a test suite and a failed test case, there is nothing for Broadcom to target fixes and validate new SW for new cards.

You have a seriously weird idea of how this sort of thing works. What really happens is that you have two engineering teams, one at (let's say) Seagate and one at (let's say) LSI. Both write their software to a third party's specification, in this case the SAS spec, and do their best to implement it as they believe it should be. Then implementation meets reality, and some edge case fails to work correctly. Users report the problem, quite possibly to both manufacturers, and then engineering triages the issue and someone comes to a conclusion as to what the appropriate fix is.

There is no magic test suite that tests for "required functionality" and I can also guarantee you that there are numerous hacks in both device and controller firmware to work around quirks that each manufacturer has identified in the opposite kind of devices. Required functionality is that it actually works with existing devices and that it attempts to be standards compliant so that it hopefully works with future devices.

Without
a test suite and a failed test case, there is nothing for Broadcom to target fixes and validate new SW for new cards.

I can pretty much guarantee that LSI had a shelf full of hard drives from a large variety of manufacturers that also had various firmware versions, and that there was some testing that would go on. It is likely that this was later transferred to Avago and then again to Broadcom. However, since a lot of this is no longer being actively developed, it is quite possible that there is only one or two firmware developers from the original team being kept on retainer to address problems that may show up. As the edge cases were mostly sanded off years ago, testing today probably consists of checking to see if whatever bugfix is being attempted actually fixes the bug in question, which is most likely a weird interaction with the firmware on another device. Also, thanks to the collapse of the HDD market, there are fewer target firmware stacks out there, so there are fewer landmines to find.

New cards. *snort*. This isn't 2010.
 

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824
Huh? This is one of those posts that truly stump me. It's just a quote with no original content. Is this meant to be sarcasm, cause I don't get it... someone help me!!!

Also, @jgreco I made an official logo for my uh... trademarked name... Are you proud of me?
 

john60

Explorer
Joined
Nov 22, 2021
Messages
85
Forbidden Arts of ZFS | committing the greatest sin & getting away with it [Hardware RAID with ZFS] is very details and suggests my innocent initial inquiry is not only valid, but that this expert in ZFS says ZFS works with Raid 0, just turn off the cache. So JBOD should work even more transparently. He even demonstrates moving is data from Raid 0 on a Raid card to a HBA IT mode card.

So to answer the question "I truly would like to understand the thought process here, because it's clear to me that there are a bunch of people who think ...", not only is it logical for me to anticipate that a card that goes from SATA to PCI that claims transparent passthrough would do what they claim. Now I see others who claim to be experts demonstrate this capability.

I understand your warning about past issues with RAID cards, thank you for that. I am soliciting help from the ZFS developer to understand better what the gotcha are and the flaws that may exist in the youtube video with respect of using a Raid card with ZFS.
Also, I have a system with a 9361-8i arriving soon, that I can experiment with.

If there is something specific that does not work, I would be happy to try to raise a trouble ticket with Broadcom.
Without specifics, all I have is "it did not work in the past", "the SW today sucks as much as it did in the past and will never improve".
I'm a fan of Broadcom, so it's hard for me to accept this last statement.
 
Last edited:

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
Forbidden Arts of ZFS | committing the greatest sin & getting away with it [Hardware RAID with ZFS] is very details and suggests my innocent initial inquiry is not only valid, but that this expert in ZFS says ZFS works with Raid 0, just turn off the cache. So JBOD should work even more transparently. He even demonstrates moving is data from Raid 0 on a Raid card to a HBA IT mode card.

Huh, learn something new every day. However, this is definitely not always guaranteed to work, and I suspect it's not viable long-term, as some of the comments below that video attest.

Your fanboyism for Broadcom is a dangerous trait, as it's a tech blind spot you've deliberately embraced. Running TrueNAS, let alone some other server operating system, should be an exercise in risk management, as data loss in the long run is inevitable. Being conservative helps prolong the coming of that day, and shortening the recovery and restore.
 

john60

Explorer
Joined
Nov 22, 2021
Messages
85
@Samuel Tai Agree, I should not be running anyone's data center. My whole career has been in cutting edge of high tech R&D. I like pushing boundaries. From this forum, I did learn about 2 local + Backblaze. So hopefully, my data is protected.

I managed a lot of developments where forward error-correcting (FEC) code is a module that plugs in or is stub out. In all these cases, the FEC logic is small compared to the overall systems. All my development instincts say the RAID (RAID is just a simple FEC) is not that complicated, and Broadcom can't be that incompetent that they can't bypass it. Yes, I am a fan of their work. So I will be careful.

So far I am learning from the developers is that ZFS is happy with a proper clean JBOD passthrough and even a disk RAID 0 (not so clean JBOD passthrough). The later, maybe pose issues for truenas functionality. Still exploring.
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
All my development instincts say the RAID (RAID is just a simple FEC) is not that complicated

Your instincts are incorrect in this case. Two words: Race conditions.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
ZFS says ZFS works with Raid 0

Of course ZFS works with RAID 0. How else do you think people hook up SATA AHCI disks? Single disk RAID 0. You can do all sorts of janky stupid stuff and it might even work, just as you might drive around for years without a seatbelt in your car without being killed. But if you put a 3Ware JBOD disk array on a ZFS system, and the controller fails, you MUST replace it with another 3Ware RAID controller, because it places a nifty little partition table on the disk that denotes the RAID configuration (even though it is "JBOD"), and then puts the virtual JBOD disk INSIDE that partition. This is just one example of why you really want to use raw disk.

but that this expert in ZFS

Art of Server is not an "expert in ZFS" or "developer of ZFS" that I am aware of. Just sells lots of reflashed HBA's on eBay.

Any idjit can get ZFS up and running on a RAID card. That's not the mark of success, however. We judge success as demonstrated by billions of problem-free run-hours. IT mode HBA's do that. RAID controllers do not. This is a dumb debate. If you value your data, use the equipment known to work correctly in as many conditions as possible. This discussion has all the dreariness of a debate about ECC. Smart people use it.
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
WHY DOES IT LOOK LIKE BUTTOCKS?
Kinda looks like a cloven hoofprint to me. Mark of the Devil, perhaps? :cool:

1675484608972.png
 

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824
WHY IS YOUR LOGO YELLING AT ME IN ALL CAPS? AND WHY DOES IT LOOK LIKE BUTTOCKS?
Dang... and I thought I did a great job at it, too.... now you're going to make me dream of buttocks.... I guess back to the drawing board...
 

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824

john60

Explorer
Joined
Nov 22, 2021
Messages
85
But if you put a 3Ware JBOD disk array on a ZFS system, and the controller fails, you MUST replace it with another 3Ware RAID controller, because it places a nifty little partition table on the disk that denotes the RAID configuration (even though it is "JBOD"), and then puts the virtual JBOD disk INSIDE that partition. This is just one example of why you really want to use raw disk.

Artoftheserver demonstrates on his youtube video creating a raid 0 using a raid card, then swapping out the raid card for an HBA IT mode card and the disk still works and zfs sees all the data. He also demonstrates that ZFS works on a multiple portions on a single disk but cautions against this for the specific reason that redundancy may be compromised if you inadvertently allocate multiple partitions on the same disk to a single vdev.

The developer on ZFS say that RAID 0 is fine. The only thing is truenas management may run into issues.
In one of his many videos, he talked about hot insert limitations and work arround. I will need to track this down.

In any event, I am not sure if the JBOD mode on the 9361-8i passes the disk as a raw disk or as a raid 0 disk. Unit arrives in a few days.

Art of Server is not an "expert in ZFS" or "developer of ZFS" that I am aware of. Just sells lots of reflashed HBA's on eBay.
He has lots of details in his numerous videos. In one video address point-by-point misconceptions about ZFS.
While this is not a qualification test, the specific details do strongly imply he knows something.

Any idjit can get ZFS up and running on a RAID card.
I am only trying to understand, no need for this adjective.

That's not the mark of success, however. We judge success as demonstrated by billions of problem-free run-hours. IT mode HBA's do that. RAID controllers do not. This is a dumb debate. If you value your data, use the equipment known to work correctly in as many conditions as possible. This discussion has all the dreariness of a debate about ECC. Smart people use it.
The open question is JBOD mode on RAID card ok to use or not.

You say no, it is better to stay with what you have experience with.
Will you advocate staying on hard drives forever? Why has sync-match CPU disappeared (ie: ecc equivalent for CPU), I worked on these in the 1980. Why are analog amplifiers no longer in use? Why are vinyl records gone.

The natural order for technology is that it advances.

Let's recap the specific potential issues with using a raid card in JBOD mode:
- cache ... Artoftheserver gives instructions to turn off. Not sure if this is necessary, but turning it off avoids the issue.
- JBOD implementation may range from presenting a disk as a raw disk, a raid 0 disk, to maybe something else. The ZFS developer say raid 0 ok. checking for device names and serial numbers will confirm if JBOD is in raw mode. ie making it even less risky.
- Swapping RAID cards does not cause an issue according to the youtube video demonstration of an extreme case of swapping out a raid card to an HBA card.
-if raid card implements JOB as a raid 0, then truenas management may have some issues. Would like to understand what these are.
- turned off code is not an issue unless the memory in the card is insufficient to fit the combo of JBOD=IT mode and raid code.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
The open question is JBOD mode on RAID card ok to use or not.
No.

You are right. Technology evolves. There are affordable Atom based mainboards with 12 SATA ports.
Everything with higher density and higher performance requirements is driving towards all NVMe. Case closed.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Artoftheserver demonstrates on his youtube video creating a raid 0 using a raid card, then swapping out the raid card for an HBA IT mode card and the disk still works and zfs sees all the data.

So if I make a video showing you where this doesn't work for a 3Ware RAID controller, then you'll believe me?

Let's recap the specific potential issues with using a raid card in JBOD mode:

You skipped the biggest one, which is that the OS drivers for RAID cards are designed for RAID failure modes and do not always handle disk error modes correctly.

- cache ... Artoftheserver gives instructions to turn off. Not sure if this is necessary, but turning it off avoids the issue.

It's definitely necessary, especially if it is not properly backed up or if there is any chance of write reordering.

The ZFS developer say raid 0 ok

Name the ZFS developer who said this. If you cannot name a developer, then I am going to ask you to stop spreading misinformation in the guise of official-sounding claims.

- Swapping RAID cards does not cause an issue according to the youtube video demonstration of an extreme case of swapping out a raid card to an HBA card.

I can show you cases where this is not true. This is part of the genesis of the set of documentation here on the forums pointing people away from RAID controllers.

-if raid card implements JOB as a raid 0, then truenas management may have some issues. Would like to understand what these are.

This would certainly include issues with error detection and SMART passthru.

Will you advocate staying on hard drives forever?

No. And I don't advocate using ZFS in many storage scenarios either. Flash and Optane are both newer technologies, and the primary reason we use hard drives comes down to cost-per-byte. One of the nice things about NVMe is that contemporary usage almost always has these being direct access devices without any RAID-type controller in between the host and the storage device.

For that matter, I actually quite like the LSI RAID controllers as they are the only RAID product that has generally been design-stable for more than a decade. I do advocate the right tool for the job.

Why are vinyl records gone.

Probably because it's really difficult to carry around a few thousand hours of music with you using vinyl.

While this is not a qualification test, the specific details do strongly imply he knows something.

Okay, so, here, let me point this out and let you puzzle this out:

Why do you think that LSI has separate IT replacement firmware for their low-end RAID cards if a RAID card like the 9361 is just as good?

The manufacturer's actions in designing their product line do strongly imply they know something, and that you're just wrong. I've tried to clue you in here. You're welcome to believe that you can substitute an LSI RAID controller and yes it will do basic read/write operations fine, but 99% is really not good enough for ZFS. Once you get into disk errors, drive swapouts, controller swaps, and other more esoteric issues, the product behaviour diverges. We don't want users to have to figure out how to mark a replaced drive as a new JBOD array member when trying to replace a failed disk. It should be no more complicated than "pull old disk, insert new, select Replace option".
 

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824
You say no, it is better to stay with what you have experience with.
Will you advocate staying on hard drives forever? Why has sync-match CPU disappeared (ie: ecc equivalent for CPU), I worked on these in the 1980. Why are analog amplifiers no longer in use? Why are vinyl records gone.

The natural order for technology is that it advances.
This isn't really a good analogy. You think it does but that's only cause either you're cherry-picking or you're just not aware of such a case, but that doesn't mean they don't exist.

As an example, the C language was designed in 1972 by Dennies Ritchie. Yes, 51 years ago. There have been some revisions, but the core of the language stays mostly the same. I've watched countless presentations on C#, Swift, <insert some trendy new language here> by some fresh junior developer and I laugh whenever they say the reason to switch is because C is "old/ancient". I guess they missed the memo that C is the core for literally every major modern OS out there (iOS, MacOS, FreeBSD, Linux, Windows, etc etc..). People keep using C, for mission-critical use mind you, because of its simplicity, power, and speed. Not very many languages out there could compete with that... Maybe Rust, but the jury is still out on that. Other examples include Rambus trying to replace DDR, Intel Itanium trying to replace x86. Neither of those things came to fruition obviously.

Bottom line, you don't just use "new shiny" things blindly. There has to be a good rationale and justification behind it.
 
Last edited:

john60

Explorer
Joined
Nov 22, 2021
Messages
85
C++ is an evolution of C, and it is used in extreme real-time environments, and it is newer and shinier than C. Earlier compilers would generate C code from C++. The bottom line, you can screw things up in C++ or C, but more modern systems tend to move to C++ (my experience). Now it is easy to mess things up in C++ with a bunch of virtual function calls and real-time memory allocation, but this is also true in C with its equivalent of these. This thread is probably not the place to compare assembly from C and C++.

FYI, the ZFS developer question thread is here.

My 9361-8i system arrives in 3 hours. Either I eat pie, or the disks from my HBA system load without issue.
If the disks load, then I need to address the potential issues you raised of swapping out the 9361-8i with another 9361-8i or an HBA
and do some failure scenario testing. My concern with failure testing is that my HBA system did act weird when I did failure testing with my
cheapest I could find ebay hard disks.
 

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824
C++ is an evolution of C, and it is used in extreme real-time environments, and it is newer and shinier than C. Earlier compilers would generate C code from C++. The bottom line, you can screw things up in C++ or C, but more modern systems tend to move to C++ (my experience). Now it is easy to mess things up in C++ with a bunch of virtual function calls and real-time memory allocation, but this is also true in C with its equivalent of these. This thread is probably not the place to compare assembly from C and C++.
C++ is a new & separate language. I wouldn't consider it an "evolution" of C the same as I wouldn't consider Objective-C or C# an evolution of C. They're all separate languages that just happen to have a common subset. Linus Torvalds, himself, even ranted about C++ as a language when someone tried to suggest the kernel to use some C++ code.

C is elegant and simple. The reason why it is so easy to mess things up in C++ is because it is so big and complicated that most developers sometimes don't even understand entirely its features and write convoluted code with it. And you know, considering that nearly ALL of modern operating systems use C (NOT C++) and even Linus Torvalds himself thinks C++ is "garbage" (his words, not mine), I would really highly reconsider the notion that C++ is somehow an "evolution" of C. Regardless of whatever opinions you may hold for Linux or Linus Torvalds, the undeniable fact is, when it comes to the world's most used OS's, the code is all C. Hell, even Python interpreter is written in C. If C++ is truly an "evolution" of C, then you'd think C would die and C++ would be everywhere.
 
Last edited:

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
My 9361-8i system arrives in 3 hours. Either I eat pie, or the disks from my HBA system load without issue.

They'll likely load without issue. That was never in serious doubt; it's what happens during exceptional conditions that's really the problem. It is unlikely that you will reach the year mark and still have it be problem-free (possibly sooner). Lots of people have had problems with the MFI firmware and driver. If you're lucky, you'll be on the MRSAS driver and MFI firmware, but still suffers some of the same underlying issues.


or search for "mfi0" and "timeout".
 

john60

Explorer
Joined
Nov 22, 2021
Messages
85
C++ is a new & separate language. I wouldn't consider it an "evolution" of C the same as I wouldn't consider Objective-C or C# an evolution of C. They're all separate languages that just happen to have a common subset. Linus Torvalds, himself, even ranted about C++ as a language when someone tried to suggest the kernel to use some C++ code.
yes, lots of people rant about assembly language being the best when I first started programming.

C is elegant and simple. The reason why it is so easy to mess things up in C++ is because it is so big and complicated that most developers sometimes don't even understand entirely its features and write convoluted code with it.
A developer that can't master a simple programing language (C++), is probably not the best example of a developer.

And you know, considering that nearly ALL of modern operating systems use C (NOT C++) and even Linus Torvalds himself thinks C++ is "garbage" (his words, not mine), I would really highly reconsider the notion that C++ is somehow an "evolution" of C. Regardless of whatever opinions you may hold for Linux or Linus Torvalds, the undeniable fact is, when it comes to the world's most used OS's, the code is all C.

Hell, even Python interpreter is written in C. If C++ is truly an "evolution" of C, then you'd think C would die and C++ would be everywhere.
To quote this thread "CPython 1.0 was released in 1989. At that time, C was just recently standardized. C++ was almost unknown".

in any event, this thread was about HBA good, and RAID JBOD bad.
 
Top