SOLVED Terrible performance on HP DL380 g9 / H240 smartarray

lc10239

Dabbler
Joined
Jan 5, 2022
Messages
12
Hello,
I've done a fresh install of TrueNAS SCALE and both interacting with the system and fio testing are showing truly awful performance numbers-- IOPS in the double digits whether on SSD or HDD according to FIO. The IOPs / BW reported by fio do not seem to depend on whether I am doing single disk, a 1 VDEV mirror with the SSDs, or a 6-way stripe of mirror vdevs on the HDDs. The only thing that makes a difference is adding a SLOG on the stripe of mirrors, which consistently adds ~50% to the IOPs (so 70-100 --> 120-150). Record size also seems to have no impact on the IOPS until I start going above 256.

Hardware is a HP Proliant DL380 gen 9
CPU: 2x Xeon E5-2660 v3 (Haswell, 20 cores)
RAM: 24x 16GB
HBA: HP Smart HBA H240, in "HBA mode" with an HP 12G SAS expander
Storage / SSD: Samsung 860 Pro 1TB
Storage / HDD: HP MB3000GCWDB (3 TB SATA @ 7200)
Storage / Boot: Internal HP 8GB microSD (have also tried booting off of the HDD)

I've been using the following fio settings:
fio --filename=test --ioengine=io_uring --sync=1 --bs=128k --numjobs=1 --iodepth=4 --group_reporting --name=test --filesize=2G --runtime=60 --rw=randwrite && rm test
Single disk:
fio --filename=/dev/sdo --direct=1 --sync=1 --bs=4k --numjobs=1 --iodepth=4 --group_reporting --name=test --filesize=5G --runtime=60 --rw=randrw

Turning sync off makes the IOPs substantially better, but alarm bells are going off that there is *no* difference in performance between 6 stripes and a single disk and an SSD.

Interestingly I also encountered this on Proxmox, which is also Debian/OpenZFS-based, so I am wondering if there is an issue there. I had seen another thread suggesting that the core count was a detriment with OpenZFS, and a number of posts suggesting that the H240 is an excellent, high-performance adapter, so I'm at a loss. The server is in good condition-- old decom stock from our datacenter and was supporting some high-speed databases just fine so I'm not expecting to see hardware issues.

I am planning to switch the adapter back to RAID mode just to check IO performance on native RAID, but I really would prefer to use TrueNAS SCALE as it hits every design spec for my homelab other than the terrible performance.

Any thoughts on tracing the root cause would be appreciated.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Interestingly

(mutters)

I'm not expecting to see hardware issues.

Well, here's a late holiday surprise, then.

The hardware guides for ZFS on TrueNAS and Proxmox both tell you not to use a RAID controller. For some reason, this mistake has become MUCH more common in the last several years, so I wrote a summary of why:


The actual story is somewhat more complex than the article suggests, but basically it works out that while the H240 may be a good controller for Windows applications, it's a custom PMC Sierra jobbie (IIRC) and the drivers for it under FreeBSD and Linux aren't that great, and when combined with the immense stresses ZFS dishes out as part of ordinary business, it simply doesn't work well because the H240 wasn't designed for this sort of traffic.

The thread count shouldn't be a dealbreaker. That's a 2.6/3.3GHz CPU. I might pick a higher performance part with lower core count, but it isn't going to be SUBSTANTIALLY faster. Avoid the low end low core speed parts with weedy turbo speeds.

So just to be clear, the H240 may be called an "HBA" by HP marketing, but it is actually running as a stripped-down RAID controller, with a RAID controller driver, and this doesn't work out too well. It's just as bad on FreeBSD as it is on Linux, so it's clearly the device, not the OS. On FreeBSD, these typically show up under the catastrophically bad-for-ZFS CISS driver, and I don't recall what it shows up as on Linux.

Replace the controller with an LSI HBA crossflashed to IT mode. This is the best path to success. Some searching of the forums may yield more specific recommendations about what exactly is compatible with your specific HP model.
 

lc10239

Dabbler
Joined
Jan 5, 2022
Messages
12
Just to be clear, the H240 supports a "RAID mode" which does abstract things away, and an "HBA mode" which I understand is the same as "IT mode" for LSI chips. See for instance this STH thread. The H240 is currently in HBA mode, but its possible its still acting as a RAID card. Is there a way to verify that with e.g. systool?

Performance-wise, this had been in production running databases on CentOS 7, using the hpsa driver (a SCSI, not RAID, driver) which has been mainlined for a long time. I can see that this is the driver in use in TrueNAS SCALE with lsmod and systool. To be clear as well, I have not tested it in FreeBSD since TrueNAS Core would not fit my usecase.

I appreciate the feedback, and I'm not trying to be stubborn in the face of good advice, but it is not reasonable to suggest that the H240 simply chokes on IOPs above 75 when running zero RAID calculations. Others are reporting that this controller should be capable of far more than that, and I see reports of Linux users hitting ~2000 IOPS in software arrays of a few thousand.

Short of installing e.g. Fedora and running fio, is there another way to pin down where exactly the hangup is? I'm not looking for you to make a bad controller work with software that isnt designed for it, but I would like to nail down the actual cause here if for no other reason than that I can avoid wasting yalls time in the future.

Thanks!
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Others are reporting that this controller
I dunno, my quick take away from that thread is that it's less than reliable.

It's hard to be sure about this without more data, but the experience that does exist is as follows (more or less equally valid for Linux and FreeBSD):
  • Adaptec's non-Windows drivers were disastrous typically, and unacceptably unreliable at best. At one point, they had someone try to write and maintain a FreeBSD driver for their then-shiny-and-new HBA (Rocket 750?). It was their best effort ever, and yet it couldn't operate without erroring out way too frequently. Their RAID stuff often didn't even have drivers.
  • LSI RAID controllers are weird. Mixing RAID and non-RAID, at least on some models, absolutely tanks performance. Some have special non-RAID modes. With the right setup, I don't think anybody's complained about performance or reliability - but it takes fiddling and it's an open question whether the mrsas driver really is good enough.
  • LSI HBAs work very well, with updated firmware. The first ~7 revisions always seem to be experimental, at best.
So, if what you want is to get things done ASAP, go with LSI HBAs. If you want to tinker, please do so and let us know how it went. LSI's monopoly isn't great, but it was earned through technical superiority.

Sidenote: With the recent talk of NVMe spinning rust, I hope that ends up happening. PCIe 4.0 x1 to 36 3.5" bays is trivial with today's platforms and would free us from SAS controllers and expanders, while adding bandwidth. Or rather, it should be trivial, but I don't think anybody has support for bifurcating that far down, which would leave us back at PCIe switches made by... *checks notes* LSI/Broadcom/Avago or Microchip? D'oh.
 

lc10239

Dabbler
Joined
Jan 5, 2022
Messages
12
Makes sense. Followup question: The H240 has 2 SAS connectors going into an expander. The Mobo also has 2 unused SAS connectors. Is it reasonable to just connect the expander to the mobo and expect "reasonable" performance?

As for NVMe-- I mentioned wanting to use TrueNAS in the datacenter. We are in that world, when we built things out we did not have the money for 100TB flash from someone like Nimble (neighborhood of $500k?). But NVMe is super cheap, and RAID cards aren't much of a thing there-- when each NVMe uses 4 PCIe lanes, what RAID card is going to handle 10 of them?

Our current software solution is incredibly complex to manage and the performance doesn't justify it so I'm hoping to give TrueNAS core a shot on a spare node to see how it does.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
The Mobo also has 2 unused SAS connectors. Is it reasonable to just connect the expander to the mobo and expect "reasonable" performance?
I expect those are actually SATA, not SAS, so it wouldn't work at all with an expander.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
an "HBA mode" which I understand is the same as "IT mode" for LSI chips.

You misunderstand. It wants to be the same, but does not have the same performance and reliability characteristics.

This is evidenced by your coming here with these problems.

See for instance this STH thread. The H240 is currently in HBA mode, but its possible its still acting as a RAID card. Is there a way to verify that with e.g. systool?

It's basically irrelevant. It's known not to work well. It may be trying, trying hard even, to be a HBA, but the results are underwhelming. Regardless of whether it is in "HBA mode" or "RAID mode", it is probably using the same OS driver, and as previously explained in the article I linked to above, this does not bode well. Most HBA's are not highly performant, as they are sold as the cheapest option for connecting drives, and often very little has been done to tune them for performance. The PMC Sierra chips are the Realteks of the SAS world. They work, but they're nothing to write home about. We advocate stuff like LSI HBA's and Intel ethernets because these companies have spent large amounts of engineering time authoring drivers and fine-tuning for performance. These qualities are very helpful to making a NAS perform well.

not reasonable to suggest that the H240 simply chokes on IOPs above 75 when running zero RAID calculations.

Yes, fine. By that same argument ... It's not reasonable to suggest that an LSI 9271CV-8i would choke on a few thousand IOPS either, but, we know from experience, the driver layer and other issues with this RAID card make it undesirable for use with ZFS, even if it is my preferred card for running local ESXi storage.

What you're asking for is an in-depth analysis of what the actual issue is. There may well be hobbyists out there interested in doing this, and yay if they show up and tell you that you just need to twiddle thing X and it magically works better. Some of us do this professionally, but I have exactly zero customers who insist on using an H240 instead of a cheap LSI HBA. Doing this kind of deep dive generally takes a good bit of analysis and profiling to figure out where the problem lies, so no one is willing to pay me a thousand or more for the time and hardware to do this, when simply ordering an appropriate HBA is a zero-cost fix. Likewise, iXsystems has not been big on resolving issues with random non-LSI HBA's. They've picked LSI HBA's to sell in their TrueNAS hardware, and have worked with LSI to address a few known issues.

The pragmatic fix is just to use the correct recommended hardware. We all wish PC hardware wasn't so $#!+ty like this.

is there another way to pin down where exactly the hangup is? I'm not looking for you to make a bad controller work with software that isnt designed for it, but I would like to nail down the actual cause here if for no other reason than that I can avoid wasting yalls time in the future.

Absolutely. You can become an expert in the device driver involved, the firmware on the card, and how that all interacts with ZFS. Doing this at the kernel source code level may require several dozens of your hours, even if you already have some familiarity with this sort of thing. I've done that kind of work; that was the genesis of the oft-discussed bug 1530 of days long past. It's tedious, annoying work to understand all the interactions of the relatively complex stack of software involved in making ZFS work.

My dislike for this sort of work makes me very happy to buy used LSI 2008/2308's for $30 a pop, used LSI 3008's for $90 a pop, or even pay for brand new controllers. My hourly rate is close enough to the cost of a new controller that no one is interested in paying me to do the work, either, it's cheaper just to buy the right thing in the first place.

As @Ericloewe commented, LSI has a bit of a monopoly here. I am absolutely fine with that edge vanishing, but I see it as unlikely. Few of the other vendors have put the time and effort into their drivers. The only other one I trust is (on FreeBSD) isci for the C600-style Intel SCU.

Sometimes you have to play the hand you are dealt.
 

lc10239

Dabbler
Joined
Jan 5, 2022
Messages
12
Thanks for the followup.
What you're asking for is an in-depth analysis of what the actual issue is.
What I was looking for was "if I saw something like this happen in prod, how would ID the adapter as the issue". But I think what I'm hearing is twofold. First, the combination of no ZFS errors, lots of free RAM, low CPU usage and a bad `fio` report does point at the controller especially given that it is non-LSI. And second, that funky hardware / drivers are going to be not worth troubleshooting.

I'm going to hang with the RAID function + ESXi on this box, until I get a chance to get an LSI adapter. From what I'm reading this sounds like it should be an LSI 9207 8i hanging off of the expander-- I'm just not clear if that's going to support the full 16 SATA drives in the system.

Thanks!
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
if I saw something like this happen in prod, how would ID the adapter as the issue

Well, it's like any IT issue. You could:

1) Talk to your field engineer, and have them run the issue up the tree within their organization. This is the classic answer, but it doesn't work for TrueNAS unless you've purchased a system from iXsystems. It will, however, reliably get you an answer of some sort where it is available.

2) Read the fine manual and see if you can identify the problem on your own. This often requires you becoming something of a SME on an issue. Some of us do this on a semi-regular basis. My ZFS knowledge jumped significantly a decade ago when I jumped in the deep end trying to use ZFS for block storage.

3) See if you can avail yourself of Internet resources to identify the problem. There are forums like this one, or Reddit, but a large percentage of the users tend to be SOHO or hobbyist users. There are some of us who do this professionally, but speaking for myself, I tend to make pragmatic choices, because the CTO who signs off on purchases is a cheapskate and a tightwad, and also has this uncanny awareness of where corners can be cut. (That CTO cheapskate is ME, by the way.) I would rather find ways to exploit cheaper IT options.

The problem, as previously outlined, is that no one really has a motivation to support a second or third HBA chipset/firmware/driver combo. So when you come to these forums, the only real answer is "try this thing we know to work well", and for those who resist, then the answer becomes "k, great, good luck with that, let us know how it goes".

So I feel like you're right:

But I think what I'm hearing is twofold. First, the combination of no ZFS errors, lots of free RAM, low CPU usage and a bad `fio` report does point at the controller especially given that it is non-LSI. And second, that funky hardware / drivers are going to be not worth troubleshooting.

So I'm not blowing off your issue just to be rude. I'm happy to try to help people, within reason, and so are all the other posters here. We happily adapt to new realities, but I'm not paid to do this, and even the paid people over at iXsystems aren't desperate to solve these issues.

It's not necessarily pleasant to write off hardware that could/should be better supported. On an unrelated issue, we're currently going through the ESXi 6.7->7 transition here and having to trash lots of storage and network controllers that no longer work because they lack VMware native drivers. It's unfortunate, but stuff happens. Card manufacturers do not want to go back and create new drivers for hardware they no longer sell. No profit in it. That sucks. Sigh. So I understand your frustration.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
On the plus side, cheap cards for those of us who run OSes that are less aggressive about dropping support for perfectly-serviceable hardware.
 

lc10239

Dabbler
Joined
Jan 5, 2022
Messages
12
A little update:

Not sure how often yall are told you're correct, but I reconfigured the card for RAID (RAID 1 of 2x Samsung 860 pro, RAID10 of 12x HP 3TB SATA) and installed ESXI. The performance is still *abysmal*, and we know it isn't the drivers / OS because the card is on the VMware HCL and the drivers are provided by HP (OEM-modified VMware installation). This card is, indeed, a complete dog. But upside! It's apparently not ZFS, or the hpsa driver, that are the problem: 75 IOPs might simply be correct.

It's sort of incredible that the devs who were using this box never bothered to look into it, and I made the mistake of assuming that a team of developers who were given a big budget wouldn't have gone prod with something so terrible.

So shame on me, and thanks again for the help. I don't take it as a brush-off at all, I totally get not wanting to troubleshoot uncountable crappy HBAs.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
It's sort of incredible that the devs who were using this box never bothered to look into it
It could be worse. They could've "looked into it". I inherited a bunch of Xeon E5 v3 servers with an excess of RAM because "the app gets really slow if not using a RAM disk". While reverse-engineering how the app was supposed to work (because documentation is for losers), at one point I accidentally ran it on a server's boot pool (mirrored 250 GB Samsung 860 Evos, nothing fancy at all). Did I notice when performance tanked? Of course not, it ran the same, I noticed when I saw pool usage increasing as crud accumulated during my experiments.
 
Top