SOLVED Slow writes and "Device X is causing slow I/O" with Scale on Intel Cougar Point controller, HP Microserver Gen8, but works fine on Core

aklibisz

Dabbler
Joined
Aug 23, 2023
Messages
13
I have an interesting issue and solution regarding TrueNAS Scale performance on my HP Microserver Gen8.

I first reported it in this thread: https://www.truenas.com/community/t...ow-i-o-on-pool-couple-questions.109419/page-3 It is also similar to this post, which never came to a conclusion: https://www.truenas.com/community/t...th-scale-22-02-on-hp-microserver-gen-8.99884/

I'm reporting it as a standalone issue/post, in case anyone else runs into it.

My initial system:
  • HP Microserver Gen8
  • Intel Xeon e3-1220lv2
  • 16GB memory
  • Intel Cougar Point SATA600 controller, integrated into the motherboard. This is technically a "RAID controller", but I configured it with SATA AHCI mode enabled (i.e, RAID disabled), and write cache disabled
  • 4x 4TB Seagate Ironwolf drives, brand new, verified that these are CMR, connected to the Intel controller
  • TrueNAS OS installed on a 240GB PNY 2.5" SSD, connected to a separate SATA controller
  • Running TrueNAS Scale 22.12.3.3
When I setup this system, I ran some simple performance tests using cp from the OS drive to the pool, cp over samba, and scp to copy the TrueNAS ISO back and forth over a 1Gbps switch.

When copying data into the pool, I found:
  • scp speeds would start off fast, ~700Mbps, and then would decrease to ~20Mbps. The scp times would also change drastically between runs, fluctuating from ~35 second to over 7 minutes to copy the file.
  • cp over samba doesn't display the exact transfer speeds while it's running, but the total transfer duration was similar.
  • copying the 1.6GB ISO from /tmp (on the boot SSD) to /mnt/<pool> takes way too long. Like, minutes. I gave up waiting for it.
After running just ~5 of these copy tests, TrueNAS showed a warning for all four drives: "Device /dev/disk/by-partuuid/<uuid> is causing slow I/O on pool <pool>".

The read speeds were actually good, around ~900Mbps. And they didn't seem to obviously fluctuate and degrade like the writes.

I tried disabling encryption, disabling synchronization, and some other things, but nothing helped.

Then I decided to try Core, which immediately completely resolved the write speed issues. No messing with drivers or configurations, it's just fixed out of the box.

Then I was advised to try a more well-known HBA LSI card, so I purchased one on eBay titled "IBM H1110 SAS-2 6Gbps HBA LSI 9211-4i P20 IT Mode for ZFS FreeNAS unRAID US". The exact listing is here: https://www.ebay.com/itm/133308217106

I got the card a few days later and tried it with Core. Everything still works well. Switching between the on-board Intel RAID controller and the HBA card, between reboots, also works.

I tried Scale with the card, everything also works well. I tried Scale again with the on-board Intel RAID controller, and writes are still terribly slow.

So, in the end, I think I'm going to stick with Core and continue using the HBA. I don't see any obvious must-have features in Scale, and I'm a bit spooked by how poorly it performed. I also like that if the HBA ever fails, I can temporarily switch over to the on-board Intel until I get a new HBA.

I took a debug dump from Scale shortly after performing a few scp tests. If anyone is interested, I can share it.

TLDR: If writes are extremely slow on your Microserver Gen8 running Scale, either switch to Core, or install one of the recommend HBA LSI cards (search LSI 9211 Truenas on eBay).
 
Top