Device is causing slow I/O on pool, couple questions

aklibisz

Dabbler
Joined
Aug 23, 2023
Messages
13
What is said boot drive? How are you conncecting those drives to the motherboard?

Please look ath this post.
Thanks for the quick response here.

The boot drive is a brand new 250GB PNY SSD. Boot drive is connected via SATA directly to motherboard. The 4-drive array is also SATA. The BIOS sees two sata controllers. One for the boot drive, one for the 4-drive array.

I've seen that post, but I'll have to re-read it later.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Output of camcontrol devlist please. Won't work in SCALE. Define SATA controllers... how are you connecting your 4 drives "array"?

Please read the following resources.
 

aklibisz

Dabbler
Joined
Aug 23, 2023
Messages
13
Define SATA controllers... how are you connecting your 4 drives "array"?
I’m sliding them into the slots in the HP Proliant gen8 microserver. I’m not sure how else to describe it. This isn’t some sketchy/cheap sata multiplexer off eBay or aliexpress if that’s what you’re getting at.

I’ve read the resources linked but I don’t see/understand anything actionable there. The only thing I can think to try is to install TrueNAS core and see if that works better.
 

NickF

Guru
Joined
Jun 12, 2014
Messages
763
There’s a bunch of semi interesting things here. Can you PM me a debug from both source and destination systems ? and provide a brief summary of where you are now..what you tried recently…that kind of thing.

I need some more context here but I’m happy to help
 

aklibisz

Dabbler
Joined
Aug 23, 2023
Messages
13
Thanks!
Can you PM me a debug from both source and destination systems ? and provide a brief summary of where you are now?
What it a "debug" in this case? once I know that, I can DM it to you.

I've actually just installed core, and I'm planning to re-configure and try these tests:
  1. cp the TrueNAS ISO from the boot SSD to the dataset
  2. scp the TrueNAS ISO from macos to the boot SSD
  3. scp the TrueNAS ISO from macos to the dataset
  4. cp the TrueNAS ISO from macos to the dataset, mounted via SMB
 

NickF

Guru
Joined
Jun 12, 2014
Messages
763
Thanks!

What it a "debug" in this case? once I know that, I can DM it to you.

I've actually just installed core, and I'm planning to re-configure and try these tests:
  1. cp the TrueNAS ISO from the boot SSD to the dataset
  2. scp the TrueNAS ISO from macos to the boot SSD
  3. scp the TrueNAS ISO from macos to the dataset
  4. cp the TrueNAS ISO from macos to the dataset, mounted via SMB
Oh sure sorrry for not linking it
Basically a dump of all your logs and stuff so I can take a look without asking 1739 questions.

The problem is that you just reinstalled so we’ve lost all of the evidence in your logs.

This actually might be a good thing. It’ll make it easier to read. Get another job kicked off and let it run a bit. Come back to here tomorrow with a fresh debug.
Try not to make too many changes or introduce any new variables until then.

Then just wait until you see the device is slow messages again.

 
Last edited:

aklibisz

Dabbler
Joined
Aug 23, 2023
Messages
13
Thanks @NickF

I have to call it a day, but I just finished a bit of quick testing on TrueNAS Core, and it looks like TrueNAS Core is working much better. Here's a quick braindump of what I've found:
  1. Copying the 1GB TrueNAS core ISO from /tmp to /mnt/<pool>/<dataset> is instantaneous. `time` command literally returns 0.00. I'm pretty sure /tmp is on the boot SSD. `df /tmp` and df /mnt` return different capacities.
  2. SCPing the 1GB ISO from my macbook to the dataset over a 1Gbps connection runs at 0.9Gbps
  3. SCPing the 1GB ISO from the dataset back to my macbook over same 1Gbps connection runs at 0.9Gbps
  4. I couldn't get SMB permissions figured out, so I'm kicking that down the road

The only problem is that Core didn't recognize my 2.5Gb NIC at all (Scale recognized it out of the box). So I switched back on the built-in 1Gbps for now.

I don't have debug logs yet. I'll try to get those tomorrow, but it's a busy day, and then I'm traveling on the weekend.

If I can get the 2.5Gb NIC working, I might just stick with Core. I'm new to TrueNAS entirely, so I don't have a particular affinity for core vs. scale, and I don't need Docker images or VMs running on this thing. Just need reliable storage. LMK if it sounds crazy to stick with Core in this situation.
 

aklibisz

Dabbler
Joined
Aug 23, 2023
Messages
13
One addition: I got SMB working, and transferring the 1GB ISO from my macbook to the mounted SMB share runs in about 10 seconds. Transferring from SMB back to macbook also runs in about 10 seconds. (This is > 1Gbps, which doesn't really make sense, as I have a 1Gb NIC on the TrueNAS server. I do have a 2.5Gbps USB-C adapter and a 2.5Gbps switch.)

In any case, SMB is working well, so this further supports that there was something wrong with Scale on my system.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
I’m sliding them into the slots in the HP Proliant gen8 microserver. I’m not sure how else to describe it. This isn’t some sketchy/cheap sata multiplexer off eBay or aliexpress if that’s what you’re getting at.

I’ve read the resources linked but I don’t see/understand anything actionable there. The only thing I can think to try is to install TrueNAS core and see if that works better.

It uses a storage controller, namely the b120i. This is a raid controller, which should not be used with ZFS as per written in the resource I linked you.
 

aklibisz

Dabbler
Joined
Aug 23, 2023
Messages
13

It uses a storage controller, namely the b120i. This is a raid controller, which should not be used with ZFS as per written in the resource I linked you.
Ah, thanks. An important detail might be: I’ve disabled the RAID functionality in BIOS and enabled “SATA AHCI Mode” instead. So I was disregarding the “don’t use RAID advice” in that post. The card was defaulted to RAID mode, and TrueNAS did not recognize the drives. I disabled RAID mode before installing Scale, and kept it disabled for Core. So it shouldn’t be a variable in the Core vs Scale discussion. Is this configuration still problematic?

Btw, I was also able to get the 2.5Gbps NIC working and measured 1.5Gbps transfer over my 2.5Gbps switch, which is as fast as I’ve seen over this switch. The switch is pretty cheap, so I’m going to try with a more advanced/expensive one to see if I can get closer to 2.5Gbps.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
You cannot use a RAID card with ZFS, you have to use a HBA in IT mode. Sometimes, some raid cards can be crossflashed to HBA firmware.
 

NickF

Guru
Joined
Jun 12, 2014
Messages
763
Interesting discovery @Davvo. I feel like I remember folks using these in the past tho...hmmm...
Maybe worth digging more into the forum here for other people that had issues with this server.
In any case, a cheapo eBay HBA is an easy enough fix?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
It uses a storage controller, namely the b120i. This is a raid controller, which should not be used with ZFS as per written in the resource I linked you.

I think we've previously explored this and determined it is the crappiest of all possible things, a tacky big lie by HPE. I am *fairly* certain that it is little more than some SATA AHCI ports and then if you use it with Windows it has a software RAID driver. Plus a BIOS component that allows it to "boot" from the RAID even if a drive has failed. If it is in AHCI mode, it quite possibly works fine. I believe it is NOT an Intel controller but rather something like PMC/Sierra though. Could be wrong.
 

NickF

Guru
Joined
Jun 12, 2014
Messages
763
I think we've previously explored this and determined it is the crappiest of all possible things, a tacky big lie by HPE. I am *fairly* certain that it is little more than some SATA AHCI ports and then if you use it with Windows it has a software RAID driver. Plus a BIOS component that allows it to "boot" from the RAID even if a drive has failed. If it is in AHCI mode, it quite possibly works fine. I believe it is NOT an Intel controller but rather something like PMC/Sierra though. Could be wrong.
Fascinating. I had a Gen10 which was NOT this way. Gotta love HP.
 

aklibisz

Dabbler
Joined
Aug 23, 2023
Messages
13
Ok, catching up: I re-read the "What's all the noise about HBAs, and why can't I use a RAID controller?" post.

So, I'm using the "HPE Dynamic Smart Array B120i", which is not an LSI HBA controller flashed to IT/IR. I was surprised that this is problematic, as I've seen many people online using the Gen8 Microserver for TrueNAS. Maybe we are all ignorant of the nuances here. I actually don't know what IT/IR means yet, but I did verify in the BIOS that this controller has write caching disabled.

I'm still not sure how to weigh the specific risk of continuing to use this controller. The most concrete thing I've understood from the post is that there is that it is less reliable than LSI controllers, and there is some long-tail risk that it stops working. I don't really understand the precise failure mode, and so I don't know how to weigh the risk against a bunch of other risks like the motherboard dying, or two disks failing in a RAIDz1, or even my server getting stolen.

It sounds like I could buy a more reliable controller for pretty cheap, but there is only 1 PCI expansion slot, and I would rather use for the upgraded NIC.

This leaves me wondering: where is the existing controller and can I upgrade the controller without using the PCI slot? I don't see any discrete controller card. The drives connect to the motherboard via a mini SAS connector that fans out to four SATA connectors, each connected to a drive. So is the controller this cable rig? Or is the controller integrated into the motherboard? I've attached some photos for reference. If the controller is not integrated, maybe there's a better controller that connects to this mini SAS slot?
 

Attachments

  • fit_2560 (1).jpeg
    fit_2560 (1).jpeg
    163 KB · Views: 172
  • fit_2560.jpeg
    fit_2560.jpeg
    197.6 KB · Views: 156
  • fit_2560 (2).jpeg
    fit_2560 (2).jpeg
    174.8 KB · Views: 168

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I'm still not sure how to weigh the specific risk of continuing to use this controller. The most concrete thing I've understood from the post is that there is that it is less reliable than LSI controllers, and there is some long-tail risk that it stops working.

The question here is what the hell is the thing. I have not personally seen one of these newer Microservers, the one here is an older AMD based N40L, but some of the newer ones use Marvell controllers too. You can look in the dmesg for a line like

ahci0: <Marvell 88SE9230 AHCI SATA controller> port 0xe050-0xe057,0xe040-0xe043,0xe030-0xe037,0xe020-0xe023,0xe000-0xe01f mem 0xfea40000-0xfea407ff irq 32 at device 0.0 on pci1

or

ahci0: <AMD SB7x0/SB8x0/SB9x0 AHCI SATA controller> port 0xd000-0xd007,0xc000-0xc003,0xb000-0xb007,0xa000-0xa003,0x9000-0x900f mem 0xfe6ffc00-0xfe6fffff irq 19at device 17.0 on pci0

or

ahci0: <Intel Cougar Point AHCI SATA controller> port 0x1000-0x1007,0x1008-0x100b,0x1010-0x1017,0x1018-0x101b,0x1020-0x103f mem 0xfacd0000-0xfacd07ff irq 17 at device 31.2 on pci0

Basically all the Intel AHCI SATA controllers are rock solid, and many others work fine too. However, not all of them do. The Marvells have been flaky in the past (see in particular the debacle with the AsRock Rack storage board popular about a decade ago that used two different Marvell chipsets, this got fixed in software thankfully). There's stuff that's known to work well (LSI HBA, Intel SATA AHCI, Intel SCU, etc) and then there's stuff that is more dodgy. Dodgy doesn't always mean it won't work out, but it's harder to know for certain.
 

aklibisz

Dabbler
Joined
Aug 23, 2023
Messages
13
It looks like it's an Intel controller. I attached screenshots of the bios and dmesg output and a text file of the full dmesg.

Thanks for all this help, btw.
 

Attachments

  • Screenshot 2023-08-26 at 9.13.23 AM.png
    Screenshot 2023-08-26 at 9.13.23 AM.png
    148.9 KB · Views: 163
  • Screenshot 2023-08-24 at 7.44.48 PM.png
    Screenshot 2023-08-24 at 7.44.48 PM.png
    214.4 KB · Views: 165
  • dmesg.txt
    13.1 KB · Views: 163

aklibisz

Dabbler
Joined
Aug 23, 2023
Messages
13
I went ahead and purchased a proper HBA: https://www.ebay.com/itm/133308217106

I figure I’d rather have reliable storage at 1Gbps than headaches/surprises at 2.5Gbps.

I’ll test it with both Scale and Core and post some results. At this point it might be best for me to start a new thread? I’ve totally hijacked this one.
 
Top