I/O Errors Galore

inman.turbo

Contributor
Joined
Aug 27, 2019
Messages
149
As a preliminary to upgrading all my systems to truenas core I installed on a Dell Poweredge R440 (No Raid/ Software Raid Set to AHCI, 8 x 2.5 inch SATA) with half the backplane (4 X 2.5'' SATA) on LSI HBA. Same system was stellar with freenas. This version gives me thousands of cam errors and other oddness, keeps saying the device is reset. I feel like I am missing something obvious here. I've never had so much trouble. First I brought up an old pool and it said a disk was degraded. I replaced that then it said the other two were degraded. Tried a pool with just the new disk and ran fine for few hours then suspended the pool for I/O errors. Replaced HBA and moved to new slot, replaced cables and all disks, same. I can create a pool then it only works for a little while. Then only after several reboots I can destroy/export and create again. I notice that auto trim can now be manually turned on and I tried that a couple iterations but didn't seem to work. Any insights here?

Sorry I don't have any logs RN. I booted back to legacy freenas just to make sure that still works and I'm not going crazy. If anyone wants more info I'd be happy to boot back to Truenas and collect some logs.

All disks are sata ssd. 3 Western Digital Red 2TB (WDS200T1R0A). Also 1 intel s3700 was trying to use as control during experimentation. (I also use them for write protection in production on sata backplanes).

Current Train: TrueNAS-12.0-STABLE
No updates available.
 
Last edited:

ornias

Wizard
Joined
Mar 6, 2020
Messages
1,458
Did you try going back to your previous version, to exclude any inconvenient correlation?
 

ornias

Wizard
Joined
Mar 6, 2020
Messages
1,458
Did you check your PSU?
It could be dying and TrueNAS CORE 12 being a little more agressive with power and triggering latent issues...

Whats the LSI HBA?
 

inman.turbo

Contributor
Joined
Aug 27, 2019
Messages
149
Did you check your PSU?
It could be dying and TrueNAS CORE 12 being a little more agressive with power and triggering latent issues...

Nope, sure didn't. Thanks though I will certainly check "TrueNAS CORE 12 being a little more agressive with power and triggering latent issues". I had no idea, thank you! That is certainly a possibility given the spaces I am often dealing with I have had power issues before. I didn't realize there could be a difference here.
 

ornias

Wizard
Joined
Mar 6, 2020
Messages
1,458
Nope, sure didn't. Thanks though I will certainly check "TrueNAS CORE 12 being a little more agressive with power and triggering latent issues". I had no idea, thank you! That is certainly a possibility given the spaces I am often dealing with I have had power issues before. I didn't realize there could be a difference here.
It's very unlikely, but so is an upgrade from 11 to 12 giving the signs you are getting...
 

inman.turbo

Contributor
Joined
Aug 27, 2019
Messages
149
Whats the LSI HBA?

So far I've tried with this one:
☁ ~ lspci | grep LSI
65:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS2308 PCI-Express Fusion-MPT SAS-2 (rev 05)

And this one:

➜ ~ lspci |grep LSI
03:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03)
 
Top