Errors with HBA when going from 9.10.2-U6 to 11.1-U2&U3

Status
Not open for further replies.

ATCP

Cadet
Joined
Apr 6, 2018
Messages
5
I think I have some errors with my LSI HBA. Everything is fine when I run the system before 11.1 but everything after gives me errors. It looks like my controller randomly disappear and then come back and drives resilver. Here is part of the logs I received in U2 and U3.

Code:
ix0: link state changed to UP
ix0: link state changed to DOWN
ix2: link state changed to DOWN
ix2: link state changed to UP
ix2: link state changed to DOWN
ix2: link state changed to UP
ix2: link state changed to DOWN
ix2: link state changed to UP
ix2: link state changed to DOWN
ix2: link state changed to UP
mps1: IOC Fault 0x40000d04, Resetting
mps1: Reinitializing controller,
mps1: Firmware: 20.00.02.00, Driver: 21.02.00.00-fbsd
mps1: IOCCapabilities: 1285c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,EventReplay,HostDisc>
mps1: mps_reinit finished sc 0xfffffe0001e84000 post 4 free 3
mps1: SAS Address for SATA device = 859e4e46c8a0d186
mps1: SAS Address from SATA device = 859e4e46c8a0d186
mps1: SAS Address for SATA device = 85984e2fc2c3c484
mps1: SAS Address from SATA device = 85984e2fc2c3c484
mps1: SAS Address for SATA device = 859c5140bdb9ad95
mps1: SAS Address from SATA device = 859c5140bdb9ad95
mps1: SAS Address for SATA device = 859d5f40b2c5af75
mps1: SAS Address from SATA device = 859d5f40b2c5af75
mps1: SAS Address for SATA device = 859e5642a5c3c472
mps1: SAS Address from SATA device = 859e5642a5c3c472
mps1: SAS Address for SATA device = 85985f40bda2d38f
mps1: SAS Address from SATA device = 85985f40bda2d38f
mps1: SAS Address for SATA device = 859a4739a2bfcf92
mps1: SAS Address from SATA device = 859a4739a2bfcf92
mps1: SAS Address for SATA device = 85984e2fc2a2cf6c
mps1: SAS Address from SATA device = 85984e2fc2a2cf6c
mps1: SAS Address for SATA device = 859d5b32a3c3d172
mps1: SAS Address from SATA device = 859d5b32a3c3d172
mps1: SAS Address for SATA device = 859d5b32a3a1c587
mps1: SAS Address from SATA device = 859d5b32a3a1c587
mps1: SAS Address for SATA device = 859d54439fa5b17d
mps1: SAS Address from SATA device = 859d54439fa5b17d
mps1: SAS Address for SATA device = 859d5b32a3b7bd90
mps1: SAS Address from SATA device = 859d5b32a3b7bd90
mps1: SAS Address for SATA device = 859e5b44a5bbaa6f
mps1: SAS Address from SATA device = 859e5b44a5bbaa6f
mps1: SAS Address for SATA device = 859a4739a2c2cd6f
mps1: SAS Address from SATA device = 859a4739a2c2cd6f
mps1: IOC Fault 0x40000d04, Resetting
mps1: Reinitializing controller,
mps1: Firmware: 20.00.02.00, Driver: 21.02.00.00-fbsd
mps1: IOCCapabilities: 1285c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,EventReplay,HostDisc>
mps1: mps_reinit finished sc 0xfffffe0001e84000 post 4 free 3
(da22:mps1:0:21:0): Invalidating pack
da22 at mps1 bus 0 scbus11 target 21 lun 0
da22: <ATA WDC WD4000FYYZ-0 1K03> s/n WD-WCC136HXY3XJ detached
GEOM_MIRROR: Request failed (error=6). da22p1[READ(offset=45031424, length=12288)]
GEOM_MIRROR: Device swap4: provider da22p1 disconnected.
GEOM_ELI: g_eli_read_done() failed (error=6)g_access(918): provider da22 has error 6 set
mirror/swap4.eli[READ(offset=74575872, length=40960)]
swap_pager: I/O error - pagein failed; blkno 2115363,size 40960, error 6
vm_fault: pager read error, pid 4776 (zfsd)
g_access(918): provider da22 has error 6 set
g_access(918): provider da22 has error 6 set
g_access(918): provider da22 has error 6 set
g_access(918): provider da22 has error 6 set
g_access(918): provider da22 has error 6 set
g_access(918): provider da22 has error 6 set
g_access(918): provider da22 has error 6 set
mps1: SAS Address for SATA device = 859e4e46c8a0d186
mps1: SAS Address from SATA device = 859e4e46c8a0d186
mps1: SAS Address for SATA device = 85984e2fc2c3c484
mps1: SAS Address from SATA device = 85984e2fc2c3c484
mps1: SAS Address for SATA device = 859d5f40b2c5af75
mps1: SAS Address from SATA device = 859d5f40b2c5af75
mps1: SAS Address for SATA device = 859e5642a5c3c472
mps1: SAS Address from SATA device = 859e5642a5c3c472
mps1: SAS Address for SATA device = 859c5140bdb9ad95
mps1: SAS Address from SATA device = 859c5140bdb9ad95
mps1: SAS Address for SATA device = 85985f40bda2d38f
mps1: SAS Address from SATA device = 85985f40bda2d38f
mps1: SAS Address for SATA device = 859a4739a2bfcf92
mps1: SAS Address from SATA device = 859a4739a2bfcf92
mps1: SAS Address for SATA device = 85984e2fc2a2cf6c
mps1: SAS Address from SATA device = 85984e2fc2a2cf6c
mps1: SAS Address for SATA device = 859d5b32a3c3d172
mps1: SAS Address from SATA device = 859d5b32a3c3d172
mps1: SAS Address for SATA device = 859d5b32a3a1c587
mps1: SAS Address from SATA device = 859d5b32a3a1c587
mps1: SAS Address for SATA device = 859d54439fa5b17d
mps1: SAS Address from SATA device = 859d54439fa5b17d
mps1: SAS Address for SATA device = 859d5b32a3b7bd90
mps1: SAS Address from SATA device = 859d5b32a3b7bd90
mps1: SAS Address for SATA device = 859e5b44a5bbaa6f
mps1: SAS Address from SATA device = 859e5b44a5bbaa6f
mps1: SAS Address for SATA device = 859a4739a2c2cd6f
mps1: SAS Address from SATA device = 859a4739a2c2cd6f
(da22:mps1:0:21:0): Periph destroyed
da22 at mps1 bus 0 scbus11 target 21 lun 0
da22: <ATA WDC WD4000FYYZ-0 1K03> Fixed Direct Access SPC-4 SCSI device
da22: Serial Number WD-WCC136HXY3XJ
da22: 600.000MB/s transfers
da22: Command Queueing enabled
da22: 3815447MB (7814037168 512 byte sectors)
mps1: IOC Fault 0x40000d04, Resetting
mps1: Reinitializing controller,
mps1: Firmware: 20.00.02.00, Driver: 21.02.00.00-fbsd
mps1: IOCCapabilities: 1285c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,EventReplay,HostDisc>
mps1: mps_reinit finished sc 0xfffffe0001e84000 post 4 free 3
mps1: SAS Address for SATA device = 859e4e46c8a0d186
mps1: SAS Address from SATA device = 859e4e46c8a0d186
mps1: SAS Address for SATA device = 85984e2fc2c3c484
mps1: SAS Address from SATA device = 85984e2fc2c3c484
mps1: SAS Address for SATA device = 859c5140bdb9ad95
mps1: SAS Address from SATA device = 859c5140bdb9ad95
mps1: SAS Address for SATA device = 859d5f40b2c5af75
mps1: SAS Address from SATA device = 859d5f40b2c5af75
mps1: SAS Address for SATA device = 859e5642a5c3c472
mps1: SAS Address from SATA device = 859e5642a5c3c472
mps1: SAS Address for SATA device = 85985f40bda2d38f
mps1: SAS Address from SATA device = 85985f40bda2d38f
mps1: SAS Address for SATA device = 859a4739a2bfcf92
mps1: SAS Address from SATA device = 859a4739a2bfcf92
mps1: SAS Address for SATA device = 85984e2fc2a2cf6c
mps1: SAS Address from SATA device = 85984e2fc2a2cf6c
mps1: SAS Address for SATA device = 859d5b32a3c3d172
mps1: SAS Address from SATA device = 859d5b32a3c3d172
mps1: SAS Address for SATA device = 859d5b32a3a1c587
mps1: SAS Address from SATA device = 859d5b32a3a1c587
mps1: SAS Address for SATA device = 859d54439fa5b17d
mps1: SAS Address from SATA device = 859d54439fa5b17d
mps1: SAS Address for SATA device = 859d5b32a3b7bd90
mps1: SAS Address from SATA device = 859d5b32a3b7bd90
mps1: SAS Address for SATA device = 859e5b44a5bbaa6f
mps1: SAS Address from SATA device = 859e5b44a5bbaa6f
mps1: SAS Address for SATA device = 859a4739a2c2cd6f
mps1: SAS Address from SATA device = 859a4739a2c2cd6f


-- End of security output --


Do you think I need new firmware on the controller?
I Installed 11.1-U2 on the March 16th and U3 on March 20th. I went back to 9.10.2-U6 on March 22nd. Since then I have not received any errors, all my logs and scrubs are clean with no resilvering.

Any other ideas? Try U4? I would like to be on 11.1. (I am planning on a getting an additional freenas box and upgrading my network to 10GBase-T switch) and I am looking to utilize some of then new features in 11.1. most notably the Backblaze B2 integration.

(I have been a long time reader of this forum but this is my first post)
Thanks
Michael
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
no system specs, so its fairly hard to say much (these are considered minimum details for many on these forums)
do you have enough power for the drives/hardware? insufficient power can cause all kinds of weird things
do you have another controller to swap in? bad controller seems the next most likely
 

ATCP

Cadet
Joined
Apr 6, 2018
Messages
5
It doesn't make a lot of sense that it would be a hardware issue seeing as it's a 2 year old box and has run perfectly on older trains (and still does). It just gives me errors on 11.1

The system is a Q30 by 45Drives.
128GB ECC ram
2 Xeon E5 processors
2 LSI controllers (don't know the model) - both are throwing up errors
30 4TB WD Enterprise Drives
750 Watt power supply

If its not a firmware issue could it be that 11.1 is making the controller work harder causing it to overheat?
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
I'm hardly an expert, but plugging a guess at your hardware into a PSU calculator along with 30 10k SAS, I get 731W (https://outervision.com/power-supply-calculator)
perhaps what you are seeing is errors that weren't being reported in the older versions?
if your PSU is just enough for the min power requirements, then things like drive spinups or higher than average CPU usage could easily cause random resets when the power spikes over what the PSU can supply reliably
i had similar errors when I had a smaller PSU, haven't seen them since I both replaced it and reduced the number of drives.
getting the firmware to match the driver is recommended, however you may have issues trying to use older freenas
what version does the 9.10 you are using have?

you can see what the adapters are with the driver utils (mpt, mpr, mps):
(sudo) mpsutil show adapter
(according to 45drives, you probably have LSI 9305)
should be able to get driver version with
dmesg | grep mpr | grep Driver
or (if you get alot of errors dmesg might be blank)
(sudo) cat /var/log/dmesg.yesterday | grep Driver
 
Last edited:

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Do you think I need new firmware on the controller?
Definitely. Versions of P20 prior to 20.0.04 were incredibly buggy.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
sas2flash -listall
 

ATCP

Cadet
Joined
Apr 6, 2018
Messages
5
Code:
LSI Corporation SAS2 Flash Utility																								 
Version 16.00.00.00 (2013.03.01)																									
Copyright (c) 2008-2013 LSI Corporation. All rights reserved																		
																																   
	   Adapter Selected is a LSI SAS: SAS2116_1(B1)																				
																																   
Num   Ctlr			FW Ver		NVDATA		x86-BIOS		 PCI Addr														 
----------------------------------------------------------------------------														
																																   
0  SAS2116_1(B1)   20.00.02.00	14.01.00.06	07.39.00.00	 00:04:00:00														
1  SAS2116_1(B1)   20.00.02.00	14.01.00.06	07.39.00.00	 00:82:00:00														
																																   
	   Finished Processing Commands Successfully.																				 
	   Exiting SAS2Flash.		 


I have two SAS9201-16i HBAs. Which unfortunately don't look to be currently supported... and is making it difficult to find the newest firmware.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Status
Not open for further replies.
Top