Hard drive really faulty or just cable?

Status
Not open for further replies.

Joshuah

Cadet
Joined
Dec 19, 2017
Messages
6
I've got a HP ML10v2 server setup with two volumes (tank + tankbackup).

Tank = 6x 4TB WD REDs in z2 setup.

Tankbackup= 4x 4TB WD REDs in z1 setup.

OS is connected to a EST11B PCIe 4 port sata controller (2x Samsung 850 SSDs).

Screenshot: http://prntscr.com/hvyf4y

I have the tank connected to the mainboard (no problems there at all). However, the issue is with the tankbackup. I am using a LSI raid card with an external breakout cable to the 4 hard drives, I am getting soo many errors etc.. and not sure if it's an issue with the card, cable, or disks.

What is the best way to 100% tell if the issue is with the the disk, cables, or card? I have no other machines available to test with. My thoughts were to unplug the disks and plug them into the 2 spare ports on the EST11B (same controller as the OS disks), however, this did not work, as it would try to boot up and would detect the two SSDs and the two RED drives and would "system halt".

The data on "tankbackup" is not important right now, so I can blow it all away etc.

Edit: I am running 11.1 RELEASE.

My DMESG is often flooded with:

Code:
(da0:mps0:0:2:0): WRITE(10). CDB: 2a 00 01 d2 ae 98 00 01 00 00
(da0:mps0:0:2:0): CAM status: CCB request completed with an error
(da0:mps0:0:2:0): Retrying command
	(da0:mps0:0:2:0): READ(16). CDB: 88 00 00 00 00 01 d1 c0 bc 90 00 00 00 10 00 00 length 8192 SMID 259 terminated ioc 804b loginfo 31111000 scsi 0 state c xfer 0
	(da0:mps0:0:2:0): READ(16). CDB: 88 00 00 00 00 01 d1 c0 ba 90 00 00 00 10 00 00 length 8192 SMID 427 terminated ioc 804b log(da0:mps0:0:2:0): READ(16). CDB: 88 00 00 00 00 01 d1 c0 bc 90 00 00 00 10 00 00
info 31111000 scsi 0 state c xfer 0
	(da0:mps0:0:2:0): READ(10). CDB: 28 00 00 40 02 90 00 00 10 00 length 8192 SMID 260 terminated ioc 804b loginfo 31111000 scsi(da0:mps0:0:2:0): CAM status: CCB request completed with an error
 0 state c xfer 0
	(da0:mps0:0:2:0): WRITE(10). CDB: 2a 00 01 d2 b8 80 00 00 10 00 length 8192 SMID 324 terminated ioc 804b loginfo 31111000 scs(da0:i 0 state c xfer 0
	(da0:mps0:0:2:0): WRITE(10). CDB: 2a 00 01 d2 b7 80 00 01 00 00 length 131072 SMID 182 terminated ioc 804b loginfo 31111000 smps0:0:2:0): Error 5, Retries exhausted
(da0:mps0:0:2:0): READ(16). CDB: 88 00 00 00 00 01 d1 c0 ba 90 00 00 00 10 00 00
csi 0 state c xfer 0
(da0:mps0:0:2:0): CAM status: CCB request completed with an error
(da0:	(da0:mps0:0:2:0): WRITE(10). CDB: 2a 00 01 d2 b6 80 00 01 00 00 length 131072 SMID 180 terminated ioc 804b loginfo 31111000 smps0:0:2:0): Error 5, Retries exhausted
(da0:mps0:0:2:0): READ(10). CDB: 28 00 00 40 02 90 00 00 10 00
csi 0 state c xfer 0
	(da0:mps0:0:2:0): WRITE(10). CDB: 2a 00 01 d2 b5 98 00 00 e8 00 length 118784 SMID 472 terminated ioc 804b loginfo 31111000 s(da0:mps0:0:2:0): CAM status: CCB request completed with an error
csi 0 state c xfer 0
	(da0:mps0:0:2:0): WRITE(10). CDB: 2a 00 01 d2 b4 98 00 01 00 00 length 131072 SMID 958 terminated ioc 804b loginfo 31111000 s(da0:mps0:0:2:0): Error 5, Retries exhausted
(da0:mps0:0:2:0): WRITE(10). CDB: 2a 00 01 d2 b8 80 00 00 10 00
csi 0 state c xfer 0
	(da0:mps0:0:2:0): WRITE(10). CDB: 2a 00 01 d2 b3 98 00 01 00 00 length 131072 SMID 570 terminated ioc 804b loginfo 31111000 s(da0:mps0:0:2:0): CAM status: CCB request completed with an error
csi 0 state c xfer 0
	(da0:mps0:0:2:0): WRITE(10). CDB: 2a 00 01 d2 b2 98 00 01 00 00 length 131072 SMID 178 terminated ioc 804b loginfo 31111000 s(da0:csi 0 state c xfer 0
	(da0:mps0:0:2:0): WRITE(10). CDB: 2a 00 01 d2 b1 98 00 01 00 00 length 131072 SMID 294 terminated ioc 804b loginfo 31111000 smps0:0:2:0): Error 5, Retries exhausted
(da0:mps0:0:2:0): WRITE(10). CDB: 2a 00 01 d2 b7 80 00 01 00 00
csi 0 state c xfer 0
	(da0:mps0:0:2:0): WRITE(10). CDB: 2a 00 01 d2 b0 98 00 01 00 00 length 131072 SMID 451 terminated ioc 804b loginfo 31111000 scsi 0 state c xfer 0
	(da0:mps0:0:2:0): WRITE(10). CDB: 2a 00 01 d2 af 98 00 01 00 00 length 131072 SMID 741 terminated ioc 804b loginfo 31111000 s(da0:mps0:0:2:0): CAM status: CCB request completed with an error
csi 0 state c xfer 0
(da0:	(da0:mps0:0:2:0): WRITE(10). CDB: 2a 00 01 d2 ae 98 00 01 00 00 length 131072 SMID 201 terminated ioc 804b loginfo 31111000 smps0:0:2:0): Error 5, Retries exhausted
(da0:mps0:0:2:0): WRITE(10). CDB: 2a 00 01 d2 b6 80 00 01 00 00
csi 0 state c xfer 0
(da0:mps0:0:2:0): CAM status: CCB request completed with an error
(da0:mps0:0:2:0): Error 5, Retries exhausted
(da0:mps0:0:2:0): WRITE(10). CDB: 2a 00 01 d2 b5 98 00 00 e8 00
(da0:mps0:0:2:0): CAM status: CCB request completed with an error
(da0:mps0:0:2:0): Error 5, Retries exhausted
(da0:mps0:0:2:0): WRITE(10). CDB: 2a 00 01 d2 b4 98 00 01 00 00
(da0:mps0:0:2:0): CAM status: CCB request completed with an error
(da0:mps0:0:2:0): Error 5, Retries exhausted
(da0:mps0:0:2:0): WRITE(10). CDB: 2a 00 01 d2 b3 98 00 01 00 00
(da0:mps0:0:2:0): CAM status: CCB request completed with an error
(da0:mps0:0:2:0): Error 5, Retries exhausted
(da0:mps0:0:2:0): WRITE(10). CDB: 2a 00 01 d2 b2 98 00 01 00 00
(da0:mps0:0:2:0): CAM status: CCB request completed with an error
(da0:mps0:0:2:0): Error 5, Retries exhausted
(da0:mps0:0:2:0): WRITE(10). CDB: 2a 00 01 d2 b1 98 00 01 00 00
(da0:mps0:0:2:0): CAM status: CCB request completed with an error
(da0:mps0:0:2:0): Error 5, Retries exhausted
(da0:mps0:0:2:0): WRITE(10). CDB: 2a 00 01 d2 b0 98 00 01 00 00
(da0:mps0:0:2:0): CAM status: CCB request completed with an error
(da0:mps0:0:2:0): Error 5, Retries exhausted
(da0:mps0:0:2:0): WRITE(10). CDB: 2a 00 01 d2 af 98 00 01 00 00
(da0:mps0:0:2:0): CAM status: CCB request completed with an error
(da0:mps0:0:2:0): Error 5, Retries exhausted
(da0:mps0:0:2:0): WRITE(10). CDB: 2a 00 01 d2 ae 98 00 01 00 00
(da0:mps0:0:2:0): CAM status: CCB request completed with an error
(da0:mps0:0:2:0): Error 5, Retries exhausted
 

Joshuah

Cadet
Joined
Dec 19, 2017
Messages
6
Here is what zpool status shows up:

Code:
pool: tankbackup
 state: DEGRADED
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
   see: http://illumos.org/msg/ZFS-8000-JQ
  scan: none requested
config:

	NAME											STATE	 READ WRITE CKSUM
	tankbackup									  DEGRADED	 0 15.6K	 0
	  raidz1-0									  DEGRADED	 0   200	 0
		gptid/e5aaa700-f14a-11e7-b76e-3ca82a4c06b0  ONLINE	   0   205	 0
		gptid/e769c48c-f14a-11e7-b76e-3ca82a4c06b0  ONLINE	   0	 0	 0
		gptid/e8ab58b4-f14a-11e7-b76e-3ca82a4c06b0  ONLINE	   0	 0	 0
		gptid/ea3e5977-f14a-11e7-b76e-3ca82a4c06b0  FAULTED	  0 8.00K	 0  too many errors

errors: 15939 data errors, use '-v' for a list
 

rs225

Guru
Joined
Jun 28, 2014
Messages
878
Have you checked the firmware for the LSI? I would guess you'll have to change the cable, and if the problem remains it will be the card or power.
 

Jailer

Not strong, but bad
Joined
Sep 12, 2014
Messages
4,977
What LSI card is it and what firmware is on it?
 

Joshuah

Cadet
Joined
Dec 19, 2017
Messages
6
Thanks for the replies so far.

Code:
07:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2308 PCI-Express Fusion-MPT SAS-2 (rev 01)


MPT2BIOS-7.17.03.00 (2011.04.25)

http://prntscr.com/hw77h2
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
No, that's the card's PCI option ROM version. But it's indicative of an old firmware. You want sas2flash -listall
 

Joshuah

Cadet
Joined
Dec 19, 2017
Messages
6
Thanks for that.

Code:
root@freenas:~ # sas2flash -listall
LSI Corporation SAS2 Flash Utility
Version 16.00.00.00 (2013.03.01)
Copyright (c) 2008-2013 LSI Corporation. All rights reserved

	Adapter Selected is a LSI SAS: SAS2308_2(B0)

Num   Ctlr			FW Ver		NVDATA		x86-BIOS		 PCI Addr
----------------------------------------------------------------------------

0  SAS2308_2(B0)   09.00.02.00	09.01.00.03	07.17.03.00	 00:07:00:00

	Finished Processing Commands Successfully.
	Exiting SAS2Flash.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Yeah, definitely update that firmware to the latest.
 
Status
Not open for further replies.
Top