SOLVED SCSI Errors & unable to Wipe Drives (9240/9211) w/ IBM/Seagate Disks

Status
Not open for further replies.

Seggr

Dabbler
Joined
May 16, 2016
Messages
12
My New Build:
SUPERMICRO 846E1-R900B CSE-846
X8DTE-F,
2x L5520,
64GB RAM,
24x 3TB SAS,
Chelsio 10GB 2-Port PCI-e OPT Adapter Card 110-1088-30
MegaRAID SAS 9240-8i (Flashed to 9211 IT mode)
2x 8GB USB Thumb drives

FreeNAS-9.10-STABLE-201605021851

Preface:
Please point something out plainly if you feel I have missed something obvious, my ego would rather just fix this.
I kinda resolved an issue I have/had with booting from USB devices, I was unable to boot to USB devices at all until I disabled the Optional ROM for all of the slots other than the SAS Controller.
Now I can boot to them but ONLY AFTER I attempt to boot them, get an error about no operating system found, reseat the thumb drives, see the message "GRUB", and reset then system. I have to repeat this process when ever I reboot atm.

Problems:
During Boot-up I see the following errors:

(da0:mps0:0:8:0): READ(16). CDB: 88 00 00 00 00 01 5d 50 a3 71 00 00 00 04 00 00
(da0:mps0:0:8:0): CAM status: SCSI Status Error
(da0:mps0:0:8:0): SCSI status: Check Condition
(da0:mps0:0:8:0): SCSI sense: ILLEGAL REQUEST asc:20,0 (Invalid command operation code)
(da0:mps0:0:8:0): Field Replaceable Unit: 0
(da0:mps0:0:8:0): Descriptor 0x80: 00 00 05 20 00 00 ff ff ff ff ff ff 00 00
(da0:mps0:0:8:0): Error 22, Unretryable error
(da0:mps0:0:8:0): READ(16). CDB: 88 00 00 00 00 01 5d 50 a3 af 00 00 00 01 00 00
(da0:mps0:0:8:0): CAM status: SCSI Status Error
(da0:mps0:0:8:0): SCSI status: Check Condition
(da0:mps0:0:8:0): SCSI sense: ILLEGAL REQUEST asc:20,0 (Invalid command operation code)
(da0:mps0:0:8:0): Field Replaceable Unit: 0
(da0:mps0:0:8:0): Descriptor 0x80: 00 00 05 20 00 00 ff ff ff ff ff ff 00 00
(da0:mps0:0:8:0): Error 22, Unretryable error
(da0:mps0:0:8:0): READ(16). CDB: 88 00 00 00 00 01 5d 50 a3 ae 00 00 00 01 00 00
(da0:mps0:0:8:0): CAM status: SCSI Status Error
(da0:mps0:0:8:0): SCSI status: Check Condition
(da0:mps0:0:8:0): SCSI sense: ILLEGAL REQUEST asc:20,0 (Invalid command operation code)
(da0:mps0:0:8:0): Field Replaceable Unit: 0
(da0:mps0:0:8:0): Descriptor 0x80: 00 00 05 20 00 00 ff ff ff ff ff ff 00 00
(da0:mps0:0:8:0): Error 22, Unretryable error
(da0:mps0:0:8:0): READ(16). CDB: 88 00 00 00 00 01 5d 50 a3 ae 00 00 00 01 00 00
(da0:mps0:0:8:0): CAM status: SCSI Status Error
(da0:mps0:0:8:0): SCSI status: Check Condition
(da0:mps0:0:8:0): SCSI sense: ILLEGAL REQUEST asc:20,0 (Invalid command operation code)
(da0:mps0:0:8:0): Field Replaceable Unit: 0
(da0:mps0:0:8:0): Descriptor 0x80: 00 00 05 20 00 00 ff ff ff ff ff ff 00 00
(da0:mps0:0:8:0): Error 22, Unretryable error​

(These would leave me to believe this is a driver issue.)

All of 24 of the drives are showing as having 600.00MB/s throughput.
(This would leave me to believe this is a drive issue.)

I am unable to wipe the drives I get the a permissions error.
I've attempted to run the following command after reading some old posts to no avail.
dd if=/dev/zero of=/dev/da0 bs=512 count=1
1+0 records in
1+0 records out
512 bytes transferred in 1.689685 secs (303 bytes/sec)
(SCSI Errors Followed.)
Here are some command results to maybe shed some light on this...

LSI Corporation SAS2 IR Configuration Utility.
Version 20.00.00.00 (2014.09.18)
Copyright (c) 2008-2014 LSI Corporation. All rights reserved.

Read configuration has been initiated for controller 0
------------------------------------------------------------------------
Controller information
------------------------------------------------------------------------
Controller type : SAS2008
BIOS version : 7.39.02.00
Firmware version : 20.00.07.00
Channel description : 1 Serial Attached SCSI
Initiator ID : 0
Maximum physical devices : 255
Concurrent commands supported : 3432
Slot : 6
Segment : 0
Bus : 8
Device : 0
Function : 0
RAID Support : No
------------------------------------------------------------------------
IR Volume information
------------------------------------------------------------------------
------------------------------------------------------------------------
Physical device information
------------------------------------------------------------------------
Initiator at ID #0

Device is a Hard disk
Enclosure # : 2
Slot # : 0
SAS Address : 5000c50-0-83e3-0fc5
State : Ready (RDY)
Size (in MB)/(in sectors) : 2861588/5860533167
Manufacturer : IBM-ESXS
Model Number : ST3000NM0023
Firmware Revision : BC5E
Serial No : Z1Z9201A0211BC5E
GUID : N/A
Protocol : SAS
Drive Type : SAS_HDD​
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
I'm far from knowledgeable on SAS, but that type of error often indicates a cable or connector problem.
 

m0nkey_

MVP
Joined
Oct 27, 2015
Messages
2,739
CAM errors usually indicate either a bad drive, connection or cable. Look at that first. Can you post the SMART output from da0?
 

Seggr

Dabbler
Joined
May 16, 2016
Messages
12
I'm far from knowledgeable on SAS, but that type of error often indicates a cable or connector problem.
I was kinda afraid of this, I lack any formal training when it comes to the standards of these things. Now to try and find a "High Quality Cable".

CAM errors usually indicate either a bad drive, connection or cable. Look at that first. Can you post the SMART output from da0?
I will also do this tomorrow when I am in front of it again.
 

Seggr

Dabbler
Joined
May 16, 2016
Messages
12
I've gone back and reseated the cables, I've also taken pictures of the cables and identified them as a pair of SuperMicro cbl-0281l(s) not sure if other people have reported issues using these in the past.

Reseating the cables did not resolve the issue.

I doubt these photos are going to progress this any further but "when in doubt document!"
oKF3yYh.jpg


T8FH50s.jpg
 

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
Have you tried booting FreeNAS 9.3-STABLE instead of 9.10?
 

Seggr

Dabbler
Joined
May 16, 2016
Messages
12
I have not, I can try it to see if I get different results... but it feels like a failure to identify the issue at hand and something I could stumble into if I upgrade at some point.
 

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
I have not, I can try it to see if I get different results... but it feels like a failure to identify the issue at hand and something I could stumble into if I upgrade at some point.
Yes, but it wouldn't hurt to try the older version. Some folks are having problems with 9.10. Also, you refer to motherboard X8DTE-F above, but the link is to the X8DT6-F board which includes a built-in LSI SAS controller. Which motherboard are you actually working with?
 

Seggr

Dabbler
Joined
May 16, 2016
Messages
12
Sorry I have the X8DTE-F, they use the same page/manual/drivers.
 

Seggr

Dabbler
Joined
May 16, 2016
Messages
12
Unfortunately after installing 9.3 and am still getting the same errors, so I've gone ahead and ordered replacement cables thankfully they should be in tomorrow for me to try.

o0tnzZD.png
 

Seggr

Dabbler
Joined
May 16, 2016
Messages
12
So I also attempted to wipe the drives again.
[root@freenas ~]# dd if=/dev/zero of=/dev/da0 bs=512 count=1
dd: /dev/da0: Operation not permitted
[root@freenas ~]#

CAM errors usually indicate either a bad drive, connection or cable. Look at that first. Can you post the SMART output from da0?

Sorry! I forgot to collect this earlier but
[root@freenas ~]# smartctl -H /dev/da0
smartctl 6.3 2014-07-26 r3976 [FreeBSD 9.3-RELEASE-p5 amd64] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK
[root@freenas ~]#
 

Seggr

Dabbler
Joined
May 16, 2016
Messages
12
Ok, So this is a problem independent of FreeNas.

I have attempted to wipe the drives in 3rd party tools and they failed. (I attempted to use KillDisk and it failed I am in the process loading another tool)
 

m0nkey_

MVP
Joined
Oct 27, 2015
Messages
2,739
So I also attempted to wipe the drives again.
[root@freenas ~]# dd if=/dev/zero of=/dev/da0 bs=512 count=1
dd: /dev/da0: Operation not permitted
[root@freenas ~]#



Sorry! I forgot to collect this earlier but
[root@freenas ~]# smartctl -H /dev/da0
smartctl 6.3 2014-07-26 r3976 [FreeBSD 9.3-RELEASE-p5 amd64] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK
[root@freenas ~]#
Can you provide a full output: smartctl -a /dev/da0
 

Seggr

Dabbler
Joined
May 16, 2016
Messages
12
Can you provide a full output: smartctl -a /dev/da0

smartctl 6.4 2015-06-04 r4109 [FreeBSD 10.3-RELEASE amd64] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor: IBM-ESXS
Product: ST3000NM0023
Revision: BC5E
Compliance: SPC-4
User Capacity: 3,000,592,982,016 bytes [3.00 TB]
Logical block size: 512 bytes
Formatted with type 2 protection
LB provisioning type: unreported, LBPME=0, LBPRZ=0
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Logical Unit id: 0x5000c50083e30fc7
Serial number: Z1Z9201A0000R547VMEA
Device type: disk
Transport protocol: SAS (SPL-3)
Local Time is: Fri May 20 13:41:59 2016 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Temperature Warning: Enabled

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Current Drive Temperature: 63 C
Drive Trip Temperature: 65 C

Elements in grown defect list: 0

Vendor (Seagate) cache information
Blocks sent to initiator = 0

Vendor (Seagate/Hitachi) factory information
number of hours powered up = 222.57
number of minutes until next internal SMART test = 58

Error counter log:
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 1087995026 0 0 1087995026 0 2363.465 0
write: 0 0 0 0 0 541.667 0
verify: 7371927 0 0 7371927 0 2.230 0

Non-medium error count: 15

SMART Self-test log
Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ]
Description number (hours)
# 1 Background short Completed - 2 - [- - -]

Long (extended) Self Test duration: 26000 seconds [433.3 minutes]



******* I found this after doing some googling in relation to the mention of "Protection"*******

http://talesinit.blogspot.com/2015/11/formatted-with-type-2-protection-huh.html
 
Last edited:

Seggr

Dabbler
Joined
May 16, 2016
Messages
12
You are very close to voiding your warranty. That drive should not be 63C.
I don't think that is an accurate reading... but I will check (I am already adding another cooling unit to the data closet, it comes in the mail later this week.)

That said, I have been successful in formatting a pair of disks after following the steps listed in the link I mentioned my last post.

STUPID DISK:
[root@freenas ~]# sg_readcap -l /dev/da2
Read Capacity results:
Protection: prot_en=1, p_type=1, p_i_exponent=0 [type 2 protection]
Logical block provisioning: lbpme=0, lbprz=0
Last logical block address=5860533167 (0x15d50a3af), Number of logical blocks=5860533168
Logical block length=512 bytes
Logical blocks per physical block exponent=0
Lowest aligned logical block address=0
Hence:
Device size: 3000592982016 bytes, 2861588.5 MiB, 3000.59 GB​

FIXED DISK:
[root@freenas ~]# sg_readcap -l /dev/da0
Read Capacity results:
Protection: prot_en=0, p_type=0, p_i_exponent=0
Logical block provisioning: lbpme=0, lbprz=0
Last logical block address=5860533167 (0x15d50a3af), Number of logical blocks=5860533168
Logical block length=512 bytes
Logical blocks per physical block exponent=0
Lowest aligned logical block address=0
Hence:
Device size: 3000592982016 bytes, 2861588.5 MiB, 3000.59 GB
HOW I FIXED IT:
[root@freenas ~]# sg_format --format --fmtpinfo=0 /dev/da2
IBM-ESXS ST3000NM0023 BC5E peripheral_type: disk [0x0]
<< supports protection information>>
Unit serial number: Z1Z8JAZK0000R536KZEF
LU name: 5000c500837495bf
Mode Sense (block descriptor) data, prior to changes:
Mode Sense (block descriptor) data, prior to changes:
<<< longlba flag set (64 bit lba) >>>
Number of blocks=5860533168 [0x15d50a3b0]
Block size=512 [0x200]

A FORMAT will commence in 15 seconds
ALL data on /dev/da2 will be DESTROYED
Press control-C to abort​

WHAT STILL IS UPSETTING:
I can only do one disk at a time and this take for freaking ever.
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
I don't think that is an accurate reading...
It's from the drive's own sensor.
I can only do one disk at a time and this take for freaking ever.
Either login with multiple SSH sessions, or use tmux, or go to Storage | View Disks and select a disk, then click the Wipe button.
 

Seggr

Dabbler
Joined
May 16, 2016
Messages
12
It's from the drive's own sensor. Either login with multiple SSH sessions, or use tmux, or go to Storage | View Disks and select a disk, then click the Wipe button.
I am compelled to respond to this message but I would like you to know that you should probably work on your manners or at least evaluate if your comments are constructive before hitting the post button.

I am probably going to write a script to run the commands one after another as running them at the same time results in each consecutive wipe taking longer and longer to complete. I would also like to mention that if the Wipe Button worked this thread would not exist.
 
Status
Not open for further replies.
Top