LSI 9201-16e errors writing disks in JBOD

Status
Not open for further replies.

MasterTacoChief

Explorer
Joined
Feb 20, 2017
Messages
67
I'm trying to use a LSI 9201-16e + Xyratex HB-1235 JBOD enclosure. I currently have 3x Seagate Constellation ES.2 3TB drives in the array and a single SAS cable between the LSI card and Xyratex enclosure (I've tried up to 3x cables). In Windows 7 I'm able to detect all drives, create partitions, write and read disks with this hardware.
In FreeNAS it's able to detect the hardware, but for some reason anything that attempts to write to the disks seems to produce a number of errors on the console. Here's some diagnostics:

[root@freenas ~]# camcontrol devlist
<SEAGATE ST33000651SS MS01> at scbus0 target 16 lun 0 (da0,pass0)
<SEAGATE ST33000651SS MS01> at scbus0 target 17 lun 0 (da1,pass1)
<SEAGATE ST33000651SS MS01> at scbus0 target 18 lun 0 (da2,pass2)
<XYRATEX HB-1235-E6EBD 2005> at scbus0 target 19 lun 0 (ses0,pass3)
<HL-DT-ST DVD+-RW GH70N A101> at scbus3 target 0 lun 0 (pass4,cd0)
<Kingston DataTraveler 2.0 PMAP> at scbus8 target 0 lun 0 (pass5,da3)


[root@freenas ~]# sysctl -a | grep mps
device mps
dev.mps.0.encl_table_dump:
dev.mps.0.mapping_table_dump:
dev.mps.0.spinup_wait_time: 3
dev.mps.0.chain_alloc_fail: 0
dev.mps.0.enable_ssu: 1
dev.mps.0.max_io_pages: -1
dev.mps.0.max_chains: 2048
dev.mps.0.chain_free_lowwater: 2047
dev.mps.0.chain_free: 2048
dev.mps.0.io_cmds_highwater: 5
dev.mps.0.io_cmds_active: 0
dev.mps.0.driver_version: 21.01.00.00-fbsd
dev.mps.0.firmware_version: 20.00.07.00
dev.mps.0.disable_msi: 0
dev.mps.0.disable_msix: 0
dev.mps.0.debug_level: 3
dev.mps.0.%parent: pci3
dev.mps.0.%pnpinfo: vendor=0x1000 device=0x0064 subvendor=0x1000 subdevice=0x30d
0 class=0x010700
dev.mps.0.%location: slot=0 function=0 dbsf=pci0:3:0:0
dev.mps.0.%driver: mps
dev.mps.0.%desc: Avago Technologies (LSI) SAS2116
dev.mps.%parent:
kstat.zfs.misc.zcompstats.skipped_insufficient_gain: 1
kstat.zfs.misc.zcompstats.empty: 0
kstat.zfs.misc.zcompstats.attempts: 521


smartctl also returns data for each disk:
[root@freenas ~]# smartctl -a /dev/da2 | more
smartctl 6.5 2016-05-07 r4318 [FreeBSD 10.3-STABLE amd64] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor: SEAGATE
Product: ST33000651SS
Revision: MS01
Compliance: SPC-4
User Capacity: 3,000,592,982,016 bytes [3.00 TB]
Logical block size: 512 bytes
Formatted with type 2 protection
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Logical Unit id: 0x5000c50034a1d653
Serial number: Z290QZ8G00009129XWST
Device type: disk
Transport protocol: SAS (SPL-3)
Local Time is: Tue Apr 25 19:21:54 2017 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Temperature Warning: Enabled

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK


Trying to create a ZPOOL, partition the drives manually, etc results in error messages in the console:

Apr 25 19:41:53 freenas (da2:mps0:0:18:0): WRITE(16). CDB: 8a 00 00 00 00 01 5d 50 a3 8f 00 00 00 20 00 00
Apr 25 19:41:53 freenas (da2:mps0:0:18:0): CAM status: SCSI Status Error
Apr 25 19:41:53 freenas (da2:mps0:0:18:0): SCSI status: Check Condition
Apr 25 19:41:53 freenas (da2:mps0:0:18:0): SCSI sense: ILLEGAL REQUEST asc:20,0 (Invalid command operation code)
Apr 25 19:41:53 freenas (da2:mps0:0:18:0): Field Replaceable Unit: 0
Apr 25 19:41:53 freenas (da2:mps0:0:18:0): Descriptor 0x80: 00 00 00 00 ff 29 00 00 00 00 00 00 00 00
Apr 25 19:41:53 freenas (da2:mps0:0:18:0): Error 22, Unretryable error
Apr 25 19:41:53 freenas (da2:mps0:0:18:0): READ(16). CDB: 88 00 00 00 00 01 5d 50 a3 8e 00 00 00 01 00 00
Apr 25 19:41:53 freenas (da2:mps0:0:18:0): CAM status: SCSI Status Error
Apr 25 19:41:53 freenas (da2:mps0:0:18:0): SCSI status: Check Condition
Apr 25 19:41:53 freenas (da2:mps0:0:18:0): SCSI sense: ILLEGAL REQUEST asc:20,0 (Invalid command operation code)
Apr 25 19:41:53 freenas (da2:mps0:0:18:0): Field Replaceable Unit: 0
Apr 25 19:41:53 freenas (da2:mps0:0:18:0): Descriptor 0x80: 00 00 00 00 ff 29 00 00 00 00 00 00 00 00
Apr 25 19:41:53 freenas (da2:mps0:0:18:0): Error 22, Unretryable error
<read error returned numerous times>


As far as I can tell from other posts, the P20 firmware is still the latest and the P21 driver should work, the LSI-9201 (SAS2116) chipset should work fine. The Xyratex enclosure seems like a possible unknown, but it's detecting correctly and at least some communication is working since I can detect drives and read SMART status. I've tried 1x, 2x, and 3x SAS cables and both controllers in the enclosure with the same results.

What am I missing here?
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
The issue is almost certainly not P20 vs P21.

I would say there's almost a 100% chance it's this SAS enclosure dohickey you're using. Can you let us know exactly what hardware (mobo, cpu, amount and type of RAM etc) that you're using for reference? Is there any way these drives (I realize they're SAS drives) can be hooked up without the enclosure, directly to your HBA? That would certainly identify the enclosure as the issue.
 

MasterTacoChief

Explorer
Joined
Feb 20, 2017
Messages
67
So a bit more searching turned up this article about "type 2 protection", which actually does show up in the smart status in my first post:
http://talesinit.blogspot.cl/2015/11/formatted-with-type-2-protection-huh.html

I've started the format of one of the drives, which unfortunately will take multiple hours to complete, so I won't have an answer until tomorrow. Looking like a promising solution though. I'll report back on success/failure.

FYI, I have a temporary setup with a Dell T3500, Xeon W3570, 6GB DDR3 ECC RAM (yes I know that is low, but like I said it was only to test the SAS hardware). If all works, this array is moving to my Dell T420 with 56GB RAM (my active FreeNAS server). If this doesn't work, the T420 has a PERC H710 with a few empty bays I can try using with these disks as a sanity check, but it's annoyingly not capable of HBA mode.

Thanks for the help!
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
I'm glad you likely have figured it out. I was going to ask what "type 2" protection was.
 

MasterTacoChief

Explorer
Joined
Feb 20, 2017
Messages
67
After 16hrs of waiting for the low-level format to complete, I am now able to create a ZFS with all three disks. Success!
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
The only thing about that though, there is little information in the post you linked about what the "type 2" protection actually is. I think the thread is well enough populated now with key phrases so that anyone else with the problem will find our discussion and solve their problem, but in terms of academic honesty, do you have a quick description of what "Type 2" protection actually does/did, and why the disks came with it enabled?

I know I'd like to know, as this is the first I've heard of this.
 

MasterTacoChief

Explorer
Joined
Feb 20, 2017
Messages
67
I found this whitepaper from Seagate which explains more:
https://www.seagate.com/files/stati...-from-corruption-technology-paper-tp621us.pdf

In essense, each sector on the disk is increased in size by 8 bytes, where those extra 8 bytes contain data to verify the contents of the sector. This allows what Seagate claims to be end-to-end error checking (from physical medium to OS). When this is enabled, the disk is expecting those 8 extra bytes to be sent with the data. Since there isn't support for this somewhere in the data path (LSI driver? FreeBSD?) the disk doesn't get the data it expects and apparently throws an error. A read would also send the extra 8 bytes which the driver/OS needs to expect and know what to do with.

Fortunately the protection type can be changed, but only by a low-level-format (which takes forever on large drives). Without it, you simply have one less method of error checking (which apparently most drives don't have anyway).

Apparently Windows has support since I didn't get any errors there. As far as I can tell this is a Seagate-only feature, but not sure if other vendors have also adopted this. I have a friend that works in enterprise storage hardware design that I'm going to check with.
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
I found this whitepaper from Seagate which explains more:
https://www.seagate.com/files/stati...-from-corruption-technology-paper-tp621us.pdf

In essense, each sector on the disk is increased in size by 8 bytes, where those extra 8 bytes contain data to verify the contents of the sector. This allows what Seagate claims to be end-to-end error checking (from physical medium to OS). When this is enabled, the disk is expecting those 8 extra bytes to be sent with the data. Since there isn't support for this somewhere in the data path (LSI driver? FreeBSD?) the disk doesn't get the data it expects and apparently throws an error. A read would also send the extra 8 bytes which the driver/OS needs to expect and know what to do with.

Fortunately the protection type can be changed, but only by a low-level-format (which takes forever on large drives). Without it, you simply have one less method of error checking (which apparently most drives don't have anyway).

Apparently Windows has support since I didn't get any errors there. As far as I can tell this is a Seagate-only feature, but not sure if other vendors have also adopted this. I have a friend that works in enterprise storage hardware design that I'm going to check with.
Fascinating. I am sure you just saved someone else in the future a big hassle. Well done sir.
 
Status
Not open for further replies.
Top