First Time W/ JBOD

webdawg · Apr 20, 2021

I purchased a used SuperMicro CSE-836E16-R92JBD 3U LFF 16 BAY 3.5 JBOD BPN-SAS2-836EL1

I think the backplane is bad in my JBOD. I checked the cables, card, backplane jumpers. I just purchased another card, and JBOD, and I have a new spare cable. This is what happens:

------------

I am doing some testing/burn in. I boot into systemrescuecd 7.01, and I run nwipe. I have done this 3-4 times. After about 12 hours I get this:

[Tue Apr 20 06:40:52 2021] sd 5:0:5:0: [sdf] tag#436 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[Tue Apr 20 06:40:52 2021] sd 5:0:5:0: [sdf] tag#436 Sense Key : Illegal Request [current] [descriptor]
[Tue Apr 20 06:40:52 2021] sd 5:0:5:0: [sdf] tag#436 Add. Sense: Invalid field in cdb
[Tue Apr 20 06:40:52 2021] sd 5:0:5:0: [sdf] tag#436 CDB: Write(32)
[Tue Apr 20 06:40:52 2021] sd 5:0:5:0: [sdf] tag#436 CDB[00]: 7f 00 00 00 00 00 00 18 00 0b 20 00 00 00 00 01
[Tue Apr 20 06:40:52 2021] sd 5:0:5:0: [sdf] tag#436 CDB[10]: ce b3 7b 00 ce b3 7b 00 00 00 00 00 00 00 04 00
[Tue Apr 20 06:40:52 2021] blk_update_request: critical target error, dev sdf, sector 7762836224 op 0x1:(WRITE) flags 0x4800 phys_seg 128 prio class 0
[Tue Apr 20 06:40:52 2021] Buffer I/O error on dev sdf, logical block 970354528, lost async page write
[Tue Apr 20 06:40:52 2021] Buffer I/O error on dev sdf, logical block 970354529, lost async page write
[Tue Apr 20 06:40:52 2021] Buffer I/O error on dev sdf, logical block 970354530, lost async page write
[Tue Apr 20 06:40:52 2021] Buffer I/O error on dev sdf, logical block 970354531, lost async page write
[Tue Apr 20 06:40:52 2021] Buffer I/O error on dev sdf, logical block 970354532, lost async page write
[Tue Apr 20 06:40:52 2021] Buffer I/O error on dev sdf, logical block 970354533, lost async page write
[Tue Apr 20 06:40:52 2021] Buffer I/O error on dev sdf, logical block 970354534, lost async page write
[Tue Apr 20 06:40:52 2021] Buffer I/O error on dev sdf, logical block 970354535, lost async page write
[Tue Apr 20 06:40:52 2021] Buffer I/O error on dev sdf, logical block 970354536, lost async page write
[Tue Apr 20 06:40:52 2021] Buffer I/O error on dev sdf, logical block 970354537, lost async page write
[Tue Apr 20 06:43:26 2021] sd 5:0:2:0: [sdc] tag#1605 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[Tue Apr 20 06:43:26 2021] sd 5:0:2:0: [sdc] tag#1605 Sense Key : Illegal Request [current] [descriptor]
[Tue Apr 20 06:43:26 2021] sd 5:0:2:0: [sdc] tag#1605 Add. Sense: Invalid field in cdb
[Tue Apr 20 06:43:26 2021] sd 5:0:2:0: [sdc] tag#1605 CDB: Write(32)
[Tue Apr 20 06:43:26 2021] sd 5:0:2:0: [sdc] tag#1605 CDB[00]: 7f 00 00 00 00 00 00 18 00 0b 20 00 00 00 00 01
[Tue Apr 20 06:43:26 2021] sd 5:0:2:0: [sdc] tag#1605 CDB[10]: ce b3 7b 00 ce b3 7b 00 00 00 00 00 00 00 04 00
[Tue Apr 20 06:43:26 2021] blk_update_request: critical target error, dev sdc, sector 7762836224 op 0x1:(WRITE) flags 0x4800 phys_seg 128 prio class 0
[Tue Apr 20 06:43:26 2021] buffer_io_error: 118 callbacks suppressed
[Tue Apr 20 06:43:26 2021] Buffer I/O error on dev sdc, logical block 970354528, lost async page write
[Tue Apr 20 06:43:26 2021] Buffer I/O error on dev sdc, logical block 970354529, lost async page write
[Tue Apr 20 06:43:26 2021] Buffer I/O error on dev sdc, logical block 970354530, lost async page write
[Tue Apr 20 06:43:26 2021] Buffer I/O error on dev sdc, logical block 970354531, lost async page write
[Tue Apr 20 06:43:26 2021] Buffer I/O error on dev sdc, logical block 970354532, lost async page write
[Tue Apr 20 06:43:26 2021] Buffer I/O error on dev sdc, logical block 970354533, lost async page write
[Tue Apr 20 06:43:26 2021] Buffer I/O error on dev sdc, logical block 970354534, lost async page write
[Tue Apr 20 06:43:26 2021] Buffer I/O error on dev sdc, logical block 970354535, lost async page write
[Tue Apr 20 06:43:26 2021] Buffer I/O error on dev sdc, logical block 970354536, lost async page write
[Tue Apr 20 06:43:26 2021] Buffer I/O error on dev sdc, logical block 970354537, lost async page write
[Tue Apr 20 06:48:42 2021] sd 5:0:1:0: [sdb] tag#1023 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[Tue Apr 20 06:48:42 2021] sd 5:0:1:0: [sdb] tag#1023 Sense Key : Illegal Request [current] [descriptor]
[Tue Apr 20 06:48:42 2021] sd 5:0:1:0: [sdb] tag#1023 Add. Sense: Invalid field in cdb
[Tue Apr 20 06:48:42 2021] sd 5:0:1:0: [sdb] tag#1023 CDB: Write(32)
[Tue Apr 20 06:48:42 2021] sd 5:0:1:0: [sdb] tag#1023 CDB[00]: 7f 00 00 00 00 00 00 18 00 0b 20 00 00 00 00 01
[Tue Apr 20 06:48:42 2021] sd 5:0:1:0: [sdb] tag#1023 CDB[10]: ce b3 7b 00 ce b3 7b 00 00 00 00 00 00 00 04 00
[Tue Apr 20 06:48:42 2021] blk_update_request: critical target error, dev sdb, sector 7762836224 op 0x1:(WRITE) flags 0x4800 phys_seg 128 prio class 0
[Tue Apr 20 06:48:42 2021] buffer_io_error: 118 callbacks suppressed
[Tue Apr 20 06:48:42 2021] Buffer I/O error on dev sdb, logical block 970354528, lost async page write
[Tue Apr 20 06:48:42 2021] Buffer I/O error on dev sdb, logical block 970354529, lost async page write
[Tue Apr 20 06:48:42 2021] Buffer I/O error on dev sdb, logical block 970354530, lost async page write
[Tue Apr 20 06:48:42 2021] Buffer I/O error on dev sdb, logical block 970354531, lost async page write
[Tue Apr 20 06:48:42 2021] Buffer I/O error on dev sdb, logical block 970354532, lost async page write
[Tue Apr 20 06:48:42 2021] Buffer I/O error on dev sdb, logical block 970354533, lost async page write
[Tue Apr 20 06:48:42 2021] Buffer I/O error on dev sdb, logical block 970354534, lost async page write
[Tue Apr 20 06:48:42 2021] Buffer I/O error on dev sdb, logical block 970354535, lost async page write
[Tue Apr 20 06:48:42 2021] Buffer I/O error on dev sdb, logical block 970354536, lost async page write
[Tue Apr 20 06:48:42 2021] Buffer I/O error on dev sdb, logical block 970354537, lost async page write
[Tue Apr 20 06:49:04 2021] sd 5:0:4:0: [sde] tag#1414 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[Tue Apr 20 06:49:04 2021] sd 5:0:4:0: [sde] tag#1414 Sense Key : Illegal Request [current] [descriptor]
[Tue Apr 20 06:49:04 2021] sd 5:0:4:0: [sde] tag#1414 Add. Sense: Invalid field in cdb
[Tue Apr 20 06:49:04 2021] sd 5:0:4:0: [sde] tag#1414 CDB: Write(32)
[Tue Apr 20 06:49:04 2021] sd 5:0:4:0: [sde] tag#1414 CDB[00]: 7f 00 00 00 00 00 00 18 00 0b 20 00 00 00 00 01
[Tue Apr 20 06:49:04 2021] sd 5:0:4:0: [sde] tag#1414 CDB[10]: ce b3 7b 00 ce b3 7b 00 00 00 00 00 00 00 04 00
[Tue Apr 20 06:49:04 2021] blk_update_request: critical target error, dev sde, sector 7762836224 op 0x1:(WRITE) flags 0x4800 phys_seg 128 prio class 0
[Tue Apr 20 06:49:04 2021] buffer_io_error: 118 callbacks suppressed
[Tue Apr 20 06:49:04 2021] Buffer I/O error on dev sde, logical block 970354528, lost async page write
[Tue Apr 20 06:49:04 2021] Buffer I/O error on dev sde, logical block 970354529, lost async page write
[Tue Apr 20 06:49:04 2021] Buffer I/O error on dev sde, logical block 970354530, lost async page write
[Tue Apr 20 06:49:04 2021] Buffer I/O error on dev sde, logical block 970354531, lost async page write
[Tue Apr 20 06:49:04 2021] Buffer I/O error on dev sde, logical block 970354532, lost async page write
[Tue Apr 20 06:49:04 2021] Buffer I/O error on dev sde, logical block 970354533, lost async page write
[Tue Apr 20 06:49:04 2021] Buffer I/O error on dev sde, logical block 970354534, lost async page write
[Tue Apr 20 06:49:04 2021] Buffer I/O error on dev sde, logical block 970354535, lost async page write
[Tue Apr 20 06:49:04 2021] Buffer I/O error on dev sde, logical block 970354536, lost async page write
[Tue Apr 20 06:49:04 2021] Buffer I/O error on dev sde, logical block 970354537, lost async page write
[Tue Apr 20 06:52:08 2021] sd 5:0:3:0: [sdd] tag#157 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[Tue Apr 20 06:52:08 2021] sd 5:0:3:0: [sdd] tag#157 Sense Key : Illegal Request [current] [descriptor]
[Tue Apr 20 06:52:08 2021] sd 5:0:3:0: [sdd] tag#157 Add. Sense: Invalid field in cdb
[Tue Apr 20 06:52:08 2021] sd 5:0:3:0: [sdd] tag#157 CDB: Write(32)
[Tue Apr 20 06:52:08 2021] sd 5:0:3:0: [sdd] tag#157 CDB[00]: 7f 00 00 00 00 00 00 18 00 0b 20 00 00 00 00 01
[Tue Apr 20 06:52:08 2021] sd 5:0:3:0: [sdd] tag#157 CDB[10]: ce b3 7b 00 ce b3 7b 00 00 00 00 00 00 00 04 00
[Tue Apr 20 06:52:08 2021] blk_update_request: critical target error, dev sdd, sector 7762836224 op 0x1:(WRITE) flags 0x4800 phys_seg 128 prio class 0
[Tue Apr 20 06:52:08 2021] buffer_io_error: 118 callbacks suppressed
[Tue Apr 20 06:52:08 2021] Buffer I/O error on dev sdd, logical block 970354528, lost async page write
[Tue Apr 20 06:52:08 2021] Buffer I/O error on dev sdd, logical block 970354529, lost async page write
[Tue Apr 20 06:52:08 2021] Buffer I/O error on dev sdd, logical block 970354530, lost async page write
[Tue Apr 20 06:52:08 2021] Buffer I/O error on dev sdd, logical block 970354531, lost async page write
[Tue Apr 20 06:52:08 2021] Buffer I/O error on dev sdd, logical block 970354532, lost async page write
[Tue Apr 20 06:52:08 2021] Buffer I/O error on dev sdd, logical block 970354533, lost async page write
[Tue Apr 20 06:52:08 2021] Buffer I/O error on dev sdd, logical block 970354534, lost async page write
[Tue Apr 20 06:52:08 2021] Buffer I/O error on dev sdd, logical block 970354535, lost async page write
[Tue Apr 20 06:52:08 2021] Buffer I/O error on dev sdd, logical block 970354536, lost async page write
[Tue Apr 20 06:52:08 2021] Buffer I/O error on dev sdd, logical block 970354537, lost async page write
[Tue Apr 20 06:53:23 2021] sd 5:0:0:0: [sda] tag#1619 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[Tue Apr 20 06:53:23 2021] sd 5:0:0:0: [sda] tag#1619 Sense Key : Illegal Request [current] [descriptor]
[Tue Apr 20 06:53:23 2021] sd 5:0:0:0: [sda] tag#1619 Add. Sense: Invalid field in cdb
[Tue Apr 20 06:53:23 2021] sd 5:0:0:0: [sda] tag#1619 CDB: Write(32)
[Tue Apr 20 06:53:23 2021] sd 5:0:0:0: [sda] tag#1619 CDB[00]: 7f 00 00 00 00 00 00 18 00 0b 20 00 00 00 00 01
[Tue Apr 20 06:53:23 2021] sd 5:0:0:0: [sda] tag#1619 CDB[10]: ce b3 7b 00 ce b3 7b 00 00 00 00 00 00 00 04 00
[Tue Apr 20 06:53:23 2021] blk_update_request: critical target error, dev sda, sector 7762836224 op 0x1:(WRITE) flags 0x4800 phys_seg 128 prio class 0
[Tue Apr 20 06:53:23 2021] buffer_io_error: 118 callbacks suppressed
[Tue Apr 20 06:53:23 2021] Buffer I/O error on dev sda, logical block 970354528, lost async page write
[Tue Apr 20 06:53:23 2021] Buffer I/O error on dev sda, logical block 970354529, lost async page write
[Tue Apr 20 06:53:23 2021] Buffer I/O error on dev sda, logical block 970354530, lost async page write
[Tue Apr 20 06:53:23 2021] Buffer I/O error on dev sda, logical block 970354531, lost async page write
[Tue Apr 20 06:53:23 2021] Buffer I/O error on dev sda, logical block 970354532, lost async page write
[Tue Apr 20 06:53:23 2021] Buffer I/O error on dev sda, logical block 970354533, lost async page write
[Tue Apr 20 06:53:23 2021] Buffer I/O error on dev sda, logical block 970354534, lost async page write
[Tue Apr 20 06:53:23 2021] Buffer I/O error on dev sda, logical block 970354535, lost async page write
[Tue Apr 20 06:53:23 2021] Buffer I/O error on dev sda, logical block 970354536, lost async page write
[Tue Apr 20 06:53:23 2021] Buffer I/O error on dev sda, logical block 970354537, lost async page write

I did some testing too after the failure:

[root@sysrescue ~]# dd if=/dev/sdb of=/dev/sda bs=64M status=progress oflag=direct
dd: error writing '/dev/sda': Remote I/O error
1+0 records in
0+0 records out
0 bytes copied, 0.473331 s, 0.0 kB/s

and

dd if=/dev/zero of=/dev/sda bs=64M status=progress oflag=direct
dd: error writing '/dev/sda': Remote I/O error
1+0 records in
0+0 records out
0 bytes copied, 0.447416 s, 0.0 kB/s

I just, while the system was running, switched my 6G External Mini SAS SFF-8088 to SFF-8088 to a different port on the back of the JBOD. It re-detected the drives, and I am going to leave it run to see if it errors out again. I wonder if I have a bad SAS port on the backplane, or SFF-8087 to SFF-8088 cable inside the unit.

[Tue Apr 20 14:09:20 2021] ses 5:0:13:0: Attached Enclosure device
[Tue Apr 20 14:09:20 2021] ses 5:0:13:0: Attached scsi generic sg6 type 13
[Tue Apr 20 14:09:20 2021] sd 5:0:7:0: [sdj] Attached SCSI disk
[Tue Apr 20 14:09:20 2021] sd 5:0:8:0: [sdk] Attached SCSI disk
[Tue Apr 20 14:09:20 2021] sd 5:0:11:0: [sdn] Attached SCSI disk
[Tue Apr 20 14:09:20 2021] sd 5:0:10:0: [sdm] Attached SCSI disk
[Tue Apr 20 14:09:20 2021] sd 5:0:9:0: [sdl] Attached SCSI disk
[Tue Apr 20 14:09:20 2021] sd 5:0:12:0: [sdo] Attached SCSI disk
[Tue Apr 20 14:09:46 2021] sdi: sdi1 sdi2 sdi3
[Tue Apr 20 14:09:46 2021] sdg: sdg1 sdg2

I am going to keep testing. This is my first time with any external SAS JBOD, and I wanted to talk to the FreeNAS/TrueNAS folks to see if you have an experience you can lend me. Everything is on a APC enterprise UPS that lasts like forever. It's all pretty enterprise. The thing that just struck me is that I was able to move to a different SAS port on the backplane, and the backplane is still working. I have had issues with bad backplanes, all the LEDs go red....etc....but to see this backplane still working.

This is my first time working with these drives, and this type of enclosure. These are IBM drives, ST4000NM0043. There is an error counter log at the end of the smartctl -a report of each drive. Looks like this:

Error counter log:
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 3122255856 0 0 3122255856 0 739683.910 0
write: 0 0 0 0 0 705806.176 0
verify: 1675063088 0 0 1675063088 0 14960.613 0

Non-medium error count: 32

I don't have a clue what it means. I was going to log what it is at now for each drive at the beginning of this test, and see if it changes at the end. It just seems like some enterprise stats for internal hard drive workings, then the cable issue errors that you may see with lets say another drive. I don't remember what they look like.

Let me know what you think if you get a chance.

PS, I also checked the LEDs on the backplane, and did not see anything out of the ordinary.

firesyde424 · Apr 20, 2021

That looks very much like something in the storage path is bad or possibly not connected properly. The "IBM" drives you mention are actually 4TB Seagate Constellation ES.3 6GB SAS drives. I have 300 of those drives running inside 5 different 60 bay MD3060e JBODs, all attached to servers running FreeNAS or TrueNAS. In our case, when we've seen issues like this in the past, it's usually been a backplane. In one case we did have a bad cable.

webdawg · Apr 21, 2021

Thanks for the advice, and . I just reseated all cables, and used a different port on the backplane, and card.

I thought the model number was seagate. I have a ton of these drives, but I did not google the model before post.

I have another card, and entire new jbod unit coming.

Should their be any reason a SAS9200-8E, and BPN-SAS2-836EL1 should not be compatible?

This is the cable I am using: https://www.amazon.com/gp/product/B01KH9OPMM/ - SFF-8088 to SFF-8088

I use 10Gtek products all the time. I guess as I diagnose, I will be posting more information. I was just checking for some weird incompatibility I would not know about.

webdawg · Apr 24, 2021

So I started another 'test', and same results.

So last time, to reset the test, I just removed the SFF-8088 cable from the card, and put it in the 2nd card slot. Redetected all the drives (different drive letters), and I started NWIPE again. All the drives froze again.

Next (this morning), while everything was still frozen, I put another HD in the JBOD, and linux detected it, and I can read/write from it.

My next trick is to try the newest version of system rescue, and then a new cable, and then a new JBOD, and then a new card.

I wish I had some more logs to look at.

webdawg · Apr 24, 2021

One thing I noticed this morning, is the test fails at the same sector all the time:

blk_update_request: critical target error, dev sde, sector 7762836224 op 0x1:(WRITE) flags 0x4800 phys_seg 128 prio class 0

The last set of tests too.

webdawg · Apr 24, 2021

I installed a new JBOD (same model), and getting same error. Double checking firmware this morning, and then going to try diff drives.

webdawg · Apr 24, 2021

I think I figured it out, I will update after I attempt to fix.

== START OF INFORMATION SECTION ===
Vendor: IBM-XIV
Product: ST4000NM0043 C1
Revision: EC5C
Compliance: SPC-4
User Capacity: 4,000,787,030,016 bytes [4.00 TB]
Logical block size: 512 bytes
Formatted with type 2 protection
8 bytes of protection information per logical block
LU is fully provisioned
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Logical Unit id: 0x5000c50058f38a63
Serial number: xxx
Device type: disk
Transport protocol: SAS (SPL-3)
Local Time is: Sat Apr 24 13:33:49 2021 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Temperature Warning: Enabled

The key here:

Formatted with type 2 protection

*http://talesinit.blogspot.com/2015/11/formatted-with-type-2-protection-huh.html
*https://www.seagate.com/files/stati...-from-corruption-technology-paper-tp621us.pdf
*https://redmine.ixsystems.com/issues/26746

I am going to bulk wipe some SATA drives, and see what happens.

webdawg · Apr 25, 2021

I just completed a test w/ 4x WD Reds 4TB, and everything appears to be working fine.

I am going to see what Type 2 does, and why it would cause failures in a drive test, in linux.

webdawg · Apr 28, 2021

I disabled type two, but it did not help. I am trying a new card this morning. (same model)

Mlovelace · Apr 28, 2021

webdawg said:
I think I figured it out, I will update after I attempt to fix.

== START OF INFORMATION SECTION ===
Vendor: IBM-XIV
Product: ST4000NM0043 C1
Revision: EC5C
Compliance: SPC-4
User Capacity: 4,000,787,030,016 bytes [4.00 TB]
Logical block size: 512 bytes
Formatted with type 2 protection
8 bytes of protection information per logical block
LU is fully provisioned
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Logical Unit id: 0x5000c50058f38a63
Serial number: xxx
Device type: disk
Transport protocol: SAS (SPL-3)
Local Time is: Sat Apr 24 13:33:49 2021 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Temperature Warning: Enabled

The drive(s) you listed here are formatted with 520 byte sectors. You need to reformat all of them to 512 byte sectors to be usable in your trueNAS system.

You'll need to run the following command on each of those drives. Replace 'daX' with whatever the drive label is in your system

Code:

sg_format -v --format --size=512 /dev/daX

webdawg · Apr 28, 2021

I actually ran

Code:

sg_format --verbose --format --ffmt=1 --fmtpinfo=0 --quick --wait /dev/sda

I was trying to see if a fast format would happen, but these drives do not support. The command did complete on each drive hours later.

Then I opened nwipe, and started it, and it failed again. The error messages look different:

Code:

[325889.208541] sd 5:0:25:0: [sdb] tag#1564 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[325889.208574] sd 5:0:25:0: [sdb] tag#1564 Sense Key : Illegal Request [current] [descriptor]
[325889.208582] sd 5:0:25:0: [sdb] tag#1564 Add. Sense: Invalid field in cdb
[325889.208591] sd 5:0:25:0: [sdb] tag#1564 CDB: Write(16) 8a 00 00 00 00 01 ce b3 7b 00 00 00 04 00 00 00
[325889.208599] blk_update_request: critical target error, dev sdb, sector 7762836224 op 0x1:(WRITE) flags 0x4800 phys_seg 128 prio class 0

I did not reboot the box after format, so I am wondering if it just ran out of write space.

I did this, this morning:

Put in a different card of the same model, hard reset everything, and I am using sysrescue 8.02 (so upgraded linux kernel).

I have a feeling it may work this time, but who knows.

The next thing I want to figure out is what the kernel is detecting as the amount of sectors vs what the drive is configured at just to check to see if there is anything else strange.

webdawg · Apr 28, 2021

So far so good:

Code:

sg_readcap /dev/sda
READ CAPACITY (10) indicates device capacity too large
  now trying 16 byte cdb variant
Read Capacity results:
   Protection: prot_en=0, p_type=0, p_i_exponent=0
   Logical block provisioning: lbpme=0, lbprz=0
   Last LBA=7814037167 (0x1d1c0beaf), Number of logical blocks=7814037168
   Logical block length=512 bytes
   Logical blocks per physical block exponent=0
   Lowest aligned LBA=0

and

Code:

blockdev --getsz /dev/sda
7814037168

webdawg · Apr 29, 2021

Another day, same issues. Drive sector type has been fixed for a while. I have new drives on order, they should be here May 4th.

I just got done checking if Active State Power Management was disabled in the BIOS for PCI, for the heck of it, and it has been.

I have a LSI 9300-8e on order, and some new cables for it. I am getting those today.

The 4TB SATA drives worked with no issue. I am leaning towards drives at this point. I am going to work on drive firmware/config today and see if I can find any other crazy settings that would cause this issue.

webdawg · Apr 29, 2021

At this point I am wondering if I have been hit with this:

Need help to "hack" these drives

I have some Seagate Enterprise Capacity 3.5 HDD v5 4TB SAS (ST4000NM0025) drives which have been flashed by lenovo with a custom 3TB firmware because they were used to replace failed drives in a 3TB storage system. Now the original system has been decommissioned, but since these drives have...

forums.servethehome.com

I just tried flashing firmware too, and nothing:

Checking the drives, it looks like the IBM label says 4TB.

webdawg · Apr 29, 2021

The next thing I am going to try is sysrescue 64bit, because both times I used i686, and I am booting seatools right now to take a look at the drives with it.

webdawg · Apr 29, 2021

Seatools did a LBA /drive resize, and it had some crazy centillion lba number. No effect, the drive size is the same.

The new card, and cable came today. It is a 9300-8e. I am going to flash it, and test with the card if the latest test failes.

I was able to confirm with supermicro that the backplan was up to date.

Everything is pointing to an incompatibility, or bad drives. I shall know soon!!!!!!!!!!!

What a nightmare. I needed to start using this box about a week ago.

PS (edit)

I have a dell server here 12gps, newer, and I am about to boot, and run w/ nwipe that way.

webdawg · Apr 30, 2021

I have given up. I spent a ton of time on this, and I am returning these drives. Removing type 2 did not help at all, and things have got even stranger.

I can 'nwipe' the drives, but I can't dd them at all. I am not crazy here.

It has to be the firmware of the drive/the drive itself.

I have another 4tb SAS seagate that I am running a test on right now, and I have a bunch of HGST drives that should be here next week tues.

I do not feel like I have wasted time here. It was a training/learning experience. Learning sg3_utils, and working with scsi commands/doing testing...I was able to update my existing knowledge, and add much to it.

I am happy somehow people are using these drives. There are a TON of posts about the IBM-XIV drives, and how simple it is to use them. I am wondering if something has changed recently. You may start to expect to see new posts w/ new people having issues with this.

I wanted to contact IBM, and see what they did to the firmware to bork these drives. What a waste of hardware. They were only in service for 4 years. Even if I did, I bet I would never get it answer. I sent them an email. If anyone knows someone that works at IBM on these things....

I think the key would be to force flash the OEM Seagate firmware back to them. Neither sg3_utils will allow it, and neither will the seagate utils.

PS (edit)

I just for the heck of it checked if any PSID stuff/opal stuff is active, and no.

I just threw these in Poweredge R530, and I am running nwipe on them now to see what happens.

webdawg · May 3, 2021

Today is the final day here. What a mess, but the updated knowledge that I gained was worth it.

It stopped at a diff sector this time, but failed in the same mannger in the Dell Poweredge R530.

IBM must have done something else to these drives, and if someone knows someone at IBM, where I can pose just a quick question (ES 3.5 Inch team), I would like to finish my research. I am about to send these drives back, but I feel there is something different besides type 2. The last thing I am going to do before sending these back is run a long smart test on just one of the drives, just to see what happens. I want to make sure that somehow I did not get 6 bad drives in a row.

My other box came back good with the OEM 4TB seagate, in nwipe.

I will update after the smartctl -t long /dev/dev comes back.

webdawg · May 4, 2021

The self test came back fine:

At this point I would warn anyone that was part of the great IBM 4TB purchase (as talked about on numerous forum posts; the ibm xTB drives on ebay) take a look at their drives. Possibly do a stress test. Something about this firmware has caused the issues that I reported here.

The tests were done on 6 drives purchased on ebay, all with the same hours above.

webdawg · May 5, 2021

I just had 7 HGST drives pass:

Important Announcement for the TrueNAS Community.

First Time W/ JBOD

Contributor

Contributor

Contributor

Contributor

Contributor

Contributor

Contributor

Contributor

Contributor

Guru

Contributor

Contributor

Contributor

Contributor

Contributor

Contributor

Contributor

Contributor

Contributor

Contributor

Similar threads