LSI (Avago) 9207-8i with Seagate 10TB Enterprise (ST10000NM0016)

leveche

Cadet
Joined
Oct 4, 2017
Messages
4
Hi,

I have issues after replacing WD disks with the Seagate ST10000NM0016 ones in my FreeNAS 11.0-U4 system. The geometry (RAIDZ2 with 8 disks) hasn't changed, I simply replaced the disks with the newer ones. Ever since then, I get periodically errors like
Code:
(da6:mps2:0:45:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00 length 0 SMID 365 command timeout cm 0xfffffe0001006f10 ccb 0xfffff801668ce800
		(noperiph:mps2:0:4294967295:0): SMID 2 Aborting command 0xfffffe0001006f10
mps2: Sending reset from mpssas_send_abort for target ID 45
		(da6:mps2:0:45:0): WRITE(16). CDB: 8a 00 00 00 00 01 8c 17 f6 68 00 00 00 08 00 00 length 4096 SMID 783 terminated ioc 804b scsi 0 state c xfer 0
mps2: Unfreezing devq for target ID 45
(da6:mps2:0:45:0): WRITE(16). CDB: 8a 00 00 00 00 01 8c 17 f6 68 00 00 00 08 00 00
(da6:mps2:0:45:0): CAM status: CCB request completed with an error
(da6:mps2:0:45:0): Retrying command
(da6:mps2:0:45:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00
(da6:mps2:0:45:0): CAM status: Command timeout
(da6:mps2:0:45:0): Retrying command
(da6:mps2:0:45:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00
(da6:mps2:0:45:0): CAM status: SCSI Status Error
(da6:mps2:0:45:0): SCSI status: Check Condition
(da6:mps2:0:45:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred)
(da6:mps2:0:45:0): Error 6, Retries exhausted
(da6:mps2:0:45:0): Invalidating pack

and soon enough the disk in question is being thrown out of the pool. After reboot it resumes working 'fine', until it or some other disk goes through the same ordeal.

I noticed that the firmware on my HBAs is P17, behind FreeBSD's mps version, and not the latest available from Broadcom - that would be P20 from April 2016. However, a quick search shows a number of people having problems with P20, and even deliberately downgrading to P16. Can anyone on this forum share their experiences with P20 firmware on 9207-8i?

I also note Broadcom (LSI) offers its own version of freeBSD driver. Is it recommended - or even possible - to use it with freeNAS?

And of course, can anyone offer any additional counsel as to how to fix my disk problem?

Thanks.
 
Joined
May 10, 2017
Messages
838
There were issues with the first P20 firmwares, but AFAIK there are no problems with the current one, P20.00.07, I have a 9207-8i on that firmware and never had any issues.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
I noticed that the firmware on my HBAs is P17, behind FreeBSD's mps version
You should update your firmware and see if that corrects the problem.
I also note Broadcom (LSI) offers its own version of freeBSD driver. Is it recommended - or even possible - to use it with freeNAS?
The integrated driver should be the latest available.
 

miip

Dabbler
Joined
Oct 7, 2017
Messages
15
I got the same issues with these HDDs on a Fujitsu D2607-8i that is flashed to P20 FW (LSI 9211-8i). I already replaced the cables but that did not help, then i found this thread.

Did you have any luck resolving this issue?
 

leveche

Cadet
Joined
Oct 4, 2017
Messages
4
I got the same issues with these HDDs on a Fujitsu D2607-8i that is flashed to P20 FW (LSI 9211-8i). I already replaced the cables but that did not help, then i found this thread.

Did you have any luck resolving this issue?

I have only reflashed the firmware (to 20.00.07.00) yesterday - let's give it a few days before I make assertions with any degree of confidence. I certainly hope it works.
 

leveche

Cadet
Joined
Oct 4, 2017
Messages
4
I have only reflashed the firmware (to 20.00.07.00) yesterday - let's give it a few days before I make assertions with any degree of confidence. I certainly hope it works.

Alas, I still see the failed CDB warnings in dmesg. They do seem much reduced in number; ~10 per 2Tb written, but this is still not acceptable in production. I'm not sure how to debug further.

On the practical side, I could try moving the disks to different controllers - I have a 3ware 9650SE and an Areca ARC-1882 available, although I gather freeNAS community disapproves of these even in JBOD mode.
 

miip

Dabbler
Joined
Oct 7, 2017
Messages
15
I tried contacting seagate about the issue, but they only told me to check the drive with their tool and report back. Maybe if more users would contact them they would start investigating this.

I also thought about getting a newer 3008 series controller.
 
Last edited:

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
I have a 3ware 9650SE and an Areca ARC-1882 available, although I gather freeNAS community disapproves of these even in JBOD mode.
Don't try those controllers with FreeNAS.
I also thought about getting a newer 3008 series controller.
If it is a hardware limitation of some kind, a newer controller might solve the problems. It might be worth a try, but you might not need to go that much newer. There may be a hardware revision on the HBA that you are using. Can you give more details on the exact (details are important) model/revision of card you are using?
 

miip

Dabbler
Joined
Oct 7, 2017
Messages
15
If it is a hardware limitation of some kind, a newer controller might solve the problems. It might be worth a try, but you might not need to go that much newer. There may be a hardware revision on the HBA that you are using. Can you give more details on the exact (details are important) model/revision of card you are using?

Alright, my current controller is a Fujitsu D2607-A21. It is flashed to IT mode, the firmware version is 20.00.07.00, same as leveches controller but his is a LSI SAS2308 where mine is a LSI SAS2008. Both controllers are using the mps driver in freebsd. As both controllers are showing the same symptoms this could also be a driver issue in freebsd. A SAS3008 controller would be using the mpr driver, so one variable could be eliminated by switching to that one.

Might be worth to note that i did not had any issues this week. Before, i rebooted the whole system to get the drive back in order, this time i only rebooted the VM freenas is running in. But i have not stresstested the system either in that time, just normal use. I am using vmware esxi 6.5 to virtualize. The controller is under direct control of the freenas VM. Let me know if you need more details.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Alright, my current controller is a Fujitsu D2607-A21. It is flashed to IT mode, the firmware version is 20.00.07.00, same as leveches controller but his is a LSI SAS2308 where mine is a LSI SAS2008.
That is strange. I have not tried to use any 10TB drives, but I would have expected them to work. I did check on the controller on my main storage server at work and the documentation says it only supports up to 8TB drives. You may have to go to a newer model controller that stipulates that it will work with 10TB disks. I don't expect the problem to be in the driver, but I have no insight about that. It is certainly not the fault of the drive manufacturer. You might get some results by submitting a ticket with Broadcom (the current owner of LSI/Avago). They may be able to release an update to both the firmware and the driver as the BSD driver comes from them to begin with.
 
Joined
May 10, 2017
Messages
838
I would expect that if there's an issue it's related to that specific Seagate model, possibly firmware related, would be very surprised if it has anything to do with the disk capacity.
 

leveche

Cadet
Joined
Oct 4, 2017
Messages
4
Don't try those controllers with FreeNAS.

Not even in IT/JBOD mode?

If it is a hardware limitation of some kind, a newer controller might solve the problems. It might be worth a try, but you might not need to go that much newer. There may be a hardware revision on the HBA that you are using. Can you give more details on the exact (details are important) model/revision of card you are using?

Here's the details of my controller:
Code:
------------------------------------------------------------------------
Controller information
------------------------------------------------------------------------
  Controller type  : SAS2308_2
  BIOS version  : 7.39.02.00
  Firmware version  : 20.00.07.00
  Channel description  : 1 Serial Attached SCSI
  Initiator ID  : 0
  Maximum physical devices  : 1023
  Concurrent commands supported  : 10240
  Slot  : 4
  Segment  : 0
  Bus  : 4
  Device  : 0
  Function  : 0
  RAID Support  : No


Code:
01:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2308 PCI-Express Fusion-MPT SAS-2 (rev 05)
  Subsystem: LSI Logic / Symbios Logic 9207-8i SAS2.1 HBA
  Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
  Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
  Latency: 0, Cache Line Size: 32 bytes
  Interrupt: pin A routed to IRQ 16
  Region 0: I/O ports at 9000
  Region 1: Memory at faeb0000 (64-bit, non-prefetchable)
  Region 3: Memory at faec0000 (64-bit, non-prefetchable)
  Expansion ROM at faf00000 [disabled]
  Capabilities: [50] Power Management version 3
  Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
  Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
  Capabilities: [68] Express (v2) Endpoint, MSI 00
  DevCap: MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
  ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
  DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
  RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
  MaxPayload 256 bytes, MaxReadReq 512 bytes
  DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
  LnkCap: Port #0, Speed 8GT/s, Width x8, ASPM L0s, Exit Latency L0s <64ns, L1 <1us
  ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
  LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
  ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
  LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
  DevCap2: Completion Timeout: Range BC, TimeoutDis+, LTR-, OBFF Not Supported
  DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
  LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
  Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
  Compliance De-emphasis: -6dB
  LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
  EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
  Capabilities: [d0] Vital Product Data
  Not readable
  Capabilities: [a8] MSI: Enable- Count=1/1 Maskable- 64bit+
  Address: 0000000000000000  Data: 0000
  Capabilities: [c0] MSI-X: Enable+ Count=16 Masked-
  Vector table: BAR=1 offset=0000e000
  PBA: BAR=1 offset=0000f000
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
On the practical side, I could try moving the disks to different controllers - I have a 3ware 9650SE and an Areca ARC-1882 available, although I gather freeNAS community disapproves of these even in JBOD mode.
You are already aware of the problem with the kind of controller I was referring to.
 

miip

Dabbler
Joined
Oct 7, 2017
Messages
15
I upgraded to a 3008 controller (IBM M1215) on the weekend. No issues so far.
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
Well, this might be worth keeping an eye on for the Hardware Recommendations

@Ericloewe
If 2008 based controllers only support <= 8TB drives, that basically means we need to recommend 3008+ controllers (or perhaps the 21xx controllers?) for > 8TB support.

I only have 8TB drives currently.

Reminds me of the SAS1 2TB limit
 
Joined
May 10, 2017
Messages
838
If 2008 based controllers only support <= 8TB drives, that basically means we need to recommend 3008+ controllers (or perhaps the 21xx controllers?) for > 8TB support.

I'd be very surprised if this issue has something to do with the disk capacity, it's probably related to that specific Seagate model.

Reminds me of the SAS1 2TB limit

There's not a general SAS1 2TB limit, LSI SAS1 controllers and some SAS1 expanders have a 2TB limit, I have enclosures with SAS1 expanders that work with 8TB disks.
 
Last edited:

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
This isn't the first oddity I've heard of involving recent, large drives and early LSI SAS2 controllers. I'll keep my eyes open, but I don't want to jump to any conclusions.
 

saltmaster

Cadet
Joined
Nov 2, 2017
Messages
3
I've just gone about setting up 5 of the exact same hard-drives on a SAS2008 based controller (re-flashed Dell Perc H200E) and I'm encountering exactly the same issue. Based on a few other forums posts elsewhere I don't think the Seagate Enterprise 10TB drives work very well with the SAS2008 based controllers. I've ordered a SAS3008 based controller which should arrive in a few weeks to see if that solves the problem.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
I've ordered a SAS3008 based controller which should arrive in a few weeks to see if that solves the problem.
It should. It is a much newer chipset.
 

miip

Dabbler
Joined
Oct 7, 2017
Messages
15
This could also be a simple driver issue. LSI 2008 and 2308 use the same driver in FreeBSD.

So far no errors with the LSI 3008 controller.

BTW, where did you guys see that LSI 2008 only supports drives <8TB? I can not find anything official about a 8TB limit. There were no issues for several weeks, so i don't think this is some compatibility issue.
 
Top