Hot unplugging SAS JBOD

jixam

Dabbler
Joined
May 1, 2015
Messages
47
I recently added a Supermicro CSE-836BE1C-R1K03JBOD to our FreeNAS 11.2 system. This JBOD is chained using two SFF-8644 cables to an identical JBOD which has been working flawlessly for 3 years. The older JBOD in turn connects to an LSI SAS 9300-8e HBA, also with two SFF-8644 cables . I am including sas3flash -list output below.

My problem is that I am getting errors on this new JBOD, like this:

(da16:mpr0:0:49:0): WRITE(10). CDB: 2a 00 60 52 4f 10 00 00 40 00 length 32768 SMID 503 terminated ioc 804b loginfo 31120303 scsi 0 state c xfer 0
(da16:mpr0:0:49:0): WRITE(10). CDB: 2a 00 60 52 4f 10 00 00 40 00
(da16:mpr0:0:49:0): CAM status: CCB request completed with an error
(da16:mpr0:0:49:0): Retrying command


There appears to be no SMART errors so I am thinking that this is a cabling issue and I have ordered new cables to test this theory.

Since everything has worked well until now, it occurs to me that I have no experience in downsizing. Hence my questions: can I unplug the new JBOD without disrupting other FreeNAS services? Do I need to export pools residing on the JBOD first? Can I swap one cable at a time with no downtime at all?

If you think it is not a cabling issue, other ideas are very welcome.

Thanks!


The sas3flash output:

Avago Technologies SAS3 Flash Utility
Version 16.00.00.00 (2017.05.02)
Copyright 2008-2017 Avago Technologies. All rights reserved.

Adapter Selected is a Avago SAS: SAS3008(C0)

Controller Number : 0
Controller : SAS3008(C0)
PCI Address : 00:af:00:00
SAS Address : 500605b-0-0dc3-cb20
NVDATA Version (Default) : 05.00.00.07
NVDATA Version (Persistent) : 05.00.00.07
Firmware Product ID : 0x2221 (IT)
Firmware Version : 05.00.00.00
NVDATA Vendor : LSI
NVDATA Product ID : SAS9300-8e
BIOS Version : 08.11.00.00
UEFI BSD Version : 06.00.00.00
FCODE Version : N/A
Board Name : SAS9300-8e
Board Assembly : H3-25460-02H
Board Tracer Number : SP80919xxx

Finished Processing Commands Successfully.
Exiting SAS3Flash.
 
Last edited:

jixam

Dabbler
Joined
May 1, 2015
Messages
47
Oh, after posting this I just noticed that the firmware version is old. We must have missed updating that when installing this system. Could that be causing my errors?
 

Yorick

Wizard
Joined
Nov 4, 2018
Messages
1,912
I believe the guidance around here is that firmware 16 for these HBAs isn’t a suggestion, it’s pretty much a must.

Swapping cables sounds like a good test, as well. Pool export would make the pool “static”, if you need to take it offline, that’s a good precaution.
 

jixam

Dabbler
Joined
May 1, 2015
Messages
47
This is a production system so it has been difficult to diagnose with only a little downtime possible. Upgrading the firmware didn't help and the error persisted when attaching the JBOD to a different server (with TrueNAS 12.0). However, I eventually figured out that leaving one of the cables disconnected resolves the CAM errors. My conclusion is that the JBOD must have a faulty port and we are going to just not use that one.

To answer my original questions: 1) hot unplugging a JBOD is fine with TrueNAS, 2) pools must be exported/detached first, 3) dual SFF-8644 cables can be switched one at a time with no issues.
 
Top