Hi all,
I am new to FreeNAS but I have some knowledge on ZFS. I have a SuperMicro 847E16-R1K28LPB Chassis with X9SRL-F Board (INTEL Xeon E5-2620 V, 64GB ECC Registrered) and a LSI 8-Port 9207-8i HBA (IT Mode, Firmware Version 19). The LSI in connected to a 36 Port SAS Expander (LSI SAS2X36). I am running FreeNAS-9.3-STABLE-201503071634.
I have 11x2TB SATA Disks (2x5 HDD RaidZ plus one hotspare) making up one ZFS pool and another 8 SAS drives for a second ZFS pool. During scrubs I see these messages in the logs:
As you can see this does not boil down to one specific drive, but as far as I can see it, only the SATA disks are affected. This happens only if the ZFS pool with the SATA disks is scrubbed. The scrub finishes with no errors.
Disks in the pool are 10xWD + 1xSeagate (the Hot-Spare is a WD drive, the Seagate is in use).
My question is: What is most likely causing these errors? Could this be a general problem of one or more SATA disks running on a SAS expander? I heard rumors that this is not the best setup one could think of.
What could I do to debug this?
Thanx a lot for any hints...
I am new to FreeNAS but I have some knowledge on ZFS. I have a SuperMicro 847E16-R1K28LPB Chassis with X9SRL-F Board (INTEL Xeon E5-2620 V, 64GB ECC Registrered) and a LSI 8-Port 9207-8i HBA (IT Mode, Firmware Version 19). The LSI in connected to a 36 Port SAS Expander (LSI SAS2X36). I am running FreeNAS-9.3-STABLE-201503071634.
I have 11x2TB SATA Disks (2x5 HDD RaidZ plus one hotspare) making up one ZFS pool and another 8 SAS drives for a second ZFS pool. During scrubs I see these messages in the logs:
Code:
freenasbox (da4:mps0:0:12:0): READ(10). CDB: 28 00 05 6d 10 d8 00 01 00 00 length 131072 SMID 613 command timeout cm 0xffffff8000f73168 ccb 0xfffffe00161b0800 freenasbox (noperiph:mps0:0:4294967295:0): SMID 3 Aborting command 0xffffff8000f73168 freenasbox (da4:mps0:0:12:0): READ(10). CDB: 28 00 05 6d 11 d8 00 00 98 00 length 77824 SMID 270 command timeout cm 0xffffff8000f579f0 ccb 0xfffffe0dc435c800 freenasbox (da4:mps0:0:12:0): READ(10). CDB: 28 00 05 6d 11 d8 00 00 98 00 length 77824 SMID 270 terminated ioc 804b scsi 0 state c xfer 0 freenasbox (da4:mps0:0:12:0): READ(10). CDB: 28 00 05 6d 10 d8 00 01 00 00 freenasbox (da4:mps0:0:12:0): CAM status: Command timeout freenasbox (da4:mps0:0:12:0): Retrying command freenasbox (da4:mps0:0:12:0): READ(10). CDB: 28 00 05 6d 11 d8 00 00 98 00 freenasbox (da4:mps0:0:12:0): CAM status: SCSI Status Error freenasbox (da4:mps0:0:12:0): SCSI status: Check Condition freenasbox (da4:mps0:0:12:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred) freenasbox (da4:mps0:0:12:0): Retrying command (per sense data) freenasbox (da10:mps0:0:18:0): READ(10). CDB: 28 00 05 6d 49 60 00 00 40 00 length 32768 SMID 347 command timeout cm 0xffffff8000f5dc98 ccb 0xfffffe09291eb800 freenasbox (noperiph:mps0:0:4294967295:0): SMID 4 Aborting command 0xffffff8000f5dc98 freenasbox (da10:mps0:0:18:0): READ(10). CDB: 28 00 05 6d 49 20 00 00 40 00 length 32768 SMID 659 command timeout cm 0xffffff8000f76c58 ccb 0xfffffe001617f000 freenasbox (da10:mps0:0:18:0): READ(10). CDB: 28 00 05 6d 4f 80 00 00 08 00 length 4096 SMID 609 command timeout cm 0xffffff8000f72c48 ccb 0xfffffe00161a7800 freenasbox (da10:mps0:0:18:0): READ(10). CDB: 28 00 05 6d 4f 80 00 00 08 00 length 4096 SMID 609 terminated ioc 804b scsi 0 state c xfer 0 freenasbox (da10:mps0:0:18:0): READ(10). CDB: 28 00 05 6d 49 20 00 00 40 00 length 32768 SMID 659 terminated ioc 804b scsi 0 state c xfer 0 freenasbox (da10:mps0:0:18:0): READ(10). CDB: 28 00 05 6d 49 60 00 00 40 00 freenasbox (da10:mps0:0:18:0): CAM status: Command timeout freenasbox (da10:mps0:0:18:0): Retrying command freenasbox (da10:mps0:0:18:0): READ(10). CDB: 28 00 05 6d 4f 80 00 00 08 00 freenasbox (da10:mps0:0:18:0): CAM status: SCSI Status Error freenasbox (da10:mps0:0:18:0): SCSI status: Check Condition freenasbox (da10:mps0:0:18:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred) freenasbox (da10:mps0:0:18:0): Retrying command (per sense data) freenasbox (da1:mps0:0:9:0): READ(10). CDB: 28 00 bb 5a 46 b8 00 00 88 00 length 69632 SMID 820 command timeout cm 0xffffff8000f83aa0 ccb 0xfffffe001617f000 freenasbox (noperiph:mps0:0:4294967295:0): SMID 5 Aborting command 0xffffff8000f83aa0 freenasbox (da1:mps0:0:9:0): READ(10). CDB: 28 00 bb 5a 44 38 00 00 c0 00 length 98304 SMID 797 command timeout cm 0xffffff8000f81d28 ccb 0xfffffe004de2e800 freenasbox (da1:mps0:0:9:0): READ(10). CDB: 28 00 bb 5a 5a e0 00 00 08 00 length 4096 SMID 778 command timeout cm 0xffffff8000f804d0 ccb 0xfffffe0db10ac000 freenasbox (da1:mps0:0:9:0): READ(10). CDB: 28 00 bb 5a 5b f0 00 00 08 00 length 4096 SMID 817 command timeout cm 0xffffff8000f836c8 ccb 0xfffffe0aaffa0800 freenasbox (da1:mps0:0:9:0): READ(10). CDB: 28 00 bb 5a 64 40 00 00 08 00 length 4096 SMID 136 command timeout cm 0xffffff8000f4ce40 ccb 0xfffffe02d5cdf000 freenasbox (da1:mps0:0:9:0): READ(10). CDB: 28 00 bb 5a 64 40 00 00 08 00 length 4096 SMID 136 terminated ioc 804b scsi 0 state c xfer 0 freenasbox (da1:mps0:0:9:0): READ(10). CDB: 28 00 bb 5a 5b f0 00 00 08 00 length 4096 SMID 817 terminated ioc 804b scsi 0 state c xfer 0 freenasbox (da1:mps0:0:9:0): READ(10). CDB: 28 00 bb 5a 5a e0 00 00 08 00 length 4096 SMID 778 terminated ioc 804b scsi 0 state c xfer 0 freenasbox (da1:mps0:0:9:0): READ(10). CDB: 28 00 bb 5a 44 38 00 00 c0 00 length 98304 SMID 797 terminated ioc 804b scsi 0 state c xfer 0 freenasbox (da1:mps0:0:9:0): READ(10). CDB: 28 00 bb 5a 46 b8 00 00 88 00 freenasbox (da1:mps0:0:9:0): CAM status: Command timeout freenasbox (da1:mps0:0:9:0): Retrying command freenasbox (da1:mps0:0:9:0): READ(10). CDB: 28 00 bb 5a 64 40 00 00 08 00 freenasbox (da1:mps0:0:9:0): CAM status: SCSI Status Error freenasbox (da1:mps0:0:9:0): SCSI status: Check Condition freenasbox (da1:mps0:0:9:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred) freenasbox (da1:mps0:0:9:0): Retrying command (per sense data)
As you can see this does not boil down to one specific drive, but as far as I can see it, only the SATA disks are affected. This happens only if the ZFS pool with the SATA disks is scrubbed. The scrub finishes with no errors.
Disks in the pool are 10xWD + 1xSeagate (the Hot-Spare is a WD drive, the Seagate is in use).
My question is: What is most likely causing these errors? Could this be a general problem of one or more SATA disks running on a SAS expander? I heard rumors that this is not the best setup one could think of.
What could I do to debug this?
Thanx a lot for any hints...