Hi all,
I am hoping someone can educate me with this problem. This is the second time I am getting this error and this one is with a new vdev. So problem happened 2 times, once with different drives.
I have a suspicion this is just a failing drive or some bad sectors. Since this system has been running for about a year with Unraid before I started migration to freenas a week ago. (Obviously HDDs were stress tested at one point or another before being added to a system. Memtests all pass)
So this morning I got this message:
Status of the pool
I got similar error on another vdev a day before on a different drive but I discarded it as a bad sector and did zpool clear after running a smart test.
This is today's status
The second highlighted rectangle had same error as first a day ago.
smartctl -a /dev/da18
This is error happened the other day with da11 I also added new vdev that day i am asuming error was ok with corrupt GPT stuff
smartctl -a /dev/da11
Here are my firmware versions I am also assuming all is ok there:
I guess my questions are:
edit: correct smart test
I am hoping someone can educate me with this problem. This is the second time I am getting this error and this one is with a new vdev. So problem happened 2 times, once with different drives.
I have a suspicion this is just a failing drive or some bad sectors. Since this system has been running for about a year with Unraid before I started migration to freenas a week ago. (Obviously HDDs were stress tested at one point or another before being added to a system. Memtests all pass)
So this morning I got this message:
Code:
kernel log messages: > mps2: SAS Address for SATA device = 4874463cffdcbe94 > mps2: SAS Address from SATA device = 4874463cffdcbe94 > da20 at mps2 bus 0 scbus2 target 11 lun 0 > da20: <ATA WDC WD50EFRX-68M 0A82> Fixed Direct Access SPC-4 SCSI device > da20: Serial Number WD-xxxxxxxxx > da20: 600.000MB/s transfers > da20: Command Queueing enabled > da20: 4769307MB (9767541168 512 byte sectors) > da20: quirks=0x8<4K> > ahcich1: Timeout on slot 8 port 0 > ahcich1: is 00000000 cs 00000100 ss 00000000 rs 00000100 tfd c0 serr 00000000 cmd 0004c817 > (ada1:ahcich1:0:0:0): FLUSHCACHE48. ACB: ea 00 00 00 00 40 00 00 00 00 00 00 > (ada1:ahcich1:0:0:0): CAM status: Command timeout > (ada1:ahcich1:0:0:0): Retrying command > ahcich0: Timeout on slot 7 port 0 > ahcich0: is 00000000 cs 00000080 ss 00000000 rs 00000080 tfd c0 serr 00000000 cmd 0004c717 > (ada0:ahcich0:0:0:0): FLUSHCACHE48. ACB: ea 00 00 00 00 40 00 00 00 00 00 00 > (ada0:ahcich0:0:0:0): CAM status: Command timeout > (ada0:ahcich0:0:0:0): Retrying command > (da18:mps0:0:3:0): WRITE(10). CDB: 2a 00 7b b7 df e8 00 00 08 00 > (da18:mps0:0:3:0): CAM status: SCSI Status Error > (da18:mps0:0:3:0): SCSI status: Check Condition > (da18:mps0:0:3:0): SCSI sense: ILLEGAL REQUEST asc:21,0 (Logical block address out of range) > (da18:mps0:0:3:0): Info: 0x7bb7dfe8 > (da18:mps0:0:3:0): Error 22, Unretryable error -- End of security output --
Status of the pool
Code:
Checking status of zfs pools: NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT freenas-boot 14.9G 1.09G 13.8G - - 7% 1.00x ONLINE - zfast 460G 195G 265G - 27% 42% 1.00x ONLINE /mnt zroot 81.8T 38.3T 43.5T - 24% 46% 1.00x ONLINE /mnt pool: zroot state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://illumos.org/msg/ZFS-8000-9P scan: resilvered 68K in 0h0m with 0 errors on Sat Jul 2 02:53:31 2016 config: NAME STATE READ WRITE CKSUM zroot ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 gptid/775bcae9-38d8-11e6-8455-0cc47a6b6816 ONLINE 0 0 0 gptid/7861ae25-38d8-11e6-8455-0cc47a6b6816 ONLINE 0 0 0 gptid/79076331-38d8-11e6-8455-0cc47a6b6816 ONLINE 0 0 0 gptid/79b53996-38d8-11e6-8455-0cc47a6b6816 ONLINE 0 0 0 gptid/7ac4a958-38d8-11e6-8455-0cc47a6b6816 ONLINE 0 0 0 gptid/7bdf71d9-38d8-11e6-8455-0cc47a6b6816 ONLINE 0 0 0 raidz2-1 ONLINE 0 0 0 gptid/7cf57b75-38d8-11e6-8455-0cc47a6b6816 ONLINE 0 0 0 gptid/7e037123-38d8-11e6-8455-0cc47a6b6816 ONLINE 0 0 0 gptid/7f1e9fd8-38d8-11e6-8455-0cc47a6b6816 ONLINE 0 0 0 gptid/80358d0f-38d8-11e6-8455-0cc47a6b6816 ONLINE 0 0 0 gptid/814a11ec-38d8-11e6-8455-0cc47a6b6816 ONLINE 0 0 0 gptid/81fe2a3b-38d8-11e6-8455-0cc47a6b6816 ONLINE 0 0 0 raidz2-2 ONLINE 0 0 0 gptid/063d5bbf-3ebe-11e6-8f50-0cc47a6b6816 ONLINE 0 0 0 gptid/0751dcd6-3ebe-11e6-8f50-0cc47a6b6816 ONLINE 0 0 0 gptid/080d0dc1-3ebe-11e6-8f50-0cc47a6b6816 ONLINE 0 0 0 gptid/08b9ac7e-3ebe-11e6-8f50-0cc47a6b6816 ONLINE 0 0 0 gptid/09630467-3ebe-11e6-8f50-0cc47a6b6816 ONLINE 0 1 0 gptid/0a0c005f-3ebe-11e6-8f50-0cc47a6b6816 ONLINE 0 0 0 errors: No known data errors -- End of daily output --
I got similar error on another vdev a day before on a different drive but I discarded it as a bad sector and did zpool clear after running a smart test.
This is today's status

The second highlighted rectangle had same error as first a day ago.
smartctl -a /dev/da18
Code:
smartctl -a /dev/da18 smartctl 6.5 2016-05-07 r4318 [FreeBSD 10.3-STABLE amd64] (local build) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Western Digital Red Device Model: WDC WD50EFRX-68MYMN1 Serial Number: WD-xxxx LU WWN Device Id: 5 0014ee 260b5e9b4 Firmware Version: 82.00A82 User Capacity: 5,000,981,078,016 bytes [5.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 5700 rpm Device is: In smartctl database [for details use: -P show] ATA Version is: ACS-2, ACS-3 T13/2161-D revision 3b SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Sat Jul 2 08:59:33 2016 CDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 248) Self-test routine in progress... 80% of test remaining. Total time to complete Offline data collection: (57960) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 579) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x303d) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 204 202 021 Pre-fail Always - 8791 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 310 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 087 087 000 Old_age Always - 10207 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 21 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 9 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 381 194 Temperature_Celsius 0x0022 118 112 000 Old_age Always - 34 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 5569 - # 2 Short offline Completed without error 00% 5554 - # 3 Extended offline Completed without error 00% 5542 - # 4 Short offline Completed without error 00% 5530 - # 5 Short offline Completed without error 00% 5506 - # 6 Short offline Completed without error 00% 5472 - # 7 Short offline Completed without error 00% 5448 - # 8 Short offline Completed without error 00% 5424 - # 9 Short offline Completed without error 00% 5400 - #10 Short offline Completed without error 00% 5376 - #11 Short offline Completed without error 00% 5353 - #12 Short offline Completed without error 00% 5329 - #13 Short offline Completed without error 00% 5305 - #14 Short offline Completed without error 00% 5281 - #15 Short offline Completed without error 00% 5257 - #16 Extended offline Completed without error 00% 5245 - #17 Short offline Completed without error 00% 5233 - #18 Short offline Completed without error 00% 5232 - #19 Short offline Completed without error 00% 5208 - #20 Short offline Completed without error 00% 5184 - #21 Short offline Completed without error 00% 5160 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk.
This is error happened the other day with da11 I also added new vdev that day i am asuming error was ok with corrupt GPT stuff
Code:
anomaly.m3ki.net kernel log messages: > mps2: SAS Address for SATA device = 4873463ef7c2c595 > mps2: SAS Address from SATA device = 4873463ef7c2c595 > da19 at mps2 bus 0 scbus2 target 2 lun 0 > da19: <ATA WDC WD50EFRX-68M 0A82> Fixed Direct Access SPC-4 SCSI device > da19: Serial Number WD---------- > da19: 600.000MB/s transfers > da19: Command Queueing enabled > da19: 4769307MB (9767541168 512 byte sectors) > da19: quirks=0x8<4K> > GEOM_ELI: Device da13p1.eli destroyed. > GEOM_ELI: Detached da13p1.eli on last close. > GEOM_ELI: Device da14p1.eli destroyed. > GEOM_ELI: Detached da14p1.eli on last close. > GEOM_ELI: Device da0p1.eli destroyed. > GEOM_ELI: Detached da0p1.eli on last close. > GEOM_ELI: Device da1p1.eli destroyed. > GEOM_ELI: Detached da1p1.eli on last close. > GEOM_ELI: Device da2p1.eli destroyed. > GEOM_ELI: Detached da2p1.eli on last close. > GEOM_ELI: Device da3p1.eli destroyed. > GEOM_ELI: Detached da3p1.eli on last close. > GEOM_ELI: Device da4p1.eli destroyed. > GEOM_ELI: Detached da4p1.eli on last close. > GEOM_ELI: Device da5p1.eli destroyed. > GEOM_ELI: Detached da5p1.eli on last close. > GEOM_ELI: Device da7p1.eli destroyed. > GEOM_ELI: Detached da7p1.eli on last close. > GEOM_ELI: Device da8p1.eli destroyed. > GEOM_ELI: Detached da8p1.eli on last close. > GEOM_ELI: Device da9p1.eli destroyed. > GEOM_ELI: Detached da9p1.eli on last close. > GEOM_ELI: Device da11p1.eli destroyed. > GEOM_ELI: Detached da11p1.eli on last close. > GEOM_ELI: Device ada0p1.eli destroyed. > GEOM_ELI: Detached ada0p1.eli on last close. > GEOM_ELI: Device ada1p1.eli destroyed. > GEOM_ELI: Detached ada1p1.eli on last close. > GEOM: da6: the primary GPT table is corrupt or invalid. > GEOM: da6: using the secondary instead -- recovery strongly advised. > GEOM: da10: the primary GPT table is corrupt or invalid. > GEOM: da10: using the secondary instead -- recovery strongly advised. > GEOM: da12: the primary GPT table is corrupt or invalid. > GEOM: da12: using the secondary instead -- recovery strongly advised. > GEOM: da15: the primary GPT table is corrupt or invalid. > GEOM: da15: using the secondary instead -- recovery strongly advised. > GEOM: da18: the primary GPT table is corrupt or invalid. > GEOM: da18: using the secondary instead -- recovery strongly advised. > GEOM: da19: the primary GPT table is corrupt or invalid. > GEOM: da19: using the secondary instead -- recovery strongly advised. > GEOM_ELI: Device da13p1.eli created. > GEOM_ELI: Encryption: AES-XTS 128 > GEOM_ELI: Crypto: hardware > GEOM_ELI: Device da14p1.eli created. > GEOM_ELI: Encryption: AES-XTS 128 > GEOM_ELI: Crypto: hardware > GEOM_ELI: Device da0p1.eli created. > GEOM_ELI: Encryption: AES-XTS 128 > GEOM_ELI: Crypto: hardware > GEOM_ELI: Device da1p1.eli created. > GEOM_ELI: Encryption: AES-XTS 128 > GEOM_ELI: Crypto: hardware > GEOM_ELI: Device da2p1.eli created. > GEOM_ELI: Encryption: AES-XTS 128 > GEOM_ELI: Crypto: hardware > GEOM_ELI: Device da3p1.eli created. > GEOM_ELI: Encryption: AES-XTS 128 > GEOM_ELI: Crypto: hardware > GEOM_ELI: Device da4p1.eli created. > GEOM_ELI: Encryption: AES-XTS 128 > GEOM_ELI: Crypto: hardware > GEOM_ELI: Device da5p1.eli created. > GEOM_ELI: Encryption: AES-XTS 128 > GEOM_ELI: Crypto: hardware > GEOM_ELI: Device da7p1.eli created. > GEOM_ELI: Encryption: AES-XTS 128 > GEOM_ELI: Crypto: hardware > GEOM_ELI: Device da8p1.eli created. > GEOM_ELI: Encryption: AES-XTS 128 > GEOM_ELI: Crypto: hardware > GEOM_ELI: Device da9p1.eli created. > GEOM_ELI: Encryption: AES-XTS 128 > GEOM_ELI: Crypto: hardware > GEOM_ELI: Device da11p1.eli created. > GEOM_ELI: Encryption: AES-XTS 128 > GEOM_ELI: Crypto: hardware > GEOM_ELI: Device da6p1.eli created. > GEOM_ELI: Encryption: AES-XTS 128 > GEOM_ELI: Crypto: hardware > GEOM_ELI: Device da10p1.eli created. > GEOM_ELI: Encryption: AES-XTS 128 > GEOM_ELI: Crypto: hardware > GEOM_ELI: Device da12p1.eli created. > GEOM_ELI: Encryption: AES-XTS 128 > GEOM_ELI: Crypto: hardware > GEOM_ELI: Device da15p1.eli created. > GEOM_ELI: Encryption: AES-XTS 128 > GEOM_ELI: Crypto: hardware > GEOM_ELI: Device da18p1.eli created. > GEOM_ELI: Encryption: AES-XTS 128 > GEOM_ELI: Crypto: hardware > GEOM_ELI: Device da19p1.eli created. > GEOM_ELI: Encryption: AES-XTS 128 > GEOM_ELI: Crypto: hardware > GEOM_ELI: Device ada0p1.eli created. > GEOM_ELI: Encryption: AES-XTS 128 > GEOM_ELI: Crypto: hardware > GEOM_ELI: Device ada1p1.eli created. > GEOM_ELI: Encryption: AES-XTS 128 > GEOM_ELI: Crypto: hardware > (da11:mps2:0:6:0): READ(10). CDB: 28 00 40 56 04 58 00 00 40 00 length 32768 SMID 965 terminated ioc 804b scsi 0 state 0 xfer 0 > (da11:mps2:0:6:0): READ(10). CDB: 28 00 40 56 05 18 00 00 40 00 length 32768 SMID 744 terminated ioc 804b scsi 0 state 0 xfer(da11:mps2:0:6:0): READ(10). CDB: 28 00 40 56 04 58 00 00 40 00 > 0 > (da11:mps2:0:6:0): CAM status: CCB request completed with an error > (da11:mps2:0:6:0): READ(10). CDB: 28 00 40 56 04 98 00 00 40 00 length 32768 SMID 973 terminated ioc 804b scsi 0 state 0 xfer(da11: 0 > mps2:0:6:0): Retrying command > (da11:mps2:0:6:0): READ(10). CDB: 28 00 40 56 05 18 00 00 40 00 > (da11:mps2:0:6:0): CAM status: CCB request completed with an error > (da11:mps2:0:6:0): Retrying command > (da11:mps2:0:6:0): READ(10). CDB: 28 00 40 56 04 98 00 00 40 00 > (da11:mps2:0:6:0): CAM status: CCB request completed with an error > (da11:mps2:0:6:0): Retrying command > (da11:mps2:0:6:0): WRITE(16). CDB: 8a 00 00 00 00 01 2c 93 cd 60 00 00 00 40 00 00 > (da11:mps2:0:6:0): CAM status: SCSI Status Error > (da11:mps2:0:6:0): SCSI status: Check Condition > (da11:mps2:0:6:0): SCSI sense: ILLEGAL REQUEST asc:21,0 (Logical block address out of range) > (da11:mps2:0:6:0): Info: 0x12c93cd60 > (da11:mps2:0:6:0): Error 22, Unretryable error -- End of security output --
smartctl -a /dev/da11
Code:
smartctl -a /dev/da11 smartctl 6.5 2016-05-07 r4318 [FreeBSD 10.3-STABLE amd64] (local build) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Western Digital Red Device Model: WDC WD50EFRX-68MYMN1 Serial Number: WD-xxx LU WWN Device Id: 5 0014ee 260b5abef Firmware Version: 82.00A82 User Capacity: 5,000,981,078,016 bytes [5.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 5700 rpm Device is: In smartctl database [for details use: -P show] ATA Version is: ACS-2, ACS-3 T13/2161-D revision 3b SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Sat Jul 2 08:52:53 2016 CDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (57660) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 576) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x303d) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 203 202 021 Pre-fail Always - 8816 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 205 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 090 090 000 Old_age Always - 7325 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 25 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 6 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 324 194 Temperature_Celsius 0x0022 116 109 000 Old_age Always - 36 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 7303 - # 2 Short offline Completed without error 00% 7291 - # 3 Short offline Completed without error 00% 7255 - # 4 Short offline Completed without error 00% 7159 - # 5 Short offline Completed without error 00% 6475 - # 6 Short offline Completed without error 00% 6451 - # 7 Short offline Completed without error 00% 6427 - # 8 Extended offline Completed without error 00% 6404 - # 9 Short offline Completed without error 00% 6390 - #10 Short offline Completed without error 00% 6366 - #11 Short offline Completed without error 00% 6343 - #12 Short offline Completed without error 00% 6324 - #13 Short offline Completed without error 00% 6300 - #14 Short offline Completed without error 00% 6267 - #15 Short offline Completed without error 00% 6243 - #16 Short offline Completed without error 00% 6219 - #17 Short offline Completed without error 00% 6195 - #18 Short offline Completed without error 00% 6171 - #19 Short offline Completed without error 00% 6146 - #20 Short offline Completed without error 00% 6122 - #21 Short offline Completed without error 00% 6098 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.
Here are my firmware versions I am also assuming all is ok there:
Code:
> mps0: Firmware: 20.00.04.00, Driver: 20.00.00.00-fbsd > mps0: IOCCapabilities: 1285c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,EventReplay,HostDisc> > pcib2: <ACPI PCI-PCI bridge> irq 16 at device 1.1 on pci0 > pci2: <ACPI PCI bus> on pcib2 > mps1: <Avago Technologies (LSI) SAS2308> port 0xd000-0xd0ff mem 0xf7240000-0xf724ffff,0xf7200000-0xf723ffff irq 17 at device 0.0 on pci2 > mps1: Firmware: 20.00.04.00, Driver: 20.00.00.00-fbsd > mps1: IOCCapabilities: 5285c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,EventReplay,HostDisc> > xhci0: <Intel Lynx Point USB 3.0 controller> mem 0xf7700000-0xf770ffff irq 16 at device 20.0 on pci0 > xhci0: 32 bytes context size, 64-bit DMA > xhci0: Port routing mask set to 0xffffffff > usbus0 on xhci0 > ehci0: <Intel Lynx Point USB 2.0 controller USB-B> mem 0xf7714000-0xf77143ff irq 16 at device 26.0 on pci0 > usbus1: EHCI version 1.0 > usbus1 on ehci0 > pcib3: <ACPI PCI-PCI bridge> irq 16 at device 28.0 on pci0 > pci3: <ACPI PCI bus> on pcib3 > pcib4: <ACPI PCI-PCI bridge> at device 0.0 on pci3 > pci4: <ACPI PCI bus> on pcib4 > vgapci0: <VGA-compatible display> port 0xc000-0xc07f mem 0xf6000000-0xf6ffffff,0xf7000000-0xf701ffff irq 16 at device 0.0 on pci4 > vgapci0: Boot video device > pcib5: <ACPI PCI-PCI bridge> irq 18 at device 28.2 on pci0 > pci5: <ACPI PCI bus> on pcib5 > igb0: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0xb000-0xb01f mem 0xf7500000-0xf757ffff,0xf7580000-0xf7583fff irq 18 at device 0.0 on pci5 > igb0: Using MSIX interrupts with 5 vectors > igb0: Ethernet address: 0c:c4:7a:6b:68:16 > igb0: Bound queue 0 to cpu 0 > igb0: Bound queue 1 to cpu 1 > igb0: Bound queue 2 to cpu 2 > igb0: Bound queue 3 to cpu 3 > pcib6: <ACPI PCI-PCI bridge> irq 19 at device 28.3 on pci0 > pci6: <ACPI PCI bus> on pcib6 > igb1: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0xa000-0xa01f mem 0xf7400000-0xf747ffff,0xf7480000-0xf7483fff irq 19 at device 0.0 on pci6 > igb1: Using MSIX interrupts with 5 vectors > igb1: Ethernet address: 0c:c4:7a:6b:68:17 > igb1: Bound queue 0 to cpu 0 > igb1: Bound queue 1 to cpu 1 > igb1: Bound queue 2 to cpu 2 > igb1: Bound queue 3 to cpu 3 > pcib7: <ACPI PCI-PCI bridge> irq 16 at device 28.4 on pci0 > pci7: <ACPI PCI bus> on pcib7 > mps2: <Avago Technologies (LSI) SAS2008> port 0x9000-0x90ff mem 0xf73c0000-0xf73c3fff,0xf7380000-0xf73bffff irq 16 at device 0.0 on pci7 > mps2: Firmware: 20.00.04.00, Driver: 20.00.00.00-fbsd > mps2: IOCCapabilities: 1285c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,EventReplay,HostDisc> > ehci1: <Intel Lynx Point USB 2.0 controller USB-A> mem 0xf7713000-0xf77133ff irq 22 at device 29.0 on pci0 > usbus2: EHCI version 1.0 > usbus2 on ehci1 > isab0: <PCI-ISA bridge> at device 31.0 on pci0
I guess my questions are:
- Am I right to assume the issue is with soon to be failing HDDs?is there anything else I can do to test?
- Is normal procedure just to ignore these errors for now since it's only one error each? and do zpool clear?
- And obviously if errors persist replace the drive?
edit: correct smart test