TheBlueRaja
Dabbler
- Joined
- Sep 22, 2017
- Messages
- 15
Hi All,
I'm recieveing the following errors on my drive pool:-
Running the smartctl command i see the following for all drives in my pool.
All drives "Pass" SMART but i see read errors on ada1 and ada2 (ada2 is the only drive mentioned in the reporting as having issues.
So daft questions time, should i be replacing both of these ASAP? Why are they showing as Passed? Should i remove these from the pool for now and order new drives or just order new drives then replace?
Thanks
I'm recieveing the following errors on my drive pool:-
Code:
CRITICAL: Dec. 23, 2017, 10:49 p.m. - Device: /dev/ada2, FAILED SMART self-check. BACK UP DATA NOW! CRITICAL: Dec. 23, 2017, 10:49 p.m. - Device: /dev/ada2, Failed SMART usage Attribute: 1 Raw_Read_Error_Rate.
Running the smartctl command i see the following for all drives in my pool.
Code:
root@freenas:/nonexistent # smartctl -a /dev/ada0 smartctl 6.5 2016-05-07 r4318 [FreeBSD 11.0-STABLE amd64] (local build) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Hitachi Deskstar 7K3000 Device Model: Hitachi HDS723030ALA640 Serial Number: MK0351YHK1DTDA LU WWN Device Id: 5 000cca 225eaeae2 Firmware Version: MKAOAA10 User Capacity: 3,000,592,982,016 bytes [3.00 TB] Sector Size: 512 bytes logical/physical Rotation Rate: 7200 rpm Form Factor: 3.5 inches Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS T13/1699-D revision 4 SATA Version is: SATA 2.6, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Tue Dec 26 19:00:13 2017 GMT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 24) seconds. Offline data collection capabilities: (0x5b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 461) minutes. SCT capabilities: (0x003d) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail Always - 0 2 Throughput_Performance 0x0005 135 135 054 Pre-fail Offline - 84 3 Spin_Up_Time 0x0007 160 160 024 Pre-fail Always - 506 (Average 461) 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 30 5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0 7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0 8 Seek_Time_Performance 0x0005 123 123 020 Pre-fail Offline - 31 9 Power_On_Hours 0x0012 097 097 000 Old_age Always - 21027 10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 22 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 328 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 328 194 Temperature_Celsius 0x0002 117 117 000 Old_age Always - 51 (Min/Max 22/58) 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0 197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 18984 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. root@freenas:/nonexistent # smartctl -a /dev/ada1 smartctl 6.5 2016-05-07 r4318 [FreeBSD 11.0-STABLE amd64] (local build) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Hitachi Deskstar 7K3000 Device Model: Hitachi HDS723030ALA640 Serial Number: MK0371YHHEGXTA LU WWN Device Id: 5 000cca 225d4385e Firmware Version: MKAOAA10 User Capacity: 3,000,592,982,016 bytes [3.00 TB] Sector Size: 512 bytes logical/physical Rotation Rate: 7200 rpm Form Factor: 3.5 inches Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS T13/1699-D revision 4 SATA Version is: SATA 2.6, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Tue Dec 26 19:00:17 2017 GMT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 24) seconds. Offline data collection capabilities: (0x5b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 476) minutes. SCT capabilities: (0x003d) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail Always - 65536 2 Throughput_Performance 0x0005 135 135 054 Pre-fail Offline - 86 3 Spin_Up_Time 0x0007 132 132 024 Pre-fail Always - 558 (Average 615) 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 24 5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0 7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0 8 Seek_Time_Performance 0x0005 123 123 020 Pre-fail Offline - 31 9 Power_On_Hours 0x0012 097 097 000 Old_age Always - 21367 10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 24 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 104 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 104 194 Temperature_Celsius 0x0002 142 142 000 Old_age Always - 42 (Min/Max 19/45) 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0 197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 21356 - # 2 Short offline Completed without error 00% 21332 - # 3 Short offline Completed without error 00% 21308 - # 4 Short offline Completed without error 00% 21284 - # 5 Short offline Completed without error 00% 21260 - # 6 Short offline Completed without error 00% 21236 - # 7 Short offline Completed without error 00% 21212 - # 8 Short offline Completed without error 00% 21188 - # 9 Short offline Completed without error 00% 21164 - #10 Short offline Completed without error 00% 21140 - #11 Short offline Completed without error 00% 21116 - #12 Short offline Completed without error 00% 21092 - #13 Short offline Completed without error 00% 21068 - #14 Short offline Completed without error 00% 21044 - #15 Short offline Completed without error 00% 21020 - #16 Short offline Completed without error 00% 20996 - #17 Short offline Completed without error 00% 20972 - #18 Short offline Completed without error 00% 20948 - #19 Short offline Completed without error 00% 20924 - #20 Short offline Completed without error 00% 20900 - #21 Short offline Completed without error 00% 20876 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. root@freenas:/nonexistent # smartctl -a /dev/ada2 smartctl 6.5 2016-05-07 r4318 [FreeBSD 11.0-STABLE amd64] (local build) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Hitachi Deskstar 7K3000 Device Model: Hitachi HDS723030ALA640 Serial Number: MK0371YHHV9N0A LU WWN Device Id: 5 000cca 225da0d5c Firmware Version: MKAOAA10 User Capacity: 3,000,592,982,016 bytes [3.00 TB] Sector Size: 512 bytes logical/physical Rotation Rate: 7200 rpm Form Factor: 3.5 inches Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS T13/1699-D revision 4 SATA Version is: SATA 2.6, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Tue Dec 26 19:00:20 2017 GMT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 24) seconds. Offline data collection capabilities: (0x5b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 480) minutes. SCT capabilities: (0x003d) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000b 032 032 016 Pre-fail Always - 196607 2 Throughput_Performance 0x0005 135 135 054 Pre-fail Offline - 84 3 Spin_Up_Time 0x0007 136 136 024 Pre-fail Always - 509 (Average 629) 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 28 5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0 7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0 8 Seek_Time_Performance 0x0005 123 123 020 Pre-fail Offline - 31 9 Power_On_Hours 0x0012 097 097 000 Old_age Always - 21367 10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 24 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 133 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 133 194 Temperature_Celsius 0x0002 120 120 000 Old_age Always - 50 (Min/Max 19/50) 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0 197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 19436 - # 2 Short offline Completed without error 00% 19412 - # 3 Short offline Completed without error 00% 19388 - # 4 Short offline Completed without error 00% 19364 - # 5 Short offline Completed without error 00% 19325 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. root@freenas:/nonexistent # smartctl -a /dev/ada3 smartctl 6.5 2016-05-07 r4318 [FreeBSD 11.0-STABLE amd64] (local build) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Hitachi Deskstar 7K3000 Device Model: Hitachi HDS723030ALA640 Serial Number: MK0371YVJD3Z7A LU WWN Device Id: 5 000cca 234e1b262 Firmware Version: MKAOAA10 User Capacity: 3,000,592,982,016 bytes [3.00 TB] Sector Size: 512 bytes logical/physical Rotation Rate: 7200 rpm Form Factor: 3.5 inches Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS T13/1699-D revision 4 SATA Version is: SATA 2.6, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Tue Dec 26 19:00:24 2017 GMT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 24) seconds. Offline data collection capabilities: (0x5b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 457) minutes. SCT capabilities: (0x003d) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail Always - 0 2 Throughput_Performance 0x0005 135 135 054 Pre-fail Offline - 84 3 Spin_Up_Time 0x0007 133 133 024 Pre-fail Always - 517 (Average 645) 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 22 5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 17 7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0 8 Seek_Time_Performance 0x0005 123 123 020 Pre-fail Offline - 31 9 Power_On_Hours 0x0012 097 097 000 Old_age Always - 21027 10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 22 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 312 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 312 194 Temperature_Celsius 0x0002 113 113 000 Old_age Always - 53 (Min/Max 23/61) 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 17 197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 18984 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. root@freenas:/nonexistent #
All drives "Pass" SMART but i see read errors on ada1 and ada2 (ada2 is the only drive mentioned in the reporting as having issues.
So daft questions time, should i be replacing both of these ASAP? Why are they showing as Passed? Should i remove these from the pool for now and order new drives or just order new drives then replace?
Thanks