trey22
Dabbler
- Joined
- Apr 11, 2013
- Messages
- 28
System specs:
FreeNAS 11.3 / Intel G3220 3.0GHz / Kingston 2 x 8GB ECC / ASRock E3C224D2I / 4 x Toshiba 2TB (RAIDZ2), A-Data 32GB USB / Corsair CX430 / Fractal Design Node 304 / APC BR1000G Pro UPS
Drives are from 2013/2014.
According to reporting, issues w/ 2 of my 4 drives:
1. Device: /dev/ada0, 1 Offline uncorrectable sectors
Device: /dev/ada0, 7 Offline uncorrectable sectors
Device: /dev/ada0, 25 Offline uncorrectable sectors
2. Device: /dev/ada2, ATA error count increased from 178 to 179
Device: /dev/ada2, Read SMART Error Log Failed.
Device: /dev/ada2, not capable of SMART self-check.
Smart tests set up as follows:
Output of drives w/ errors:
What info do you need to help troubleshoot? Or is this a case of two bad drives and they need replacement?
FreeNAS 11.3 / Intel G3220 3.0GHz / Kingston 2 x 8GB ECC / ASRock E3C224D2I / 4 x Toshiba 2TB (RAIDZ2), A-Data 32GB USB / Corsair CX430 / Fractal Design Node 304 / APC BR1000G Pro UPS
Drives are from 2013/2014.
According to reporting, issues w/ 2 of my 4 drives:
1. Device: /dev/ada0, 1 Offline uncorrectable sectors
Device: /dev/ada0, 7 Offline uncorrectable sectors
Device: /dev/ada0, 25 Offline uncorrectable sectors
2. Device: /dev/ada2, ATA error count increased from 178 to 179
Device: /dev/ada2, Read SMART Error Log Failed.
Device: /dev/ada2, not capable of SMART self-check.
Smart tests set up as follows:
Output of drives w/ errors:
Code:
Warning: settings changed through the CLI are not written to the configuration database and will be reset on reboot. root@freenas:~ # smartctl -a /dev/ada0 smartctl 7.0 2018-12-30 r4883 [FreeBSD 11.3-RELEASE-p14 amd64] (local build) Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Toshiba 3.5" DT01ACA... Desktop HDD Device Model: TOSHIBA DT01ACA200 Serial Number: Y2Q4WR6AS LU WWN Device Id: 5 000039 ff3c23987 Firmware Version: MX4OABB0 User Capacity: 2,000,398,934,016 bytes [2.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 7200 rpm Form Factor: 3.5 inches Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS T13/1699-D revision 4 SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Fri Jun 11 07:25:22 2021 EDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (15683) seconds. Offline data collection capabilities: (0x5b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 262) minutes. SCT capabilities: (0x003d) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000b 077 077 016 Pre-fail Always - 11732063 2 Throughput_Performance 0x0005 139 139 054 Pre-fail Offline - 72 3 Spin_Up_Time 0x0007 126 126 024 Pre-fail Always - 299 (Average 300) 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 279 5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 2 7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0 8 Seek_Time_Performance 0x0005 124 124 020 Pre-fail Offline - 33 9 Power_On_Hours 0x0012 091 091 000 Old_age Always - 64326 10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 279 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 737 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 737 194 Temperature_Celsius 0x0002 162 162 000 Old_age Always - 37 (Min/Max 12/42) 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 2 197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 96 198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0 SMART Error Log Version: 1 ATA Error Count: 70 (device log contains only the most recent five errors) CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX] ER = Error register [HEX] ST = Status register [HEX] Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days. Error 70 occurred at disk power-on lifetime: 64085 hours (2670 days + 5 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 88 08 9d 10 06 Error: UNC at LBA = 0x06109d08 = 101752072 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 00 80 90 a8 10 40 00 4d+15:01:34.809 READ FPDMA QUEUED 60 00 78 90 a7 10 40 00 4d+15:01:34.809 READ FPDMA QUEUED 60 00 70 90 a6 10 40 00 4d+15:01:34.809 READ FPDMA QUEUED 60 00 68 90 a5 10 40 00 4d+15:01:34.809 READ FPDMA QUEUED 60 00 60 90 a4 10 40 00 4d+15:01:34.809 READ FPDMA QUEUED Error 69 occurred at disk power-on lifetime: 64085 hours (2670 days + 5 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 88 08 9d 10 06 Error: UNC at LBA = 0x06109d08 = 101752072 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 00 10 90 a8 10 40 00 4d+15:01:30.966 READ FPDMA QUEUED 60 00 08 90 a7 10 40 00 4d+15:01:30.966 READ FPDMA QUEUED 60 00 00 90 a6 10 40 00 4d+15:01:30.966 READ FPDMA QUEUED 60 00 f8 90 a5 10 40 00 4d+15:01:30.966 READ FPDMA QUEUED 60 00 f0 90 a4 10 40 00 4d+15:01:30.966 READ FPDMA QUEUED Error 68 occurred at disk power-on lifetime: 64085 hours (2670 days + 5 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 88 08 9d 10 06 Error: UNC at LBA = 0x06109d08 = 101752072 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 00 a0 90 a8 10 40 00 4d+15:01:27.123 READ FPDMA QUEUED 60 00 98 90 a7 10 40 00 4d+15:01:27.123 READ FPDMA QUEUED 60 00 90 90 a6 10 40 00 4d+15:01:27.123 READ FPDMA QUEUED 60 00 88 90 a5 10 40 00 4d+15:01:27.123 READ FPDMA QUEUED 60 00 80 90 a4 10 40 00 4d+15:01:27.123 READ FPDMA QUEUED Error 67 occurred at disk power-on lifetime: 64085 hours (2670 days + 5 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 88 08 9d 10 06 Error: UNC at LBA = 0x06109d08 = 101752072 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 00 30 90 a8 10 40 00 4d+15:01:23.280 READ FPDMA QUEUED 60 00 28 90 a7 10 40 00 4d+15:01:23.280 READ FPDMA QUEUED 60 00 20 90 a6 10 40 00 4d+15:01:23.280 READ FPDMA QUEUED 60 00 18 90 a5 10 40 00 4d+15:01:23.280 READ FPDMA QUEUED 60 00 10 90 a4 10 40 00 4d+15:01:23.280 READ FPDMA QUEUED Error 66 occurred at disk power-on lifetime: 64085 hours (2670 days + 5 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 88 08 9d 10 06 Error: UNC at LBA = 0x06109d08 = 101752072 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 00 c0 90 a8 10 40 00 4d+15:01:19.452 READ FPDMA QUEUED 60 00 b8 90 a7 10 40 00 4d+15:01:19.452 READ FPDMA QUEUED 60 00 b0 90 a6 10 40 00 4d+15:01:19.452 READ FPDMA QUEUED 60 00 a8 90 a5 10 40 00 4d+15:01:19.451 READ FPDMA QUEUED 60 00 a0 90 a4 10 40 00 4d+15:01:19.451 READ FPDMA QUEUED SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed without error 00% 57640 - # 2 Short offline Completed without error 00% 57634 - # 3 Short offline Completed without error 00% 57466 - # 4 Extended offline Completed without error 00% 57304 - # 5 Short offline Completed without error 00% 57298 - # 6 Short offline Completed without error 00% 57131 - # 7 Short offline Completed without error 00% 57058 - # 8 Extended offline Completed without error 00% 56895 - # 9 Short offline Completed without error 00% 56890 - #10 Short offline Completed without error 00% 56758 - #11 Extended offline Completed without error 00% 56593 - #12 Short offline Completed without error 00% 56588 - #13 Short offline Completed without error 00% 56422 - #14 Short offline Completed without error 00% 56373 - #15 Extended offline Completed without error 00% 56211 - #16 Short offline Completed without error 00% 56205 - #17 Short offline Completed without error 00% 56038 - #18 Extended offline Completed without error 00% 55875 - #19 Short offline Completed without error 00% 55869 - #20 Short offline Completed without error 00% 55712 - #21 Short offline Completed without error 00% 55639 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing
Code:
root@freenas:~ # root@freenas:~ # smartctl -a /dev/ada2 smartctl 7.0 2018-12-30 r4883 [FreeBSD 11.3-RELEASE-p14 amd64] (local build) Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Toshiba 3.5" DT01ACA... Desktop HDD Device Model: TOSHIBA DT01ACA200 Serial Number: 23ON4HAGS LU WWN Device Id: 5 000039 ff3c9284b Firmware Version: MX4OABB0 User Capacity: 2,000,398,934,016 bytes [2.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 7200 rpm Form Factor: 3.5 inches Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS T13/1699-D revision 4 SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Fri Jun 11 07:42:25 2021 EDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: FAILED! Drive failure expected in less than 24 hours. SAVE ALL DATA. See vendor-specific Attribute list for failed Attributes. General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (15204) seconds. Offline data collection capabilities: (0x5b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 254) minutes. SCT capabilities: (0x003d) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000b 076 076 016 Pre-fail Always - 10683924 2 Throughput_Performance 0x0005 139 139 054 Pre-fail Offline - 72 3 Spin_Up_Time 0x0007 128 128 024 Pre-fail Always - 296 (Average 296) 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 109 5 Reallocated_Sector_Ct 0x0033 001 001 005 Pre-fail Always FAILING_NOW 2004 7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0 8 Seek_Time_Performance 0x0005 124 124 020 Pre-fail Offline - 33 9 Power_On_Hours 0x0012 092 092 000 Old_age Always - 60106 10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 109 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 533 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 533 194 Temperature_Celsius 0x0002 157 157 000 Old_age Always - 38 (Min/Max 11/47) 196 Reallocated_Event_Count 0x0032 001 001 000 Old_age Always - 2257 197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0 SMART Error Log Version: 1 ATA Error Count: 302 (device log contains only the most recent five errors) CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX] ER = Error register [HEX] ST = Status register [HEX] Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days. Error 302 occurred at disk power-on lifetime: 59867 hours (2494 days + 11 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 98 70 b7 42 06 Error: UNC at LBA = 0x0642b770 = 105035632 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 f8 80 10 bf 42 40 00 4d+16:49:42.622 READ FPDMA QUEUED 60 00 78 10 be 42 40 00 4d+16:49:42.622 READ FPDMA QUEUED 60 00 70 10 bd 42 40 00 4d+16:49:42.622 READ FPDMA QUEUED 60 00 68 10 bc 42 40 00 4d+16:49:42.622 READ FPDMA QUEUED 60 00 60 10 bb 42 40 00 4d+16:49:42.622 READ FPDMA QUEUED Error 301 occurred at disk power-on lifetime: 59867 hours (2494 days + 11 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 88 18 38 41 06 Error: UNC at LBA = 0x06413818 = 104937496 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 f8 30 98 41 41 40 00 4d+16:49:12.434 READ FPDMA QUEUED 60 00 28 98 40 41 40 00 4d+16:49:12.434 READ FPDMA QUEUED 60 00 20 98 3f 41 40 00 4d+16:49:12.434 READ FPDMA QUEUED 60 00 18 98 3e 41 40 00 4d+16:49:12.434 READ FPDMA QUEUED 60 00 10 98 3d 41 40 00 4d+16:49:12.434 READ FPDMA QUEUED Error 300 occurred at disk power-on lifetime: 59867 hours (2494 days + 11 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 88 18 38 41 06 Error: UNC at LBA = 0x06413818 = 104937496 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 f8 d0 98 41 41 40 00 4d+16:49:08.698 READ FPDMA QUEUED 60 00 c8 98 40 41 40 00 4d+16:49:08.698 READ FPDMA QUEUED 60 00 c0 98 3f 41 40 00 4d+16:49:08.698 READ FPDMA QUEUED 60 00 b8 98 3e 41 40 00 4d+16:49:08.698 READ FPDMA QUEUED 60 00 b0 98 3d 41 40 00 4d+16:49:08.698 READ FPDMA QUEUED Error 299 occurred at disk power-on lifetime: 59465 hours (2477 days + 17 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 40 78 06 b2 08 Error: UNC at LBA = 0x08b20678 = 145884792 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 70 80 48 06 b2 40 00 10:09:27.740 READ FPDMA QUEUED 61 18 78 38 ff af 40 00 10:09:27.740 WRITE FPDMA QUEUED 2f 00 01 10 00 00 00 00 10:09:27.739 READ LOG EXT 60 70 68 48 06 b2 40 00 10:09:15.538 READ FPDMA QUEUED 61 18 60 38 ff af 40 00 10:09:15.538 WRITE FPDMA QUEUED Error 298 occurred at disk power-on lifetime: 59465 hours (2477 days + 17 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 40 78 06 b2 08 Error: UNC at LBA = 0x08b20678 = 145884792 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 70 68 48 06 b2 40 00 10:09:15.538 READ FPDMA QUEUED 61 18 60 38 ff af 40 00 10:09:15.538 WRITE FPDMA QUEUED ef 02 00 00 00 00 40 00 10:09:15.538 SET FEATURES [Enable write cache] ef aa 00 00 00 00 40 00 10:09:15.537 SET FEATURES [Enable read look-ahead] c6 00 10 00 00 00 40 00 10:09:15.537 SET MULTIPLE MODE SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Interrupted (host reset) 80% 53417 - # 2 Short offline Completed without error 00% 53415 - # 3 Short offline Interrupted (host reset) 50% 53248 - # 4 Extended offline Interrupted (host reset) 80% 53082 - # 5 Short offline Completed without error 00% 53079 - # 6 Short offline Completed without error 00% 52912 - # 7 Short offline Completed without error 00% 52839 - # 8 Extended offline Completed without error 00% 52676 - # 9 Short offline Completed without error 00% 52671 - #10 Short offline Completed without error 00% 52538 - #11 Extended offline Completed without error 00% 52374 - #12 Short offline Completed without error 00% 52369 - #13 Short offline Completed without error 00% 52203 - #14 Short offline Completed without error 00% 52154 - #15 Extended offline Completed without error 00% 51991 - #16 Short offline Completed without error 00% 51986 - #17 Short offline Completed without error 00% 51819 - #18 Extended offline Completed without error 00% 51655 - #19 Short offline Completed without error 00% 51650 - #20 Short offline Completed without error 00% 51492 - #21 Short offline Completed without error 00% 51419 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing
What info do you need to help troubleshoot? Or is this a case of two bad drives and they need replacement?
Last edited: