Freenas Critical errors on /dev/da4

Status
Not open for further replies.

risho

Dabbler
Joined
May 21, 2016
Messages
18
  • CRITICAL: May 21, 2016, 7:20 a.m. - Device: /dev/da1 [SAT], 1 Currently unreadable (pending) sectors
  • CRITICAL: May 21, 2016, 7:20 a.m. - Device: /dev/da1 [SAT], Self-Test Log error count increased from 0 to 1
  • CRITICAL: May 21, 2016, 12:20 a.m. - Device: /dev/da4 [SAT], ATA error count increased from 4 to 5
  • CRITICAL: May 21, 2016, 8:32 p.m. - Device: /dev/da4 [SAT], ATA error count increased from 533 to 543
So i've gotten those errors from my freenas dropdown menu. I just set up freenas. these drives are from 2011 and they were very active in my nas since i got them. it was running raid 6 in linux until a few days ago and I set up free nas. there are 7 drives. 5 of them are old hitachis from 2011 and 2 of them are wd reds i got in the past couple months.


smartctl -a /dev/da1
Code:
smartctl -a /dev/da1
smartctl 6.4 2015-06-04 r4109 [FreeBSD 10.3-RELEASE amd64] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:  Hitachi Deskstar 5K3000
Device Model:  Hitachi HDS5C3030ALA630
Serial Number:  MJ1311YNG35Y3A
LU WWN Device Id: 5 000cca 228c17368
Firmware Version: MEAOA5C0
User Capacity:  3,000,592,982,016 bytes [3.00 TB]
Sector Size:  512 bytes logical/physical
Rotation Rate:  5700 rpm
Form Factor:  3.5 inches
Device is:  In smartctl database [for details use: -P show]
ATA Version is:  ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 2.6, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:  Sun May 22 13:54:11 2016 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84)   Offline data collection activity
           was suspended by an interrupting command from host.
           Auto Offline Data Collection: Enabled.
Self-test execution status:  ( 118)   The previous self-test completed having
           the read element of the test failed.
Total time to complete Offline
data collection:      (37866) seconds.
Offline data collection
capabilities:         (0x5b) SMART execute Offline immediate.
           Auto Offline data collection on/off support.
           Suspend Offline collection upon new
           command.
           Offline surface scan supported.
           Self-test supported.
           No Conveyance Self-test supported.
           Selective Self-test supported.
SMART capabilities:  (0x0003)   Saves SMART data before entering
           power-saving mode.
           Supports SMART auto save timer.
Error logging capability:  (0x01)   Error logging supported.
           General Purpose Logging supported.
Short self-test routine
recommended polling time:     (  1) minutes.
Extended self-test routine
recommended polling time:     ( 631) minutes.
SCT capabilities:     (0x003d)   SCT Status supported.
           SCT Error Recovery Control supported.
           SCT Feature Control supported.
           SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAG  VALUE WORST THRESH TYPE  UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate  0x000b  100  100  016  Pre-fail  Always  -  1
  2 Throughput_Performance  0x0005  135  135  054  Pre-fail  Offline  -  106
  3 Spin_Up_Time  0x0007  147  147  024  Pre-fail  Always  -  557 (Average 395)
  4 Start_Stop_Count  0x0012  100  100  000  Old_age  Always  -  320
  5 Reallocated_Sector_Ct  0x0033  100  100  005  Pre-fail  Always  -  0
  7 Seek_Error_Rate  0x000b  100  100  067  Pre-fail  Always  -  0
  8 Seek_Time_Performance  0x0005  130  130  020  Pre-fail  Offline  -  33
  9 Power_On_Hours  0x0012  095  095  000  Old_age  Always  -  39825
 10 Spin_Retry_Count  0x0013  100  100  060  Pre-fail  Always  -  0
 12 Power_Cycle_Count  0x0032  100  100  000  Old_age  Always  -  320
192 Power-Off_Retract_Count 0x0032  100  100  000  Old_age  Always  -  945
193 Load_Cycle_Count  0x0012  100  100  000  Old_age  Always  -  945
194 Temperature_Celsius  0x0002  166  166  000  Old_age  Always  -  36 (Min/Max 15/62)
196 Reallocated_Event_Count 0x0032  100  100  000  Old_age  Always  -  0
197 Current_Pending_Sector  0x0022  100  100  000  Old_age  Always  -  1
198 Offline_Uncorrectable  0x0008  100  100  000  Old_age  Offline  -  0
199 UDMA_CRC_Error_Count  0x000a  200  200  000  Old_age  Always  -  0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description  Status  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline  Completed: read failure  60%  39794  2147605335
# 2  Extended offline  Completed without error  00%  2615  -
# 3  Short offline  Completed without error  00%  2578  -
# 4  Short offline  Completed without error  00%  1262  -
# 5  Short offline  Completed without error  00%  0  -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
  1  0  0  Not_testing
  2  0  0  Not_testing
  3  0  0  Not_testing
  4  0  0  Not_testing
  5  0  0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


da2
Code:
smartctl -a /dev/da2
smartctl 6.4 2015-06-04 r4109 [FreeBSD 10.3-RELEASE amd64] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:  Hitachi Deskstar 5K3000
Device Model:  Hitachi HDS5C3030ALA630
Serial Number:  MJ1311YNG3TWYA
LU WWN Device Id: 5 000cca 228c1ba97
Firmware Version: MEAOA5C0
User Capacity:  3,000,592,982,016 bytes [3.00 TB]
Sector Size:  512 bytes logical/physical
Rotation Rate:  5700 rpm
Form Factor:  3.5 inches
Device is:  In smartctl database [for details use: -P show]
ATA Version is:  ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 2.6, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:  Sun May 22 13:54:38 2016 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84)   Offline data collection activity
           was suspended by an interrupting command from host.
           Auto Offline Data Collection: Enabled.
Self-test execution status:  (  0)   The previous self-test routine completed
           without error or no self-test has ever
           been run.
Total time to complete Offline
data collection:      (36967) seconds.
Offline data collection
capabilities:         (0x5b) SMART execute Offline immediate.
           Auto Offline data collection on/off support.
           Suspend Offline collection upon new
           command.
           Offline surface scan supported.
           Self-test supported.
           No Conveyance Self-test supported.
           Selective Self-test supported.
SMART capabilities:  (0x0003)   Saves SMART data before entering
           power-saving mode.
           Supports SMART auto save timer.
Error logging capability:  (0x01)   Error logging supported.
           General Purpose Logging supported.
Short self-test routine
recommended polling time:     (  1) minutes.
Extended self-test routine
recommended polling time:     ( 616) minutes.
SCT capabilities:     (0x003d)   SCT Status supported.
           SCT Error Recovery Control supported.
           SCT Feature Control supported.
           SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAG  VALUE WORST THRESH TYPE  UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate  0x000b  100  100  016  Pre-fail  Always  -  0
  2 Throughput_Performance  0x0005  134  134  054  Pre-fail  Offline  -  112
  3 Spin_Up_Time  0x0007  146  146  024  Pre-fail  Always  -  397 (Average 561)
  4 Start_Stop_Count  0x0012  100  100  000  Old_age  Always  -  322
  5 Reallocated_Sector_Ct  0x0033  100  100  005  Pre-fail  Always  -  0
  7 Seek_Error_Rate  0x000b  100  100  067  Pre-fail  Always  -  0
  8 Seek_Time_Performance  0x0005  132  132  020  Pre-fail  Offline  -  32
  9 Power_On_Hours  0x0012  095  095  000  Old_age  Always  -  39818
 10 Spin_Retry_Count  0x0013  100  100  060  Pre-fail  Always  -  0
 12 Power_Cycle_Count  0x0032  100  100  000  Old_age  Always  -  322
192 Power-Off_Retract_Count 0x0032  100  100  000  Old_age  Always  -  934
193 Load_Cycle_Count  0x0012  100  100  000  Old_age  Always  -  934
194 Temperature_Celsius  0x0002  181  181  000  Old_age  Always  -  33 (Min/Max 15/61)
196 Reallocated_Event_Count 0x0032  100  100  000  Old_age  Always  -  0
197 Current_Pending_Sector  0x0022  100  100  000  Old_age  Always  -  0
198 Offline_Uncorrectable  0x0008  100  100  000  Old_age  Offline  -  0
199 UDMA_CRC_Error_Count  0x000a  200  200  000  Old_age  Always  -  0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description  Status  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline  Completed without error  00%  39795  -
# 2  Extended offline  Completed without error  00%  2615  -
# 3  Short offline  Completed without error  00%  2577  -
# 4  Short offline  Completed without error  00%  1261  -
# 5  Short offline  Completed without error  00%  8  -
# 6  Short offline  Completed without error  00%  8  -
# 7  Short offline  Completed without error  00%  0  -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
  1  0  0  Not_testing
  2  0  0  Not_testing
  3  0  0  Not_testing
  4  0  0  Not_testing
  5  0  0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


da4

Code:
smartctl -a /dev/da4
smartctl 6.4 2015-06-04 r4109 [FreeBSD 10.3-RELEASE amd64] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:  Hitachi Deskstar 5K3000
Device Model:  Hitachi HDS5C3030ALA630
Serial Number:  MJ1311YNG3VG3A
LU WWN Device Id: 5 000cca 228c1c06c
Firmware Version: MEAOA5C0
User Capacity:  3,000,592,982,016 bytes [3.00 TB]
Sector Size:  512 bytes logical/physical
Rotation Rate:  5700 rpm
Form Factor:  3.5 inches
Device is:  In smartctl database [for details use: -P show]
ATA Version is:  ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 2.6, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:  Sun May 22 13:55:44 2016 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84)   Offline data collection activity
           was suspended by an interrupting command from host.
           Auto Offline Data Collection: Enabled.
Self-test execution status:  (  41)   The self-test routine was interrupted
           by the host with a hard or soft reset.
Total time to complete Offline
data collection:      (36368) seconds.
Offline data collection
capabilities:         (0x5b) SMART execute Offline immediate.
           Auto Offline data collection on/off support.
           Suspend Offline collection upon new
           command.
           Offline surface scan supported.
           Self-test supported.
           No Conveyance Self-test supported.
           Selective Self-test supported.
SMART capabilities:  (0x0003)   Saves SMART data before entering
           power-saving mode.
           Supports SMART auto save timer.
Error logging capability:  (0x01)   Error logging supported.
           General Purpose Logging supported.
Short self-test routine
recommended polling time:     (  1) minutes.
Extended self-test routine
recommended polling time:     ( 606) minutes.
SCT capabilities:     (0x003d)   SCT Status supported.
           SCT Error Recovery Control supported.
           SCT Feature Control supported.
           SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAG  VALUE WORST THRESH TYPE  UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate  0x000b  100  100  016  Pre-fail  Always  -  0
  2 Throughput_Performance  0x0005  136  136  054  Pre-fail  Offline  -  104
  3 Spin_Up_Time  0x0007  147  147  024  Pre-fail  Always  -  392 (Average 556)
  4 Start_Stop_Count  0x0012  100  100  000  Old_age  Always  -  329
  5 Reallocated_Sector_Ct  0x0033  100  100  005  Pre-fail  Always  -  0
  7 Seek_Error_Rate  0x000b  100  100  067  Pre-fail  Always  -  0
  8 Seek_Time_Performance  0x0005  132  132  020  Pre-fail  Offline  -  32
  9 Power_On_Hours  0x0012  095  095  000  Old_age  Always  -  39856
 10 Spin_Retry_Count  0x0013  100  100  060  Pre-fail  Always  -  0
 12 Power_Cycle_Count  0x0032  100  100  000  Old_age  Always  -  329
192 Power-Off_Retract_Count 0x0032  100  100  000  Old_age  Always  -  965
193 Load_Cycle_Count  0x0012  100  100  000  Old_age  Always  -  965
194 Temperature_Celsius  0x0002  187  187  000  Old_age  Always  -  32 (Min/Max 16/62)
196 Reallocated_Event_Count 0x0032  100  100  000  Old_age  Always  -  0
197 Current_Pending_Sector  0x0022  100  100  000  Old_age  Always  -  0
198 Offline_Uncorrectable  0x0008  100  100  000  Old_age  Offline  -  0
199 UDMA_CRC_Error_Count  0x000a  200  200  000  Old_age  Always  -  765

SMART Error Log Version: 1
ATA Error Count: 766 (device log contains only the most recent five errors)
   CR = Command Register [HEX]
   FR = Features Register [HEX]
   SC = Sector Count Register [HEX]
   SN = Sector Number Register [HEX]
   CL = Cylinder Low Register [HEX]
   CH = Cylinder High Register [HEX]
   DH = Device/Head Register [HEX]
   DC = Device Command Register [HEX]
   ER = Error register [HEX]
   ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 766 occurred at disk power-on lifetime: 39856 hours (1660 days + 16 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 01 f7 6e f7 04  Error: ICRC, ABRT at LBA = 0x04f76ef7 = 83324663

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 20 00 d8 6e f7 40 00  2d+07:23:29.694  WRITE FPDMA QUEUED
  61 10 00 b8 6e f7 40 00  2d+07:23:29.693  WRITE FPDMA QUEUED
  61 20 00 b0 ab 63 40 00  2d+07:23:29.693  WRITE FPDMA QUEUED
  61 10 00 90 ab 63 40 00  2d+07:23:29.693  WRITE FPDMA QUEUED
  61 20 00 b0 03 18 40 00  2d+07:23:29.693  WRITE FPDMA QUEUED

Error 765 occurred at disk power-on lifetime: 39856 hours (1660 days + 16 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 08 18 a0 63 07  Error: ICRC, ABRT at LBA = 0x0763a018 = 123969560

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 38 00 e8 9f 63 40 00  2d+07:21:29.881  WRITE FPDMA QUEUED
  ef 10 02 00 00 00 00 00  2d+07:21:29.775  SET FEATURES [Enable SATA feature]
  ef 02 00 00 00 00 00 00  2d+07:21:29.775  SET FEATURES [Enable write cache]
  ef aa 00 00 00 00 00 00  2d+07:21:29.775  SET FEATURES [Enable read look-ahead]
  ef 03 46 00 00 00 00 00  2d+07:21:29.775  SET FEATURES [Set transfer mode]

Error 764 occurred at disk power-on lifetime: 39856 hours (1660 days + 16 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 08 18 a0 63 07  Error: ICRC, ABRT at LBA = 0x0763a018 = 123969560

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 38 00 e8 9f 63 40 00  2d+07:21:29.661  WRITE FPDMA QUEUED
  61 08 00 80 a0 2b 40 00  2d+07:21:29.660  WRITE FPDMA QUEUED
  61 18 00 a0 f7 17 40 00  2d+07:21:29.660  WRITE FPDMA QUEUED
  61 08 00 98 f7 17 40 00  2d+07:21:29.660  WRITE FPDMA QUEUED
  61 10 00 a0 68 f7 40 00  2d+07:21:29.660  WRITE FPDMA QUEUED

Error 763 occurred at disk power-on lifetime: 39855 hours (1660 days + 15 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 e0 c0 2b 63 07  Error: ICRC, ABRT at LBA = 0x07632bc0 = 123939776

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 00 00 a0 2b 63 40 00  2d+07:00:52.041  WRITE FPDMA QUEUED
  61 40 00 98 96 2b 40 00  2d+07:00:52.040  WRITE FPDMA QUEUED
  61 58 00 b8 7f 17 40 00  2d+07:00:52.040  WRITE FPDMA QUEUED
  61 20 00 98 7f 17 40 00  2d+07:00:52.040  WRITE FPDMA QUEUED
  61 28 00 70 7f 17 40 00  2d+07:00:52.040  WRITE FPDMA QUEUED

Error 762 occurred at disk power-on lifetime: 39855 hours (1660 days + 15 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 51 08 b8 0e f7 04  Error: ICRC, ABRT at LBA = 0x04f70eb8 = 83300024

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 18 00 a8 0e f7 40 00  2d+06:50:58.353  WRITE FPDMA QUEUED
  61 08 00 98 0e f7 40 00  2d+06:50:58.353  WRITE FPDMA QUEUED
  61 08 00 88 0e f7 40 00  2d+06:50:58.353  WRITE FPDMA QUEUED
  61 08 00 78 0e f7 40 00  2d+06:50:58.352  WRITE FPDMA QUEUED
  61 08 00 80 af d4 40 00  2d+06:50:58.352  WRITE FPDMA QUEUED

SMART Self-test log structure revision number 1
Num  Test_Description  Status  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline  Interrupted (host reset)  90%  39823  -
# 2  Extended offline  Completed without error  00%  2628  -
# 3  Short offline  Completed without error  00%  2618  -
# 4  Short offline  Completed without error  00%  2618  -
# 5  Short offline  Completed without error  00%  2597  -
# 6  Short offline  Completed without error  00%  1282  -
# 7  Short offline  Completed without error  00%  21  -
# 8  Short offline  Aborted by host  50%  21  -
# 9  Short offline  Completed without error  00%  12  -
#10  Short offline  Completed without error  00%  12  -
#11  Short offline  Completed without error  00%  0  -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
  1  0  0  Not_testing
  2  0  0  Not_testing
  3  0  0  Not_testing
  4  0  0  Not_testing
  5  0  0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

da5

Code:
smartctl -a /dev/da5
smartctl 6.4 2015-06-04 r4109 [FreeBSD 10.3-RELEASE amd64] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:  Hitachi Deskstar 5K3000
Device Model:  Hitachi HDS5C3030ALA630
Serial Number:  MJ1311YNG3V0SA
LU WWN Device Id: 5 000cca 228c1bece
Firmware Version: MEAOA5C0
User Capacity:  3,000,592,982,016 bytes [3.00 TB]
Sector Size:  512 bytes logical/physical
Rotation Rate:  5700 rpm
Form Factor:  3.5 inches
Device is:  In smartctl database [for details use: -P show]
ATA Version is:  ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 2.6, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:  Sun May 22 13:59:54 2016 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84)   Offline data collection activity
           was suspended by an interrupting command from host.
           Auto Offline Data Collection: Enabled.
Self-test execution status:  (  0)   The previous self-test routine completed
           without error or no self-test has ever
           been run.
Total time to complete Offline
data collection:      (38469) seconds.
Offline data collection
capabilities:         (0x5b) SMART execute Offline immediate.
           Auto Offline data collection on/off support.
           Suspend Offline collection upon new
           command.
           Offline surface scan supported.
           Self-test supported.
           No Conveyance Self-test supported.
           Selective Self-test supported.
SMART capabilities:  (0x0003)   Saves SMART data before entering
           power-saving mode.
           Supports SMART auto save timer.
Error logging capability:  (0x01)   Error logging supported.
           General Purpose Logging supported.
Short self-test routine
recommended polling time:     (  1) minutes.
Extended self-test routine
recommended polling time:     ( 641) minutes.
SCT capabilities:     (0x003d)   SCT Status supported.
           SCT Error Recovery Control supported.
           SCT Feature Control supported.
           SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAG  VALUE WORST THRESH TYPE  UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate  0x000b  100  100  016  Pre-fail  Always  -  0
  2 Throughput_Performance  0x0005  133  133  054  Pre-fail  Offline  -  113
  3 Spin_Up_Time  0x0007  144  144  024  Pre-fail  Always  -  403 (Average 568)
  4 Start_Stop_Count  0x0012  100  100  000  Old_age  Always  -  329
  5 Reallocated_Sector_Ct  0x0033  100  100  005  Pre-fail  Always  -  0
  7 Seek_Error_Rate  0x000b  100  100  067  Pre-fail  Always  -  0
  8 Seek_Time_Performance  0x0005  132  132  020  Pre-fail  Offline  -  32
  9 Power_On_Hours  0x0012  095  095  000  Old_age  Always  -  39844
 10 Spin_Retry_Count  0x0013  100  100  060  Pre-fail  Always  -  0
 12 Power_Cycle_Count  0x0032  100  100  000  Old_age  Always  -  329
192 Power-Off_Retract_Count 0x0032  100  100  000  Old_age  Always  -  943
193 Load_Cycle_Count  0x0012  100  100  000  Old_age  Always  -  943
194 Temperature_Celsius  0x0002  181  181  000  Old_age  Always  -  33 (Min/Max 15/62)
196 Reallocated_Event_Count 0x0032  100  100  000  Old_age  Always  -  0
197 Current_Pending_Sector  0x0022  100  100  000  Old_age  Always  -  0
198 Offline_Uncorrectable  0x0008  100  100  000  Old_age  Offline  -  0
199 UDMA_CRC_Error_Count  0x000a  200  200  000  Old_age  Always  -  0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description  Status  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline  Completed without error  00%  39822  -
# 2  Extended offline  Completed without error  00%  2627  -
# 3  Short offline  Completed without error  00%  2617  -
# 4  Short offline  Completed without error  00%  2588  -
# 5  Short offline  Aborted by host  90%  2588  -
# 6  Short offline  Completed without error  00%  1272  -
# 7  Short offline  Completed without error  00%  20  -
# 8  Short offline  Completed without error  00%  12  -
# 9  Short offline  Completed without error  00%  0  -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
  1  0  0  Not_testing
  2  0  0  Not_testing
  3  0  0  Not_testing
  4  0  0  Not_testing
  5  0  0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

da6

Code:
smartctl -a /dev/da6
smartctl 6.4 2015-06-04 r4109 [FreeBSD 10.3-RELEASE amd64] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:  Hitachi Deskstar 5K3000
Device Model:  Hitachi HDS5C3030ALA630
Serial Number:  MJ1311YNG3757A
LU WWN Device Id: 5 000cca 228c17806
Firmware Version: MEAOA5C0
User Capacity:  3,000,592,982,016 bytes [3.00 TB]
Sector Size:  512 bytes logical/physical
Rotation Rate:  5700 rpm
Form Factor:  3.5 inches
Device is:  In smartctl database [for details use: -P show]
ATA Version is:  ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 2.6, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:  Sun May 22 14:00:40 2016 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84)   Offline data collection activity
           was suspended by an interrupting command from host.
           Auto Offline Data Collection: Enabled.
Self-test execution status:  (  0)   The previous self-test routine completed
           without error or no self-test has ever
           been run.
Total time to complete Offline
data collection:      (36667) seconds.
Offline data collection
capabilities:         (0x5b) SMART execute Offline immediate.
           Auto Offline data collection on/off support.
           Suspend Offline collection upon new
           command.
           Offline surface scan supported.
           Self-test supported.
           No Conveyance Self-test supported.
           Selective Self-test supported.
SMART capabilities:  (0x0003)   Saves SMART data before entering
           power-saving mode.
           Supports SMART auto save timer.
Error logging capability:  (0x01)   Error logging supported.
           General Purpose Logging supported.
Short self-test routine
recommended polling time:     (  1) minutes.
Extended self-test routine
recommended polling time:     ( 611) minutes.
SCT capabilities:     (0x003d)   SCT Status supported.
           SCT Error Recovery Control supported.
           SCT Feature Control supported.
           SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAG  VALUE WORST THRESH TYPE  UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate  0x000b  100  100  016  Pre-fail  Always  -  0
  2 Throughput_Performance  0x0005  136  136  054  Pre-fail  Offline  -  103
  3 Spin_Up_Time  0x0007  146  146  024  Pre-fail  Always  -  395 (Average 561)
  4 Start_Stop_Count  0x0012  100  100  000  Old_age  Always  -  322
  5 Reallocated_Sector_Ct  0x0033  100  100  005  Pre-fail  Always  -  0
  7 Seek_Error_Rate  0x000b  100  100  067  Pre-fail  Always  -  0
  8 Seek_Time_Performance  0x0005  130  130  020  Pre-fail  Offline  -  33
  9 Power_On_Hours  0x0012  095  095  000  Old_age  Always  -  39853
 10 Spin_Retry_Count  0x0013  100  100  060  Pre-fail  Always  -  0
 12 Power_Cycle_Count  0x0032  100  100  000  Old_age  Always  -  322
192 Power-Off_Retract_Count 0x0032  100  100  000  Old_age  Always  -  962
193 Load_Cycle_Count  0x0012  100  100  000  Old_age  Always  -  962
194 Temperature_Celsius  0x0002  181  181  000  Old_age  Always  -  33 (Min/Max 15/61)
196 Reallocated_Event_Count 0x0032  100  100  000  Old_age  Always  -  0
197 Current_Pending_Sector  0x0022  100  100  000  Old_age  Always  -  0
198 Offline_Uncorrectable  0x0008  100  100  000  Old_age  Offline  -  0
199 UDMA_CRC_Error_Count  0x000a  200  200  000  Old_age  Always  -  0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description  Status  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline  Completed without error  00%  39829  -
# 2  Extended offline  Completed without error  00%  2632  -
# 3  Short offline  Completed without error  00%  2595  -
# 4  Short offline  Aborted by host  90%  2595  -
# 5  Short offline  Completed without error  00%  2595  -
# 6  Short offline  Completed without error  00%  1279  -
# 7  Short offline  Completed without error  00%  17  -
# 8  Short offline  Completed without error  00%  0  -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
  1  0  0  Not_testing
  2  0  0  Not_testing
  3  0  0  Not_testing
  4  0  0  Not_testing
  5  0  0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


i have no idea how to parse this. from what i can tell it says everything passes, but what about those errors? should i be worried?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Waaaaaay too few long tests on those drives. Definitely fix that ASAP. In any case:
  • HOLY FSCKING CRAP, those drives have seen 60+ degrees Celsius! That is insane and immensely dangerous. Consider replacing them all ASAP. Or at least have replacements on hand and be prepared to quickly replace several drives.
  • da1 is failing SMART tests. This is generally viewed as "drive is a goner, replace ASAP"
  • da2 looks fine, other than the temperature history.
  • da4 probably has a bad cable or other interface issue. Start by replacing the cable, narrow it down from there.
  • da5 and da6 look as good as da2
 

BigDave

FreeNAS Enthusiast
Joined
Oct 6, 2013
Messages
2,479
I agree with Ericloewe,
AND these drives all have 4.5 years of spin time on them.
This fact, along with the high temp history would deem them unreliable imho.
 

risho

Dabbler
Joined
May 21, 2016
Messages
18
heh there was a point of time for a few months where the nas was running in my closet with not a lot of circulation or many fans. didn't really realize the implications of that desicion at the time. it was probably hot like that for a while. if the ones other than da1 aren't showing any problems, and they are at a proper operating temperature are they still at a high risk? i built a new nas with new hardware, a new case and much more fans/airflow.


these are my new wd drives (i have 2 for a total of 7 drives in raid 6).

Code:
 smartctl -a /dev/da0
smartctl 6.4 2015-06-04 r4109 [FreeBSD 10.3-RELEASE amd64] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:  Western Digital Red
Device Model:  WDC WD40EFRX-68WT0N0
Serial Number:  WD-WCC4E7XS0NZ6
LU WWN Device Id: 5 0014ee 26291d96f
Firmware Version: 82.00A82
User Capacity:  4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:  512 bytes logical, 4096 bytes physical
Rotation Rate:  5400 rpm
Device is:  In smartctl database [for details use: -P show]
ATA Version is:  ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:  Sun May 22 16:02:47 2016 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)   Offline data collection activity
           was never started.
           Auto Offline Data Collection: Disabled.
Self-test execution status:  (  0)   The previous self-test routine completed
           without error or no self-test has ever
           been run.
Total time to complete Offline
data collection:      (53280) seconds.
Offline data collection
capabilities:         (0x7b) SMART execute Offline immediate.
           Auto Offline data collection on/off support.
           Suspend Offline collection upon new
           command.
           Offline surface scan supported.
           Self-test supported.
           Conveyance Self-test supported.
           Selective Self-test supported.
SMART capabilities:  (0x0003)   Saves SMART data before entering
           power-saving mode.
           Supports SMART auto save timer.
Error logging capability:  (0x01)   Error logging supported.
           General Purpose Logging supported.
Short self-test routine
recommended polling time:     (  2) minutes.
Extended self-test routine
recommended polling time:     ( 532) minutes.
Conveyance self-test routine
recommended polling time:     (  5) minutes.
SCT capabilities:     (0x703d)   SCT Status supported.
           SCT Error Recovery Control supported.
           SCT Feature Control supported.
           SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAG  VALUE WORST THRESH TYPE  UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate  0x002f  200  200  051  Pre-fail  Always  -  0
  3 Spin_Up_Time  0x0027  100  253  021  Pre-fail  Always  -  0
  4 Start_Stop_Count  0x0032  100  100  000  Old_age  Always  -  5
  5 Reallocated_Sector_Ct  0x0033  200  200  140  Pre-fail  Always  -  0
  7 Seek_Error_Rate  0x002e  200  200  000  Old_age  Always  -  0
  9 Power_On_Hours  0x0032  100  100  000  Old_age  Always  -  451
 10 Spin_Retry_Count  0x0032  100  253  000  Old_age  Always  -  0
 11 Calibration_Retry_Count 0x0032  100  253  000  Old_age  Always  -  0
 12 Power_Cycle_Count  0x0032  100  100  000  Old_age  Always  -  5
192 Power-Off_Retract_Count 0x0032  200  200  000  Old_age  Always  -  1
193 Load_Cycle_Count  0x0032  200  200  000  Old_age  Always  -  29
194 Temperature_Celsius  0x0022  120  107  000  Old_age  Always  -  32
196 Reallocated_Event_Count 0x0032  200  200  000  Old_age  Always  -  0
197 Current_Pending_Sector  0x0032  200  200  000  Old_age  Always  -  0
198 Offline_Uncorrectable  0x0030  100  253  000  Old_age  Offline  -  0
199 UDMA_CRC_Error_Count  0x0032  200  200  000  Old_age  Always  -  0
200 Multi_Zone_Error_Rate  0x0008  200  200  000  Old_age  Offline  -  0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description  Status  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline  Completed without error  00%  425  -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
  1  0  0  Not_testing
  2  0  0  Not_testing
  3  0  0  Not_testing
  4  0  0  Not_testing
  5  0  0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

Code:
smartctl -a /dev/da3
smartctl 6.4 2015-06-04 r4109 [FreeBSD 10.3-RELEASE amd64] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:  Western Digital Red
Device Model:  WDC WD40EFRX-68WT0N0
Serial Number:  WD-WCC4E6CTKK98
LU WWN Device Id: 5 0014ee 2b75d20e5
Firmware Version: 82.00A82
User Capacity:  4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:  512 bytes logical, 4096 bytes physical
Rotation Rate:  5400 rpm
Device is:  In smartctl database [for details use: -P show]
ATA Version is:  ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:  Sun May 22 16:11:10 2016 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)   Offline data collection activity
           was never started.
           Auto Offline Data Collection: Disabled.
Self-test execution status:  (  0)   The previous self-test routine completed
           without error or no self-test has ever
           been run.
Total time to complete Offline
data collection:      (52320) seconds.
Offline data collection
capabilities:         (0x7b) SMART execute Offline immediate.
           Auto Offline data collection on/off support.
           Suspend Offline collection upon new
           command.
           Offline surface scan supported.
           Self-test supported.
           Conveyance Self-test supported.
           Selective Self-test supported.
SMART capabilities:  (0x0003)   Saves SMART data before entering
           power-saving mode.
           Supports SMART auto save timer.
Error logging capability:  (0x01)   Error logging supported.
           General Purpose Logging supported.
Short self-test routine
recommended polling time:     (  2) minutes.
Extended self-test routine
recommended polling time:     ( 523) minutes.
Conveyance self-test routine
recommended polling time:     (  5) minutes.
SCT capabilities:     (0x703d)   SCT Status supported.
           SCT Error Recovery Control supported.
           SCT Feature Control supported.
           SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAG  VALUE WORST THRESH TYPE  UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate  0x002f  200  200  051  Pre-fail  Always  -  0
  3 Spin_Up_Time  0x0027  197  195  021  Pre-fail  Always  -  7141
  4 Start_Stop_Count  0x0032  100  100  000  Old_age  Always  -  12
  5 Reallocated_Sector_Ct  0x0033  200  200  140  Pre-fail  Always  -  0
  7 Seek_Error_Rate  0x002e  200  200  000  Old_age  Always  -  0
  9 Power_On_Hours  0x0032  100  100  000  Old_age  Always  -  619
 10 Spin_Retry_Count  0x0032  100  253  000  Old_age  Always  -  0
 11 Calibration_Retry_Count 0x0032  100  253  000  Old_age  Always  -  0
 12 Power_Cycle_Count  0x0032  100  100  000  Old_age  Always  -  12
192 Power-Off_Retract_Count 0x0032  200  200  000  Old_age  Always  -  4
193 Load_Cycle_Count  0x0032  200  200  000  Old_age  Always  -  79
194 Temperature_Celsius  0x0022  115  101  000  Old_age  Always  -  37
196 Reallocated_Event_Count 0x0032  200  200  000  Old_age  Always  -  0
197 Current_Pending_Sector  0x0032  200  200  000  Old_age  Always  -  0
198 Offline_Uncorrectable  0x0030  100  253  000  Old_age  Offline  -  0
199 UDMA_CRC_Error_Count  0x0032  200  189  000  Old_age  Always  -  28
200 Multi_Zone_Error_Rate  0x0008  200  200  000  Old_age  Offline  -  0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description  Status  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline  Completed without error  00%  593  -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
  1  0  0  Not_testing
  2  0  0  Not_testing
  3  0  0  Not_testing
  4  0  0  Not_testing
  5  0  0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


do these seem to be operating properly? they are really new. and i am much more responsible with my cooling now...

i plan to replace the hitachi drives with wd reds over time. if i were to start picking up more 4tb reds to replace the 3tb hitachis can i expand the pool to utilize the additional space once they are all 4tb drives? its raidz2 for what its worth
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
if the ones other than da1 aren't showing any problems, and they are at a proper operating temperature are they still at a high risk?
High but not immediate. Scrubs and SMART tests should keep things under control, provided you can quickly replace drives when they start getting flakey.
Being a bit proactive and replacing some of the drives in advance, while they still work (using an in-place resilver) is recommended.

do these seem to be operating properly? they are really new. and i am much more responsible with my cooling now...
Yes, but they're rather new and don't seem to have been burned in. Definitely burn in any new drives before putting them into production.

i plan to replace the hitachi drives with wd reds over time. if i were to start picking up more 4tb reds to replace the 3tb hitachis can i expand the pool to utilize the additional space once they are all 4tb drives? its raidz2 for what its worth
That should happen automagically.
 

Sakuru

Guru
Joined
Nov 20, 2015
Messages
527
It looks like you may have a cable issue on da3 as well.
199 UDMA_CRC_Error_Count 0x0032 200 189 000 Old_age Always - 28
 
Status
Not open for further replies.
Top