Suddenly ATA error Count increast

Joined
Dec 22, 2018
Messages
4
Hello,

HW:
KALEA-INFORMATIQUE PCI Express Karte - 4X SATA 6Gb/s, mit miniSAS Kabel. PCIe 2.0. Marvell 88SE9215
and SilverStone SST-FS304B

5 drives in ZFS2 - array (all 6TB drives)
3xST6000VX001-2BD186
1xWDC WD6002FRYZ-01WD5B1
1xWDC WD60EFZX-68B3FN0

I weekly check the status of my Nas and became an array status degraded:
One seagate ST6000VX001-2BD186 has some ATA Error Count: 167 increast:

Code:
root@freenas[~]# smartctl -a /dev/ada8
smartctl 7.2 2020-12-30 r5155 [FreeBSD 12.2-RELEASE-p9 amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Skyhawk
Device Model:     ST6000VX001-2BD186
Serial Number:    XXXXXXXXXXXXXX
LU WWN Device Id: 5 000c50 0dbaa881f
Firmware Version: CV12
User Capacity:    6,001,175,126,016 bytes [6.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5425 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sun Oct  3 17:09:22 2021 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      ( 249) Self-test routine in progress...
                                        90% of test remaining.
Total time to complete Offline
data collection:                (    0) seconds.
Offline data collection
capabilities:                    (0x73) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        No Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 724) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x70bd) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   076   064   006    Pre-fail  Always       -       35548372
  3 Spin_Up_Time            0x0003   091   091   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       107
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   070   060   045    Pre-fail  Always       -       9856939
  9 Power_On_Hours          0x0032   098   098   000    Old_age   Always       -       1779h+00m+00.000s
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       3
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   001   001   000    Old_age   Always       -       167
188 Command_Timeout         0x0032   098   098   000    Old_age   Always       -       4295032834
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   069   067   040    Old_age   Always       -       31 (Min/Max 23/33)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       2
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       530
194 Temperature_Celsius     0x0022   031   040   000    Old_age   Always       -       31 (0 23 0 0 0)
195 Hardware_ECC_Recovered  0x001a   076   064   000    Old_age   Always       -       35548372
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       1
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       746h+54m+56.288s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       11714087042
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       7192183930

SMART Error Log Version: 1
ATA Error Count: 167 (device log contains only the most recent five errors)
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 167 occurred at disk power-on lifetime: 1759 hours (73 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 b0 ff ff ff 4f 00  23d+14:02:57.719  READ DMA EXT
  61 00 00 ff ff ff 4f 00  23d+14:02:57.717  WRITE FPDMA QUEUED
  b0 d5 01 09 4f c2 40 00  23d+14:02:57.689  SMART READ LOG
  25 00 00 ff ff ff 4f 00  23d+14:02:54.022  READ DMA EXT
  b0 d5 01 06 4f c2 40 00  23d+14:02:53.886  SMART READ LOG

Error 166 occurred at disk power-on lifetime: 1759 hours (73 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 00 ff ff ff 4f 00  23d+14:02:54.022  READ DMA EXT
  b0 d5 01 06 4f c2 40 00  23d+14:02:53.886  SMART READ LOG
  25 00 00 ff ff ff 4f 00  23d+14:02:50.223  READ DMA EXT
  b0 d5 01 01 4f c2 40 00  23d+14:02:50.183  SMART READ LOG
  25 00 00 ff ff ff 4f 00  23d+14:02:46.515  READ DMA EXT

Error 165 occurred at disk power-on lifetime: 1759 hours (73 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 00 ff ff ff 4f 00  23d+14:02:50.223  READ DMA EXT
  b0 d5 01 01 4f c2 40 00  23d+14:02:50.183  SMART READ LOG
  25 00 00 ff ff ff 4f 00  23d+14:02:46.515  READ DMA EXT
  b0 d5 01 00 4f c2 40 00  23d+14:02:46.513  SMART READ LOG
  25 00 00 ff ff ff 4f 00  23d+14:02:42.801  READ DMA EXT

Error 164 occurred at disk power-on lifetime: 1759 hours (73 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 00 ff ff ff 4f 00  23d+14:02:46.515  READ DMA EXT
  b0 d5 01 00 4f c2 40 00  23d+14:02:46.513  SMART READ LOG
  25 00 00 ff ff ff 4f 00  23d+14:02:42.801  READ DMA EXT
  b0 da 00 00 4f c2 40 00  23d+14:02:42.494  SMART RETURN STATUS
  25 00 00 ff ff ff 4f 00  23d+14:02:38.202  READ DMA EXT

Error 163 occurred at disk power-on lifetime: 1759 hours (73 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 00 ff ff ff 4f 00  23d+14:02:42.801  READ DMA EXT
  b0 da 00 00 4f c2 40 00  23d+14:02:42.494  SMART RETURN STATUS
  25 00 00 ff ff ff 4f 00  23d+14:02:38.202  READ DMA EXT
  b0 d1 01 01 4f c2 40 00  23d+14:02:38.188  SMART READ ATTRIBUTE THRESHOLDS [OBS-4]
  2f 00 01 10 00 00 00 00  23d+14:02:38.188  READ LOG EXT

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Self-test routine in progress 90%      1779         -
# 2  Short offline       Completed without error       00%      1762         -
# 3  Short offline       Completed without error       00%      1594         -
# 4  Short offline       Completed without error       00%      1426         -
# 5  Short offline       Completed without error       00%      1258         -
# 6  Short offline       Completed without error       00%      1098         -
# 7  Short offline       Completed without error       00%       922         -
# 8  Short offline       Completed without error       00%       754         -
# 9  Short offline       Completed without error       00%       586         -
#10  Short offline       Completed without error       00%       418         -
#11  Short offline       Completed without error       00%       250         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

root@freenas[~]# smartctl -l selftest /dev/ada8
smartctl 7.2 2020-12-30 r5155 [FreeBSD 12.2-RELEASE-p9 amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Self-test routine in progress 90%      1779         -
# 2  Short offline       Completed without error       00%      1762         -
# 3  Short offline       Completed without error       00%      1594         -
# 4  Short offline       Completed without error       00%      1426         -
# 5  Short offline       Completed without error       00%      1258         -
# 6  Short offline       Completed without error       00%      1098         -
# 7  Short offline       Completed without error       00%       922         -
# 8  Short offline       Completed without error       00%       754         -
# 9  Short offline       Completed without error       00%       586         -
#10  Short offline       Completed without error       00%       418         -
#11  Short offline       Completed without error       00%       250         -


I then started a extended offline Selftest, but am not sure is the drive failing or the SAS cabel faulty? I never had any problems with SAS cabels so im not sure how I would approach this kind of error. Should I order new SAS cabel and see if the ATA error count is stabel or should I run more smart test on the drive?

best regards
hardwarejunky
 

Spearfoot

He of the long foot
Moderator
Joined
May 13, 2015
Messages
2,478
Hello,

HW:
KALEA-INFORMATIQUE PCI Express Karte - 4X SATA 6Gb/s, mit miniSAS Kabel. PCIe 2.0. Marvell 88SE9215
and SilverStone SST-FS304B

5 drives in ZFS2 - array (all 6TB drives)
3xST6000VX001-2BD186
1xWDC WD6002FRYZ-01WD5B1
1xWDC WD60EFZX-68B3FN0

I weekly check the status of my Nas and became an array status degraded:
One seagate ST6000VX001-2BD186 has some ATA Error Count: 167 increast:

Code:
root@freenas[~]# smartctl -a /dev/ada8
smartctl 7.2 2020-12-30 r5155 [FreeBSD 12.2-RELEASE-p9 amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Skyhawk
Device Model:     ST6000VX001-2BD186
Serial Number:    XXXXXXXXXXXXXX
LU WWN Device Id: 5 000c50 0dbaa881f
Firmware Version: CV12
User Capacity:    6,001,175,126,016 bytes [6.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5425 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sun Oct  3 17:09:22 2021 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      ( 249) Self-test routine in progress...
                                        90% of test remaining.
Total time to complete Offline
data collection:                (    0) seconds.
Offline data collection
capabilities:                    (0x73) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        No Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 724) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x70bd) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   076   064   006    Pre-fail  Always       -       35548372
  3 Spin_Up_Time            0x0003   091   091   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       107
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   070   060   045    Pre-fail  Always       -       9856939
  9 Power_On_Hours          0x0032   098   098   000    Old_age   Always       -       1779h+00m+00.000s
10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       3
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   001   001   000    Old_age   Always       -       167
188 Command_Timeout         0x0032   098   098   000    Old_age   Always       -       4295032834
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   069   067   040    Old_age   Always       -       31 (Min/Max 23/33)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       2
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       530
194 Temperature_Celsius     0x0022   031   040   000    Old_age   Always       -       31 (0 23 0 0 0)
195 Hardware_ECC_Recovered  0x001a   076   064   000    Old_age   Always       -       35548372
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       1
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       746h+54m+56.288s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       11714087042
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       7192183930

SMART Error Log Version: 1
ATA Error Count: 167 (device log contains only the most recent five errors)
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 167 occurred at disk power-on lifetime: 1759 hours (73 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 b0 ff ff ff 4f 00  23d+14:02:57.719  READ DMA EXT
  61 00 00 ff ff ff 4f 00  23d+14:02:57.717  WRITE FPDMA QUEUED
  b0 d5 01 09 4f c2 40 00  23d+14:02:57.689  SMART READ LOG
  25 00 00 ff ff ff 4f 00  23d+14:02:54.022  READ DMA EXT
  b0 d5 01 06 4f c2 40 00  23d+14:02:53.886  SMART READ LOG

Error 166 occurred at disk power-on lifetime: 1759 hours (73 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 00 ff ff ff 4f 00  23d+14:02:54.022  READ DMA EXT
  b0 d5 01 06 4f c2 40 00  23d+14:02:53.886  SMART READ LOG
  25 00 00 ff ff ff 4f 00  23d+14:02:50.223  READ DMA EXT
  b0 d5 01 01 4f c2 40 00  23d+14:02:50.183  SMART READ LOG
  25 00 00 ff ff ff 4f 00  23d+14:02:46.515  READ DMA EXT

Error 165 occurred at disk power-on lifetime: 1759 hours (73 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 00 ff ff ff 4f 00  23d+14:02:50.223  READ DMA EXT
  b0 d5 01 01 4f c2 40 00  23d+14:02:50.183  SMART READ LOG
  25 00 00 ff ff ff 4f 00  23d+14:02:46.515  READ DMA EXT
  b0 d5 01 00 4f c2 40 00  23d+14:02:46.513  SMART READ LOG
  25 00 00 ff ff ff 4f 00  23d+14:02:42.801  READ DMA EXT

Error 164 occurred at disk power-on lifetime: 1759 hours (73 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 00 ff ff ff 4f 00  23d+14:02:46.515  READ DMA EXT
  b0 d5 01 00 4f c2 40 00  23d+14:02:46.513  SMART READ LOG
  25 00 00 ff ff ff 4f 00  23d+14:02:42.801  READ DMA EXT
  b0 da 00 00 4f c2 40 00  23d+14:02:42.494  SMART RETURN STATUS
  25 00 00 ff ff ff 4f 00  23d+14:02:38.202  READ DMA EXT

Error 163 occurred at disk power-on lifetime: 1759 hours (73 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 00 ff ff ff 4f 00  23d+14:02:42.801  READ DMA EXT
  b0 da 00 00 4f c2 40 00  23d+14:02:42.494  SMART RETURN STATUS
  25 00 00 ff ff ff 4f 00  23d+14:02:38.202  READ DMA EXT
  b0 d1 01 01 4f c2 40 00  23d+14:02:38.188  SMART READ ATTRIBUTE THRESHOLDS [OBS-4]
  2f 00 01 10 00 00 00 00  23d+14:02:38.188  READ LOG EXT

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Self-test routine in progress 90%      1779         -
# 2  Short offline       Completed without error       00%      1762         -
# 3  Short offline       Completed without error       00%      1594         -
# 4  Short offline       Completed without error       00%      1426         -
# 5  Short offline       Completed without error       00%      1258         -
# 6  Short offline       Completed without error       00%      1098         -
# 7  Short offline       Completed without error       00%       922         -
# 8  Short offline       Completed without error       00%       754         -
# 9  Short offline       Completed without error       00%       586         -
#10  Short offline       Completed without error       00%       418         -
#11  Short offline       Completed without error       00%       250         -

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

root@freenas[~]# smartctl -l selftest /dev/ada8
smartctl 7.2 2020-12-30 r5155 [FreeBSD 12.2-RELEASE-p9 amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Self-test routine in progress 90%      1779         -
# 2  Short offline       Completed without error       00%      1762         -
# 3  Short offline       Completed without error       00%      1594         -
# 4  Short offline       Completed without error       00%      1426         -
# 5  Short offline       Completed without error       00%      1258         -
# 6  Short offline       Completed without error       00%      1098         -
# 7  Short offline       Completed without error       00%       922         -
# 8  Short offline       Completed without error       00%       754         -
# 9  Short offline       Completed without error       00%       586         -
#10  Short offline       Completed without error       00%       418         -
#11  Short offline       Completed without error       00%       250         -


I then started a extended offline Selftest, but am not sure is the drive failing or the SAS cabel faulty? I never had any problems with SAS cabels so im not sure how I would approach this kind of error. Should I order new SAS cabel and see if the ATA error count is stabel or should I run more smart test on the drive?

best regards
hardwarejunk
The ATA errors with command dumps are not necessarily a sign of failure. I have a few drives that show those types of errors from years ago and they still work fine.

What is worrying is error code 187 "Reported Uncorrectable Sectors":
Code:
187 Reported_Uncorrect      0x0032   001   001   000    Old_age   Always       -       167

My advice is to either replace this disk or have a replacement drive on-hand, because it is probably going to fail soon.
 
Joined
Dec 22, 2018
Messages
4
Thanks for the advice, I contacted the Seagte support - hopefully I can RMA them the drive. Warranty is until the end of 2024 so enough time there. But I play it save and will purchase a new 6TB drive this week. Better save than sorry ;)
 
Top