ATA status: 41 (DRDY ERR), error: 40 (UNC )

Status
Not open for further replies.

esamett

Patron
Joined
May 28, 2011
Messages
345
Freenas Domain Security Run email today:

Code:
freenas.domain kernel log messages:
> (ada0:ahcich0:0:0:0): READ_FPDMA_QUEUED. ACB: 60 20 20 84 74 40 9b 00 00 00 00 00
> (ada0:ahcich0:0:0:0): CAM status: ATA Status Error
> (ada0:ahcich0:0:0:0): ATA status: 41 (DRDY ERR), error: 40 (UNC )
> (ada0:ahcich0:0:0:0): RES: 41 40 20 84 74 40 9b 00 00 00 00
> (ada0:ahcich0:0:0:0): Retrying command
> (ada0:ahcich0:0:0:0): READ_FPDMA_QUEUED. ACB: 60 20 20 84 74 40 9b 00 00 00 00 00
> (ada0:ahcich0:0:0:0): CAM status: ATA Status Error
> (ada0:ahcich0:0:0:0): ATA status: 41 (DRDY ERR), error: 40 (UNC )
> (ada0:ahcich0:0:0:0): RES: 41 40 20 84 74 40 9b 00 00 00 00
> (ada0:ahcich0:0:0:0): Retrying command
> (ada0:ahcich0:0:0:0): READ_FPDMA_QUEUED. ACB: 60 20 20 84 74 40 9b 00 00 00 00 00
> (ada0:ahcich0:0:0:0): CAM status: ATA Status Error
> (ada0:ahcich0:0:0:0): ATA status: 41 (DRDY ERR), error: 40 (UNC )
> (ada0:ahcich0:0:0:0): RES: 41 40 20 84 74 40 9b 00 00 00 00
> (ada0:ahcich0:0:0:0): Retrying command
> (ada0:ahcich0:0:0:0): READ_FPDMA_QUEUED. ACB: 60 20 20 84 74 40 9b 00 00 00 00 00
> (ada0:ahcich0:0:0:0): CAM status: ATA Status Error
> (ada0:ahcich0:0:0:0): ATA status: 41 (DRDY ERR), error: 40 (UNC )
> (ada0:ahcich0:0:0:0): RES: 41 40 20 84 74 40 9b 00 00 00 00
> (ada0:ahcich0:0:0:0): Retrying command
> (ada0:ahcich0:0:0:0): READ_FPDMA_QUEUED. ACB: 60 20 20 84 74 40 9b 00 00 00 00 00
> (ada0:ahcich0:0:0:0): CAM status: ATA Status Error
> (ada0:ahcich0:0:0:0): ATA status: 41 (DRDY ERR), error: 40 (UNC )
> (ada0:ahcich0:0:0:0): RES: 41 40 20 84 74 40 9b 00 00 00 00
> (ada0:ahcich0:0:0:0): Error 5, Retries exhaust


I searched the forums and saw a few postings suggesting doing smartctl long test on a certain drive. I don't see which drive I should check from the error message. GUI says both volumes and all 22 drives are on-line. Scrub was recently started. I have Smart tests scheduled per Cyberjock advice.

As background I replaced a few cables over the past few months per advice and those errors have not recurred. I also wiped a drive to remove pending sectors before that without repeat errors.

Please advise.

Thanks as always,

e
 

eldo

Explorer
Joined
Dec 18, 2014
Messages
99
esamett,

I belive your logs are pointint you to the disk /dev/ada0. You can verify which particular physical drive it is with 'smartctl -i /dev/ada0' and referencing the serial number if you need.
Have you tried running the smart short and/or long tests?

I recently was doing initial testing on some new drives, and received a very similar error. In my case it was ATA status 41, error: 50 (I think).
Looking through smartctl -A /dev/ada0 (as indicated in your log above) do you see any errors under RAW_VALUE? In my case I had a brand new drive and was showing quite a few counts on the 197 (current pending sector), and the drive failed the SMART short tests.

In my case I called the disk manufacturer and they required that I run their windows based smart utility prior to an RMA, and it failed with an uncorrectable error in the first 50MB of a the short smart test.
 

esamett

Patron
Joined
May 28, 2011
Messages
345
Its an older drive:

Code:
Device Model:  SAMSUNG HD204UI  
Serial Number:  S2H7JD2ZB08331  
LU WWN Device Id: 5 0024e9 004508109  
Firmware Version: 1AQ10001  
User Capacity:  2,000,398,934,016 bytes [2.00 TB]  
Sector Size:  512 bytes logical/physical  
Rotation Rate:  5400 rpm  
Device is:  In smartctl database [for details use: -P show]  
ATA Version is:  ATA8-ACS T13/1699-D revision 6  
SATA Version is:  SATA 2.6, 3.0 Gb/s  
Local Time is:  Fri Dec 19 16:40:26 2014 PST  
  
==> WARNING: Using smartmontools or hdparm with this  
drive may result in data loss due to a firmware bug.  
****** THIS DRIVE MAY OR MAY NOT BE AFFECTED! ******  
Buggy and fixed firmware report same version number!  
See the following web pages for details:  
http://knowledge.seagate.com/articles/en_US/FAQ/223571en  
http://sourceforge.net/apps/trac/smartmontools/wiki/SamsungF4EGBadBlocks  
  
SMART support is: Available - device has SMART capability.  
SMART support is: Enabled

long test pending.
Code:
[root@freenas ~]# smartctl -t long /dev/ada0
smartctl 6.2 2013-07-26 r3841 [FreeBSD 9.2-RELEASE-p10 amd64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Extended self-test routine immediately in off-li
ne mode".
Drive command "Execute SMART Extended self-test routine immediately in off-line
mode" successful.
Testing has begun.
Please wait 343 minutes for test to complete.
Test will complete after Fri Dec 19 22:26:39 2014

Use smartctl -X to abort test.
[root@freenas ~]#

thanks.

"Its an older code" - Return of the Jedi
 

esamett

Patron
Joined
May 28, 2011
Messages
345
smartctl -A /dev/ada0
Code:
smartctl 6.2 2013-07-26 r3841 [FreeBSD 9.2-RELEASE-p10 amd64] (local build)   
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org   
   
=== START OF READ SMART DATA SECTION ===   
SMART Attributes Data Structure revision number: 16   
Vendor Specific SMART Attributes with Thresholds:   
ID# ATTRIBUTE_NAME  FLAG  VALUE WORST THRESH TYPE  UPDATED  WHEN_
FAILED RAW_VALUE   
  1 Raw_Read_Error_Rate  0x002f  100  100  051  Pre-fail  Always  -
  4575   
  2 Throughput_Performance  0x0026  049  049  000  Old_age  Always  -
  21279   
  3 Spin_Up_Time  0x0023  067  066  025  Pre-fail  Always  -
  10201   
  4 Start_Stop_Count  0x0032  098  098  000  Old_age  Always  -
  2648   
  5 Reallocated_Sector_Ct  0x0033  252  252  010  Pre-fail  Always  -
  0   
  7 Seek_Error_Rate  0x002e  252  252  051  Old_age  Always  -
  0   
  8 Seek_Time_Performance  0x0024  252  252  015  Old_age  Offline  -
  0   
  9 Power_On_Hours  0x0032  100  100  000  Old_age  Always  -
  20148   
 10 Spin_Retry_Count  0x0032  252  252  051  Old_age  Always  -
  0   
 11 Calibration_Retry_Count 0x0032  252  252  000  Old_age  Always  -
  0   
 12 Power_Cycle_Count  0x0032  100  100  000  Old_age  Always  -
  427   
181 Program_Fail_Cnt_Total  0x0022  094  094  000  Old_age  Always  -
  136493310   
191 G-Sense_Error_Rate  0x0022  100  100  000  Old_age  Always  -
  7508   
192 Power-Off_Retract_Count 0x0022  252  252  000  Old_age  Always  -
  0   
194 Temperature_Celsius  0x0002  064  043  000  Old_age  Always  -
  31 (Min/Max 16/57)   
195 Hardware_ECC_Recovered  0x003a  100  100  000  Old_age  Always  -
  0   
196 Reallocated_Event_Count 0x0032  252  252  000  Old_age  Always  -
  0   
197 Current_Pending_Sector  0x0032  252  100  000  Old_age  Always  -
  0   
198 Offline_Uncorrectable  0x0030  252  252  000  Old_age  Offline  -
  0   
199 UDMA_CRC_Error_Count  0x0036  200  200  000  Old_age  Always  -
  0   
200 Multi_Zone_Error_Rate  0x002a  100  100  000  Old_age  Always  -
  7626   
223 Load_Retry_Count  0x0032  252  252  000  Old_age  Always  -
  0   
225 Load_Cycle_Count  0x0032  100  100  000  Old_age  Always  -
  2731   
  
 

esamett

Patron
Joined
May 28, 2011
Messages
345
Old, wearing out but useable?

smartctl -a /dev/ada0
Code:
smartctl 6.2 2013-07-26 r3841 [FreeBSD 9.2-RELEASE-p10 amd64] (local build)  
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org  
  
=== START OF INFORMATION SECTION ===  
Model Family:  SAMSUNG SpinPoint F4 EG (AF)  
Device Model:  SAMSUNG HD204UI  
Serial Number:  S2H7JD2ZB08331  
LU WWN Device Id: 5 0024e9 004508109  
Firmware Version: 1AQ10001  
User Capacity:  2,000,398,934,016 bytes [2.00 TB]  
Sector Size:  512 bytes logical/physical  
Rotation Rate:  5400 rpm  
Device is:  In smartctl database [for details use: -P show]  
ATA Version is:  ATA8-ACS T13/1699-D revision 6  
SATA Version is:  SATA 2.6, 3.0 Gb/s  
Local Time is:  Sat Dec 20 06:05:47 2014 PST  
  
==> WARNING: Using smartmontools or hdparm with this  
drive may result in data loss due to a firmware bug.  
****** THIS DRIVE MAY OR MAY NOT BE AFFECTED! ******  
Buggy and fixed firmware report same version number!  
See the following web pages for details:  
http://knowledge.seagate.com/articles/en_US/FAQ/223571en  
http://sourceforge.net/apps/trac/smartmontools/wiki/SamsungF4EGBadBlocks  
  
SMART support is: Available - device has SMART capability.  
SMART support is: Enabled  
  
=== START OF READ SMART DATA SECTION ===  
SMART overall-health self-assessment test result: PASSED  
  
General SMART Values:  
Offline data collection status:  (0x00) Offline data collection activity  
  was never started.  
  Auto Offline Data Collection: Disabled.
Self-test execution status:  (  0) The previous self-test routine completed
  without error or no self-test has ever 
  been run.  
Total time to complete Offline  
data collection:  (20580) seconds.  
Offline data collection  
capabilities:  (0x5b) SMART execute Offline immediate.  
  Auto Offline data collection on/off supp
ort.  
  Suspend Offline collection upon new  
  command.  
  command.  
  Offline surface scan supported.  
  Self-test supported.  
  No Conveyance Self-test supported.  
  Selective Self-test supported.  
SMART capabilities:  (0x0003) Saves SMART data before entering  
  power-saving mode.  
  Supports SMART auto save timer.  
Error logging capability:  (0x01) Error logging supported.  
  General Purpose Logging supported.  
Short self-test routine  
recommended polling time:  (  2) minutes.  
Extended self-test routine  
recommended polling time:  ( 343) minutes.  
SCT capabilities:  (0x003f) SCT Status supported.  
  SCT Error Recovery Control supported.  
  SCT Feature Control supported.  
  SCT Data Table supported.  
  
SMART Attributes Data Structure revision number: 16  
Vendor Specific SMART Attributes with Thresholds:  
ID# ATTRIBUTE_NAME  FLAG  VALUE WORST THRESH TYPE  UPDATED  WHEN_
FAILED RAW_VALUE  
  1 Raw_Read_Error_Rate  0x002f  100  100  051  Pre-fail  Always  -
  4575  
  2 Throughput_Performance  0x0026  049  049  000  Old_age  Always  -
  21279  
  3 Spin_Up_Time  0x0023  067  066  025  Pre-fail  Always  -
  10201  
  4 Start_Stop_Count  0x0032  098  098  000  Old_age  Always  -
  2648  
  5 Reallocated_Sector_Ct  0x0033  252  252  010  Pre-fail  Always  -
  0  
  7 Seek_Error_Rate  0x002e  252  252  051  Old_age  Always  -
  0  
  8 Seek_Time_Performance  0x0024  252  252  015  Old_age  Offline  -
  0  
  9 Power_On_Hours  0x0032  100  100  000  Old_age  Always  -
  20149  
10 Spin_Retry_Count  0x0032  252  252  051  Old_age  Always  -
  0  
11 Calibration_Retry_Count 0x0032  252  252  000  Old_age  Always  -
  0  
12 Power_Cycle_Count  0x0032  100  100  000  Old_age  Always  -
  427  
181 Program_Fail_Cnt_Total  0x0022  094  094  000  Old_age  Always  -
  136493310  
191 G-Sense_Error_Rate  0x0022  100  100  000  Old_age  Always  -
  7508  
192 Power-Off_Retract_Count 0x0022  252  252  000  Old_age  Always  -
  0  
194 Temperature_Celsius  0x0002  064  043  000  Old_age  Always  -
  32 (Min/Max 16/57)  
195 Hardware_ECC_Recovered  0x003a  100  100  000  Old_age  Always  -
  0  
196 Reallocated_Event_Count 0x0032  252  252  000  Old_age  Always  -
  0  
197 Current_Pending_Sector  0x0032  252  100  000  Old_age  Always  -
  0  
198 Offline_Uncorrectable  0x0030  252  252  000  Old_age  Offline  -
  0  
199 UDMA_CRC_Error_Count  0x0036  200  200  000  Old_age  Always  -
  0  
200 Multi_Zone_Error_Rate  0x002a  100  100  000  Old_age  Always  -
  7626  
223 Load_Retry_Count  0x0032  252  252  000  Old_age  Always  -
  0  
225 Load_Cycle_Count  0x0032  100  100  000  Old_age  Always  -
  2731  
  
SMART Error Log Version: 1  
No Errors Logged  
  
SMART Self-test log structure revision number 1  
Num  Test_Description  Status  Remaining  LifeTime(hours)  LBA
_of_first_error  
# 1  Extended offline  Completed without error  00%  20144  - 
# 2  Extended offline  Completed without error  00%  18234  - 
# 3  Short offline  Aborted by host  80%  18221  - 
# 4  Extended offline  Completed without error  00%  18214  - 
# 5  Extended offline  Completed: read failure  30%  18201  261
4945456  
# 6  Short offline  Completed without error  00%  1310  - 
1 of 1 failed self-tests are outdated by newer successful extended offline self-
test # 1  
  
SMART Selective self-test log data structure revision number 0  
Note: revision number not 1 implies that no selective self-test has ever been ru
n  
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS  
  1  0  0  Completed [00% left] (0-65535)  
  2  0  0  Not_testing  
  
  3  0  0  Not_testing  
  4  0  0  Not_testing  
  5  0  0  Not_testing  
Selective self-test flags (0x0):  
  After scanning selected spans, do NOT read-scan remainder of disk.  
If Selective self-test is pending on power-up, resume after 0 minute delay.  
  
[root@freenas ~]#  



"Broken, but still good" - Lilo and Stich
 
Last edited:

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Depends on what the various error rates mean... That multi-zone error rate sounds ominous and rather high.
 
Status
Not open for further replies.
Top