- Joined
 - Nov 25, 2013
 
- Messages
 - 7,776
 
Hi, all,
after the disaster with the DOA WD drives I successfully replaced all disks of our small NAS.
Reference: https://forums.freenas.org/index.php?threads/wd-red-drives-not-spinning-up.25111/
Now, just one day later, I notice this - could someone please help me interpret the output? Should I suspect the connector in the enclosure or the controller?
APM set to 254 for all drives.
Thanks,
Patrick
	
		
			
		
		
	
			
			after the disaster with the DOA WD drives I successfully replaced all disks of our small NAS.
Reference: https://forums.freenas.org/index.php?threads/wd-red-drives-not-spinning-up.25111/
Now, just one day later, I notice this - could someone please help me interpret the output? Should I suspect the connector in the enclosure or the controller?
APM set to 254 for all drives.
Thanks,
Patrick
Code:
(ada2:ahcich2:0:0:0): READ_FPDMA_QUEUED. ACB: 60 08 50 c1 82 40 ba 00 00 00 00 00 (ada2:ahcich2:0:0:0): CAM status: ATA Status Error (ada2:ahcich2:0:0:0): ATA status: 41 (DRDY ERR), error: 40 (UNC ) (ada2:ahcich2:0:0:0): RES: 41 40 50 c1 82 00 ba 00 00 08 00 (ada2:ahcich2:0:0:0): Retrying command (ada2:ahcich2:0:0:0): READ_FPDMA_QUEUED. ACB: 60 10 40 5c 99 40 d8 00 00 00 00 00 (ada2:ahcich2:0:0:0): CAM status: ATA Status Error (ada2:ahcich2:0:0:0): ATA status: 41 (DRDY ERR), error: 40 (UNC ) (ada2:ahcich2:0:0:0): RES: 41 40 40 5c 99 00 d8 00 00 10 00 (ada2:ahcich2:0:0:0): Retrying command ... da2:ahcich2:0:0:0): READ_FPDMA_QUEUED. ACB: 60 10 40 5c 99 40 d8 00 00 00 00 00 (ada2:ahcich2:0:0:0): CAM status: ATA Status Error (ada2:ahcich2:0:0:0): ATA status: 41 (DRDY ERR), error: 40 (UNC ) (ada2:ahcich2:0:0:0): RES: 41 40 48 5c 99 00 d8 00 00 10 00 (ada2:ahcich2:0:0:0): Error 5, Retries exhausted ... ahcich2: Timeout on slot 29 port 0 ahcich2: is 00000000 cs 20000000 ss 00000000 rs 20000000 tfd c0 serr 00000000 cmd 0000fd17 ahcich2: Error while READ LOG EXT (ada2:ahcich2:0:0:0): READ_FPDMA_QUEUED. ACB: 60 08 20 91 99 40 d8 00 00 00 00 00 (ada2:ahcich2:0:0:0): CAM status: ATA Status Error (ada2:ahcich2:0:0:0): ATA status: 00 () (ada2:ahcich2:0:0:0): RES: 00 00 00 00 00 00 00 00 00 00 00 (ada2:ahcich2:0:0:0): Retrying command (ada2:ahcich2:0:0:0): READ_FPDMA_QUEUED. ACB: 60 08 20 91 99 40 d8 00 00 00 00 00 (ada2:ahcich2:0:0:0): CAM status: ATA Status Error (ada2:ahcich2:0:0:0): ATA status: 41 (DRDY ERR), error: 40 (UNC ) (ada2:ahcich2:0:0:0): RES: 41 40 20 91 99 00 d8 00 00 08 00 (ada2:ahcich2:0:0:0): Retrying command ... (ada2:ahcich2:0:0:0): READ_DMA48. ACB: 25 00 50 a5 99 40 d8 00 00 00 08 00 (ada2:ahcich2:0:0:0): CAM status: ATA Status Error (ada2:ahcich2:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC ) (ada2:ahcich2:0:0:0): RES: 51 40 50 a5 99 00 d8 00 00 00 00 (ada2:ahcich2:0:0:0): Error 5, Retries exhausted (ada2:ahcich2:0:0:0): READ_FPDMA_QUEUED. ACB: 60 08 20 c8 99 40 d8 00 00 00 00 00 (ada2:ahcich2:0:0:0): CAM status: ATA Status Error (ada2:ahcich2:0:0:0): ATA status: 41 (DRDY ERR), error: 40 (UNC ) (ada2:ahcich2:0:0:0): RES: 41 40 20 c8 99 00 d8 00 00 08 00 (ada2:ahcich2:0:0:0): Retrying command ... (ada2:ahcich2:0:0:0): READ_FPDMA_QUEUED. ACB: 60 08 20 c8 99 40 d8 00 00 00 00 00 (ada2:ahcich2:0:0:0): CAM status: ATA Status Error (ada2:ahcich2:0:0:0): ATA status: 41 (DRDY ERR), error: 40 (UNC ) (ada2:ahcich2:0:0:0): RES: 41 40 20 c8 99 00 d8 00 00 08 00 (ada2:ahcich2:0:0:0): Error 5, Retries exhausted ...
Code:
[root@freenas-je] ~# zpool status
  pool: zfs
state: ONLINE
  scan: scrub in progress since Mon Dec  8 09:02:31 2014
        790G scanned out of 4.99T at 309M/s, 3h58m to go
        152K repaired, 15.45% done
config:
    NAME        STATE     READ WRITE CKSUM
    zfs         ONLINE       0     0     0
      raidz2-0  ONLINE       0     0     0
        ada0p2  ONLINE       0     0     0
        ada1p2  ONLINE       0     0     0
        ada2p2  ONLINE       0     0     0  (repairing)
        ada3p2  ONLINE       0     0     0
errors: No known data errorsCode:
[root@freenas-je] ~# smartctl -a /dev/ada2
smartctl 6.2 2013-07-26 r3841 [FreeBSD 9.2-RELEASE-p12 amd64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Device Model:     ST4000VN000-1H4168
Serial Number:    Z302C6XN
LU WWN Device Id: 5 000c50 079481ce9
Firmware Version: SC44
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5900 rpm
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Mon Dec  8 09:50:37 2014 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status:  (0x82)    Offline data collection activity
                    was completed without error.
                    Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)    The previous self-test routine completed
                    without error or no self-test has ever
                    been run.
Total time to complete Offline
data collection:         (  107) seconds.
Offline data collection
capabilities:              (0x7b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine
recommended polling time:      (   1) minutes.
Extended self-test routine
recommended polling time:      ( 531) minutes.
Conveyance self-test routine
recommended polling time:      (   2) minutes.
SCT capabilities:            (0x10bd)    SCT Status supported.
                    SCT Error Recovery Control supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   115   092   006    Pre-fail  Always       -       90091760
  3 Spin_Up_Time            0x0003   096   096   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       4
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   065   060   030    Pre-fail  Always       -       3616389
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       89
10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       4
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   001   001   000    Old_age   Always       -       139
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       1
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   066   065   045    Old_age   Always       -       34 (Min/Max 31/34)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       0
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       4
194 Temperature_Celsius     0x0022   034   040   000    Old_age   Always       -       34 (0 19 0 0 0)
197 Current_Pending_Sector  0x0012   100   099   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   099   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
SMART Error Log Version: 1
ATA Error Count: 139 (device log contains only the most recent five errors)
    CR = Command Register [HEX]
    FR = Features Register [HEX]
    SC = Sector Count Register [HEX]
    SN = Sector Number Register [HEX]
    CL = Cylinder Low Register [HEX]
    CH = Cylinder High Register [HEX]
    DH = Device/Head Register [HEX]
    DC = Device Command Register [HEX]
    ER = Error register [HEX]
    ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.
Error 139 occurred at disk power-on lifetime: 88 hours (3 days + 16 hours)
  When the command that caused the error occurred, the device was active or idle.
  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455
  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 08 ff ff ff 4f 00      02:58:13.087  READ FPDMA QUEUED
  60 00 38 ff ff ff 4f 00      02:58:13.086  READ FPDMA QUEUED
  2f 00 01 10 00 00 00 00      02:58:13.064  READ LOG EXT
  60 00 08 ff ff ff 4f 00      02:58:09.128  READ FPDMA QUEUED
  60 00 38 ff ff ff 4f 00      02:58:09.127  READ FPDMA QUEUED
Error 138 occurred at disk power-on lifetime: 88 hours (3 days + 16 hours)
  When the command that caused the error occurred, the device was active or idle.
  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455
  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 08 ff ff ff 4f 00      02:58:09.128  READ FPDMA QUEUED
  60 00 38 ff ff ff 4f 00      02:58:09.127  READ FPDMA QUEUED
  2f 00 01 10 00 00 00 00      02:58:09.093  READ LOG EXT
  60 00 08 ff ff ff 4f 00      02:58:05.167  READ FPDMA QUEUED
  60 00 38 ff ff ff 4f 00      02:58:05.166  READ FPDMA QUEUED
Error 137 occurred at disk power-on lifetime: 88 hours (3 days + 16 hours)
  When the command that caused the error occurred, the device was active or idle.
  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455
  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 08 ff ff ff 4f 00      02:58:05.167  READ FPDMA QUEUED
  60 00 38 ff ff ff 4f 00      02:58:05.166  READ FPDMA QUEUED
  2f 00 01 10 00 00 00 00      02:58:05.133  READ LOG EXT
  61 00 28 ff ff ff 4f 00      02:58:01.212  WRITE FPDMA QUEUED
  60 00 08 ff ff ff 4f 00      02:58:01.212  READ FPDMA QUEUED
Error 136 occurred at disk power-on lifetime: 88 hours (3 days + 16 hours)
  When the command that caused the error occurred, the device was active or idle.
  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: WP at LBA = 0x0fffffff = 268435455
  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 00 28 ff ff ff 4f 00      02:58:01.212  WRITE FPDMA QUEUED
  60 00 08 ff ff ff 4f 00      02:58:01.212  READ FPDMA QUEUED
  60 00 38 ff ff ff 4f 00      02:58:01.211  READ FPDMA QUEUED
  2f 00 01 10 00 00 00 00      02:58:01.134  READ LOG EXT
  60 00 08 ff ff ff 4f 00      02:57:55.804  READ FPDMA QUEUED
Error 135 occurred at disk power-on lifetime: 88 hours (3 days + 16 hours)
  When the command that caused the error occurred, the device was active or idle.
  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455
  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 08 ff ff ff 4f 00      02:57:55.804  READ FPDMA QUEUED
  60 00 38 ff ff ff 4f 00      02:57:55.796  READ FPDMA QUEUED
  60 00 10 ff ff ff 4f 00      02:57:55.794  READ FPDMA QUEUED
  60 00 08 ff ff ff 4f 00      02:57:55.785  READ FPDMA QUEUED
  60 00 18 ff ff ff 4f 00      02:57:55.782  READ FPDMA QUEUED
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Conveyance offline  Completed without error       00%        41         -
SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.