I am having an issue when my fresh build array is degrading. Most of the time I am getting write errors, but sometimes read as well.
This is my first interaction with TrueNAS but I have dome some research and can't figure this one out.
The most similar problem is described in this thread, but yet it is different.
Setup:
OS: TrueNAS-SCALE-22.02.4
CPU: Intel(R) Pentium(R) G4500 @3.5 GHz
MB: Gigabyte Z170N-Gaming 5
RAM: 8GB (2x) Corsair 4GB DDR4 2400MHz
HDDS:
- 3x 6TB WD Blue WD60EZRZ (from 2016)
- 1x 6TB WD Red WD60EFAX brand new
- 1x Kingston 120GB SSD SA400S37/120G
ZFS: RAID1Z with above drives
PSU: PicoPSU-80-WI-32V
History:
This is my NAS setup that I used since 2016 just with Windows and motherboard raid. It might not be the perfect solution, but worked for my needs. Let's put it that way ;)
My UPS failed (of course my cache was set to don't wait for write completion), 1 of 4 WD drives failed, my array fell apart and I spent past month or so with data recovery software. I managed to recover everything, but it was so close of never seeing my data.
I have since then replaced the batteries, got a new drive and run SMART and surface tests on remaining old drives.
What Happens now:
Fresh install of TrueNAS Scale. Then I create a default pool and import my 13TB of data from a backup drive. After few hours or a day I come back and array is degraded. Most often degraded drive is brand new WD red.
What have I tried:
I have replaced reported faulty drive with spare 5TB drive.
I have replaced all sata cables.
I have replaced whole computer.
What appears to be helping:
Now I have TrueNAS Core installed and so far so good. It is just 12hrs...
Interestingly my speeds are half of TrueNAS Scale.
Also I am thinking that some power saving feature might be messing up stuff here, but I have all of the power saving features disabled. Also does not explain why same setup and just different TrueNAS version would help.
My logs:
This is my first interaction with TrueNAS but I have dome some research and can't figure this one out.
The most similar problem is described in this thread, but yet it is different.
Setup:
OS: TrueNAS-SCALE-22.02.4
CPU: Intel(R) Pentium(R) G4500 @3.5 GHz
MB: Gigabyte Z170N-Gaming 5
RAM: 8GB (2x) Corsair 4GB DDR4 2400MHz
HDDS:
- 3x 6TB WD Blue WD60EZRZ (from 2016)
- 1x 6TB WD Red WD60EFAX brand new
- 1x Kingston 120GB SSD SA400S37/120G
ZFS: RAID1Z with above drives
PSU: PicoPSU-80-WI-32V
History:
This is my NAS setup that I used since 2016 just with Windows and motherboard raid. It might not be the perfect solution, but worked for my needs. Let's put it that way ;)
My UPS failed (of course my cache was set to don't wait for write completion), 1 of 4 WD drives failed, my array fell apart and I spent past month or so with data recovery software. I managed to recover everything, but it was so close of never seeing my data.
I have since then replaced the batteries, got a new drive and run SMART and surface tests on remaining old drives.
What Happens now:
Fresh install of TrueNAS Scale. Then I create a default pool and import my 13TB of data from a backup drive. After few hours or a day I come back and array is degraded. Most often degraded drive is brand new WD red.
What have I tried:
I have replaced reported faulty drive with spare 5TB drive.
I have replaced all sata cables.
I have replaced whole computer.
What appears to be helping:
Now I have TrueNAS Core installed and so far so good. It is just 12hrs...
Interestingly my speeds are half of TrueNAS Scale.
Also I am thinking that some power saving feature might be messing up stuff here, but I have all of the power saving features disabled. Also does not explain why same setup and just different TrueNAS version would help.
My logs:
Code:
root@NAS[~]# smartctl -x /dev/sda smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.10.142+truenas] (local build) === START OF INFORMATION SECTION === Model Family: Western Digital Red (SMR) Device Model: WDC WD60EFAX-68JH4N1 Serial Number: WD-WXB2DA1R5TLE LU WWN Device Id: 5 0014ee 2bf94f8ca Firmware Version: 83.00A83 User Capacity: 6,001,175,126,016 bytes [6.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 5400 rpm Form Factor: 3.5 inches TRIM Command: Available, deterministic, zeroed Device is: In smartctl database [for details use: -P show] ATA Version is: ACS-3 T13/2161-D revision 5 SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Fri Nov 11 07:33:57 2022 PST SMART support is: Available - device has SMART capability. SMART support is: Enabled AAM feature is: Unavailable APM feature is: Unavailable Rd look-ahead is: Enabled Write cache is: Enabled DSN feature is: Unavailable ATA Security is: Disabled, frozen [SEC2] Wt Cache Reorder: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_Rate POSR-K 200 200 051 - 0 3 Spin_Up_Time POS--K 225 224 021 - 3741 4 Start_Stop_Count -O--CK 100 100 000 - 17 5 Reallocated_Sector_Ct PO--CK 200 200 140 - 0 7 Seek_Error_Rate -OSR-K 200 200 000 - 0 9 Power_On_Hours -O--CK 100 100 000 - 170 10 Spin_Retry_Count -O--CK 100 253 000 - 0 11 Calibration_Retry_Count -O--CK 100 253 000 - 0 12 Power_Cycle_Count -O--CK 100 100 000 - 15 192 Power-Off_Retract_Count -O--CK 200 200 000 - 6 193 Load_Cycle_Count -O--CK 200 200 000 - 35 194 Temperature_Celsius -O---K 117 109 000 - 33 196 Reallocated_Event_Count -O--CK 200 200 000 - 0 197 Current_Pending_Sector -O--CK 200 200 000 - 0 198 Offline_Uncorrectable ----CK 100 253 000 - 0 199 UDMA_CRC_Error_Count -O--CK 200 200 000 - 0 200 Multi_Zone_Error_Rate ---R-- 200 200 000 - 0 SMART Extended Comprehensive Error Log Version: 1 (6 sectors) Device Error Count: 103 (device log contains only the most recent 24 errors) Error 103 [6] occurred at disk power-on lifetime: 163 hours (6 days + 19 hours) When the command that caused the error occurred, the device was doing SMART Offline or Self-test. After command completion occurred, registers were: ER -- ST COUNT LBA_48 LH LM LL DV DC -- -- -- == -- == == == -- -- -- -- -- 10 -- 51 00 00 00 00 00 40 04 70 40 00 Error: IDNF at LBA = 0x00400470 = 4195440 Commands leading to the command that caused the error were: CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name -- == -- == -- == == == -- -- -- -- -- --------------- -------------------- 60 00 08 00 70 00 00 fd 41 c9 d0 40 08 04:56:25.132 READ FPDMA QUEUED 61 00 08 00 68 00 02 ba a0 f4 70 40 08 04:56:25.017 WRITE FPDMA QUEUED 61 00 08 00 60 00 02 ba a0 f2 70 40 08 04:56:25.016 WRITE FPDMA QUEUED 61 00 08 00 58 00 00 00 40 04 70 40 08 04:56:25.010 WRITE FPDMA QUEUED 61 00 08 00 50 00 00 00 40 02 70 40 08 04:56:25.010 WRITE FPDMA QUEUED Error 102 [5] occurred at disk power-on lifetime: 163 hours (6 days + 19 hours) When the command that caused the error occurred, the device was doing SMART Offline or Self-test. After command completion occurred, registers were: ER -- ST COUNT LBA_48 LH LM LL DV DC -- -- -- == -- == == == -- -- -- -- -- 10 -- 51 00 00 00 00 fd 7c 8e 30 40 00 Error: IDNF at LBA = 0xfd7c8e30 = 4252798512 Commands leading to the command that caused the error were: CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name -- == -- == -- == == == -- -- -- -- -- --------------- -------------------- 60 00 08 00 50 00 00 03 c0 2a 38 40 08 04:56:09.354 READ FPDMA QUEUED 60 00 08 00 48 00 01 1e e6 ae 48 40 08 04:56:08.671 READ FPDMA QUEUED 60 00 08 00 40 00 01 1d f9 2a 08 40 08 04:56:08.669 READ FPDMA QUEUED 60 00 08 00 38 00 00 fd 41 cb e0 40 08 04:56:08.669 READ FPDMA QUEUED 61 07 c0 00 30 00 00 fd 7c 95 e8 40 08 04:56:08.669 WRITE FPDMA QUEUED Error 101 [4] occurred at disk power-on lifetime: 163 hours (6 days + 19 hours) When the command that caused the error occurred, the device was doing SMART Offline or Self-test. After command completion occurred, registers were: ER -- ST COUNT LBA_48 LH LM LL DV DC -- -- -- == -- == == == -- -- -- -- -- 10 -- 51 00 00 00 00 fd 7c 85 68 40 00 Error: IDNF at LBA = 0xfd7c8568 = 4252796264 Commands leading to the command that caused the error were: CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name -- == -- == -- == == == -- -- -- -- -- --------------- -------------------- 60 00 10 00 00 00 02 ba a0 f2 90 40 08 04:56:00.465 READ FPDMA QUEUED 60 00 10 00 f8 00 02 ba a0 f0 90 40 08 04:56:00.465 READ FPDMA QUEUED 60 00 10 00 b8 00 00 00 40 02 90 40 08 04:56:00.465 READ FPDMA QUEUED 61 07 c0 00 b0 00 00 fd 7c 86 70 40 08 04:56:00.465 WRITE FPDMA QUEUED 61 07 b8 00 90 00 00 fd 7c 8e 30 40 08 04:56:00.465 WRITE FPDMA QUEUED Error 100 [3] occurred at disk power-on lifetime: 163 hours (6 days + 19 hours) When the command that caused the error occurred, the device was doing SMART Offline or Self-test. After command completion occurred, registers were: ER -- ST COUNT LBA_48 LH LM LL DV DC -- -- -- == -- == == == -- -- -- -- -- 10 -- 51 00 00 00 00 fd 7c 82 50 40 00 Error: IDNF at LBA = 0xfd7c8250 = 4252795472 Commands leading to the command that caused the error were: CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name -- == -- == -- == == == -- -- -- -- -- --------------- -------------------- 60 00 08 00 08 00 00 fd 41 ca 90 40 08 04:55:54.512 READ FPDMA QUEUED 60 00 08 00 00 00 01 18 5f 3a c0 40 08 04:55:54.052 READ FPDMA QUEUED 60 00 18 00 f8 00 00 fd 41 c2 68 40 08 04:55:54.040 READ FPDMA QUEUED 61 00 80 00 a0 00 00 fd 7c 85 f0 40 08 04:55:52.265 WRITE FPDMA QUEUED 61 00 80 00 98 00 00 fd 7c 85 68 40 08 04:55:52.263 WRITE FPDMA QUEUED Error 99 [2] occurred at disk power-on lifetime: 152 hours (6 days + 8 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER -- ST COUNT LBA_48 LH LM LL DV DC -- -- -- == -- == == == -- -- -- -- -- 10 -- 51 00 00 00 00 00 00 00 84 40 00 Error: IDNF at LBA = 0x00000084 = 132 Commands leading to the command that caused the error were: CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name -- == -- == -- == == == -- -- -- -- -- --------------- -------------------- 61 00 01 00 18 00 00 00 00 00 86 40 08 04:03:39.619 WRITE FPDMA QUEUED 61 00 01 00 10 00 00 00 00 00 85 40 08 04:03:39.619 WRITE FPDMA QUEUED 61 00 01 00 08 00 00 00 00 00 84 40 08 04:03:39.619 WRITE FPDMA QUEUED 61 00 01 00 00 00 00 00 00 00 83 40 08 04:03:39.619 WRITE FPDMA QUEUED ef 00 10 00 02 00 00 00 00 00 00 a0 08 04:03:39.614 SET FEATURES [Enable SATA feature] Error 98 [1] occurred at disk power-on lifetime: 152 hours (6 days + 8 hours) 10 -- 51 00 00 00 00 00 00 00 86 40 00 Error: IDNF at LBA = 0x00000086 = 134 Error 97 [0] occurred at disk power-on lifetime: 152 hours (6 days + 8 hours) 10 -- 51 00 00 00 00 00 00 00 83 40 00 Error: IDNF at LBA = 0x00000083 = 131 Error 96 [23] occurred at disk power-on lifetime: 152 hours (6 days + 8 hours) 10 -- 51 00 00 00 00 00 00 00 87 40 00 Error: IDNF at LBA = 0x00000087 = 135 SMART Extended Self-test Log Version: 1 (1 sectors) Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 160 - # 2 Extended offline Completed without error 00% 98 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. SCT Status Version: 3 SCT Version (vendor specific): 258 (0x0102) Device State: DST executing in background (3) Current Temperature: 33 Celsius Power Cycle Min/Max Temperature: 32/34 Celsius Lifetime Min/Max Temperature: 26/41 Celsius Under/Over Temperature Limit Count: 0/0 Vendor specific: 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 SCT Error Recovery Control: Read: 70 (7.0 seconds) Write: 70 (7.0 seconds) Device Statistics (GP Log 0x04) Page Offset Size Value Flags Description 0x01 ===== = = === == General Statistics (rev 1) == 0x01 0x008 4 15 --- Lifetime Power-On Resets 0x01 0x010 4 170 --- Power-on Hours 0x01 0x018 6 26397621528 --- Logical Sectors Written 0x01 0x020 6 97102774 --- Number of Write Commands 0x01 0x028 6 1103767073 --- Logical Sectors Read 0x01 0x030 6 2038117 --- Number of Read Commands 0x01 0x038 6 612000000 --- Date and Time TimeStamp 0x03 ===== = = === == Rotating Media Statistics (rev 1) == 0x03 0x008 4 170 --- Spindle Motor Power-on Hours 0x03 0x010 4 116 --- Head Flying Hours 0x03 0x018 4 42 --- Head Load Events 0x03 0x020 4 0 --- Number of Reallocated Logical Sectors 0x03 0x028 4 0 --- Read Recovery Attempts 0x03 0x030 4 0 --- Number of Mechanical Start Failures 0x03 0x038 4 0 --- Number of Realloc. Candidate Logical Sectors 0x03 0x040 4 6 --- Number of High Priority Unload Events 0x04 ===== = = === == General Errors Statistics (rev 1) == 0x04 0x008 4 103 --- Number of Reported Uncorrectable Errors 0x04 0x010 4 0 --- Resets Between Cmd Acceptance and Completion 0x05 ===== = = === == Temperature Statistics (rev 1) == 0x05 0x008 1 33 --- Current Temperature 0x05 0x010 1 34 --- Average Short Term Temperature 0x05 0x018 1 - --- Average Long Term Temperature 0x05 0x020 1 41 --- Highest Temperature 0x05 0x028 1 28 --- Lowest Temperature 0x05 0x030 1 36 --- Highest Average Short Term Temperature 0x05 0x038 1 32 --- Lowest Average Short Term Temperature 0x05 0x040 1 - --- Highest Average Long Term Temperature 0x05 0x048 1 - --- Lowest Average Long Term Temperature 0x05 0x050 4 0 --- Time in Over-Temperature 0x05 0x058 1 65 --- Specified Maximum Operating Temperature 0x05 0x060 4 0 --- Time in Under-Temperature 0x05 0x068 1 0 --- Specified Minimum Operating Temperature 0x06 ===== = = === == Transport Statistics (rev 1) == 0x06 0x008 4 55 --- Number of Hardware Resets 0x06 0x010 4 28 --- Number of ASR Events 0x06 0x018 4 0 --- Number of Interface CRC Errors 0xff ===== = = === == Vendor Specific Statistics (rev 1) == 0xff 0x008 7 0 --- Vendor Specific 0xff 0x010 7 0 --- Vendor Specific 0xff 0x018 7 0 --- Vendor Specific Pending Defects log (GP Log 0x0c) No Defects Logged SATA Phy Event Counters (GP Log 0x11) ID Size Value Description 0x0001 2 0 Command failed due to ICRC error 0x0002 2 0 R_ERR response for data FIS 0x0003 2 0 R_ERR response for device-to-host data FIS 0x0004 2 0 R_ERR response for host-to-device data FIS 0x0005 2 0 R_ERR response for non-data FIS 0x0006 2 0 R_ERR response for device-to-host non-data FIS 0x0007 2 0 R_ERR response for host-to-device non-data FIS 0x0008 2 0 Device-to-host non-data FIS retries 0x0009 2 8 Transition from drive PhyRdy to drive PhyNRdy 0x000a 2 9 Device-to-host register FISes sent due to a COMRESET 0x000b 2 0 CRC errors within host-to-device FIS 0x000d 2 0 Non-CRC errors within host-to-device FIS 0x000f 2 0 R_ERR response for host-to-device data FIS, CRC 0x0012 2 0 R_ERR response for host-to-device non-data FIS, CRC 0x8000 4 43309 Vendor specific # root@NAS[~]#
Code:
root@NAS[~]# smartctl -x /dev/sdc smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.10.142+truenas] (local build) === START OF INFORMATION SECTION === Model Family: Western Digital Blue Device Model: WDC WD60EZRZ-00GZ5B1 Serial Number: WD-WXJ1H26LX894 LU WWN Device Id: 5 0014ee 20dd74b72 Firmware Version: 80.00A80 User Capacity: 6,001,175,126,016 bytes [6.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 5700 rpm Device is: In smartctl database [for details use: -P show] ATA Version is: ACS-2, ACS-3 T13/2161-D revision 3b SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Fri Nov 11 07:35:36 2022 PST SMART support is: Available - device has SMART capability. SMART support is: Enabled AAM feature is: Unavailable APM feature is: Unavailable Rd look-ahead is: Enabled Write cache is: Enabled DSN feature is: Unavailable ATA Security is: Disabled, frozen [SEC2] Wt Cache Reorder: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_Rate POSR-K 200 200 051 - 0 3 Spin_Up_Time POS--K 186 186 021 - 9666 4 Start_Stop_Count -O--CK 084 084 000 - 16207 5 Reallocated_Sector_Ct PO--CK 200 200 140 - 0 7 Seek_Error_Rate -OSR-K 200 200 000 - 0 9 Power_On_Hours -O--CK 068 068 000 - 23538 10 Spin_Retry_Count -O--CK 100 100 000 - 0 11 Calibration_Retry_Count -O--CK 100 100 000 - 0 12 Power_Cycle_Count -O--CK 100 100 000 - 228 192 Power-Off_Retract_Count -O--CK 200 200 000 - 123 193 Load_Cycle_Count -O--CK 120 120 000 - 240509 194 Temperature_Celsius -O---K 112 095 000 - 40 196 Reallocated_Event_Count -O--CK 200 200 000 - 0 197 Current_Pending_Sector -O--CK 200 200 000 - 0 198 Offline_Uncorrectable ----CK 200 200 000 - 0 199 UDMA_CRC_Error_Count -O--CK 200 200 000 - 0 200 Multi_Zone_Error_Rate ---R-- 200 200 000 - 0 SMART Extended Comprehensive Error Log Version: 1 (6 sectors) No Errors Logged SMART Extended Self-test Log Version: 1 (1 sectors) Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 23528 - # 2 Extended offline Completed without error 00% 23465 -
Code:
root@NAS[~]# smartctl -x /dev/sdd smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.10.142+truenas] (local build) === START OF INFORMATION SECTION === Model Family: Western Digital Blue Device Model: WDC WD60EZRZ-00GZ5B1 Serial Number: WD-WXJ1H26SMCYJ LU WWN Device Id: 5 0014ee 20dd767bd Firmware Version: 80.00A80 User Capacity: 6,001,175,126,016 bytes [6.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 5700 rpm Device is: In smartctl database [for details use: -P show] ATA Version is: ACS-2, ACS-3 T13/2161-D revision 3b SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Fri Nov 11 07:36:09 2022 PST SMART support is: Available - device has SMART capability. SMART support is: Enabled AAM feature is: Unavailable APM feature is: Unavailable Rd look-ahead is: Enabled Write cache is: Enabled DSN feature is: Unavailable ATA Security is: Disabled, frozen [SEC2] Wt Cache Reorder: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_Rate POSR-K 200 200 051 - 0 3 Spin_Up_Time POS--K 197 196 021 - 9125 4 Start_Stop_Count -O--CK 084 084 000 - 16189 5 Reallocated_Sector_Ct PO--CK 200 200 140 - 0 7 Seek_Error_Rate -OSR-K 200 200 000 - 0 9 Power_On_Hours -O--CK 068 068 000 - 23545 10 Spin_Retry_Count -O--CK 100 100 000 - 0 11 Calibration_Retry_Count -O--CK 100 100 000 - 0 12 Power_Cycle_Count -O--CK 100 100 000 - 222 192 Power-Off_Retract_Count -O--CK 200 200 000 - 119 193 Load_Cycle_Count -O--CK 122 122 000 - 235715 194 Temperature_Celsius -O---K 115 101 000 - 37 196 Reallocated_Event_Count -O--CK 200 200 000 - 0 197 Current_Pending_Sector -O--CK 200 200 000 - 0 198 Offline_Uncorrectable ----CK 200 200 000 - 0 199 UDMA_CRC_Error_Count -O--CK 200 200 000 - 0 200 Multi_Zone_Error_Rate ---R-- 200 200 000 - 0 SMART Extended Comprehensive Error Log Version: 1 (6 sectors) No Errors Logged SMART Extended Self-test Log Version: 1 (1 sectors) Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 23535 - # 2 Extended offline Completed without error 00% 23472 - # 3 Extended offline Aborted by host 90% 0 -
Code:
root@NAS[~]# smartctl -x /dev/sde smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.10.142+truenas] (local build) === START OF INFORMATION SECTION === Model Family: Western Digital Blue Device Model: WDC WD60EZRZ-00RWYB1 Serial Number: WD-WX21DB5096DA LU WWN Device Id: 5 0014ee 2631e3d7f Firmware Version: 80.00A80 User Capacity: 6,001,175,126,016 bytes [6.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 5700 rpm Device is: In smartctl database [for details use: -P show] ATA Version is: ACS-2, ACS-3 T13/2161-D revision 3b SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Fri Nov 11 07:36:50 2022 PST SMART support is: Available - device has SMART capability. SMART support is: Enabled AAM feature is: Unavailable APM feature is: Unavailable Rd look-ahead is: Enabled Write cache is: Enabled DSN feature is: Unavailable ATA Security is: Disabled, frozen [SEC2] Wt Cache Reorder: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_Rate POSR-K 200 199 051 - 1 3 Spin_Up_Time POS--K 249 199 021 - 6508 4 Start_Stop_Count -O--CK 084 084 000 - 16193 5 Reallocated_Sector_Ct PO--CK 195 195 140 - 170 7 Seek_Error_Rate -OSR-K 200 200 000 - 0 9 Power_On_Hours -O--CK 066 066 000 - 25245 10 Spin_Retry_Count -O--CK 100 100 000 - 0 11 Calibration_Retry_Count -O--CK 100 100 000 - 0 12 Power_Cycle_Count -O--CK 100 100 000 - 231 192 Power-Off_Retract_Count -O--CK 200 200 000 - 124 193 Load_Cycle_Count -O--CK 118 118 000 - 247562 194 Temperature_Celsius -O---K 110 098 000 - 42 196 Reallocated_Event_Count -O--CK 196 058 000 - 4 197 Current_Pending_Sector -O--CK 200 200 000 - 1 198 Offline_Uncorrectable ----CK 200 200 000 - 0 199 UDMA_CRC_Error_Count -O--CK 200 200 000 - 1 200 Multi_Zone_Error_Rate ---R-- 200 200 000 - 20 SMART Extended Comprehensive Error Log Version: 1 (6 sectors) Device Error Count: 133 (device log contains only the most recent 24 errors) Error 133 [12] occurred at disk power-on lifetime: 25222 hours (1050 days + 22 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER -- ST COUNT LBA_48 LH LM LL DV DC -- -- -- == -- == == == -- -- -- -- -- 40 -- 51 00 00 00 00 02 e5 f7 d8 40 00 Error: UNC at LBA = 0x02e5f7d8 = 48625624 Commands leading to the command that caused the error were: CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name -- == -- == -- == == == -- -- -- -- -- --------------- -------------------- 60 07 e8 00 b8 00 00 02 e5 fe d8 40 08 1d+00:36:56.472 READ FPDMA QUEUED 60 07 e8 00 b0 00 00 02 e5 f6 f0 40 08 1d+00:36:56.464 READ FPDMA QUEUED 60 07 e8 00 a8 00 00 02 e5 ef 08 40 08 1d+00:36:56.447 READ FPDMA QUEUED 60 07 e8 00 a0 00 00 02 e5 e7 20 40 08 1d+00:36:56.442 READ FPDMA QUEUED 60 07 e8 00 98 00 00 02 e5 df 38 40 08 1d+00:36:56.425 READ FPDMA QUEUED Error 132 [11] occurred at disk power-on lifetime: 24144 hours (1006 days + 0 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER -- ST COUNT LBA_48 LH LM LL DV DC -- -- -- == -- == == == -- -- -- -- -- 40 -- 51 00 00 00 02 ba a0 f4 af 40 00 Error: UNC at LBA = 0x2baa0f4af = 11721045167 Commands leading to the command that caused the error were: CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name -- == -- == -- == == == -- -- -- -- -- --------------- -------------------- 60 00 01 00 b8 00 02 ba a0 f4 af 40 00 1d+23:26:53.428 READ FPDMA QUEUED 60 00 01 00 b0 00 02 ba a0 f4 ae 40 00 1d+23:26:53.427 READ FPDMA QUEUED 60 00 01 00 a8 00 02 ba a0 f4 ad 40 00 1d+23:26:53.399 READ FPDMA QUEUED 2f 00 00 00 01 00 00 00 00 00 10 68 00 1d+23:26:53.377 READ LOG EXT 60 00 01 00 a0 00 02 ba a0 f4 ac 40 00 1d+23:26:53.178 READ FPDMA QUEUED Error 131 [10] occurred at disk power-on lifetime: 24144 hours (1006 days + 0 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER -- ST COUNT LBA_48 LH LM LL DV DC -- -- -- == -- == == == -- -- -- -- -- 40 -- 51 00 00 00 02 ba a0 f4 ac 40 00 Error: UNC at LBA = 0x2baa0f4ac = 11721045164 Commands leading to the command that caused the error were: CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name -- == -- == -- == == == -- -- -- -- -- --------------- -------------------- 60 00 01 00 a0 00 02 ba a0 f4 ac 40 00 1d+23:26:53.178 READ FPDMA QUEUED 2f 00 00 00 01 00 00 00 00 00 10 68 00 1d+23:26:53.169 READ LOG EXT 60 00 01 00 98 00 02 ba a0 f4 ab 40 00 1d+23:26:52.961 READ FPDMA QUEUED 2f 00 00 00 01 00 00 00 00 00 10 68 00 1d+23:26:52.950 READ LOG EXT 60 00 01 00 90 00 02 ba a0 f4 aa 40 00 1d+23:26:52.745 READ FPDMA QUEUED Error 130 [9] occurred at disk power-on lifetime: 24144 hours (1006 days + 0 hours) 40 -- 51 00 00 00 02 ba a0 f4 ab 40 00 Error: UNC at LBA = 0x2baa0f4ab = 11721045163 Error 129 [8] occurred at disk power-on lifetime: 24144 hours (1006 days + 0 hours) 40 -- 51 00 00 00 02 ba a0 f4 aa 40 00 Error: UNC at LBA = 0x2baa0f4aa = 11721045162 Error 128 [7] occurred at disk power-on lifetime: 24144 hours (1006 days + 0 hours) 40 -- 51 00 00 00 02 ba a0 f4 a9 40 00 Error: UNC at LBA = 0x2baa0f4a9 = 11721045161 Error 127 [6] occurred at disk power-on lifetime: 24144 hours (1006 days + 0 hours) 40 -- 51 00 00 00 02 ba a0 f4 a8 40 00 Error: UNC at LBA = 0x2baa0f4a8 = 11721045160 Error 126 [5] occurred at disk power-on lifetime: 24144 hours (1006 days + 0 hours) 40 -- 51 00 01 00 02 ba a0 f4 a8 40 00 Error: UNC at LBA = 0x2baa0f4a8 = 11721045160 SMART Extended Self-test Log Version: 1 (1 sectors) Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 25235 - # 2 Extended offline Completed without error 00% 25171 - root@NAS[~]#
Code:
Nov 11 00:53:12 NAS kernel: ipmi_si: Unable to find any System Interface(s) Nov 11 00:53:12 NAS kernel: md: resync of RAID array md127 Nov 11 00:53:12 NAS kernel: Adding 2097084k swap on /dev/mapper/md127. Priority:-2 extents:1 across:2097084k FS Nov 11 00:53:44 NAS kernel: md: md127: resync done. Nov 11 00:54:19 NAS kernel: ata3.00: configured for UDMA/133 Nov 11 00:54:19 NAS kernel: sd 2:0:0:0: [sdb] tag#18 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=7s Nov 11 00:54:19 NAS kernel: sd 2:0:0:0: [sdb] tag#18 Sense Key : Illegal Request [current] Nov 11 00:54:19 NAS kernel: sd 2:0:0:0: [sdb] tag#18 Add. Sense: Logical block address out of range Nov 11 00:54:19 NAS kernel: sd 2:0:0:0: [sdb] tag#18 CDB: Write(16) 8a 00 00 00 00 01 8e c4 8f 80 00 00 01 00 00 00 Nov 11 00:54:19 NAS kernel: zio pool=Pool vdev=/dev/disk/by-partuuid/d470d17d-88af-40b8-9f48-cfda45d58857 error=5 type=2 offset=3423241895936 size=131072 flags=40080caa Nov 11 00:54:19 NAS kernel: ata3: EH complete Nov 11 00:54:27 NAS kernel: ata3.00: configured for UDMA/133 Nov 11 00:54:27 NAS kernel: sd 2:0:0:0: [sdb] tag#14 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=15s Nov 11 00:54:27 NAS kernel: sd 2:0:0:0: [sdb] tag#14 Sense Key : Illegal Request [current] Nov 11 00:54:27 NAS kernel: sd 2:0:0:0: [sdb] tag#14 Add. Sense: Logical block address out of range Nov 11 00:54:27 NAS kernel: sd 2:0:0:0: [sdb] tag#14 CDB: Write(16) 8a 00 00 00 00 01 8e c4 90 88 00 00 01 00 00 00 Nov 11 00:54:27 NAS kernel: zio pool=Pool vdev=/dev/disk/by-partuuid/d470d17d-88af-40b8-9f48-cfda45d58857 error=5 type=2 offset=3423242031104 size=131072 flags=40080caa