I am having an issue when my fresh build array is degrading. Most of the time I am getting write errors, but sometimes read as well.
This is my first interaction with TrueNAS but I have dome some research and can't figure this one out.
The most similar problem is described in this thread, but yet it is different.
Setup:
OS: TrueNAS-SCALE-22.02.4
CPU: Intel(R) Pentium(R) G4500 @3.5 GHz
MB: Gigabyte Z170N-Gaming 5
RAM: 8GB (2x) Corsair 4GB DDR4 2400MHz
HDDS:
- 3x 6TB WD Blue WD60EZRZ (from 2016)
- 1x 6TB WD Red WD60EFAX brand new
- 1x Kingston 120GB SSD SA400S37/120G
ZFS: RAID1Z with above drives
PSU: PicoPSU-80-WI-32V
History:
This is my NAS setup that I used since 2016 just with Windows and motherboard raid. It might not be the perfect solution, but worked for my needs. Let's put it that way ;)
My UPS failed (of course my cache was set to don't wait for write completion), 1 of 4 WD drives failed, my array fell apart and I spent past month or so with data recovery software. I managed to recover everything, but it was so close of never seeing my data.
I have since then replaced the batteries, got a new drive and run SMART and surface tests on remaining old drives.
What Happens now:
Fresh install of TrueNAS Scale. Then I create a default pool and import my 13TB of data from a backup drive. After few hours or a day I come back and array is degraded. Most often degraded drive is brand new WD red.
What have I tried:
I have replaced reported faulty drive with spare 5TB drive.
I have replaced all sata cables.
I have replaced whole computer.
What appears to be helping:
Now I have TrueNAS Core installed and so far so good. It is just 12hrs...
Interestingly my speeds are half of TrueNAS Scale.
Also I am thinking that some power saving feature might be messing up stuff here, but I have all of the power saving features disabled. Also does not explain why same setup and just different TrueNAS version would help.
My logs:
This is my first interaction with TrueNAS but I have dome some research and can't figure this one out.
The most similar problem is described in this thread, but yet it is different.
Setup:
OS: TrueNAS-SCALE-22.02.4
CPU: Intel(R) Pentium(R) G4500 @3.5 GHz
MB: Gigabyte Z170N-Gaming 5
RAM: 8GB (2x) Corsair 4GB DDR4 2400MHz
HDDS:
- 3x 6TB WD Blue WD60EZRZ (from 2016)
- 1x 6TB WD Red WD60EFAX brand new
- 1x Kingston 120GB SSD SA400S37/120G
ZFS: RAID1Z with above drives
PSU: PicoPSU-80-WI-32V
History:
This is my NAS setup that I used since 2016 just with Windows and motherboard raid. It might not be the perfect solution, but worked for my needs. Let's put it that way ;)
My UPS failed (of course my cache was set to don't wait for write completion), 1 of 4 WD drives failed, my array fell apart and I spent past month or so with data recovery software. I managed to recover everything, but it was so close of never seeing my data.
I have since then replaced the batteries, got a new drive and run SMART and surface tests on remaining old drives.
What Happens now:
Fresh install of TrueNAS Scale. Then I create a default pool and import my 13TB of data from a backup drive. After few hours or a day I come back and array is degraded. Most often degraded drive is brand new WD red.
What have I tried:
I have replaced reported faulty drive with spare 5TB drive.
I have replaced all sata cables.
I have replaced whole computer.
What appears to be helping:
Now I have TrueNAS Core installed and so far so good. It is just 12hrs...
Interestingly my speeds are half of TrueNAS Scale.
Also I am thinking that some power saving feature might be messing up stuff here, but I have all of the power saving features disabled. Also does not explain why same setup and just different TrueNAS version would help.
My logs:
Code:
root@NAS[~]# smartctl -x /dev/sda
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.10.142+truenas] (local build)
=== START OF INFORMATION SECTION ===
Model Family: Western Digital Red (SMR)
Device Model: WDC WD60EFAX-68JH4N1
Serial Number: WD-WXB2DA1R5TLE
LU WWN Device Id: 5 0014ee 2bf94f8ca
Firmware Version: 83.00A83
User Capacity: 6,001,175,126,016 bytes [6.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 5400 rpm
Form Factor: 3.5 inches
TRIM Command: Available, deterministic, zeroed
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-3 T13/2161-D revision 5
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Fri Nov 11 07:33:57 2022 PST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is: Unavailable
APM feature is: Unavailable
Rd look-ahead is: Enabled
Write cache is: Enabled
DSN feature is: Unavailable
ATA Security is: Disabled, frozen [SEC2]
Wt Cache Reorder: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
1 Raw_Read_Error_Rate POSR-K 200 200 051 - 0
3 Spin_Up_Time POS--K 225 224 021 - 3741
4 Start_Stop_Count -O--CK 100 100 000 - 17
5 Reallocated_Sector_Ct PO--CK 200 200 140 - 0
7 Seek_Error_Rate -OSR-K 200 200 000 - 0
9 Power_On_Hours -O--CK 100 100 000 - 170
10 Spin_Retry_Count -O--CK 100 253 000 - 0
11 Calibration_Retry_Count -O--CK 100 253 000 - 0
12 Power_Cycle_Count -O--CK 100 100 000 - 15
192 Power-Off_Retract_Count -O--CK 200 200 000 - 6
193 Load_Cycle_Count -O--CK 200 200 000 - 35
194 Temperature_Celsius -O---K 117 109 000 - 33
196 Reallocated_Event_Count -O--CK 200 200 000 - 0
197 Current_Pending_Sector -O--CK 200 200 000 - 0
198 Offline_Uncorrectable ----CK 100 253 000 - 0
199 UDMA_CRC_Error_Count -O--CK 200 200 000 - 0
200 Multi_Zone_Error_Rate ---R-- 200 200 000 - 0
SMART Extended Comprehensive Error Log Version: 1 (6 sectors)
Device Error Count: 103 (device log contains only the most recent 24 errors)
Error 103 [6] occurred at disk power-on lifetime: 163 hours (6 days + 19 hours)
When the command that caused the error occurred, the device was doing SMART Offline or Self-test.
After command completion occurred, registers were:
ER -- ST COUNT LBA_48 LH LM LL DV DC
-- -- -- == -- == == == -- -- -- -- --
10 -- 51 00 00 00 00 00 40 04 70 40 00 Error: IDNF at LBA = 0x00400470 = 4195440
Commands leading to the command that caused the error were:
CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name
-- == -- == -- == == == -- -- -- -- -- --------------- --------------------
60 00 08 00 70 00 00 fd 41 c9 d0 40 08 04:56:25.132 READ FPDMA QUEUED
61 00 08 00 68 00 02 ba a0 f4 70 40 08 04:56:25.017 WRITE FPDMA QUEUED
61 00 08 00 60 00 02 ba a0 f2 70 40 08 04:56:25.016 WRITE FPDMA QUEUED
61 00 08 00 58 00 00 00 40 04 70 40 08 04:56:25.010 WRITE FPDMA QUEUED
61 00 08 00 50 00 00 00 40 02 70 40 08 04:56:25.010 WRITE FPDMA QUEUED
Error 102 [5] occurred at disk power-on lifetime: 163 hours (6 days + 19 hours)
When the command that caused the error occurred, the device was doing SMART Offline or Self-test.
After command completion occurred, registers were:
ER -- ST COUNT LBA_48 LH LM LL DV DC
-- -- -- == -- == == == -- -- -- -- --
10 -- 51 00 00 00 00 fd 7c 8e 30 40 00 Error: IDNF at LBA = 0xfd7c8e30 = 4252798512
Commands leading to the command that caused the error were:
CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name
-- == -- == -- == == == -- -- -- -- -- --------------- --------------------
60 00 08 00 50 00 00 03 c0 2a 38 40 08 04:56:09.354 READ FPDMA QUEUED
60 00 08 00 48 00 01 1e e6 ae 48 40 08 04:56:08.671 READ FPDMA QUEUED
60 00 08 00 40 00 01 1d f9 2a 08 40 08 04:56:08.669 READ FPDMA QUEUED
60 00 08 00 38 00 00 fd 41 cb e0 40 08 04:56:08.669 READ FPDMA QUEUED
61 07 c0 00 30 00 00 fd 7c 95 e8 40 08 04:56:08.669 WRITE FPDMA QUEUED
Error 101 [4] occurred at disk power-on lifetime: 163 hours (6 days + 19 hours)
When the command that caused the error occurred, the device was doing SMART Offline or Self-test.
After command completion occurred, registers were:
ER -- ST COUNT LBA_48 LH LM LL DV DC
-- -- -- == -- == == == -- -- -- -- --
10 -- 51 00 00 00 00 fd 7c 85 68 40 00 Error: IDNF at LBA = 0xfd7c8568 = 4252796264
Commands leading to the command that caused the error were:
CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name
-- == -- == -- == == == -- -- -- -- -- --------------- --------------------
60 00 10 00 00 00 02 ba a0 f2 90 40 08 04:56:00.465 READ FPDMA QUEUED
60 00 10 00 f8 00 02 ba a0 f0 90 40 08 04:56:00.465 READ FPDMA QUEUED
60 00 10 00 b8 00 00 00 40 02 90 40 08 04:56:00.465 READ FPDMA QUEUED
61 07 c0 00 b0 00 00 fd 7c 86 70 40 08 04:56:00.465 WRITE FPDMA QUEUED
61 07 b8 00 90 00 00 fd 7c 8e 30 40 08 04:56:00.465 WRITE FPDMA QUEUED
Error 100 [3] occurred at disk power-on lifetime: 163 hours (6 days + 19 hours)
When the command that caused the error occurred, the device was doing SMART Offline or Self-test.
After command completion occurred, registers were:
ER -- ST COUNT LBA_48 LH LM LL DV DC
-- -- -- == -- == == == -- -- -- -- --
10 -- 51 00 00 00 00 fd 7c 82 50 40 00 Error: IDNF at LBA = 0xfd7c8250 = 4252795472
Commands leading to the command that caused the error were:
CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name
-- == -- == -- == == == -- -- -- -- -- --------------- --------------------
60 00 08 00 08 00 00 fd 41 ca 90 40 08 04:55:54.512 READ FPDMA QUEUED
60 00 08 00 00 00 01 18 5f 3a c0 40 08 04:55:54.052 READ FPDMA QUEUED
60 00 18 00 f8 00 00 fd 41 c2 68 40 08 04:55:54.040 READ FPDMA QUEUED
61 00 80 00 a0 00 00 fd 7c 85 f0 40 08 04:55:52.265 WRITE FPDMA QUEUED
61 00 80 00 98 00 00 fd 7c 85 68 40 08 04:55:52.263 WRITE FPDMA QUEUED
Error 99 [2] occurred at disk power-on lifetime: 152 hours (6 days + 8 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER -- ST COUNT LBA_48 LH LM LL DV DC
-- -- -- == -- == == == -- -- -- -- --
10 -- 51 00 00 00 00 00 00 00 84 40 00 Error: IDNF at LBA = 0x00000084 = 132
Commands leading to the command that caused the error were:
CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name
-- == -- == -- == == == -- -- -- -- -- --------------- --------------------
61 00 01 00 18 00 00 00 00 00 86 40 08 04:03:39.619 WRITE FPDMA QUEUED
61 00 01 00 10 00 00 00 00 00 85 40 08 04:03:39.619 WRITE FPDMA QUEUED
61 00 01 00 08 00 00 00 00 00 84 40 08 04:03:39.619 WRITE FPDMA QUEUED
61 00 01 00 00 00 00 00 00 00 83 40 08 04:03:39.619 WRITE FPDMA QUEUED
ef 00 10 00 02 00 00 00 00 00 00 a0 08 04:03:39.614 SET FEATURES [Enable SATA feature]
Error 98 [1] occurred at disk power-on lifetime: 152 hours (6 days + 8 hours)
10 -- 51 00 00 00 00 00 00 00 86 40 00 Error: IDNF at LBA = 0x00000086 = 134
Error 97 [0] occurred at disk power-on lifetime: 152 hours (6 days + 8 hours)
10 -- 51 00 00 00 00 00 00 00 83 40 00 Error: IDNF at LBA = 0x00000083 = 131
Error 96 [23] occurred at disk power-on lifetime: 152 hours (6 days + 8 hours)
10 -- 51 00 00 00 00 00 00 00 87 40 00 Error: IDNF at LBA = 0x00000087 = 135
SMART Extended Self-test Log Version: 1 (1 sectors)
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 160 -
# 2 Extended offline Completed without error 00% 98 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
SCT Status Version: 3
SCT Version (vendor specific): 258 (0x0102)
Device State: DST executing in background (3)
Current Temperature: 33 Celsius
Power Cycle Min/Max Temperature: 32/34 Celsius
Lifetime Min/Max Temperature: 26/41 Celsius
Under/Over Temperature Limit Count: 0/0
Vendor specific:
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
SCT Error Recovery Control:
Read: 70 (7.0 seconds)
Write: 70 (7.0 seconds)
Device Statistics (GP Log 0x04)
Page Offset Size Value Flags Description
0x01 ===== = = === == General Statistics (rev 1) ==
0x01 0x008 4 15 --- Lifetime Power-On Resets
0x01 0x010 4 170 --- Power-on Hours
0x01 0x018 6 26397621528 --- Logical Sectors Written
0x01 0x020 6 97102774 --- Number of Write Commands
0x01 0x028 6 1103767073 --- Logical Sectors Read
0x01 0x030 6 2038117 --- Number of Read Commands
0x01 0x038 6 612000000 --- Date and Time TimeStamp
0x03 ===== = = === == Rotating Media Statistics (rev 1) ==
0x03 0x008 4 170 --- Spindle Motor Power-on Hours
0x03 0x010 4 116 --- Head Flying Hours
0x03 0x018 4 42 --- Head Load Events
0x03 0x020 4 0 --- Number of Reallocated Logical Sectors
0x03 0x028 4 0 --- Read Recovery Attempts
0x03 0x030 4 0 --- Number of Mechanical Start Failures
0x03 0x038 4 0 --- Number of Realloc. Candidate Logical Sectors
0x03 0x040 4 6 --- Number of High Priority Unload Events
0x04 ===== = = === == General Errors Statistics (rev 1) ==
0x04 0x008 4 103 --- Number of Reported Uncorrectable Errors
0x04 0x010 4 0 --- Resets Between Cmd Acceptance and Completion
0x05 ===== = = === == Temperature Statistics (rev 1) ==
0x05 0x008 1 33 --- Current Temperature
0x05 0x010 1 34 --- Average Short Term Temperature
0x05 0x018 1 - --- Average Long Term Temperature
0x05 0x020 1 41 --- Highest Temperature
0x05 0x028 1 28 --- Lowest Temperature
0x05 0x030 1 36 --- Highest Average Short Term Temperature
0x05 0x038 1 32 --- Lowest Average Short Term Temperature
0x05 0x040 1 - --- Highest Average Long Term Temperature
0x05 0x048 1 - --- Lowest Average Long Term Temperature
0x05 0x050 4 0 --- Time in Over-Temperature
0x05 0x058 1 65 --- Specified Maximum Operating Temperature
0x05 0x060 4 0 --- Time in Under-Temperature
0x05 0x068 1 0 --- Specified Minimum Operating Temperature
0x06 ===== = = === == Transport Statistics (rev 1) ==
0x06 0x008 4 55 --- Number of Hardware Resets
0x06 0x010 4 28 --- Number of ASR Events
0x06 0x018 4 0 --- Number of Interface CRC Errors
0xff ===== = = === == Vendor Specific Statistics (rev 1) ==
0xff 0x008 7 0 --- Vendor Specific
0xff 0x010 7 0 --- Vendor Specific
0xff 0x018 7 0 --- Vendor Specific
Pending Defects log (GP Log 0x0c)
No Defects Logged
SATA Phy Event Counters (GP Log 0x11)
ID Size Value Description
0x0001 2 0 Command failed due to ICRC error
0x0002 2 0 R_ERR response for data FIS
0x0003 2 0 R_ERR response for device-to-host data FIS
0x0004 2 0 R_ERR response for host-to-device data FIS
0x0005 2 0 R_ERR response for non-data FIS
0x0006 2 0 R_ERR response for device-to-host non-data FIS
0x0007 2 0 R_ERR response for host-to-device non-data FIS
0x0008 2 0 Device-to-host non-data FIS retries
0x0009 2 8 Transition from drive PhyRdy to drive PhyNRdy
0x000a 2 9 Device-to-host register FISes sent due to a COMRESET
0x000b 2 0 CRC errors within host-to-device FIS
0x000d 2 0 Non-CRC errors within host-to-device FIS
0x000f 2 0 R_ERR response for host-to-device data FIS, CRC
0x0012 2 0 R_ERR response for host-to-device non-data FIS, CRC
0x8000 4 43309 Vendor specific
#
root@NAS[~]# Code:
root@NAS[~]# smartctl -x /dev/sdc smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.10.142+truenas] (local build) === START OF INFORMATION SECTION === Model Family: Western Digital Blue Device Model: WDC WD60EZRZ-00GZ5B1 Serial Number: WD-WXJ1H26LX894 LU WWN Device Id: 5 0014ee 20dd74b72 Firmware Version: 80.00A80 User Capacity: 6,001,175,126,016 bytes [6.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 5700 rpm Device is: In smartctl database [for details use: -P show] ATA Version is: ACS-2, ACS-3 T13/2161-D revision 3b SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Fri Nov 11 07:35:36 2022 PST SMART support is: Available - device has SMART capability. SMART support is: Enabled AAM feature is: Unavailable APM feature is: Unavailable Rd look-ahead is: Enabled Write cache is: Enabled DSN feature is: Unavailable ATA Security is: Disabled, frozen [SEC2] Wt Cache Reorder: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_Rate POSR-K 200 200 051 - 0 3 Spin_Up_Time POS--K 186 186 021 - 9666 4 Start_Stop_Count -O--CK 084 084 000 - 16207 5 Reallocated_Sector_Ct PO--CK 200 200 140 - 0 7 Seek_Error_Rate -OSR-K 200 200 000 - 0 9 Power_On_Hours -O--CK 068 068 000 - 23538 10 Spin_Retry_Count -O--CK 100 100 000 - 0 11 Calibration_Retry_Count -O--CK 100 100 000 - 0 12 Power_Cycle_Count -O--CK 100 100 000 - 228 192 Power-Off_Retract_Count -O--CK 200 200 000 - 123 193 Load_Cycle_Count -O--CK 120 120 000 - 240509 194 Temperature_Celsius -O---K 112 095 000 - 40 196 Reallocated_Event_Count -O--CK 200 200 000 - 0 197 Current_Pending_Sector -O--CK 200 200 000 - 0 198 Offline_Uncorrectable ----CK 200 200 000 - 0 199 UDMA_CRC_Error_Count -O--CK 200 200 000 - 0 200 Multi_Zone_Error_Rate ---R-- 200 200 000 - 0 SMART Extended Comprehensive Error Log Version: 1 (6 sectors) No Errors Logged SMART Extended Self-test Log Version: 1 (1 sectors) Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 23528 - # 2 Extended offline Completed without error 00% 23465 -
Code:
root@NAS[~]# smartctl -x /dev/sdd smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.10.142+truenas] (local build) === START OF INFORMATION SECTION === Model Family: Western Digital Blue Device Model: WDC WD60EZRZ-00GZ5B1 Serial Number: WD-WXJ1H26SMCYJ LU WWN Device Id: 5 0014ee 20dd767bd Firmware Version: 80.00A80 User Capacity: 6,001,175,126,016 bytes [6.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 5700 rpm Device is: In smartctl database [for details use: -P show] ATA Version is: ACS-2, ACS-3 T13/2161-D revision 3b SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Fri Nov 11 07:36:09 2022 PST SMART support is: Available - device has SMART capability. SMART support is: Enabled AAM feature is: Unavailable APM feature is: Unavailable Rd look-ahead is: Enabled Write cache is: Enabled DSN feature is: Unavailable ATA Security is: Disabled, frozen [SEC2] Wt Cache Reorder: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_Rate POSR-K 200 200 051 - 0 3 Spin_Up_Time POS--K 197 196 021 - 9125 4 Start_Stop_Count -O--CK 084 084 000 - 16189 5 Reallocated_Sector_Ct PO--CK 200 200 140 - 0 7 Seek_Error_Rate -OSR-K 200 200 000 - 0 9 Power_On_Hours -O--CK 068 068 000 - 23545 10 Spin_Retry_Count -O--CK 100 100 000 - 0 11 Calibration_Retry_Count -O--CK 100 100 000 - 0 12 Power_Cycle_Count -O--CK 100 100 000 - 222 192 Power-Off_Retract_Count -O--CK 200 200 000 - 119 193 Load_Cycle_Count -O--CK 122 122 000 - 235715 194 Temperature_Celsius -O---K 115 101 000 - 37 196 Reallocated_Event_Count -O--CK 200 200 000 - 0 197 Current_Pending_Sector -O--CK 200 200 000 - 0 198 Offline_Uncorrectable ----CK 200 200 000 - 0 199 UDMA_CRC_Error_Count -O--CK 200 200 000 - 0 200 Multi_Zone_Error_Rate ---R-- 200 200 000 - 0 SMART Extended Comprehensive Error Log Version: 1 (6 sectors) No Errors Logged SMART Extended Self-test Log Version: 1 (1 sectors) Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 23535 - # 2 Extended offline Completed without error 00% 23472 - # 3 Extended offline Aborted by host 90% 0 -
Code:
root@NAS[~]# smartctl -x /dev/sde
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.10.142+truenas] (local build)
=== START OF INFORMATION SECTION ===
Model Family: Western Digital Blue
Device Model: WDC WD60EZRZ-00RWYB1
Serial Number: WD-WX21DB5096DA
LU WWN Device Id: 5 0014ee 2631e3d7f
Firmware Version: 80.00A80
User Capacity: 6,001,175,126,016 bytes [6.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 5700 rpm
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Fri Nov 11 07:36:50 2022 PST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is: Unavailable
APM feature is: Unavailable
Rd look-ahead is: Enabled
Write cache is: Enabled
DSN feature is: Unavailable
ATA Security is: Disabled, frozen [SEC2]
Wt Cache Reorder: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
1 Raw_Read_Error_Rate POSR-K 200 199 051 - 1
3 Spin_Up_Time POS--K 249 199 021 - 6508
4 Start_Stop_Count -O--CK 084 084 000 - 16193
5 Reallocated_Sector_Ct PO--CK 195 195 140 - 170
7 Seek_Error_Rate -OSR-K 200 200 000 - 0
9 Power_On_Hours -O--CK 066 066 000 - 25245
10 Spin_Retry_Count -O--CK 100 100 000 - 0
11 Calibration_Retry_Count -O--CK 100 100 000 - 0
12 Power_Cycle_Count -O--CK 100 100 000 - 231
192 Power-Off_Retract_Count -O--CK 200 200 000 - 124
193 Load_Cycle_Count -O--CK 118 118 000 - 247562
194 Temperature_Celsius -O---K 110 098 000 - 42
196 Reallocated_Event_Count -O--CK 196 058 000 - 4
197 Current_Pending_Sector -O--CK 200 200 000 - 1
198 Offline_Uncorrectable ----CK 200 200 000 - 0
199 UDMA_CRC_Error_Count -O--CK 200 200 000 - 1
200 Multi_Zone_Error_Rate ---R-- 200 200 000 - 20
SMART Extended Comprehensive Error Log Version: 1 (6 sectors)
Device Error Count: 133 (device log contains only the most recent 24 errors)
Error 133 [12] occurred at disk power-on lifetime: 25222 hours (1050 days + 22 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER -- ST COUNT LBA_48 LH LM LL DV DC
-- -- -- == -- == == == -- -- -- -- --
40 -- 51 00 00 00 00 02 e5 f7 d8 40 00 Error: UNC at LBA = 0x02e5f7d8 = 48625624
Commands leading to the command that caused the error were:
CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name
-- == -- == -- == == == -- -- -- -- -- --------------- --------------------
60 07 e8 00 b8 00 00 02 e5 fe d8 40 08 1d+00:36:56.472 READ FPDMA QUEUED
60 07 e8 00 b0 00 00 02 e5 f6 f0 40 08 1d+00:36:56.464 READ FPDMA QUEUED
60 07 e8 00 a8 00 00 02 e5 ef 08 40 08 1d+00:36:56.447 READ FPDMA QUEUED
60 07 e8 00 a0 00 00 02 e5 e7 20 40 08 1d+00:36:56.442 READ FPDMA QUEUED
60 07 e8 00 98 00 00 02 e5 df 38 40 08 1d+00:36:56.425 READ FPDMA QUEUED
Error 132 [11] occurred at disk power-on lifetime: 24144 hours (1006 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER -- ST COUNT LBA_48 LH LM LL DV DC
-- -- -- == -- == == == -- -- -- -- --
40 -- 51 00 00 00 02 ba a0 f4 af 40 00 Error: UNC at LBA = 0x2baa0f4af = 11721045167
Commands leading to the command that caused the error were:
CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name
-- == -- == -- == == == -- -- -- -- -- --------------- --------------------
60 00 01 00 b8 00 02 ba a0 f4 af 40 00 1d+23:26:53.428 READ FPDMA QUEUED
60 00 01 00 b0 00 02 ba a0 f4 ae 40 00 1d+23:26:53.427 READ FPDMA QUEUED
60 00 01 00 a8 00 02 ba a0 f4 ad 40 00 1d+23:26:53.399 READ FPDMA QUEUED
2f 00 00 00 01 00 00 00 00 00 10 68 00 1d+23:26:53.377 READ LOG EXT
60 00 01 00 a0 00 02 ba a0 f4 ac 40 00 1d+23:26:53.178 READ FPDMA QUEUED
Error 131 [10] occurred at disk power-on lifetime: 24144 hours (1006 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER -- ST COUNT LBA_48 LH LM LL DV DC
-- -- -- == -- == == == -- -- -- -- --
40 -- 51 00 00 00 02 ba a0 f4 ac 40 00 Error: UNC at LBA = 0x2baa0f4ac = 11721045164
Commands leading to the command that caused the error were:
CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name
-- == -- == -- == == == -- -- -- -- -- --------------- --------------------
60 00 01 00 a0 00 02 ba a0 f4 ac 40 00 1d+23:26:53.178 READ FPDMA QUEUED
2f 00 00 00 01 00 00 00 00 00 10 68 00 1d+23:26:53.169 READ LOG EXT
60 00 01 00 98 00 02 ba a0 f4 ab 40 00 1d+23:26:52.961 READ FPDMA QUEUED
2f 00 00 00 01 00 00 00 00 00 10 68 00 1d+23:26:52.950 READ LOG EXT
60 00 01 00 90 00 02 ba a0 f4 aa 40 00 1d+23:26:52.745 READ FPDMA QUEUED
Error 130 [9] occurred at disk power-on lifetime: 24144 hours (1006 days + 0 hours)
40 -- 51 00 00 00 02 ba a0 f4 ab 40 00 Error: UNC at LBA = 0x2baa0f4ab = 11721045163
Error 129 [8] occurred at disk power-on lifetime: 24144 hours (1006 days + 0 hours)
40 -- 51 00 00 00 02 ba a0 f4 aa 40 00 Error: UNC at LBA = 0x2baa0f4aa = 11721045162
Error 128 [7] occurred at disk power-on lifetime: 24144 hours (1006 days + 0 hours)
40 -- 51 00 00 00 02 ba a0 f4 a9 40 00 Error: UNC at LBA = 0x2baa0f4a9 = 11721045161
Error 127 [6] occurred at disk power-on lifetime: 24144 hours (1006 days + 0 hours)
40 -- 51 00 00 00 02 ba a0 f4 a8 40 00 Error: UNC at LBA = 0x2baa0f4a8 = 11721045160
Error 126 [5] occurred at disk power-on lifetime: 24144 hours (1006 days + 0 hours)
40 -- 51 00 01 00 02 ba a0 f4 a8 40 00 Error: UNC at LBA = 0x2baa0f4a8 = 11721045160
SMART Extended Self-test Log Version: 1 (1 sectors)
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 25235 -
# 2 Extended offline Completed without error 00% 25171 -
root@NAS[~]# Code:
Nov 11 00:53:12 NAS kernel: ipmi_si: Unable to find any System Interface(s) Nov 11 00:53:12 NAS kernel: md: resync of RAID array md127 Nov 11 00:53:12 NAS kernel: Adding 2097084k swap on /dev/mapper/md127. Priority:-2 extents:1 across:2097084k FS Nov 11 00:53:44 NAS kernel: md: md127: resync done. Nov 11 00:54:19 NAS kernel: ata3.00: configured for UDMA/133 Nov 11 00:54:19 NAS kernel: sd 2:0:0:0: [sdb] tag#18 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=7s Nov 11 00:54:19 NAS kernel: sd 2:0:0:0: [sdb] tag#18 Sense Key : Illegal Request [current] Nov 11 00:54:19 NAS kernel: sd 2:0:0:0: [sdb] tag#18 Add. Sense: Logical block address out of range Nov 11 00:54:19 NAS kernel: sd 2:0:0:0: [sdb] tag#18 CDB: Write(16) 8a 00 00 00 00 01 8e c4 8f 80 00 00 01 00 00 00 Nov 11 00:54:19 NAS kernel: zio pool=Pool vdev=/dev/disk/by-partuuid/d470d17d-88af-40b8-9f48-cfda45d58857 error=5 type=2 offset=3423241895936 size=131072 flags=40080caa Nov 11 00:54:19 NAS kernel: ata3: EH complete Nov 11 00:54:27 NAS kernel: ata3.00: configured for UDMA/133 Nov 11 00:54:27 NAS kernel: sd 2:0:0:0: [sdb] tag#14 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=15s Nov 11 00:54:27 NAS kernel: sd 2:0:0:0: [sdb] tag#14 Sense Key : Illegal Request [current] Nov 11 00:54:27 NAS kernel: sd 2:0:0:0: [sdb] tag#14 Add. Sense: Logical block address out of range Nov 11 00:54:27 NAS kernel: sd 2:0:0:0: [sdb] tag#14 CDB: Write(16) 8a 00 00 00 00 01 8e c4 90 88 00 00 01 00 00 00 Nov 11 00:54:27 NAS kernel: zio pool=Pool vdev=/dev/disk/by-partuuid/d470d17d-88af-40b8-9f48-cfda45d58857 error=5 type=2 offset=3423242031104 size=131072 flags=40080caa