Hello,
I've read all the posts in this forum (they weren't a few), as well as quite a few posts on this topic in other forums, and I still can't figure out what I should be looking at.
I get warnings on 2 of my SSDs:
Device: /dev/sdi [SAT], 2 Currently unreadable (pending) sectors.
Device: /dev/sde [SAT], 1 Currently unreadable (pending) sectors.
In the web GUI in disks S.M.A.R.T Test Results I see Remaining: N/A; Lifetime: 894; Error: N/A (Lifetime is different for each disk).
Since I'm a simple home user, I do what the experts say:
1 - I run the SMART test manually:
2 - look for errors in the output:
And here I was already very confused. Everywhere it says No Errors Logged and Completed without error.
I made the server on Christmas:
ProLiant ML350e Gen8 v2
HP H220 (SAS2308_1(D1) 20.00.07.00 14.01.30.16 07.39.02.00)
2*Intel(R) Xeon(R) CPU E5-2450 v2 @ 2.50GHz
72GB RAM
8 brand new SSDs for date and 1 for boot
TrueNAS-SCALE-22.12.0
And let's move on to the questions asking for help:
1-What should I do about the Currently unreadable (pending) sectors problem?
2-Which of all the information from the SMART test should I watch and follow over time?
3-This is a suggestion to the developers. Is it possible for the OS to read all the information after it is run as a task in the S.M.A.R.T Test Results in the web GUI so that we simple users can see a simplified result but with everything important? And to add a "button" to "fix", if possible, the problem.
Many users know how to use shell, but also many do not. I'm throwing in some codes I found on the forums and hope they work for me too. It took me a while to figure out that some of them I can't run as admin and have to log in as root.
Sorry for my english and thanks for your time!
I've read all the posts in this forum (they weren't a few), as well as quite a few posts on this topic in other forums, and I still can't figure out what I should be looking at.
I get warnings on 2 of my SSDs:
Device: /dev/sdi [SAT], 2 Currently unreadable (pending) sectors.
Device: /dev/sde [SAT], 1 Currently unreadable (pending) sectors.
In the web GUI in disks S.M.A.R.T Test Results I see Remaining: N/A; Lifetime: 894; Error: N/A (Lifetime is different for each disk).
Since I'm a simple home user, I do what the experts say:
1 - I run the SMART test manually:
smartctl -t long /dev/sdi and smartctl -t long /dev/sde2 - look for errors in the output:
smartctl -a /dev/sdi and smartctl -a /dev/sdeCode:
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.79+truenas] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Device Model: SPCC Solid State Disk
Serial Number: 30083939530
LU WWN Device Id: 5 000000 000003061
Firmware Version: 030fAA20
User Capacity: 512,110,190,592 bytes [512 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Form Factor: 2.5 inches
TRIM Command: Available, deterministic, zeroed
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-4 (minor revision not indicated)
SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Wed Feb 15 16:51:37 2023 EET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x02) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 33) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 2) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x0031) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 20
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
5 Reallocated_Sector_Ct 0x0032 100 100 050 Old_age Always - 0
9 Power_On_Hours 0x0012 100 100 000 Old_age Always - 1405
12 Power_Cycle_Count 0x0012 100 100 000 Old_age Always - 92
167 Unknown_Attribute 0x0022 100 100 000 Old_age Always - 0
168 Unknown_Attribute 0x0012 100 100 000 Old_age Always - 0
169 Unknown_Attribute 0x0013 100 100 010 Pre-fail Always - 0
170 Unknown_Attribute 0x0033 100 100 010 Pre-fail Always - 114
171 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
172 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
174 Unknown_Attribute 0x0022 100 100 000 Old_age Always - 0
175 Program_Fail_Count_Chip 0x0033 100 100 010 Pre-fail Always - 0
177 Wear_Leveling_Count 0x0012 100 100 000 Old_age Always - 0
180 Unused_Rsvd_Blk_Cnt_Tot 0x0033 100 100 000 Pre-fail Always - 114
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0033 100 100 090 Pre-fail Always - 0
187 Reported_Uncorrect 0x0032 100 000 000 Old_age Always - 0
192 Power-Off_Retract_Count 0x0012 100 100 000 Old_age Always - 92
194 Temperature_Celsius 0x0022 040 040 000 Old_age Always - 40 (Min/Max 40/40)
196 Reallocated_Event_Count 0x0012 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
199 UDMA_CRC_Error_Count 0x003e 100 100 000 Old_age Always - 0
206 Unknown_SSD_Attribute 0x0032 200 200 000 Old_age Always - 7
207 Unknown_SSD_Attribute 0x0032 200 200 000 Old_age Always - 72
208 Unknown_SSD_Attribute 0x0032 200 200 000 Old_age Always - 27
231 Unknown_SSD_Attribute 0x0023 098 098 005 Pre-fail Always - 2
233 Media_Wearout_Indicator 0x0032 100 100 000 Old_age Always - 11327
234 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 17780
241 Total_LBAs_Written 0x0032 100 100 000 Old_age Always - 1857
242 Total_LBAs_Read 0x0032 100 100 000 Old_age Always - 8147
243 Unknown_Attribute 0x0032 050 050 000 Old_age Always - 38
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 1402 -
# 2 Short offline Completed without error 00% 1343 -
# 3 Extended offline Completed without error 00% 1295 -
# 4 Extended offline Completed without error 00% 1228 -
# 5 Short offline Completed without error 00% 1215 -
# 6 Extended offline Completed without error 00% 1108 -
# 7 Short offline Completed without error 00% 1060 -
# 8 Short offline Completed without error 00% 894 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
128 0 65535 Read_scanning was completed without error
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.Code:
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.79+truenas] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Extended self-test routine immediately in off-line mode".
Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 2 minutes for test to complete.
Test will complete after Wed Feb 15 16:57:47 2023 EET
Use smartctl -X to abort test.
root@truenas[~]# smartctl -a /dev/sde
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.79+truenas] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Device Model: SPCC Solid State Disk
Serial Number: 30083933959
LU WWN Device Id: 5 000000 000001202
Firmware Version: 030fAA20
User Capacity: 512,110,190,592 bytes [512 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Form Factor: 2.5 inches
TRIM Command: Available, deterministic, zeroed
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-4 (minor revision not indicated)
SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Wed Feb 15 16:58:10 2023 EET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x02) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 33) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 2) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x0031) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 20
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
5 Reallocated_Sector_Ct 0x0032 100 100 050 Old_age Always - 0
9 Power_On_Hours 0x0012 100 100 000 Old_age Always - 1393
12 Power_Cycle_Count 0x0012 100 100 000 Old_age Always - 90
167 Unknown_Attribute 0x0022 100 100 000 Old_age Always - 0
168 Unknown_Attribute 0x0012 100 100 000 Old_age Always - 0
169 Unknown_Attribute 0x0013 100 100 010 Pre-fail Always - 0
170 Unknown_Attribute 0x0033 100 100 010 Pre-fail Always - 114
171 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
172 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
174 Unknown_Attribute 0x0022 100 100 000 Old_age Always - 0
175 Program_Fail_Count_Chip 0x0033 100 100 010 Pre-fail Always - 0
177 Wear_Leveling_Count 0x0012 100 100 000 Old_age Always - 0
180 Unused_Rsvd_Blk_Cnt_Tot 0x0033 100 100 000 Pre-fail Always - 114
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0033 100 100 090 Pre-fail Always - 0
187 Reported_Uncorrect 0x0032 100 000 000 Old_age Always - 0
192 Power-Off_Retract_Count 0x0012 100 100 000 Old_age Always - 90
194 Temperature_Celsius 0x0022 040 040 000 Old_age Always - 40 (Min/Max 40/40)
196 Reallocated_Event_Count 0x0012 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
199 UDMA_CRC_Error_Count 0x003e 100 100 000 Old_age Always - 0
206 Unknown_SSD_Attribute 0x0032 200 200 000 Old_age Always - 7
207 Unknown_SSD_Attribute 0x0032 200 200 000 Old_age Always - 63
208 Unknown_SSD_Attribute 0x0032 200 200 000 Old_age Always - 25
231 Unknown_SSD_Attribute 0x0023 098 098 005 Pre-fail Always - 2
233 Media_Wearout_Indicator 0x0032 100 100 000 Old_age Always - 8492
234 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 10539
241 Total_LBAs_Written 0x0032 100 100 000 Old_age Always - 1899
242 Total_LBAs_Read 0x0032 100 100 000 Old_age Always - 7407
243 Unknown_Attribute 0x0032 050 050 000 Old_age Always - 37
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 1393 -
# 2 Short offline Completed without error 00% 1331 -
# 3 Extended offline Completed without error 00% 1283 -
# 4 Extended offline Completed without error 00% 1276 -
# 5 Short offline Completed without error 00% 1203 -
# 6 Extended offline Completed without error 00% 1096 -
# 7 Short offline Completed without error 00% 1048 -
# 8 Extended offline Completed without error 00% 1014 -
# 9 Extended offline Completed without error 00% 1014 -
#10 Short offline Completed without error 00% 882 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
128 0 65535 Read_scanning was completed without error
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.And here I was already very confused. Everywhere it says No Errors Logged and Completed without error.
I made the server on Christmas:
ProLiant ML350e Gen8 v2
HP H220 (SAS2308_1(D1) 20.00.07.00 14.01.30.16 07.39.02.00)
2*Intel(R) Xeon(R) CPU E5-2450 v2 @ 2.50GHz
72GB RAM
8 brand new SSDs for date and 1 for boot
TrueNAS-SCALE-22.12.0
And let's move on to the questions asking for help:
1-What should I do about the Currently unreadable (pending) sectors problem?
2-Which of all the information from the SMART test should I watch and follow over time?
3-This is a suggestion to the developers. Is it possible for the OS to read all the information after it is run as a task in the S.M.A.R.T Test Results in the web GUI so that we simple users can see a simplified result but with everything important? And to add a "button" to "fix", if possible, the problem.
Many users know how to use shell, but also many do not. I'm throwing in some codes I found on the forums and hope they work for me too. It took me a while to figure out that some of them I can't run as admin and have to log in as root.
Sorry for my english and thanks for your time!