pinoli
Dabbler
- Joined
- Feb 20, 2021
- Messages
- 34
Hello,
I have been receiving weird S.M.A.R.T. results on one of my disks.
I am running TrueNAS SCALE Bluefin 22.12.1 with a Broadcom 9405W-16i HBA (rest in signature) and never had issues with my other 9 drives.
The disk in question is a Seagate X18 SAS 18TB (ST18000NM004J), part of a mirrored vdev.
TrueNAS SCALE has been randomly showing these errors in the UI.
when I run
what puzzles me is the fact that the drive looks perfect aside from this section
when the S.M.A.R.T. test fails it says
also the
this drive has been in use almost a year and never had any issue, weird S.M.A.R.T. alerts aside.
as you can see I waited a bit before opening a thread about this issue, but this behavior has always stayed the same for almost 9 months.
I am not sure what to do, I can still RMA the disk since it's within the warranty period, but not seeing any error on the drive and having zero issues with my data up until now, makes me think this might be something different. maybe those two blocks are unreadable? but then the read error count should go up, and yet it is zero.
shall I run some specific test? any help would be greatly appreciated, thanks.
I have been receiving weird S.M.A.R.T. results on one of my disks.
I am running TrueNAS SCALE Bluefin 22.12.1 with a Broadcom 9405W-16i HBA (rest in signature) and never had issues with my other 9 drives.
The disk in question is a Seagate X18 SAS 18TB (ST18000NM004J), part of a mirrored vdev.
TrueNAS SCALE has been randomly showing these errors in the UI.
when I run
sudo smartctl -x /dev/sde I get thisCode:
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.79+truenas] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Vendor: SEAGATE
Product: ST18000NM004J
Revision: E001
Compliance: SPC-5
User Capacity: 18,000,207,937,536 bytes [18.0 TB]
Logical block size: 512 bytes
Physical block size: 4096 bytes
LU is fully provisioned
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Logical Unit id: 0x5000c500d85c582f
Serial number: ZR56JHQZ0000C216GKBL
Device type: disk
Transport protocol: SAS (SPL-3)
Local Time is: Thu Mar 16 04:05:39 2023 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Temperature Warning: Enabled
Read Cache is: Enabled
Writeback Cache is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK
Grown defects during certification <not available>
Total blocks reassigned during format <not available>
Total new blocks reassigned <not available>
Power on minutes since format <not available>
Current Drive Temperature: 33 C
Drive Trip Temperature: 60 C
Manufactured in week 51 of year 2021
Specified cycle count over device lifetime: 50000
Accumulated start-stop cycles: 60
Specified load-unload count over device lifetime: 600000
Accumulated load-unload cycles: 12008
Elements in grown defect list: 0
Vendor (Seagate Cache) information
Blocks sent to initiator = 3677215160
Blocks received from initiator = 3828203624
Blocks read from cache and sent to initiator = 469201710
Number of read and write commands whose size <= segment size = 133715517
Number of read and write commands whose size > segment size = 3781631
Vendor (Seagate/Hitachi) factory information
number of hours powered up = 6534.43
number of minutes until next internal SMART test = 49
Error counter log:
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 0 0 0 0 0 272153.845 0
write: 0 0 0 0 0 67948.050 0
verify: 0 0 0 0 0 208.749 0
Non-medium error count: 0
[GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S on']
SMART Self-test log
Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ]
Description number (hours)
# 1 Background short Failed in segment --> - 6532 35143810608 [0x3 0x11 0x0]
# 2 Background short Completed - 6508 - [- - -]
# 3 Background short Completed - 6484 - [- - -]
# 4 Background short Completed - 6460 - [- - -]
# 5 Background short Failed in segment --> - 6436 35145130168 [0x3 0x11 0x0]
# 6 Background short Completed - 6412 - [- - -]
# 7 Background short Failed in segment --> - 6388 35145130168 [0x3 0x11 0x0]
# 8 Background short Completed - 6364 - [- - -]
# 9 Background short Completed - 6340 - [- - -]
#10 Background short Failed in segment --> - 6316 35143810608 [0x3 0x11 0x0]
#11 Background short Completed - 6292 - [- - -]
#12 Background short Failed in segment --> - 6268 35145130168 [0x3 0x11 0x0]
#13 Background short Completed - 6244 - [- - -]
#14 Background short Completed - 6220 - [- - -]
#15 Background short Completed - 6196 - [- - -]
#16 Background short Failed in segment --> - 6172 35143810608 [0x3 0x11 0x0]
#17 Background short Failed in segment --> - 6148 35143810608 [0x3 0x11 0x0]
#18 Background short Failed in segment --> - 6124 35145130168 [0x3 0x11 0x0]
#19 Background short Completed - 6100 - [- - -]
#20 Background short Failed in segment --> - 6076 35143810608 [0x3 0x11 0x0]
Long (extended) Self-test duration: 65535 seconds [1092.2 minutes]
Background scan results log
Status: no scans active
Accumulated power on time, hours:minutes 6534:26 [392066 minutes]
Number of background scans performed: 0, scan progress: 0.00%
Number of background medium scans performed: 0
# when lba(hex) [sk,asc,ascq] reassign_status
1 843:45 0000000820a74870 [1,18,4] Recovered via rewrite in-place
2 844:01 000000082a91c0c0 [1,18,8] Recovered via rewrite in-place
3 844:01 000000082a973580 [3,11,0] Recovered via rewrite in-place
4 844:02 000000082ac4d178 [1,18,8] Recovered via rewrite in-place
5 844:08 000000082ec152d8 [3,11,0] Require Write or Reassign Blocks command
6 844:09 000000082ec15b78 [3,11,0] Require Write or Reassign Blocks command
7 844:09 000000082ed020b8 [3,11,0] Require Write or Reassign Blocks command
8 844:09 000000082ed03ab8 [3,16,0] Require Write or Reassign Blocks command
9 844:09 000000082ed03aa0 [3,11,0] Require Write or Reassign Blocks command
10 844:10 000000082ebbfe30 [3,11,0] Require Write or Reassign Blocks command
49152 844:01 0001000105523818 [1,18,8] Recovered via rewrite in-place
49153 844:02 0001000105589a2f [1,18,8] Recovered via rewrite in-place
Protocol Specific port log page for SAS SSP
relative target port id = 1
generation code = 2
number of phys = 1
phy identifier = 0
attached device type: SAS or SATA device
attached reason: unknown
reason: unknown
negotiated logical link rate: phy enabled; 12 Gbps
attached initiator port: ssp=1 stp=1 smp=1
attached target port: ssp=0 stp=0 smp=0
SAS address = 0x5000c500d85c582d
attached SAS address = 0x500605b00fe37e22
attached phy identifier = 2
Invalid DWORD count = 1
Running disparity error count = 1
Loss of DWORD synchronization = 126
Phy reset problem = 26
Phy event descriptors:
Invalid word count: 1
Running disparity error count: 1
Loss of dword synchronization count: 126
Phy reset problem count: 26
relative target port id = 2
generation code = 2
number of phys = 1
phy identifier = 1
attached device type: no device attached
attached reason: unknown
reason: unknown
negotiated logical link rate: phy enabled; unknown
attached initiator port: ssp=0 stp=0 smp=0
attached target port: ssp=0 stp=0 smp=0
SAS address = 0x5000c500d85c582e
attached SAS address = 0x0
attached phy identifier = 0
Invalid DWORD count = 0
Running disparity error count = 0
Loss of DWORD synchronization = 0
Phy reset problem = 0
Phy event descriptors:
Invalid word count: 0
Running disparity error count: 0
Loss of dword synchronization count: 0
Phy reset problem count: 0
what puzzles me is the fact that the drive looks perfect aside from this section
Code:
SMART Self-test log
Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ]
Description number (hours)
# 1 Background short Failed in segment --> - 6532 35143810608 [0x3 0x11 0x0]
# 2 Background short Completed - 6508 - [- - -]
# 3 Background short Completed - 6484 - [- - -]
# 4 Background short Completed - 6460 - [- - -]
# 5 Background short Failed in segment --> - 6436 35145130168 [0x3 0x11 0x0]
# 6 Background short Completed - 6412 - [- - -]
# 7 Background short Failed in segment --> - 6388 35145130168 [0x3 0x11 0x0]
# 8 Background short Completed - 6364 - [- - -]
# 9 Background short Completed - 6340 - [- - -]
#10 Background short Failed in segment --> - 6316 35143810608 [0x3 0x11 0x0]
#11 Background short Completed - 6292 - [- - -]
#12 Background short Failed in segment --> - 6268 35145130168 [0x3 0x11 0x0]
#13 Background short Completed - 6244 - [- - -]
#14 Background short Completed - 6220 - [- - -]
#15 Background short Completed - 6196 - [- - -]
#16 Background short Failed in segment --> - 6172 35143810608 [0x3 0x11 0x0]
#17 Background short Failed in segment --> - 6148 35143810608 [0x3 0x11 0x0]
#18 Background short Failed in segment --> - 6124 35145130168 [0x3 0x11 0x0]
#19 Background short Completed - 6100 - [- - -]
#20 Background short Failed in segment --> - 6076 35143810608 [0x3 0x11 0x0]
when the S.M.A.R.T. test fails it says
Failed in segment --> but doesn't show the segment.also the
LBA_first_err sectors are just those two, and every failed test involves one of those two sectors (35143810608 or 35145130168).this drive has been in use almost a year and never had any issue, weird S.M.A.R.T. alerts aside.
Code:
Error counter log:
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 0 0 0 0 0 272153.845 0
write: 0 0 0 0 0 67948.050 0
verify: 0 0 0 0 0 208.749 0
Non-medium error count: 0
as you can see I waited a bit before opening a thread about this issue, but this behavior has always stayed the same for almost 9 months.
I am not sure what to do, I can still RMA the disk since it's within the warranty period, but not seeing any error on the drive and having zero issues with my data up until now, makes me think this might be something different. maybe those two blocks are unreadable? but then the read error count should go up, and yet it is zero.
shall I run some specific test? any help would be greatly appreciated, thanks.