markshelbyperry
Cadet
- Joined
- Nov 27, 2020
- Messages
- 3
TrueNAS (13.0) sends me the following critical error every day for several days now: "Device: /dev/da2, Read SMART Self-Test Log Failed. [date and time from that morning]". HOWEVER, the TrueNAS GUI and
System: 4x16TB + 4x8TB raidz1 SAS devs + 2 2x2TB special SATA vdevs | TrueNAS Core 13 | Xeon D-1541 | Supermicro X10SDV-TLN4F | 128GB DDR4 2300 ECC RAM | HDs connected to LSI 9400 16i and SSDs use onboard SATA| 450WPSU. The affected disk "/dev/da2" is an HGST enterprise model HUH728080AL4200
Smartctl -x results:
Note for context/history: I was plagued by read/write and data errors affecting various disks for a while, but I *may* have recently fixed them by changing out my HBA a week or so ago. I have scrubbed the pool and SMART tested all the disks with successful results.
The pool, which had faulted under the old controller, appeared to automatically import with the new controller fine with no new file corruption (The pool has a few permanent errors from an old fault, but they appear to just be performance log files:
Thanks in advance.
smartctl -x /dev/da2
appear to read the self-test log just fine and the disk appears to be OK. Does anyone know what this is about?System: 4x16TB + 4x8TB raidz1 SAS devs + 2 2x2TB special SATA vdevs | TrueNAS Core 13 | Xeon D-1541 | Supermicro X10SDV-TLN4F | 128GB DDR4 2300 ECC RAM | HDs connected to LSI 9400 16i and SSDs use onboard SATA| 450WPSU. The affected disk "/dev/da2" is an HGST enterprise model HUH728080AL4200
Smartctl -x results:
Code:
root@fileserver[~]# smartctl -x /dev/da2 smartctl 7.2 2021-09-14 r5236 [FreeBSD 13.1-RELEASE-p7 amd64] (local build) Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Vendor: HGST Product: HUH728080AL4200 Revision: A7D8 Compliance: SPC-4 User Capacity: 8,001,563,222,016 bytes [8.00 TB] Logical block size: 4096 bytes LU is fully provisioned Rotation Rate: 7200 rpm Form Factor: 3.5 inches Logical Unit id: 0x5000cca23bc8a7c0 Serial number: 2EKKAYKV Device type: disk Transport protocol: SAS (SPL-3) Local Time is: Sun Apr 30 14:40:42 2023 EDT SMART support is: Available - device has SMART capability. SMART support is: Enabled Temperature Warning: Enabled Read Cache is: Enabled Writeback Cache is: Enabled === START OF READ SMART DATA SECTION === SMART Health Status: OK Current Drive Temperature: 46 C Drive Trip Temperature: 85 C Manufactured in week 03 of year 2016 Specified cycle count over device lifetime: 50000 Accumulated start-stop cycles: 3720 Specified load-unload count over device lifetime: 600000 Accumulated load-unload cycles: 6117 Elements in grown defect list: 0 Vendor (Seagate Cache) information Blocks sent to initiator = 20380896632766464 Error counter log: Errors Corrected by Total Correction Gigabytes Total ECC rereads/ errors algorithm processed uncorrected fast | delayed rewrites corrected invocations [10^9 bytes] errors read: 0 2 0 2 1979769 179502.542 0 write: 0 0 0 0 6196136 88111.822 0 verify: 0 0 0 0 253876 0.000 0 Non-medium error count: 0 SMART Self-test log Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ] Description number (hours) # 1 Background short Completed - 51403 - [- - -] # 2 Background short Completed - 51377 - [- - -] # 3 Background long Completed - 51323 - [- - -] # 4 Background long Self test in progress ... - NOW - [- - -] # 5 Background short Self test in progress ... - NOW - [- - -] # 6 Background long Completed - 50919 - [- - -] # 7 Background short Completed - 50805 - [- - -] # 8 Background short Completed - 50469 - [- - -] # 9 Background short Completed - 50134 - [- - -] #10 Background short Completed - 49798 - [- - -] #11 Background short Completed - 49416 - [- - -] #12 Background short Completed - 49080 - [- - -] #13 Background long Completed - 48786 - [- - -] #14 Background short Completed - 48674 - [- - -] #15 Background short Completed - 48337 - [- - -] #16 Background short Completed - 47953 - [- - -] #17 Background short Completed - 47617 - [- - -] #18 Background short Completed - 47222 - [- - -] #19 Background short Completed - 46886 - [- - -] #20 Background long Completed - 46592 - [- - -] Long (extended) Self-test duration: 65535 seconds [1092.2 minutes] Background scan results log Status: scan is active Accumulated power on time, hours:minutes 51409:46 [3084586 minutes] Number of background scans performed: 323, scan progress: 0.00% Number of background medium scans performed: 323 Protocol Specific port log page for SAS SSP relative target port id = 1 generation code = 3 number of phys = 1 phy identifier = 0 attached device type: SAS or SATA device attached reason: power on reason: unknown negotiated logical link rate: phy enabled; 12 Gbps attached initiator port: ssp=1 stp=1 smp=1 attached target port: ssp=0 stp=0 smp=0 SAS address = 0x5000cca23bc8a7c1 attached SAS address = 0x500605b00fdeac65 attached phy identifier = 5 Invalid DWORD count = 0 Running disparity error count = 0 Loss of DWORD synchronization = 0 Phy reset problem = 0 Phy event descriptors: Invalid word count: 0 Running disparity error count: 0 Loss of dword synchronization count: 0 Phy reset problem count: 0 relative target port id = 2 generation code = 3 number of phys = 1 phy identifier = 1 attached device type: no device attached attached reason: unknown reason: power on negotiated logical link rate: phy enabled; unknown attached initiator port: ssp=0 stp=0 smp=0 attached target port: ssp=0 stp=0 smp=0 SAS address = 0x5000cca23bc8a7c2 attached SAS address = 0x0 attached phy identifier = 0 Invalid DWORD count = 0 Running disparity error count = 0 Loss of DWORD synchronization = 0 Phy reset problem = 0
Note for context/history: I was plagued by read/write and data errors affecting various disks for a while, but I *may* have recently fixed them by changing out my HBA a week or so ago. I have scrubbed the pool and SMART tested all the disks with successful results.
The pool, which had faulted under the old controller, appeared to automatically import with the new controller fine with no new file corruption (The pool has a few permanent errors from an old fault, but they appear to just be performance log files:
Code:
errors: Permanent errors have been detected in the following files: Pool1/.system/rrd-1e9984bcf13340bcb68bc263ecb0a902@auto-20221226.1800-3m:/localhost/disk-da4/disk_time.rrd Pool1/.system/rrd-1e9984bcf13340bcb68bc263ecb0a902@auto-20221226.1800-3m:/localhost/disk-da5/disk_time.rrd Pool1/.system/rrd-1e9984bcf13340bcb68bc263ecb0a902@auto-20221226.1800-3m:/localhost/ctl-tpc/disk_time.rrd Pool1/.system/rrd-1e9984bcf13340bcb68bc263ecb0a902@auto-20221107.0400-3m:/localhost/memory/memory-inactive.rrd Pool1/.system/rrd-1e9984bcf13340bcb68bc263ecb0a902@auto-20221226.1830-3m:/localhost/geom_stat/geom_latency-ada1.rrd Pool1/.system/rrd-1e9984bcf13340bcb68bc263ecb0a902@auto-20221226.1830-3m:/localhost/geom_stat
Thanks in advance.