Hi all,
I'm rather new to TrueNAS, but am trying to make sense of this. My host is a Dell T620, 12x3.5" bays, H710p flashed to IT mode I have 12x3Tb HGST drives in it and am running VMware 7.03 on the metal, passive the drives through to TrueNAS Core as a guest. S.M.A.R.T. is throwing errors on 10 out of 12 drives, stating that the error count has increased from 3 to 4. I've included the output of two drives as a sample, below. he self-test will occasionally complete, but, as you can see, usually reads that it failed in the second segment.
root@truenas[~]# smartctl -x /dev/da1
smartctl 7.2 2021-09-14 r5236 [FreeBSD 13.1-RELEASE amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Vendor: HGST
Product: HUS724030ALS640
Revision: A124
Compliance: SPC-4
User Capacity: 3,000,592,982,016 bytes [3.00 TB]
Logical block size: 512 bytes
LU is resource provisioned, LBPRZ=0
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Logical Unit id: 0x5000cca027c64a90
Serial number: P8KJ1N4W
Device type: disk
Transport protocol: SAS (SPL-3)
Local Time is: Sun Jan 8 15:28:23 2023 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Temperature Warning: Enabled
Read Cache is: Enabled
Writeback Cache is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK
Current Drive Temperature: 35 C
Drive Trip Temperature: 85 C
Manufactured in week 13 of year 2014
Specified cycle count over device lifetime: 50000
Accumulated start-stop cycles: 66
Specified load-unload count over device lifetime: 600000
Accumulated load-unload cycles: 1869
Elements in grown defect list: 0
Vendor (Seagate Cache) information
Blocks sent to initiator = 5169733250842624
Error counter log:
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 569346 3 0 569349 101716216 1397605.623 0
write: 0 0 0 0 76480 20048.961 0
verify: 3420 0 0 3420 1256 230.685 0
Non-medium error count: 0
Self-test execution status: 100% of test remaining
SMART Self-test log
Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ]
Description number (hours)
# 1 Background short Self test in progress ... 2 NOW - [- - -]
# 2 Background short Failed in second segment 2 46322 - [0x4 0x3e 0x3]
# 3 Background short Failed in second segment 2 46322 - [0x4 0x3e 0x3]
# 4 Background short Failed in second segment 2 46321 - [0x4 0x3e 0x3]
# 5 Background short Failed in second segment 2 46320 - [0x4 0x3e 0x3]
# 6 Background short Failed in second segment 2 0 - [0x4 0x3e 0x3]
Long (extended) Self-test duration: 29637 seconds [493.9 minutes]
Background scan results log
Status: scan is active
Accumulated power on time, hours:minutes 46323:56 [2779436 minutes]
Number of background scans performed: 252, scan progress: 33.04%
Number of background medium scans performed: 252
Protocol Specific port log page for SAS SSP
relative target port id = 1
generation code = 1
number of phys = 1
phy identifier = 0
attached device type: expander device
attached reason: SMP phy control function
reason: unknown
negotiated logical link rate: phy enabled; 6 Gbps
attached initiator port: ssp=0 stp=0 smp=0
attached target port: ssp=0 stp=0 smp=1
SAS address = 0x5000cca027c64a91
attached SAS address = 0x500056b37789abff
attached phy identifier = 4
Invalid DWORD count = 12
Running disparity error count = 12
Loss of DWORD synchronization = 2
Phy reset problem = 0
Phy event descriptors:
Invalid word count: 12
Running disparity error count: 12
Loss of dword synchronization count: 2
Phy reset problem count: 0
relative target port id = 2
generation code = 1
number of phys = 1
phy identifier = 1
attached device type: no device attached
attached reason: unknown
reason: power on
negotiated logical link rate: phy enabled; unknown
attached initiator port: ssp=0 stp=0 smp=0
attached target port: ssp=0 stp=0 smp=0
SAS address = 0x5000cca027c64a92
attached SAS address = 0x0
attached phy identifier = 0
Invalid DWORD count = 0
Running disparity error count = 0
Loss of DWORD synchronization = 0
Phy reset problem = 0
Phy event descriptors:
Invalid word count: 0
Running disparity error count: 0
Loss of dword synchronization count: 0
Phy reset problem count: 0
root@truenas[~]# smartctl -x /dev/da2
smartctl 7.2 2021-09-14 r5236 [FreeBSD 13.1-RELEASE amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Vendor: HGST
Product: HUS724030ALS640
Revision: A124
Compliance: SPC-4
User Capacity: 3,000,592,982,016 bytes [3.00 TB]
Logical block size: 512 bytes
LU is resource provisioned, LBPRZ=0
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Logical Unit id: 0x5000cca027c6b62c
Serial number: P8KJ8U9W
Device type: disk
Transport protocol: SAS (SPL-3)
Local Time is: Sun Jan 8 15:29:57 2023 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Temperature Warning: Enabled
Read Cache is: Enabled
Writeback Cache is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK
Current Drive Temperature: 35 C
Drive Trip Temperature: 85 C
Manufactured in week 13 of year 2014
Specified cycle count over device lifetime: 50000
Accumulated start-stop cycles: 68
Specified load-unload count over device lifetime: 600000
Accumulated load-unload cycles: 1868
Elements in grown defect list: 0
Vendor (Seagate Cache) information
Blocks sent to initiator = 6048071880278016
Error counter log:
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 453038 0 0 453038 86568375 1452301.097 0
write: 0 0 0 0 104536 23293.369 0
verify: 1711 0 0 1711 584 178.038 0
Non-medium error count: 0
SMART Self-test log
Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ]
Description number (hours)
# 1 Background short Failed in second segment 2 46284 - [0x4 0x3e 0x3]
# 2 Background short Failed in second segment 2 46283 - [0x4 0x3e 0x3]
# 3 Background short Failed in second segment 2 46283 - [0x4 0x3e 0x3]
# 4 Background short Failed in second segment 2 46282 - [0x4 0x3e 0x3]
# 5 Background short Failed in second segment 2 46281 - [0x4 0x3e 0x3]
# 6 Background short Failed in second segment 2 0 - [0x4 0x3e 0x3]
Long (extended) Self-test duration: 29637 seconds [493.9 minutes]
Background scan results log
Status: scan is active
Accumulated power on time, hours:minutes 46284:47 [2777087 minutes]
Number of background scans performed: 239, scan progress: 33.25%
Number of background medium scans performed: 239
Protocol Specific port log page for SAS SSP
relative target port id = 1
generation code = 1
number of phys = 1
phy identifier = 0
attached device type: expander device
attached reason: SMP phy control function
reason: unknown
negotiated logical link rate: phy enabled; 6 Gbps
attached initiator port: ssp=0 stp=0 smp=0
attached target port: ssp=0 stp=0 smp=1
SAS address = 0x5000cca027c6b62d
attached SAS address = 0x500056b37789abff
attached phy identifier = 5
Invalid DWORD count = 8
Running disparity error count = 8
Loss of DWORD synchronization = 2
Phy reset problem = 0
Phy event descriptors:
Invalid word count: 8
Running disparity error count: 8
Loss of dword synchronization count: 2
Phy reset problem count: 0
relative target port id = 2
generation code = 1
number of phys = 1
phy identifier = 1
attached device type: no device attached
attached reason: unknown
reason: power on
negotiated logical link rate: phy enabled; unknown
attached initiator port: ssp=0 stp=0 smp=0
attached target port: ssp=0 stp=0 smp=0
SAS address = 0x5000cca027c6b62e
attached SAS address = 0x0
attached phy identifier = 0
Invalid DWORD count = 0
Running disparity error count = 0
Loss of DWORD synchronization = 0
Phy reset problem = 0
Any ideas? I use this as cold storage only, and it runs maybe a few hours a month.
I'm rather new to TrueNAS, but am trying to make sense of this. My host is a Dell T620, 12x3.5" bays, H710p flashed to IT mode I have 12x3Tb HGST drives in it and am running VMware 7.03 on the metal, passive the drives through to TrueNAS Core as a guest. S.M.A.R.T. is throwing errors on 10 out of 12 drives, stating that the error count has increased from 3 to 4. I've included the output of two drives as a sample, below. he self-test will occasionally complete, but, as you can see, usually reads that it failed in the second segment.
root@truenas[~]# smartctl -x /dev/da1
smartctl 7.2 2021-09-14 r5236 [FreeBSD 13.1-RELEASE amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Vendor: HGST
Product: HUS724030ALS640
Revision: A124
Compliance: SPC-4
User Capacity: 3,000,592,982,016 bytes [3.00 TB]
Logical block size: 512 bytes
LU is resource provisioned, LBPRZ=0
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Logical Unit id: 0x5000cca027c64a90
Serial number: P8KJ1N4W
Device type: disk
Transport protocol: SAS (SPL-3)
Local Time is: Sun Jan 8 15:28:23 2023 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Temperature Warning: Enabled
Read Cache is: Enabled
Writeback Cache is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK
Current Drive Temperature: 35 C
Drive Trip Temperature: 85 C
Manufactured in week 13 of year 2014
Specified cycle count over device lifetime: 50000
Accumulated start-stop cycles: 66
Specified load-unload count over device lifetime: 600000
Accumulated load-unload cycles: 1869
Elements in grown defect list: 0
Vendor (Seagate Cache) information
Blocks sent to initiator = 5169733250842624
Error counter log:
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 569346 3 0 569349 101716216 1397605.623 0
write: 0 0 0 0 76480 20048.961 0
verify: 3420 0 0 3420 1256 230.685 0
Non-medium error count: 0
Self-test execution status: 100% of test remaining
SMART Self-test log
Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ]
Description number (hours)
# 1 Background short Self test in progress ... 2 NOW - [- - -]
# 2 Background short Failed in second segment 2 46322 - [0x4 0x3e 0x3]
# 3 Background short Failed in second segment 2 46322 - [0x4 0x3e 0x3]
# 4 Background short Failed in second segment 2 46321 - [0x4 0x3e 0x3]
# 5 Background short Failed in second segment 2 46320 - [0x4 0x3e 0x3]
# 6 Background short Failed in second segment 2 0 - [0x4 0x3e 0x3]
Long (extended) Self-test duration: 29637 seconds [493.9 minutes]
Background scan results log
Status: scan is active
Accumulated power on time, hours:minutes 46323:56 [2779436 minutes]
Number of background scans performed: 252, scan progress: 33.04%
Number of background medium scans performed: 252
Protocol Specific port log page for SAS SSP
relative target port id = 1
generation code = 1
number of phys = 1
phy identifier = 0
attached device type: expander device
attached reason: SMP phy control function
reason: unknown
negotiated logical link rate: phy enabled; 6 Gbps
attached initiator port: ssp=0 stp=0 smp=0
attached target port: ssp=0 stp=0 smp=1
SAS address = 0x5000cca027c64a91
attached SAS address = 0x500056b37789abff
attached phy identifier = 4
Invalid DWORD count = 12
Running disparity error count = 12
Loss of DWORD synchronization = 2
Phy reset problem = 0
Phy event descriptors:
Invalid word count: 12
Running disparity error count: 12
Loss of dword synchronization count: 2
Phy reset problem count: 0
relative target port id = 2
generation code = 1
number of phys = 1
phy identifier = 1
attached device type: no device attached
attached reason: unknown
reason: power on
negotiated logical link rate: phy enabled; unknown
attached initiator port: ssp=0 stp=0 smp=0
attached target port: ssp=0 stp=0 smp=0
SAS address = 0x5000cca027c64a92
attached SAS address = 0x0
attached phy identifier = 0
Invalid DWORD count = 0
Running disparity error count = 0
Loss of DWORD synchronization = 0
Phy reset problem = 0
Phy event descriptors:
Invalid word count: 0
Running disparity error count: 0
Loss of dword synchronization count: 0
Phy reset problem count: 0
root@truenas[~]# smartctl -x /dev/da2
smartctl 7.2 2021-09-14 r5236 [FreeBSD 13.1-RELEASE amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Vendor: HGST
Product: HUS724030ALS640
Revision: A124
Compliance: SPC-4
User Capacity: 3,000,592,982,016 bytes [3.00 TB]
Logical block size: 512 bytes
LU is resource provisioned, LBPRZ=0
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Logical Unit id: 0x5000cca027c6b62c
Serial number: P8KJ8U9W
Device type: disk
Transport protocol: SAS (SPL-3)
Local Time is: Sun Jan 8 15:29:57 2023 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
Temperature Warning: Enabled
Read Cache is: Enabled
Writeback Cache is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK
Current Drive Temperature: 35 C
Drive Trip Temperature: 85 C
Manufactured in week 13 of year 2014
Specified cycle count over device lifetime: 50000
Accumulated start-stop cycles: 68
Specified load-unload count over device lifetime: 600000
Accumulated load-unload cycles: 1868
Elements in grown defect list: 0
Vendor (Seagate Cache) information
Blocks sent to initiator = 6048071880278016
Error counter log:
Errors Corrected by Total Correction Gigabytes Total
ECC rereads/ errors algorithm processed uncorrected
fast | delayed rewrites corrected invocations [10^9 bytes] errors
read: 453038 0 0 453038 86568375 1452301.097 0
write: 0 0 0 0 104536 23293.369 0
verify: 1711 0 0 1711 584 178.038 0
Non-medium error count: 0
SMART Self-test log
Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ]
Description number (hours)
# 1 Background short Failed in second segment 2 46284 - [0x4 0x3e 0x3]
# 2 Background short Failed in second segment 2 46283 - [0x4 0x3e 0x3]
# 3 Background short Failed in second segment 2 46283 - [0x4 0x3e 0x3]
# 4 Background short Failed in second segment 2 46282 - [0x4 0x3e 0x3]
# 5 Background short Failed in second segment 2 46281 - [0x4 0x3e 0x3]
# 6 Background short Failed in second segment 2 0 - [0x4 0x3e 0x3]
Long (extended) Self-test duration: 29637 seconds [493.9 minutes]
Background scan results log
Status: scan is active
Accumulated power on time, hours:minutes 46284:47 [2777087 minutes]
Number of background scans performed: 239, scan progress: 33.25%
Number of background medium scans performed: 239
Protocol Specific port log page for SAS SSP
relative target port id = 1
generation code = 1
number of phys = 1
phy identifier = 0
attached device type: expander device
attached reason: SMP phy control function
reason: unknown
negotiated logical link rate: phy enabled; 6 Gbps
attached initiator port: ssp=0 stp=0 smp=0
attached target port: ssp=0 stp=0 smp=1
SAS address = 0x5000cca027c6b62d
attached SAS address = 0x500056b37789abff
attached phy identifier = 5
Invalid DWORD count = 8
Running disparity error count = 8
Loss of DWORD synchronization = 2
Phy reset problem = 0
Phy event descriptors:
Invalid word count: 8
Running disparity error count: 8
Loss of dword synchronization count: 2
Phy reset problem count: 0
relative target port id = 2
generation code = 1
number of phys = 1
phy identifier = 1
attached device type: no device attached
attached reason: unknown
reason: power on
negotiated logical link rate: phy enabled; unknown
attached initiator port: ssp=0 stp=0 smp=0
attached target port: ssp=0 stp=0 smp=0
SAS address = 0x5000cca027c6b62e
attached SAS address = 0x0
attached phy identifier = 0
Invalid DWORD count = 0
Running disparity error count = 0
Loss of DWORD synchronization = 0
Phy reset problem = 0
Any ideas? I use this as cold storage only, and it runs maybe a few hours a month.