Unsure why I got ZFS errors.

Joined
Jan 29, 2024
Messages
6
My TrueNAS system recently discovered two hard drives with a few errors on Saturday. For context, I have a pool with 4 hard drives, 1 is connected via sata and the rest is through my HBA card with sas>sata breakouts. So my immediate thought was to write down the s/n of the drives that had errors, shut down the system and check the wiring. I, as many of you probably have to, was expecting the two drives to be connected to my HBA, but no in fact one drive was on motherboard SATA and one drive was on the HBA. So this rules out a bad HBA, correct? So I booted the system back up, ran a smart long on both affected drives and they came back flawless. Ok, weird. I had also noticed the errors were erased from my dashboard, and they were not visible in Zpool status. A scrub had ran as scheduled that night too, and nothing came of it.

So now I'm confused, what just happened? Bad RAM? No, because a kernel panic would have occured? Bad PSU, likely? I mean maybe I DO in fact have a bad hba, and the sata ports on my motherboard are bad or just incompatible with linux as I've heard this is a thing from other posts. But that seems highly unlikely. I'm new to truenas, and zfs too. This is my first raid pool I've ever setup, so please take kindly with my troubleshooting steps, hopefully I missed something basic that would show my issue. Below I have listed my hardware.

Ryzen 3600
Asus Prime B550-PLUS (it is ecc compatible)
Crucial 16gb ECC UDIMM DDR4 2666MHz 2rx8 (used)
Adaptec ASR-71605 HBA Card (used, repaste + noctua fan mod, runs cold on both sides of the card.)
EVGA 600w 100-W1-0600-K1 (I know, needs to be replaced, was only supposed to be temporary but it is brand new)
APC UPC 600va/330w with truenas ups service setup.
2x 16tb Western Digital HC550
2x 18tb Western Digital HC550
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
Adaptec ASR-71605 HBA Card
Here it is.

Adaptec HBAs are poorly supported and you should be running a Broadcom/LSI HBA if you want reliability.
 
Joined
Jan 29, 2024
Messages
6
Here it is.

Adaptec HBAs are poorly supported and you should be running a Broadcom/LSI HBA if you want reliability.
What would you recommend instead? But how did I get errors if one of the drives was connected to my motherboard's sata?
 
Joined
Jan 29, 2024
Messages
6
I specced out a Dell H310, would this be a more viable option for truenas? I thought those adaptec cards were fine, but I think I got the recommendation from some unraid guys....
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Your system is not something we'd recommend, but if it's working and ECC is working, then it's fine. You just need to replace the HBA with an LSI IT mode HBA, such as the SAS 9300.
That said, why do you even want an HBA? For four disks, why aren't you just using the SATA ports on the motherboard?
 
Joined
Jan 29, 2024
Messages
6
Your system is not something we'd recommend, but if it's working and ECC is working, then it's fine. You just need to replace the HBA with an LSI IT mode HBA, such as the SAS 9300.
That said, why do you even want an HBA? For four disks, why aren't you just using the SATA ports on the motherboard?

I did have two other drives in the system, but they unfortunately had failed a few weeks back (RIP wd red, they ran for 6 years). I believe this motherboard only has 4 ports too, as I'm using an nvme for the boot media and that disables 2 satas. I'm planning on expanding this server even further soon.

My question wasn't answered though, as to why I need to replace this HBA as they aren't giving me issues and doing a google search I've seen recommendations on them. And why I was having ZFS errors if the one was using my motherboard's sata and the other drive was on the HBA.
 

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925
I'm a Core user, not Scale (about which I know very little, but I'll try to throw some light on basic issues you may be expereincing:

First, you've not prvided near enough information for anyone here to answer your questions. For starters, the error messages you received would have likely indicated the type of errors and suggestions might then have been made to get additional pertinent information from the system to begin to drill down on the issues.

From your words you might be confused about the results of SMART testing and pool data errors - SMART results are drive only, not the data content.

There have been multiple posts here about Ryzen tweaks for stability - no overclocking for one (never on servers), then Disable Cool n Quiet and C6 states in your BIOS. These are the known stability tweaks for Ryzen boards from a quick search here.

The history of psuedo-HBA's and raid controllers has been so bad here that they'll always be fingered first as problems, and the feedback from affected users about the experience of replacing them has supported this almost-blanket recommendation.

My suggestion to you is that you do replace the HBA per @Ericloewe 's suggestion, then come back here after you are up and running again with full details about any further issues you experience.

Good luck.

EDIT - please post your long test results on your drives and the results of zpool status -v
 
Joined
Jan 29, 2024
Messages
6
I'm a Core user, not Scale (about which I know very little, but I'll try to throw some light on basic issues you may be expereincing:

First, you've not prvided near enough information for anyone here to answer your questions. For starters, the error messages you received would have likely indicated the type of errors and suggestions might then have been made to get additional pertinent information from the system to begin to drill down on the issues.

From your words you might be confused about the results of SMART testing and pool data errors - SMART results are drive only, not the data content.

There have been multiple posts here about Ryzen tweaks for stability - no overclocking for one (never on servers), then Disable Cool n Quiet and C6 states in your BIOS. These are the known stability tweaks for Ryzen boards from a quick search here.

The history of psuedo-HBA's and raid controllers has been so bad here that they'll always be fingered first as problems, and the feedback from affected users about the experience of replacing them has supported this almost-blanket recommendation.

My suggestion to you is that you do replace the HBA per @Ericloewe 's suggestion, then come back here after you are up and running again with full details about any further issues you experience.

Good luck.

EDIT - please post your long test results on your drives and the results of zpool status -v

Like I said the errors cleared, but this is what I got from my email service. I'm not sure how to pull up the history of errors

New alerts:
  • Pool main state is ONLINE: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected.

And thank you for the info on my Ryzen motherboard, I believe I have it all set properly but I will double check as soon as I can. I will be ordering an LSI hba soon as per the recommendations, but I do want to see if this error pops up one more time to rule out if this was a weird, one time fluke. Below I have posted zpool status and long test result:
admin@truenas[~]$ sudo zpool status main
[sudo] password for admin:
pool: main
state: ONLINE
scan: scrub repaired 0B in 23:58:50 with 0 errors on Sun Jan 21 23:58:52 2024
config:

NAME STATE READ WRITE CKSUM
main ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
282f164c-2fe7-437c-a25c-e8f9a849270c ONLINE 0 0 0
ae32555d-57c4-48aa-a4ef-ac7c47ad5d50 ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
933d5cf4-524e-450d-9cc1-cf5ba6f420a2 ONLINE 0 0 0
38f95d20-65f7-476a-946f-d45f38a6d6ee ONLINE 0 0 0

errors: No known data errors
admin@truenas[~]$
/dev/sdd
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 5123 -
# 2 Extended offline Completed without error 00% 4374 -
# 3 Extended offline Interrupted (host reset) 10% 3654 -
# 4 Extended offline Completed without error 00% 2909 -
# 5 Extended offline Completed without error 00% 2192 -
# 6 Extended offline Completed without error 00% 1452 -
# 7 Extended offline Aborted by host 90% 688 -
# 8 Short offline Completed without error 00% 72 -
/dev/sda
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 1233 -
# 2 Extended offline Completed without error 00% 609 -
# 3 Short offline Completed without error 00% 81 -
 

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925
Below I have posted zpool status and long test result:
Thanks for the follow-uip.

Well, yes, but:

1. Using zpool status main instead of zpool status -v didn't give the status of your boot pool.

2. You did not post the complete output of the smart test on the discs ( e.g. you missed the individual parameter values), and

3. You only posted for two of your 4 hard disks.

If you want help here, you'll be expected to post the complete output of diagnostic commands. I suggest that you use SSH instead of the Shell as it's easier to copy amd paste from.

I don't know where logs are in Scale so I can't help there.
 
Joined
Jan 29, 2024
Messages
6
Thanks for the follow-uip.

Well, yes, but:

1. Using zpool status main instead of zpool status -v didn't give the status of your boot pool.

2. You did not post the complete output of the smart test on the discs ( e.g. you missed the individual parameter values), and

3. You only posted for two of your 4 hard disks.

If you want help here, you'll be expected to post the complete output of diagnostic commands. I suggest that you use SSH instead of the Shell as it's easier to copy amd paste from.

I don't know where logs are in Scale so I can't help there.
My apologies! I snipped what I thought was useful, but I will provide everything you need below. I was trying not to spam the thread. You'll see an EightTB pool that is just a drive I have plugged in for quick smb shares across the house and can be ignored. And I redacted the S/N from my smart outputs.

pool: Main
state: ONLINE
scan: scrub repaired 0B in 23:58:50 with 0 errors on Sun Jan 21 23:58:52 2024
config:

NAME STATE READ WRITE CKSUM
Main ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
282f164c-2fe7-437c-a25c-e8f9a849270c ONLINE 0 0 0
ae32555d-57c4-48aa-a4ef-ac7c47ad5d50 ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
933d5cf4-524e-450d-9cc1-cf5ba6f420a2 ONLINE 0 0 0
38f95d20-65f7-476a-946f-d45f38a6d6ee ONLINE 0 0 0

errors: No known data errors

pool: EightTB
state: ONLINE
scan: scrub repaired 0B in 08:17:17 with 0 errors on Sun Jan 21 08:17:19 2024
config:

NAME STATE READ WRITE CKSUM
EightTB ONLINE 0 0 0
fd97622d-24c3-4686-a99e-016d7e2c2941 ONLINE 0 0 0

errors: No known data errors

pool: boot-pool
state: ONLINE
status: Some supported and requested features are not enabled on the pool.
The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
the pool may no longer be accessible by software that does not support
the features. See zpool-features(7) for details.
scan: scrub repaired 0B in 00:01:34 with 0 errors on Wed Jan 24 03:46:35 2024
config:

NAME STATE READ WRITE CKSUM
boot-pool ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
wwn-0x5ace42e0704992fa-part3 ONLINE 0 0 0
sdf3 ONLINE 0 0 0

errors: No known data errors


admin@truenas[~]$ sudo smartctl -a /dev/sdb
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.131+truenas] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Western Digital Ultrastar DC HC550
Device Model: WDC WUH721816ALE6L4
Serial Number:
LU WWN Device Id: 5 000cca 295f2edc2
Firmware Version: PCGNW680
User Capacity: 16,000,900,661,248 bytes [16.0 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-4 published, ANSI INCITS 529-2018
SATA Version is: SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Tue Jan 30 23:13:21 2024 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 101) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: (1746) minutes.
SCT capabilities: (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 001 Pre-fail Always - 0
2 Throughput_Performance 0x0005 148 148 054 Pre-fail Offline - 48
3 Spin_Up_Time 0x0007 084 084 001 Pre-fail Always - 349 (Average 316)
4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 49
5 Reallocated_Sector_Ct 0x0033 100 100 001 Pre-fail Always - 0
7 Seek_Error_Rate 0x000b 100 100 001 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 140 140 020 Pre-fail Offline - 15
9 Power_On_Hours 0x0012 100 100 000 Old_age Always - 5777
10 Spin_Retry_Count 0x0013 100 100 001 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 49
22 Helium_Level 0x0023 100 100 025 Pre-fail Always - 100
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 291
193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 291
194 Temperature_Celsius 0x0002 066 066 000 Old_age Always - 29 (Min/Max 24/50)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x000a 100 100 000 Old_age Always - 0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 5096 -
# 2 Extended offline Completed without error 00% 4348 -
# 3 Extended offline Interrupted (host reset) 10% 3628 -
# 4 Extended offline Completed without error 00% 2883 -
# 5 Extended offline Completed without error 00% 2166 -
# 6 Extended offline Completed without error 00% 1426 -
# 7 Extended offline Aborted by host 90% 662 -
# 8 Extended offline Completed without error 00% 90 -

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

admin@truenas[~]$ sudo smartctl -a /dev/sdd
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.131+truenas] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Western Digital Ultrastar DC HC550
Device Model: WDC WUH721816ALE6L4
Serial Number:
LU WWN Device Id: 5 000cca 295f2ea9e
Firmware Version: PCGNW680
User Capacity: 16,000,900,661,248 bytes [16.0 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-4 published, ANSI INCITS 529-2018
SATA Version is: SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Tue Jan 30 23:14:11 2024 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 101) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: (1697) minutes.
SCT capabilities: (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 001 Pre-fail Always - 0
2 Throughput_Performance 0x0005 148 148 054 Pre-fail Offline - 48
3 Spin_Up_Time 0x0007 084 084 001 Pre-fail Always - 331 (Average 303)
4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 52
5 Reallocated_Sector_Ct 0x0033 100 100 001 Pre-fail Always - 0
7 Seek_Error_Rate 0x000b 100 100 001 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 140 140 020 Pre-fail Offline - 15
9 Power_On_Hours 0x0012 100 100 000 Old_age Always - 5803
10 Spin_Retry_Count 0x0013 100 100 001 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 52
22 Helium_Level 0x0023 100 100 025 Pre-fail Always - 100
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 296
193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 296
194 Temperature_Celsius 0x0002 066 066 000 Old_age Always - 29 (Min/Max 25/49)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x000a 100 100 000 Old_age Always - 0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 5123 -
# 2 Extended offline Completed without error 00% 4374 -
# 3 Extended offline Interrupted (host reset) 10% 3654 -
# 4 Extended offline Completed without error 00% 2909 -
# 5 Extended offline Completed without error 00% 2192 -
# 6 Extended offline Completed without error 00% 1452 -
# 7 Extended offline Aborted by host 90% 688 -
# 8 Short offline Completed without error 00% 72 -

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

admin@truenas[~]$ sudo smartctl -a /dev/sda
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.131+truenas] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Western Digital Ultrastar DC HC550
Device Model: WDC WUH721818ALE604
Serial Number:
LU WWN Device Id: 5 000cca 2a9d42b5a
Firmware Version: PCGNW680
User Capacity: 18,000,207,937,536 bytes [18.0 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-4 published, ANSI INCITS 529-2018
SATA Version is: SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Tue Jan 30 23:15:03 2024 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 101) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: (1977) minutes.
SCT capabilities: (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 001 Pre-fail Always - 0
2 Throughput_Performance 0x0005 147 147 054 Pre-fail Offline - 52
3 Spin_Up_Time 0x0007 094 094 001 Pre-fail Always - 270
4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 4
5 Reallocated_Sector_Ct 0x0033 100 100 001 Pre-fail Always - 0
7 Seek_Error_Rate 0x000b 100 100 001 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 140 140 020 Pre-fail Offline - 15
9 Power_On_Hours 0x0012 100 100 000 Old_age Always - 1292
10 Spin_Retry_Count 0x0013 100 100 001 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 4
22 Helium_Level 0x0023 100 100 025 Pre-fail Always - 100
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 58
193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 58
194 Temperature_Celsius 0x0002 066 066 000 Old_age Always - 29 (Min/Max 25/34)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x000a 100 100 000 Old_age Always - 0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 1233 -
# 2 Extended offline Completed without error 00% 609 -
# 3 Short offline Completed without error 00% 81 -

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

admin@truenas[~]$ sudo smartctl -a /dev/sde
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.131+truenas] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Western Digital Ultrastar DC HC550
Device Model: WDC WUH721818ALE604
Serial Number:
LU WWN Device Id: 5 000cca 2a9c93664
Firmware Version: PCGNW680
User Capacity: 18,000,207,937,536 bytes [18.0 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-4 published, ANSI INCITS 529-2018
SATA Version is: SATA 3.3, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Tue Jan 30 23:15:32 2024 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 101) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: (2022) minutes.
SCT capabilities: (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 001 Pre-fail Always - 0
2 Throughput_Performance 0x0005 149 149 054 Pre-fail Offline - 44
3 Spin_Up_Time 0x0007 093 093 001 Pre-fail Always - 283
4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 4
5 Reallocated_Sector_Ct 0x0033 100 100 001 Pre-fail Always - 0
7 Seek_Error_Rate 0x000b 100 100 001 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 140 140 020 Pre-fail Offline - 15
9 Power_On_Hours 0x0012 100 100 000 Old_age Always - 1292
10 Spin_Retry_Count 0x0013 100 100 001 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 4
22 Helium_Level 0x0023 100 100 025 Pre-fail Always - 100
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 58
193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 58
194 Temperature_Celsius 0x0002 065 065 000 Old_age Always - 30 (Min/Max 25/34)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x000a 100 100 000 Old_age Always - 0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 1236 -
# 2 Extended offline Completed without error 00% 607 -
# 3 Short offline Completed without error 00% 81 -

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
 

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925
Thanks for the follow up.
Nothing here looks awry to me.
Let's see what happens with a real HBA...
 
Top