z2pool gone crazy after stupid tinkering-no data related issues but: inaccessible GUI options, eternal resilver loop and wrong disk identifiers

ElGusto

Explorer
Joined
Mar 19, 2015
Messages
72
Not 8TB, anything, just to check each port and how it presents the drive
Well, with two drives supposedly being flaky and only double redundance I would not like to remove a drive for increased risk of data loss. And: shouldn't there be reported checksum errors if a port of the HBA was broken?
What I could do: Just move all four drives to the other SFF-8087 connector, as the HBA has two of these, with four ports each.
 

ElGusto

Explorer
Joined
Mar 19, 2015
Messages
72
No, that is exactly what I meant. Could this be the first time the drives had to resilver?
No, I rarely had heat problems earlier (one time each of the previous two summers) and could fix these be removing/adding the drives and resilvering them once no problem when the weather had cooled down. But this time I 'fixed' it the wrong way IMHO as I didn't exactly remember how I did it before.
 

homer27081990

Patron
Joined
Aug 9, 2022
Messages
321
Well, with two drives supposedly being flaky and only double redundance I would not like to remove a drive for increased risk of data loss. And: shouldn't there be reported checksum errors if a port of the HBA was broken?
What I could do: Just move all four drives to the other SFF-8087 connector, as the HBA has two of these, with four ports each.
That is the next course of action IMO.
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
This appears to be one of the problems related to SMR drives. They work until they don't work.

Also, getting a successful re-silver out of a SMR drive will likely depend on how much is used. AND the internal fragmentation of the drive.

Ideally, when you have to do a full re-silver of a SMR drive, their would be a way to tell the drive to start from scratch. Something like wiping it, so that the SMR drive wipes the block indirect table and any new writes after that start a new table. Then perform the ZFS replace & re-silver.
 

ElGusto

Explorer
Joined
Mar 19, 2015
Messages
72
Hello people: Please don't put this in SMR or drive/controller issues when it clearly is not.

The resilver had run through to 100% for at least 20 times now, with not a single error! Everything hardware related is running perfectly fine! Also the resilver has always worked flawlessly the years before.
There's no reason at all for thinking there was an issue with the hard drives, because then it would need to throw lots of errors but there are zero.
The drives not all shown with their GPTID in the pool and the disk edit window not opening for da1 has nothing to do with the drives being SMR.
And the resilver starting over after successfully running through to 100% hasn't, too.

This clearly is some configuration/administrative/meta problem, with the GPT table being wrong from all my experiments or such.
I tried swapping the port now and it didn't make a difference, as expected.
I just need help reconfiguring and re-adding the drives in a way that repairs/organizes their names correctly again.
 
Last edited:

homer27081990

Patron
Joined
Aug 9, 2022
Messages
321
Hello people: Please don't put this in SMR or drive/controller issues when it clearly is not.

The resilver had run through to 100% for at least 20 times now, with not a single error! Everything hardware related is running perfectly fine! Also the resilver has always worked flawlessly the years before.
There's no reason at all for thinking there was an issue with the hard drives, because then it would need to throw lots of errors but there are zero.
The drives not all shown with their GPTID in the pool and the disk edit window not opening for da1 has nothing to do with the drives being SMR.
And the resilver starting over after successfully running through to 100% hasn't, too.

This clearly is some configuration/administrative/meta problem, with the GPT table being wrong from all my experiments or such.
I tried swapping the port now and it didn't make a difference, as expected.
I just need help reconfiguring and re-adding the drives in a way that repairs/organizes their names correctly again.
My experience lies in general troubleshooting. In case this isn't about the SMR drives, you need a storage Guru here.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Discaimer: don't take offense.
I had to offline/online the disks and use the replace feature for making it work again for multiple times.
You should describe each step you did as accurately as possible, but imho pool is likely gone.

Strange enough there's no da1 there, but da4 is twice . Also the partition numbers seem to be all over the place.
Imho that's a sign of improper use of the replace feature.

Put in some CMR asap if you want to take advantage of the experience in this forum.

Also, did you use the GUI for all of this or the CLI?
 
Last edited:

ElGusto

Explorer
Joined
Mar 19, 2015
Messages
72
Hello!
The pool is fully functional. All data is accessible and there are no errors reported. I did all the tinkering from the Web-GUI AFAIR.
Yes, I think I did some thing(s) wring and now the storage organization is somewhat confused.
But as it's fully operational I think it should be possible to fix this with some commands.
 

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925

ElGusto

Explorer
Joined
Mar 19, 2015
Messages
72
Discaimer: don't take offense.

You should describe each step you did as accurately as possible, but imho pool is likely gone.


Imho that's a sign of improper use of the replace feature.

Put in some CMR asap if you want to take advantage of the experience in this forum.

Also, did you use the GUI for all of this or the CLI?

Here are the details as much as I can remember:
What I did before:
It has been some weeks, since.
It was a very hot period in the old computer case (Poweredge T20, which sadly is reknown for being terribly ventilated).
I was copying some 250GByte from my Notebook to the z2pool and at about 150GByte the transfer stalled.
I had a check at the Web-UI and it told me there was some degradation. Checking the drives showed me, that one drive supposedly failed.
I wanted to make the drive running again, for at least finishing this one transfer I was it. So I think I first tried the replace feature and noticed it wouldn't let me replace the failed drive by itself. So I offline the drive and onlined it again. I think then something didn't work as expected, like the WebUI not reacting to my commands or so.. not sure anymore.. anyways something lead me to try the replace feature again, and this time it actually did allow me to chose the drive that had failed before. After this I waited for some minutes and nothing seemed to have happened again. I then did an offline/online again and it ran again.

After this (the weather had cooled down somewhat) I restarted the data transfer from my Notebook for the remaining 100 Gbyte. And of course at some point two of the drives overheated again .. I was stupid enough for trying it more and more and committed very many offline/online and replace actions in a row, until the pool accepted the drives again, each time there was an overheating event.
In the end I gave up and moved all the hardware to an much better case which had gotten in between, where the temperatures overall are about I 20 C lower.
I then had the idea, I could try enabling acoustic management and power management for the four drives, as this might have an effect on the drives running even cooler.
After doing so I first noticed, that I could not access the 'Edit Disk' options for da1 anymore as well I was wondering why this one drive was shown as da1, while the other drives where shown by their GPTID .. I am not even sure if it possibly was like this before all the hustle.

And that basically is where we are at now.
 

ElGusto

Explorer
Joined
Mar 19, 2015
Messages
72
Is the repeating resilver issue over?
No, but there was no single error, since I moved the hardware to the new enclosure with better ventilation.
That's the point: The resilver starts for no replicable reason.
 

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,925
No, but there was no single error, since I moved the hardware to the new enclosure with better ventilation.
That's the point: The resilver starts for no replicable reason.
I'm surprised/confused that you (can) describe the pool as "fully functional" on a system that is constantly resilvering...
 

homer27081990

Patron
Joined
Aug 9, 2022
Messages
321
No, but there was no single error, since I moved the hardware to the new enclosure with better ventilation.
That's the point: The resilver starts for no replicable reason.
Ok, if you insist that the only issue is the resilvering itself, grab any logs and stats you can find and place them here.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Here are the details as much as I can remember:
What I did before:
It has been some weeks, since.
It was a very hot period in the old computer case (Poweredge T20, which sadly is reknown for being terribly ventilated).
I was copying some 250GByte from my Notebook to the z2pool and at about 150GByte the transfer stalled.
I had a check at the Web-UI and it told me there was some degradation. Checking the drives showed me, that one drive supposedly failed.
I wanted to make the drive running again, for at least finishing this one transfer I was it. So I think I first tried the replace feature and noticed it wouldn't let me replace the failed drive by itself. So I offline the drive and onlined it again. I think then something didn't work as expected, like the WebUI not reacting to my commands or so.. not sure anymore.. anyways something lead me to try the replace feature again, and this time it actually did allow me to chose the drive that had failed before. After this I waited for some minutes and nothing seemed to have happened again. I then did an offline/online again and it ran again.

After this (the weather had cooled down somewhat) I restarted the data transfer from my Notebook for the remaining 100 Gbyte. And of course at some point two of the drives overheated again .. I was stupid enough for trying it more and more and committed very many offline/online and replace actions in a row, until the pool accepted the drives again, each time there was an overheating event.
In the end I gave up and moved all the hardware to an much better case which had gotten in between, where the temperatures overall are about I 20 C lower.
I then had the idea, I could try enabling acoustic management and power management for the four drives, as this might have an effect on the drives running even cooler.
After doing so I first noticed, that I could not access the 'Edit Disk' options for da1 anymore as well I was wondering why this one drive was shown as da1, while the other drives where shown by their GPTID .. I am not even sure if it possibly was like this before all the hustle.

And that basically is where we are at now.
What a mess to troubleshoot... SMR on top of that is not helping.
Pray for someone with both a knowhow pool and a heart big enough to help.

Did you run any smart test?
 
Last edited:

ElGusto

Explorer
Joined
Mar 19, 2015
Messages
72
Ok, if you insist that the only issue is the resilvering itself, grab any logs and stats you can find and place them here.
Hello I have been searching hours for more log data on the resilvering process but could not find anything, beyond the already known 'zpool status z2pool' Could you point me to the right direction, please?
(Also the Dashboard ist telling me 'Logs: 0'. I don't know where to enable this.
Bildschirmfoto_2022-08-13_19-46-09.jpg



What a mess to troubleshoot... SMR on top of that is not helping.
Pray for someone with both a knowhow pool and a heart big enough to help.

Did you run any smart test?

Hello!

Here is the SMART data for all four drives. I tried an extended offline test, but the estimated time was about 24 hours, so I cancelled this.

da1
root@truenas[/var/log]# smartctl -a /dev/da1
smartctl 7.2 2020-12-30 r5155 [FreeBSD 12.2-RELEASE-p14 amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Seagate Archive HDD (SMR)
Device Model: ST8000AS0002-1NA17Z
Serial Number: Z8400ZQV
LU WWN Device Id: 5 000c50 07a091d6a
Firmware Version: AR13
User Capacity: 8,001,563,222,016 bytes [8.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 5980 rpm
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Sat Aug 13 19:24:52 2022 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.

General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 0) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 953) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x30a5) SCT Status supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 102 099 006 Pre-fail Always - 4516096
3 Spin_Up_Time 0x0003 091 088 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 098 098 020 Old_age Always - 2835
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 082 060 030 Pre-fail Always - 8980062877
9 Power_On_Hours 0x0032 085 085 000 Old_age Always - 13797
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 099 099 020 Old_age Always - 1442
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0
189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 065 033 045 Old_age Always In_the_past 35 (1 32 35 24 0)
191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 0
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 1081
193 Load_Cycle_Count 0x0032 097 097 000 Old_age Always - 7000
194 Temperature_Celsius 0x0022 035 067 000 Old_age Always - 35 (0 17 0 0 0)
195 Hardware_ECC_Recovered 0x001a 102 099 000 Old_age Always - 4516096
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 7562 (29 58 0)
241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 35880619993
242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 514972082147

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 13793 -
# 2 Extended offline Aborted by host 90% 13793 -
# 3 Extended offline Aborted by host 90% 13793 -
# 4 Short offline Completed without error 00% 13772 -
# 5 Short offline Completed without error 00% 13748 -
# 6 Short offline Completed without error 00% 13724 -
# 7 Short offline Completed without error 00% 13700 -
# 8 Short offline Completed without error 00% 13676 -
# 9 Short offline Completed without error 00% 13652 -
#10 Short offline Completed without error 00% 13630 -
#11 Short offline Completed without error 00% 13623 -
#12 Short offline Completed without error 00% 13599 -
#13 Short offline Completed without error 00% 13575 -
#14 Short offline Completed without error 00% 13551 -
#15 Short offline Completed without error 00% 13527 -
#16 Short offline Completed without error 00% 13503 -
#17 Short offline Completed without error 00% 13479 -
#18 Short offline Completed without error 00% 13455 -
#19 Short offline Completed without error 00% 13431 -
#20 Short offline Completed without error 00% 13407 -
#21 Short offline Completed without error 00% 13383 -

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

da2
root@truenas[/var/log]# smartctl -a /dev/da2
smartctl 7.2 2020-12-30 r5155 [FreeBSD 12.2-RELEASE-p14 amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Seagate Archive HDD (SMR)
Device Model: ST8000AS0002-1NA17Z
Serial Number: Z84011C7
LU WWN Device Id: 5 000c50 07a0850a6
Firmware Version: AR13
User Capacity: 8,001,563,222,016 bytes [8.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 5980 rpm
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Sat Aug 13 19:24:55 2022 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.

General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 0) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 957) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x30a5) SCT Status supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 119 099 006 Pre-fail Always - 205023720
3 Spin_Up_Time 0x0003 091 089 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 098 098 020 Old_age Always - 2637
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 076 060 030 Pre-fail Always - 43364306016
9 Power_On_Hours 0x0032 084 084 000 Old_age Always - 14116
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 099 099 020 Old_age Always - 1358
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0
189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 064 035 045 Old_age Always In_the_past 36 (1 161 36 24 0)
191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 0
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 995
193 Load_Cycle_Count 0x0032 097 097 000 Old_age Always - 7040
194 Temperature_Celsius 0x0022 036 065 000 Old_age Always - 36 (0 17 0 0 0)
195 Hardware_ECC_Recovered 0x001a 119 099 000 Old_age Always - 205023720
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 7738 (243 177 0)
241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 51851602427
242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 477883986687

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 14111 -
# 2 Short offline Aborted by host 10% 14111 -
# 3 Extended offline Aborted by host 90% 14111 -
# 4 Short offline Completed without error 00% 14090 -
# 5 Short offline Completed without error 00% 14066 -
# 6 Short offline Completed without error 00% 14042 -
# 7 Short offline Completed without error 00% 14018 -
# 8 Short offline Completed without error 00% 13994 -
# 9 Short offline Completed without error 00% 13970 -
#10 Short offline Completed without error 00% 13948 -
#11 Short offline Completed without error 00% 13942 -
#12 Short offline Completed without error 00% 13918 -
#13 Short offline Completed without error 00% 13894 -
#14 Short offline Completed without error 00% 13870 -
#15 Short offline Completed without error 00% 13846 -
#16 Short offline Completed without error 00% 13822 -
#17 Short offline Completed without error 00% 13798 -
#18 Short offline Completed without error 00% 13774 -
#19 Short offline Completed without error 00% 13750 -
#20 Short offline Completed without error 00% 13726 -
#21 Short offline Completed without error 00% 13702 -

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
 
Last edited:

ElGusto

Explorer
Joined
Mar 19, 2015
Messages
72
da3
root@truenas[/var/log]# smartctl -a /dev/da3
smartctl 7.2 2020-12-30 r5155 [FreeBSD 12.2-RELEASE-p14 amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Seagate Archive HDD (SMR)
Device Model: ST8000AS0002-1NA17Z
Serial Number: Z84010E2
LU WWN Device Id: 5 000c50 07a08ac8f
Firmware Version: AR13
User Capacity: 8,001,563,222,016 bytes [8.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 5980 rpm
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Sat Aug 13 19:24:57 2022 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.

General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 0) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 954) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x30a5) SCT Status supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 116 099 006 Pre-fail Always - 113138888
3 Spin_Up_Time 0x0003 091 089 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 098 098 020 Old_age Always - 2566
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 074 060 030 Pre-fail Always - 56198755968
9 Power_On_Hours 0x0032 084 084 000 Old_age Always - 14047
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 099 099 020 Old_age Always - 1344
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0
189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 066 036 045 Old_age Always In_the_past 34 (2 63 34 24 0)
191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 0
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 1150
193 Load_Cycle_Count 0x0032 097 097 000 Old_age Always - 6757
194 Temperature_Celsius 0x0022 034 064 000 Old_age Always - 34 (0 16 0 0 0)
195 Hardware_ECC_Recovered 0x001a 116 099 000 Old_age Always - 113138888
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 7991 (115 250 0)
241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 35009442500
242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 587473543856

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 14042 -
# 2 Extended offline Aborted by host 90% 14042 -
# 3 Short offline Completed without error 00% 14021 -
# 4 Short offline Completed without error 00% 13997 -
# 5 Short offline Completed without error 00% 13973 -
# 6 Short offline Completed without error 00% 13949 -
# 7 Short offline Completed without error 00% 13925 -
# 8 Short offline Completed without error 00% 13901 -
# 9 Short offline Completed without error 00% 13879 -
#10 Short offline Completed without error 00% 13873 -
#11 Short offline Completed without error 00% 13849 -
#12 Short offline Completed without error 00% 13825 -
#13 Short offline Completed without error 00% 13801 -
#14 Short offline Completed without error 00% 13777 -
#15 Short offline Completed without error 00% 13753 -
#16 Short offline Completed without error 00% 13729 -
#17 Short offline Completed without error 00% 13705 -
#18 Short offline Completed without error 00% 13681 -
#19 Short offline Completed without error 00% 13657 -
#20 Short offline Completed without error 00% 13633 -
#21 Short offline Completed without error 00% 13592 -

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

da4
root@truenas[/var/log]# smartctl -a /dev/da4
smartctl 7.2 2020-12-30 r5155 [FreeBSD 12.2-RELEASE-p14 amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Seagate Archive HDD (SMR)
Device Model: ST8000AS0002-1NA17Z
Serial Number: Z8401NL6
LU WWN Device Id: 5 000c50 07a2ce8d9
Firmware Version: AR13
User Capacity: 8,001,563,222,016 bytes [8.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 5980 rpm
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Sat Aug 13 19:24:59 2022 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.

General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 0) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 956) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x30a5) SCT Status supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 119 099 006 Pre-fail Always - 210745264
3 Spin_Up_Time 0x0003 091 090 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 098 098 020 Old_age Always - 2670
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 085 060 030 Pre-fail Always - 391606615
9 Power_On_Hours 0x0032 084 084 000 Old_age Always - 14067
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 099 099 020 Old_age Always - 1340
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0
189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 065 039 045 Old_age Always In_the_past 35 (1 192 35 24 0)
191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 0
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 1194
193 Load_Cycle_Count 0x0032 097 097 000 Old_age Always - 6935
194 Temperature_Celsius 0x0022 035 061 000 Old_age Always - 35 (0 16 0 0 0)
195 Hardware_ECC_Recovered 0x001a 119 099 000 Old_age Always - 210745264
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 8179 (53 27 0)
241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 20885265204
242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 590457441139

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 14062 -
# 2 Extended offline Aborted by host 90% 14062 -
# 3 Short offline Completed without error 00% 14041 -
# 4 Short offline Completed without error 00% 14017 -
# 5 Short offline Completed without error 00% 13993 -
# 6 Short offline Completed without error 00% 13969 -
# 7 Short offline Completed without error 00% 13945 -
# 8 Short offline Completed without error 00% 13921 -
# 9 Short offline Completed without error 00% 13899 -
#10 Short offline Completed without error 00% 13893 -
#11 Short offline Completed without error 00% 13869 -
#12 Short offline Completed without error 00% 13845 -
#13 Short offline Completed without error 00% 13821 -
#14 Short offline Completed without error 00% 13797 -
#15 Short offline Completed without error 00% 13773 -
#16 Short offline Completed without error 00% 13749 -
#17 Short offline Completed without error 00% 13725 -
#18 Short offline Completed without error 00% 13701 -
#19 Short offline Completed without error 00% 13677 -
#20 Short offline Completed without error 00% 13653 -
#21 Short offline Completed without error 00% 13612 -

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
I had to split it in two, because the post was too long.
 

ElGusto

Explorer
Joined
Mar 19, 2015
Messages
72
As in #37: I cancelled the SMART extended test I had initiated, because it was said to take 24 hours.
 
Last edited:
Top