SMART Test Scheduling Problem... Scheduled but does not run.

Status
Not open for further replies.

r0b0ty

Dabbler
Joined
Jul 5, 2018
Messages
16
Hi, guys.

I'm having trouble understanding why the SMART tasks I've set up don't run. I'm new to this, so it is quite probable that I misinterpreted how the FreeNAS guide described the feature.

I scheduled the tests (see image) for the first time on Monday, so I was expecting one to run on Tuesday, yesterday, and tonight. Well none have run, and I see none scheduled for ada0 and a couple of sporadic, odd ones for ada1 and ada2. I've rebooted the machine, but no change. Can you please point to what I'm doing wrong?!?

vg1gI1W.png


rXED7NM.png


The guide suggests running the following shell commands for more information and verification. The tests below are old (part of the burn-in process I did a while back).

Output of smartctl -l selftest /dev/ada0

root@guardabarranco:~ # smartctl -l selftest /dev/ada0
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.1-STABLE amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 3 -



Output of smartctl -l selftest /dev/ada1

root@guardabarranco:~ # smartctl -l selftest /dev/ada1
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.1-STABLE amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 81 -
# 2 Short offline Completed without error 00% 73 -
# 3 Extended offline Completed without error 00% 10 -
# 4 Short offline Completed without error 00% 3 -



Output of smartd -q showtests


root@guardabarranco:~ # smartd -q showtests

smartd 6.6 2017-11-05 r4594 [FreeBSD 11.1-STABLE amd64] (local build)

Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org



Opened configuration file /usr/local/etc/smartd.conf

Configuration file /usr/local/etc/smartd.conf parsed.

Device: /dev/ada2, opened
Device: /dev/ada2, WDC WD40EFRX-68N32N0, S/N:WD-WCC7K2XS6Y7K, WWN:5-0014ee-2650c066b, FW:82.00A82, 4.00 TB
Device: /dev/ada2, found in smartd database: Western Digital Red
Device: /dev/ada2, is SMART capable. Adding to "monitor" list.
Device: /dev/ada1, opened
Device: /dev/ada1, WDC WD40EFRX-68N32N0, S/N:WD-WCC7K2FFF87P, WWN:5-0014ee-26535b8db, FW:82.00A82, 4.00 TB
Device: /dev/ada1, found in smartd database: Western Digital Red
Device: /dev/ada1, is SMART capable. Adding to "monitor" list.
Device: /dev/ada0, opened
Device: /dev/ada0, SanDisk SDSA6GM-016G-1006, S/N:141792401188, WWN:5-001b44-c0f115324, FW:U221006, 16.0 GB
Device: /dev/ada0, not found in smartd database.
Device: /dev/ada0, can't monitor Current_Pending_Sector count - no Attribute 197
Device: /dev/ada0, is SMART capable. Adding to "monitor" list.
Monitoring 3 ATA/SATA, 0 SCSI/SAS and 0 NVMe devices

Next scheduled self tests (at most 5 of each type per device):
Device: /dev/ada1, will do test 1 of type L at Wed Aug 29 00:02:17 2018 EDT
Device: /dev/ada2, will do test 1 of type L at Thu Nov 1 00:02:17 2018 EDT
Device: /dev/ada2, will do test 2 of type L at Thu Nov 15 00:02:17 2018 EST

Totals [Thu Aug 23 16:32:17 2018 EDT - Wed Nov 21 15:32:17 2018 EST]:
Device: /dev/ada2, will do 2 tests of type L
Device: /dev/ada2, will do 0 tests of type S
Device: /dev/ada2, will do 0 tests of type C
Device: /dev/ada2, will do 0 tests of type O
Device: /dev/ada1, will do 1 test of type L
Device: /dev/ada1, will do 0 tests of type S
Device: /dev/ada1, will do 0 tests of type C
Device: /dev/ada1, will do 0 tests of type O
Device: /dev/ada0, will do 0 tests of type L
Device: /dev/ada0, will do 0 tests of type S
Device: /dev/ada0, will do 0 tests of type C
Device: /dev/ada0, will do 0 tests of type O
 

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,924
What's it say in "Storage" "View Disks"?

upload_2018-8-23_17-51-45.png
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,974
It looks correct, the test will run every 15 days on Wed for ada2 or Thursday for ada1. When it comes to CRON schedules, things are not the easiest to understand. But your output clearly indicates that Wed Aug 29 you will have ada1 perform a Long test. Unfortunately yopu did not provide the full smartctl output for your drives so I can't point specific things out but it appears that you have run an Extended test already. So you have two thresholds, 14 days and Wed or Thurs. You must pass 14 days since the last test and be on the next Wed or Thurs for the test to run, if memory serves me correctly.

Hope this helps.
 

r0b0ty

Dabbler
Joined
Jul 5, 2018
Messages
16
Here you go, @Redcoat

yRPCR6J.png




@joeschmuck Is this the full smartctl output you mentioned?

ada0:

root@guardabarranco:~ # smartctl -a /dev/ada0
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.1-STABLE amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model: SanDisk SDSA6GM-016G-1006
Serial Number: 141792401188
LU WWN Device Id: 5 001b44 c0f115324
Firmware Version: U221006
User Capacity: 16,013,942,784 bytes [16.0 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Form Factor: 1.8 inches
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-2 T13/2015-D revision 3
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Thu Aug 23 18:46:10 2018 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===




data collection: ( 120) seconds.
Offline data collection
capabilities: (0x51) SMART execute Offline immediate.
No Auto Offline data collection support.
Suspend Offline collection upon new
command.
No Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 3) minutes.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 100 100 002 Pre-fail Always - 0
5 Reallocated_Sector_Ct 0x0033 100 100 002 Pre-fail Always - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 4316
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 534
165 Unknown_Attribute 0x0002 100 100 000 Old_age Always - 15692
170 Unknown_Attribute 0x0033 100 100 005 Pre-fail Always - 0
171 Unknown_Attribute 0x0002 100 100 000 Old_age Always - 0
172 Unknown_Attribute 0x0002 100 100 000 Old_age Always - 0
173 Unknown_Attribute 0x0002 100 100 000 Old_age Always - 7
174 Unknown_Attribute 0x0002 100 100 000 Old_age Always - 137
184 End-to-End_Error 0x0033 100 100 097 Pre-fail Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0
194 Temperature_Celsius 0x0022 071 029 000 Old_age Always - 29 (Min/Max -2/57)
198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
230 Unknown_SSD_Attribute 0x0002 100 100 000 Old_age Always - 23
234 Unknown_Attribute 0x0002 100 100 000 Old_age Always - 13
241 Total_LBAs_Written 0x0002 100 100 000 Old_age Always - 178031731
242 Total_LBAs_Read 0x0002 100 100 000 Old_age Always - 1186540430

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 3 -

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.



ada1:

root@guardabarranco:~ # smartctl -a /dev/ada1
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.1-STABLE amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Western Digital Red
Device Model: WDC WD40EFRX-68N32N0
Serial Number: WD-WCC7K2FFF87P
LU WWN Device Id: 5 0014ee 26535b8db
Firmware Version: 82.00A82
User Capacity: 4,000,787,030,016 bytes [4.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 5400 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-3 T13/2161-D revision 5
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is: Thu Aug 23 18:48:40 2018 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===

command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 477) minutes.
Conveyance self-test routine
recommended polling time: ( 5) minutes.
SCT capabilities: (0x303d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 172 171 021 Pre-fail Always - 6358
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 46
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 213
10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 46
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 40
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 121
194 Temperature_Celsius 0x0022 111 104 000 Old_age Always - 39
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 81 -
# 2 Short offline Completed without error 00% 73 -
# 3 Extended offline Completed without error 00% 10 -
# 4 Short offline Completed without error 00% 3 -

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.



ada2:


root@guardabarranco:~ # smartctl -a /dev/ada2
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.1-STABLE amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Western Digital Red
Device Model: WDC WD40EFRX-68N32N0
Serial Number: WD-WCC7K2XS6Y7K
LU WWN Device Id: 5 0014ee 2650c066b
Firmware Version: 82.00A82
User Capacity: 4,000,787,030,016 bytes [4.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 5400 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-3 T13/2161-D revision 5
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is: Thu Aug 23 18:51:49 2018 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===

without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (44880) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 477) minutes.
Conveyance self-test routine
recommended polling time: ( 5) minutes.
SCT capabilities: (0x303d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 162 161 021 Pre-fail Always - 6883
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 46
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 213
10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 46
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 40
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 119
194 Temperature_Celsius 0x0022 113 107 000 Old_age Always - 37
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 82 -
# 2 Short offline Completed without error 00% 74 -
# 3 Extended offline Completed without error 00% 11 -
# 4 Short offline Completed without error 00% 3 -

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
 

r0b0ty

Dabbler
Joined
Jul 5, 2018
Messages
16
What I don't get, @joeschmuck, is why the tests didn't run this week, since I scheduled them prior to Tuesday (the first test scheduled). Is the logic like this: "IF 14 days have elapsed since the last test AND it is Tuesday (for ada0), THEN perform test"? If so... It makes sense for ada1 (since the burn-in test completed on AUG 15). But things don't make sense for ada2 (burn-in test also completed on AUG 15, yet scheduled until November?) and ada0 (don't know when it last ran, but never scheduled?).
 

Redcoat

MVP
Joined
Feb 18, 2014
Messages
2,924

r0b0ty

Dabbler
Joined
Jul 5, 2018
Messages
16
Guys, as I don't want to waste anyone's precious time, please note that I changed my mind and chose to follow the proposed SMART test schedule mentioned here (among other places) - just a tad bit modified:

https://www.ceos3c.com/freenas/freenas-essentials-setting-up-smart-tests-and-scrubs/

After running smartd -q showtests it properly shows the next 5 scheduled tests for both drives through November. I say "both" drives, as I realized my mistake of originally including my SSD drive for testing - apparently it's unnecessary and not recommended, so I removed it.

I'm happy with the setup. I'm unhappy and bothered that I don't understand why my previous attempt didn't work. It seems straightforward, but it stumped me.

Anyway... if you have further insight, I'll gladly take it, but otherwise, thank you for your help!
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,974
There is more on how to schedule these tests and other CRON related events, just search "BSD CRON" on the internet, you will find that CRON is not as intuitive as you would think. We all have requested a better GUI that conveys the intent of what a user desires and then makes the appropriate commands to make it happen, but that really hasn't happened yet. I think selecting specific days and day of the week is the best option for your situation. For example you could select days 1 through 7 and 15 through 21 and Thursday which would result in running it the first and third Thursday of a calendar month. This is not 100% fool proof but odds are you would get the same expected result practically every time.

Also, just my two cents on how frequent I run my SMART Tests... I run the Short SMART Tests every night except Tuesday, they are short and do not interfere with any operations I have going on. I run my SMART Long tests once a week at a time I know the system should not be very active. Yes, 9PM is when I typically am relaxing am in bed typically by 10:30. That may sound early but I am at work by 5AM each day, many times I'm there earlier. So don't be shy about testing your hard drives out.

Hope this helps.
 

r0b0ty

Dabbler
Joined
Jul 5, 2018
Messages
16
Thanks for your feedback, @joeschmuck. Based on your input, I think I'll increase the frequency of my tests. I appreciate the help!
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,974
Anytime. I do try and spread sound advice when I can. Please note that there are others here on the forums who prefer to test less and some who may test more. Since my hard drives run 24/7, I feel it's not a problem running tests this frequently and I don't see them wearing out faster because of it. My WD Reds (6 drives) are well over 5 years of age and now in another computer doing other work but they are still going strong. Well time to head out to the porch and shoot the breeze with my father. He's still alive and kick'in even though he's been smoking for over 70 years. Isn't life funny.
 
Status
Not open for further replies.
Top