SMART tests

Status
Not open for further replies.

Bill McCormick

Explorer
Joined
Oct 3, 2015
Messages
68
I have some SMART tests enabled but I'm not getting any feedback (email) that indicates that the tests are even running.

The SMART service is enabled; check interval=30; mode=Sleep; my email address entered. I dod get other email reports. Is there a log file or something I can look at to glean more info?

Any help appreciated and thanks much in advance!

I have 8 drives RAIDZ2:
[bill@freenas] /nonexistent# smartctl --scan
/dev/pass0 -d scsi # /dev/pass0, SCSI device
/dev/pass1 -d scsi # /dev/pass1, SCSI device
/dev/pass2 -d scsi # /dev/pass2, SCSI device
/dev/pass3 -d scsi # /dev/pass3, SCSI device
/dev/ada0 -d atacam # /dev/ada0, ATA device
/dev/ada1 -d atacam # /dev/ada1, ATA device
/dev/ada2 -d atacam # /dev/ada2, ATA device
/dev/ada3 -d atacam # /dev/ada3, ATA device

The drives and interfaces appear to be SMART capable. This is a drive connected to a MB SATA port (another dive connected to an LSI controller JBOD follows):
[bill@freenas] /nonexistent# smartctl -a /dev/ada0
smartctl 6.3 2014-07-26 r3976 [FreeBSD 9.3-RELEASE-p28 amd64] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Western Digital Red
Device Model: WDC WD10EFRX-68JCSN0
Serial Number: WD-WCC1U1202373
LU WWN Device Id: 5 0014ee 20810ee25
Firmware Version: 01.01A01
User Capacity: 1,000,204,886,016 bytes [1.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS (minor revision not indicated)
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Mon Nov 9 14:46:59 2015 CST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (13200) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 151) minutes.
Conveyance self-test routine
recommended polling time: ( 5) minutes.
SCT capabilities: (0x30bd) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 141 139 021 Pre-fail Always - 3925
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 183
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 099 099 000 Old_age Always - 1313
10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 170
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 106
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 76
194 Temperature_Celsius 0x0022 107 097 000 Old_age Always - 36
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 1310 -
# 2 Short offline Completed without error 00% 1306 -
# 3 Short offline Completed without error 00% 1302 -
# 4 Short offline Completed without error 00% 1298 -
# 5 Short offline Completed without error 00% 1294 -
# 6 Short offline Completed without error 00% 1290 -
# 7 Short offline Completed without error 00% 1286 -
# 8 Short offline Completed without error 00% 1283 -
# 9 Short offline Completed without error 00% 1270 -
#10 Short offline Completed without error 00% 1266 -
#11 Short offline Completed without error 00% 1262 -
#12 Short offline Completed without error 00% 1258 -
#13 Short offline Completed without error 00% 1254 -
#14 Short offline Completed without error 00% 1250 -
#15 Short offline Completed without error 00% 1246 -
#16 Short offline Completed without error 00% 1242 -
#17 Short offline Completed without error 00% 1238 -
#18 Short offline Completed without error 00% 1234 -
#19 Short offline Completed without error 00% 1230 -
#20 Short offline Completed without error 00% 1226 -
#21 Short offline Completed without error 00% 1222 -

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

[bill@freenas] /nonexistent# smartctl -a -d sat /dev/pass0
smartctl 6.3 2014-07-26 r3976 [FreeBSD 9.3-RELEASE-p28 amd64] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Western Digital Red
Device Model: WDC WD30EFRX-68AX9N0
Serial Number: WD-WMC1T0884163
LU WWN Device Id: 5 0014ee 6ad873861
Firmware Version: 80.00A80
User Capacity: 3,000,592,982,016 bytes [3.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-2 (minor revision not indicated)
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Mon Nov 9 14:48:16 2015 CST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART Status not supported: ATA return descriptor not supported by controller firmware
SMART overall-health self-assessment test result: PASSED
Warning: This result is based on an Attribute check.

General SMART Values:
Offline data collection status: (0x84) Offline data collection activity
was suspended by an interrupting command from host.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (40380) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 405) minutes.
Conveyance self-test routine
recommended polling time: ( 5) minutes.
SCT capabilities: (0x70bd) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 185 181 021 Pre-fail Always - 5741
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 57
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 098 098 000 Old_age Always - 1753
10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 57
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 54
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 2
194 Temperature_Celsius 0x0022 109 101 000 Old_age Always - 41
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 1026 -
# 2 Short offline Completed without error 00% 1018 -
# 3 Short offline Completed without error 00% 1018 -

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,466
Yes, your self-tests are running. In the case of ada0, you're running short tests every 4 hours, which is way too often (I run mine daily, and most people here would probably say that's too often). FreeNAS does not notify you when the tests run, only when they fail.

Edit: Well, the tests are running on ada0. Doesn't look like it on pass0.
 

Bill McCormick

Explorer
Joined
Oct 3, 2015
Messages
68
I really appreciate your attention and anything you can offer, but my question was about a log file or something else I can hang my hat on to verify that the tests are running.

I assume that you're looking at the quantity of entries below "SMART Self-test log structure revision number 1" to make your determination. In reality, the quantity (21 & 3) is from when I ran tests from the command line; it hasn't changed in weeks - a fact I might have pointed out in my original post ... sorry. So that's why I don;t believe that the tests are running.

Thanks!
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,466
The smartctl results for /dev/ada0 show short tests every four power-on hours over the last 24+, with the most recent at 1310 hours. They also show that the disk has been powered on for a total of 1313 hours. Unless you know that you've consistently, manually, executed a short self-test with that frequency, that tells you that the scheduled tests have been running (though they're too frequent). My conclusion has nothing to do with the number of entries--that caps out at 21 (I don't know why that number was chosen), so if you have more than 21 self-tests, it will show only the last 21. It's based instead on the hours listed in that log.

To the best of my knowledge, the SMART self-test log is the only place where the self-tests are logged.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,974
To the best of my knowledge, the SMART self-test log is the only place where the self-tests are logged.
Agreed, the only place logged. You could just take a look at them periodically like I do just to see how many hours are on them and see how the last test pass.

If you are looking to get the SMART Test results automatically emailed to you each day then you will need to setup a script. I have one in the How-To section (I believe) and it steps you though everything but if you follow it the way it's written, you should remove SMART testing from the GUI schedule.

Also I agree with @danb35 , the frequency with which you are running the tests is a bit much. I myself run a short test on all my drives each day at 1AM. I then run a long test on all my drives on Sunday at 2AM. All of my drives are over the 3 year warranty period now and I've had zero failures. I actually expect them to continue working for at least 1 more year before a single failure occurs.
 

Bill McCormick

Explorer
Joined
Oct 3, 2015
Messages
68
Yea, I guess I should probably look at smartctl docs. But I think there is still an issue with running the (GUI configured) test for my Avago (LSI) JBOD drives (/dev/pass0..3). I suppose if I run it (the aforementioned script) from a cron job, I'll get get it to work plus some better indication that it actually ran, but I like to try to confine myself to given functionality before adding on to it.

When running smartctl from the CLI for these drives, I need to include a -d sat switch. So I included that switch in S.M.A.R.T. extra options in the View Disks section. Notwithstanding CLI functionality, it seems to have no effect there. So perhaps this something worth mentioning before I work around it. I'll be happy to formalize it in whatever tracking system is being used if it needs to be tracked.

[bill@freenas] /nonexistent# smartctl -a -d sat /dev/pass0
smartctl 6.3 2014-07-26 r3976 [FreeBSD 9.3-RELEASE-p28 amd64] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org
...
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 1754 -
# 2 Extended offline Completed without error 00% 1026 -
# 3 Short offline Completed without error 00% 1018 -
# 4 Short offline Completed without error 00% 1018 -
...
[bill@freenas] /nonexistent# smartctl -t short -d sat /dev/pass0
smartctl 6.3 2014-07-26 r3976 [FreeBSD 9.3-RELEASE-p28 amd64] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Short self-test routine immediately in off-line mode".
Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 2 minutes for test to complete.
Test will complete after Mon Nov 9 23:19:03 2015

Use smartctl -X to abort test.
[bill@freenas] /nonexistent# smartctl -a -d sat /dev/pass0
smartctl 6.3 2014-07-26 r3976 [FreeBSD 9.3-RELEASE-p28 amd64] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org
...
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 1762 -
# 2 Short offline Completed without error 00% 1754 -
# 3 Extended offline Completed without error 00% 1026 -
# 4 Short offline Completed without error 00% 1018 -
# 5 Short offline Completed without error 00% 1018 -
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,974
Maybe this thread can help you?
Thanks for posting that link.

The first link in that thread goes to my script and then the author has additional customization. It's all there for you to use, just do some reading and I'm sure you will find a reasonable outcome for your needs.
 

Bill McCormick

Explorer
Joined
Oct 3, 2015
Messages
68
OK. Thanks for that. I posted some additional questions in that thread.

So what's the deal with SMART Extra Options? Is it broken?
 

Bill McCormick

Explorer
Joined
Oct 3, 2015
Messages
68
I found that my drives connected through the Avago/LSI card ended up with NO config lines in /usr/local/etc/smartd.conf. So that was one problem. When I added them by hand, I did get some indication that they ran in /var/log/messages. The SMART Extra options will make it to smartd.conf. so it's not broken. However, if there is nothing to attach the extra options to, as in my case, they will not make it. I'm not sure why my Avago/LSI connected drives have this problem, so any insight on that would still be appreciated. There's still some other issue that I'm still looking into however, so this may not quite be over.
 
Status
Not open for further replies.
Top