Why is the GUI reporting failure, when none seems to exist?

Status
Not open for further replies.

LordKitsuna

Cadet
Joined
Nov 22, 2015
Messages
6
I logged into my freenas box tonight and noticed an alert. It said that "/dev/ada1 failed smart self test BACKUP DATA NOW!" However when i logged in via term and ran smart to take a look at what was failing i found not nothing was wrong at all.
Code:
[root@freenas] ~# smartctl -a /dev/ada1
smartctl 6.5 2016-05-07 r4318 [FreeBSD 10.3-STABLE amd64] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:  Hitachi Ultrastar 7K3000
Device Model:  Hitachi HUA723030ALA640
Serial Number:  MK0371YVHLE3NA
LU WWN Device Id: 5 000cca 234d6776d
Firmware Version: MKAOAA50
User Capacity:  3,000,592,982,016 bytes [3.00 TB]
Sector Size:  512 bytes logical/physical
Rotation Rate:  7200 rpm
Form Factor:  3.5 inches
Device is:  In smartctl database [for details use: -P show]
ATA Version is:  ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 2.6, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:  Mon Dec  5 20:36:25 2016 PST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84) Offline data collection activity
  was suspended by an interrupting command from host.
  Auto Offline Data Collection: Enabled.
Self-test execution status:  (  0) The previous self-test routine completed
  without error or no self-test has ever
  been run.
Total time to complete Offline
data collection:  (26658) seconds.
Offline data collection
capabilities:  (0x5b) SMART execute Offline immediate.
  Auto Offline data collection on/off support.
  Suspend Offline collection upon new
  command.
  Offline surface scan supported.
  Self-test supported.
  No Conveyance Self-test supported.
  Selective Self-test supported.
SMART capabilities:  (0x0003) Saves SMART data before entering
  power-saving mode.
  Supports SMART auto save timer.
Error logging capability:  (0x01) Error logging supported.
  General Purpose Logging supported.
Short self-test routine
recommended polling time:  (  1) minutes.
Extended self-test routine
recommended polling time:  ( 445) minutes.
SCT capabilities:  (0x003d) SCT Status supported.
  SCT Error Recovery Control supported.
  SCT Feature Control supported.
  SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAG  VALUE WORST THRESH TYPE  UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate  0x000b  100  100  016  Pre-fail  Always  -  0
  2 Throughput_Performance  0x0005  137  137  054  Pre-fail  Offline  -  79
  3 Spin_Up_Time  0x0007  136  136  024  Pre-fail  Always  -  526 (Average 608)
  4 Start_Stop_Count  0x0012  100  100  000  Old_age  Always  -  15
  5 Reallocated_Sector_Ct  0x0033  100  100  005  Pre-fail  Always  -  0
  7 Seek_Error_Rate  0x000b  100  100  067  Pre-fail  Always  -  0
  8 Seek_Time_Performance  0x0005  123  123  020  Pre-fail  Offline  -  31
  9 Power_On_Hours  0x0012  099  099  000  Old_age  Always  -  9319
10 Spin_Retry_Count  0x0013  100  100  060  Pre-fail  Always  -  0
12 Power_Cycle_Count  0x0032  100  100  000  Old_age  Always  -  15
192 Power-Off_Retract_Count 0x0032  100  100  000  Old_age  Always  -  71
193 Load_Cycle_Count  0x0012  100  100  000  Old_age  Always  -  71
194 Temperature_Celsius  0x0002  200  200  000  Old_age  Always  -  30 (Min/Max 15/40)
196 Reallocated_Event_Count 0x0032  100  100  000  Old_age  Always  -  0
197 Current_Pending_Sector  0x0022  100  100  000  Old_age  Always  -  0
198 Offline_Uncorrectable  0x0008  100  100  000  Old_age  Offline  -  0
199 UDMA_CRC_Error_Count  0x000a  200  200  000  Old_age  Always  -  0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description  Status  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline  Completed without error  00%  9291  -
# 2  Short offline  Completed without error  00%  9283  -
# 3  Short offline  Completed without error  00%  9275  -
# 4  Short offline  Completed without error  00%  9123  -
# 5  Short offline  Completed without error  00%  9115  -
# 6  Short offline  Completed without error  00%  9107  -
# 7  Short offline  Completed without error  00%  8955  -
# 8  Short offline  Completed without error  00%  8947  -
# 9  Short offline  Completed without error  00%  8939  -
#10  Short offline  Completed without error  00%  8787  -
#11  Short offline  Completed without error  00%  8779  -
#12  Short offline  Completed without error  00%  8771  -
#13  Short offline  Completed without error  00%  8619  -
#14  Short offline  Completed without error  00%  8611  -
#15  Short offline  Completed without error  00%  8602  -
#16  Short offline  Completed without error  00%  8450  -
#17  Short offline  Completed without error  00%  8442  -
#18  Short offline  Completed without error  00%  8434  -
#19  Short offline  Completed without error  00%  8282  -
#20  Short offline  Completed without error  00%  8274  -
#21  Short offline  Completed without error  00%  8266  -

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
  1  0  0  Not_testing
  2  0  0  Not_testing
  3  0  0  Not_testing
  4  0  0  Not_testing
  5  0  0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay



I checked all 8 of my drives manually to ensure they were all just fine. It also said "The boot volume state is ONLINE: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected."
Now this has me on edge, as far as i can tell the flash drive it uses for boot is fine. Going to pull a backup of it with DD just in case but now i am not sure if i can trust what the GUI says since it already false positives me on SMART errors.
 

wblock

Documentation Engineer
Joined
Nov 14, 2014
Messages
1,506
dd is not a good backup tool. Back up the FreeNAS configuration with the built-in tools and save it on other media, not the NAS. That can be restored to a freshly-installed new memory stick.

I don't see why it would report a problem with ada1, though.
 

m0nkey_

MVP
Joined
Oct 27, 2015
Messages
2,739
According to the output of smartctl, you have not run a long SMART test. Recommend you do and let it run.
 

LordKitsuna

Cadet
Joined
Nov 22, 2015
Messages
6
According to the output of smartctl, you have not run a long SMART test. Recommend you do and let it run.
I never bothered with a full SMART because when i first got the drives i ran them all through a full round of badblocks, which does a full Read/Write/Verify across the entire disk 4 times. Extended SMART is just a read along the entire disk. So Badblocks is a more thorough test. Anything on the disk would have gotten caught by SMART and logged at some point during the badblock test. although since that was back when i first got them i suppose a quick read along test to make sure everything is still a-ok is never a bad thing

dd is not a good backup tool. Back up the FreeNAS configuration with the built-in tools and save it on other media, not the NAS. That can be restored to a freshly-installed new memory stick.

I don't see why it would report a problem with ada1, though.
How so? i purchased 3 of the flashdrive so i would have spares on hand. Since they are the same layout i dont see how a clone of the disk in its entirety is a "not a good backup" I am all for being wrong but backing up the config and installing fresh just seems like extra steps. in the event of the usb failing
 

wblock

Documentation Engineer
Joined
Nov 14, 2014
Messages
1,506
dd is stupid and blindly copies every block, writing every block on the destination. Writing every block on flash media is not good, messing with what it thinks it can use for spares and wear leveling.

If the destination stick is one block smaller, it fails. If it is even one block larger, the backup GPT table that is supposed to be at the end of the disk is no longer at the end of the disk.

"Unique" IDs on the original disk are copied, and no longer unique.

Times have changed since it was a simple MBR at the beginning of a spinning rust disk. dd still has a place, but it should be the last resort, not the first.
 

LordKitsuna

Cadet
Joined
Nov 22, 2015
Messages
6
dd is stupid and blindly copies every block, writing every block on the destination. Writing every block on flash media is not good, messing with what it thinks it can use for spares and wear leveling.

If the destination stick is one block smaller, it fails. If it is even one block larger, the backup GPT table that is supposed to be at the end of the disk is no longer at the end of the disk.

"Unique" IDs on the original disk are copied, and no longer unique.

Times have changed since it was a simple MBR at the beginning of a spinning rust disk. dd still has a place, but it should be the last resort, not the first.
fair enough, config backup it is
 
Status
Not open for further replies.
Top