Hard drive SMART data reset to zero after power outage

freenas-supero · Dec 11, 2022

So yesterday I had a prolonged power outage that required shutting down my storage server running FreeNAS-11.3-U5. This server is on a UPS that is configured to shutdown when battery level reaches a certain level, ensuring graceful shutdown.

Today I got the CRON job for the smart data and I was surprised to see that the SMART stats for an older drive (da1) were somehow reset to zero.

This is the table I get each Sunday at 5PM

2022-12-04:

Code:

+------+------------------------+----+------+-----+-----+-------+-------+--------+------+----------+------+-----------+----+
|Device|Serial                  |Temp| Power|Start|Spin |ReAlloc|Current|Offline |Seek  |Total     |High  |    Command|Last|
|      |Number                  |    | On   |Stop |Retry|Sectors|Pending|Uncorrec|Errors|Seeks     |Fly   |    Timeout|Test|
|      |                        |    | Hours|Count|Count|       |Sectors|Sectors |      |          |Writes|    Count  |Age |
+------+------------------------+----+------+-----+-----+-------+-------+--------+------+----------+------+-----------+----+
|da0 ? |ML0221F306AUSD          |24  | 94826|  289|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|2731*|
|da1 ? |S1E1RH1L                |26  | 78083|  112|    0|      0|      0|       0|     4| 662079809|   141|          0|2731*|
|da2 ? |ML0220F31B18RD          |26  | 78476|  136|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|2731*|
|da3 ? |ML4220F318UPDK          |25  | 78695|  151|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|2731*|
|da4   |PK2234P8JYHX5Y          |31  | 62815|   37|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|   0|
|da5   |PK1234P8K1GMLP          |29  | 49187|   32|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|   0|
|da6   |WD-WCC7K3KL0CV0         |25  | 19802|   22|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|   0|
|da7   |WD-WX32D70EDHDJ         |23  |  5279|    7|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|   0|
+------+------------------------+----+------+-----+-----+-------+-------+--------+------+----------+------+-----------+----+

2022-12-11 (today):

Code:

+------+------------------------+----+------+-----+-----+-------+-------+--------+------+----------+------+-----------+----+
|Device|Serial                  |Temp| Power|Start|Spin |ReAlloc|Current|Offline |Seek  |Total     |High  |    Command|Last|
|      |Number                  |    | On   |Stop |Retry|Sectors|Pending|Uncorrec|Errors|Seeks     |Fly   |    Timeout|Test|
|      |                        |    | Hours|Count|Count|       |Sectors|Sectors |      |          |Writes|    Count  |Age |
+------+------------------------+----+------+-----+-----+-------+-------+--------+------+----------+------+-----------+----+
|da0 ? |ML0221F306AUSD          |24  | 94993|  290|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|2731*|
|da1   |S1E1RH1L                |26  |    62|  113|    0|      0|      0|       0|     4| 663656672|   142|          0|   0|
|da2 ? |ML0220F31B18RD          |25  | 78643|  137|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|2731*|
|da3 ? |ML4220F318UPDK          |25  | 78862|  152|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|2731*|
|da4   |PK2234P8JYHX5Y          |30  | 62982|   38|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|   0|
|da5   |PK1234P8K1GMLP          |29  | 49354|   33|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|   0|
|da6   |WD-WCC7K3KL0CV0         |25  | 19969|   23|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|   0|
|da7   |WD-WX32D70EDHDJ         |22  |  5446|    8|    0|      0|      0|       0|   N/A|       N/A|   N/A|        N/A|   0|
+------+------------------------+----+------+-----+-----+-------+-------+--------+------+----------+------+-----------+----+

See how "Poweron Hours" and "Last test age" were reset to ZERO. Note that I already posted on this forum about the high value of last test age but never got a solution for this (see this thread). The Smart tests (long & short) are being run on a regular basis as per attached screenshots...

I have a feeling this is only a firmware reporting bug of some sort but I want to make sure..... Of course I have multiple backups... :) Should I worry about that specific drive (da1)?

joeschmuck · Dec 11, 2022

What I find is odd was all your drives were on for an additional 167 hours yet the Last Test Age didn't change.

I have seen drives have issues with recording very high numbers like these. The drive manufacturers didn't expect them to last this long. I have to say that I'm impressed, almost 11 years of power on time for the first drive listed. I hope my current drives last until SSD'd are $100 USD for 6TB (or more).

If you want you can give my new script a try (it's come a long way from the version you are using now, which is simplistic and did what it needed to do). It's still in beta and has not yet been tested on Scale (Debian) but it does work so far on FreeBSD. I have about another week worth of testing, it takes so long mainly because I have a day job so I only spend a few hours a week on the script, and I try to rest everything out that I can simulate. This particular change was pretty involved to add customized alarm setpoints for up to 24 drives. These are intended for drives which require something different than the normal thresholds.

To run it rename it to multi_report.sh and then run it the same as you had the smart_report.sh. But the first time you run it you will be asked to configure it and it will create a configuration file. Just answer the questions. This will make a simple setup and all "should" work fine. You can do more advanced setup using the -config switch. -help will provide you some additional information.

Cheers,
-Joe

freenas-supero · Dec 11, 2022

@joeschmuck

Thanks for the script. I copied it in /root/scripts and ran it with the "-config" switch, provided my email and locations for CSV files. Then the script asked if I wanted to automatically setup fior some basic drive offsets. I didnt know what to enter so I went the safe way and answered NO.

What are these offsets?

Then I saw this:

Creating the new file...
./multi_report.sh: line 5119: echo # Please look at the new Experimental Custom
Drive Settings under -config.: command not found

Humm, kind of slow...

Do you have enough RAM? Looks like about 8KB. Troubleshooting...

I found the problem, the system was identified as a Tandy TRS-80 Model 1
Wow! That was a fantastic consumer computer, in it's day.
Adjusting for the 1.774 MHz clock rate...
Success!

New clean configuration complete.

Path and Name if the configuration file: /root/scripts/multi_report_config.txt

It looks like the config wasnt successful (command not found) and I had a good chuckle at the Tandy stuff....

Shoudl I rerun the setup?

EDIT: Here's the contents of the config TXT file in case you want to see what it generated on my system

# Multi-Report v1.6e dtd:2022-11-11 (TrueNAS Core FreeNAS-11.3-U5)
#
# This file is used exclusively to configure the multi_report version 1.6c or later.
#
# The configuration file will be created in the same directory as the script.
#
# This configuration file will override the default values coded into the script.

###### Email Address ######
# Enter your email address to send the report to. The from address does not need to be changed unless you experience
# an error sending the email. Some email servers only use the email address associated with the email server.

email="*************"
from="****************"

###### Custom Hack ######
# Custom Hacks are for users with generally very unsupported drives and the data must be manually manipulated.
# The goal is to not have any script customized so I will look for fixes where I can.
#
# Allowable custom hacks are: mistermanko
custom_hack="none"

###### Zpool Status Summary Table Settings

usedWarn=80 # Pool used percentage for CRITICAL color to be used.
scrubAgeWarn=37 # Maximum age (in days) of last pool scrub before CRITICAL color will be used (30 + 7 days for day of week). Default=37.

###### Temperature Settings
HDDtempWarn=45 # HDD Drive temp (in C) upper OK limit before a WARNING color/message will be used.
HDDtempCrit=50 # HDD Drive temp (in C) upper OK limit before a CRITICAL color/message will be used.
HDDmaxovrd="true" # HDD Max Drive Temp Override. This value when "true" will not alarm on any Current Power Cycle Max Temperature Limit.
SSDtempWarn=45 # SSD Drive temp (in C) upper OK limit before a WARNING color/message will be used.
SSDtempCrit=50 # SSD Drive temp (in C) upper OK limit before a CRITICAL color/message will be used.
SSDmaxovrd="true" # SSD Max Drive Temp Override. This value when "true" will not alarm on any Current Power Cycle Max Temperature Limit.
NVMtempWarn=50 # NVM Drive temp (in C) upper OK limit before a WARNING color/message will be used.
NVMtempCrit=60 # NVM Drive temp (in C) upper OK limit before a CRITICAL color/message will be used.
NVMmaxovrd="true" # NVM Max Drive Temp Override. This value when "true" will not alarm on any Current Power Cycle Max Temperature Limit.
# --- NOTE: NVMe drives currently do not report Min/Max temperatures so this is a future feature.

###### SSD/NVMe Specific Settings

wearLevelCrit=9 # Wear Level Alarm Setpoint lower OK limit before a WARNING color/message, 9% is the default.

###### General Settings
# Output Formats
powerTimeFormat="h" # Format for power-on hours string, valid options are "ymdh", "ymd", "ym", "y", or "h" (year month day hour).
tempdisplay="*C" # The format you desire the temperature to be displayed in. Common formats are: "*C", "^C", or "^c". Choose your own.
non_exist_value="---" # How do you desire non-existent data to be displayed. The Default is "---", popular options are "N/A" or " ".
pool_capacity="zfs" # Select "zfs" or "zpool" for Zpool Status Report - Pool Size and Free Space capacities. zfs is default.

# Ignore or Activate Alarms
ignoreUDMA="false" # Set to "true" to ignore all UltraDMA CRC Errors for the summary alarm (Email Header) only, errors will appear in the graphical chart.
ignoreSeekError="true" # Set to "true" to ignore all Seek Error Rate/Health errors. Default is true.
ignoreReadError="true" # Set to "true" to ignore all Raw Read Error Rate/Health errors. Default is true.
ignoreMultiZone="false" # Set to "true" to ignore all MultiZone Errors. Default is false.
disableWarranty="true" # Set to "true to disable email Subject line alerts for any expired warranty alert. The email body will still report the alert.

# Disable or Activate Input/Output File Settings
includeSSD="true" # Set to "true" will engage SSD Automatic Detection and Reporting, false = Disable SSD Automatic Detection and Reporting.
includeNVM="true" # Set to "true" will engage NVM Automatic Detection and Reporting, false = Disable NVM Automatic Detection and Reporting.
reportnonSMART="true" # Will force even non-SMART devices to be reported, "true" = normal operation to report non-SMART devices.
disableRAWdata="false" # Set to "true" to remove the smartctl -a data and non-smart data appended to the normal report. Default is false.
ata_auto_enable="false" # Set to "true" to automatically update Log Error count to only display a log error when a new one occurs.

# Media Alarms
sectorsWarn=1 # Number of sectors per drive to allow with errors before WARNING color/message will be used, this value should be less than sectorsCrit.
sectorsCrit=9 # Number of sectors per drive with errors before CRITICAL color/message will be used.
reAllocWarn=0 # Number of Reallocated sector events allowed. Over this amount is an alarm condition.
multiZoneWarn=0 # Number of MultiZone Errors to allow before a Warning color/message will be used. Default is 0.
multiZoneCrit=5 # Number of MultiZone Errors to allow before a Warning color/message will be used. Default is 5.
deviceRedFlag="true" # Set to "true" to have the Device Column indicate RED for ANY alarm condition. Default is true.
heliumAlarm="true" # Set to "true" to set for a critical alarm any He value below "heliumMin" value. Default is true.
heliumMin=100 # Set to 100 for a zero leak helium result. An alert will occur below this value.
rawReadWarn=5 # Number of read errors to allow before WARNING color/message will be used, this value should be less than rawReadCrit.
rawReadCrit=100 # Number of read errors to allow before CRITICAL color/message will be used.
seekErrorsWarn=5 # Number of seek errors to allow before WARNING color/message will be used, this value should be less than seekErrorsCrit.
seekErrorsCrit=100 # Number of seek errors to allow before CRITICAL color/message will be used.

# Time-Limited Error Recovery (TLER)
SCT_Drive_Enable="false" # Set to "true" to send a command to enable SCT on your drives for user defined timeout if the TLER state is Disabled.
SCT_Warning="TLER_No_Msg" # Set to "all" will generate a Warning Message for all devices not reporting SCT enabled. "TLER" reports only drive which support TLER.
# "TLER_No_Msg" will only report for TLER drives and not report a Warning Message if the drive can set TLER on.
SCT_Read_Timeout=70 # Set to the read threshold. Default = 70 = 7.0 seconds.
SCT_Write_Timeout=70 # Set to the write threshold. Default = 70 = 7.0 seconds.

# SMART Testing Alarm
testAgeWarn=2 # Maximum age (in days) of last SMART test before CRITICAL color/message will be used.

###### Statistical Data File
statistical_data_file="/root/scripts/statisticalsmartdata.csv"
expDataEnable="true" # Set to "true" will save all drive data into a CSV file defined by "statistical_data_file" below.
expDataEmail="true" # Set to "true" to have an attachment of the file emailed to you. Default is true.
expDataPurge=730 # Set to the number of day you wish to keep in the data. Older data will be purged. Default is 730 days (2 years). 0=Disable.
expDataEmailSend="Mon" # Set to the day of the week the statistical report is emailed. (All, Mon, Tue, Wed, Thu, Fri, Sat, Sun, Month)

###### FreeNAS config backup settings
configBackup="true" # Set to "true" to save config backup (which renders next two options operational); "false" to keep disable config backups.
configSendDay="Mon" # Set to the day of the week the config is emailed. (All, Mon, Tue, Wed, Thu, Fri, Sat, Sun, Month)
saveBackup="false" # Set to "false" to delete FreeNAS config backup after mail is sent; "true" to keep it in dir below.
backupLocation="/tmp/" # Directory in which to store the backup FreeNAS config files.

###### Attach multi_report_config.txt to Email ######
Config_Email_Enable="true" # Set to "true" to enable periodic email (which renders next two options operational).
Config_Changed_Email="true" # If "true" it will attach the updated/changed file to the email.
Config_Backup_Day="Mon" # Set to the day of the week the multi_report_config.txt is emailed. (All, Mon, Tue, Wed, Thu, Fri, Sat, Sun, Month, Never)

########## REPORT CHART CONFIGURATION ##############

###### REPORT HEADER TITLE ######
HDDreportTitle="Spinning Rust Summary Report" # This is the title of the HDD report, change as you desire.
SSDreportTitle="SSD Summary Report" # This is the title of the SSD report, change as you desire.
NVMreportTitle="NVMe Summary Report" # This is the title of the NVMe report, change as you desire.

### CUSTOM REPORT CONFIGURATION ###
# By default most items are selected. Change the item to false to have it not displayed in the graph, true to have it displayed.
# NOTE: Alarm setpoints are not affected by these settings, this is only what columns of data are to be displayed on the graph.
# I would recommend that you remove columns of data that you don't really care about to make the graph less busy.

# For Zpool Status Summary
Zpool_Pool_Name_Title="Pool Name"
Zpool_Status_Title="Status"
Zpool_Pool_Size_Title="Pool Size"
Zpool_Free_Space_Title="Free Space"
Zpool_Used_Space_Title="Used Space"
Zfs_Pool_Size_Title="^Pool Size"
Zfs_Free_Space_Title="^Free Space"
Zfs_Used_Space_Title="^Used Space"
Zpool_Read_Errors_Title="Read Errors"
Zpool_Write_Errors_Title="Write Errors"
Zpool_Checksum_Errors_Title="Cksum Errors"
Zpool_Scrub_Repaired_Title="Scrub Repaired Bytes"
Zpool_Scrub_Errors_Title="Scrub Errors"
Zpool_Scrub_Age_Title="Last Scrub Age"
Zpool_Scrub_Duration_Title="Last Scrub Duration"

# For Hard Drive Section
HDD_Device_ID="true"
HDD_Device_ID_Title="Device ID"
HDD_Serial_Number="true"
HDD_Serial_Number_Title="Serial Number"
HDD_Model_Number="true"
HDD_Model_Number_Title="Model Number"
HDD_Capacity="true"
HDD_Capacity_Title="HDD Capacity"
HDD_Rotational_Rate="true"
HDD_Rotational_Rate_Title="RPM"
HDD_SMART_Status="true"
HDD_SMART_Status_Title="SMART Status"
HDD_Warranty="true"
HDD_Warranty_Title="Warr- anty"
HDD_Raw_Read_Error_Rate="true"
HDD_Raw_Read_Error_Rate_Title="Read Error Rate"
HDD_Drive_Temp="true"
HDD_Drive_Temp_Title="Curr Temp"
HDD_Drive_Temp_Min="true"
HDD_Drive_Temp_Min_Title="Temp Min"
HDD_Drive_Temp_Max="true"
HDD_Drive_Temp_Max_Title="Temp Max"
HDD_Power_On_Hours="true"
HDD_Power_On_Hours_Title="Power On Time"
HDD_Start_Stop_Count="true"
HDD_Start_Stop_Count_Title="Start Stop Count"
HDD_Load_Cycle="true"
HDD_Load_Cycle_Title="Load Cycle Count"
HDD_Spin_Retry="true"
HDD_Spin_Retry_Title="Spin Retry Count"
HDD_Reallocated_Sectors="true"
HDD_Reallocated_Sectors_Title="Re-alloc Sects"
HDD_Reallocated_Events="true"
HDD_Reallocated_Events_Title="Re-alloc Evnt"
HDD_Pending_Sectors="true"
HDD_Pending_Sectors_Title="Curr Pend Sects"
HDD_Offline_Uncorrectable="true"
HDD_Offline_Uncorrectable_Title="Offl Unc Sects"
HDD_UDMA_CRC_Errors="true"
HDD_UDMA_CRC_Errors_Title="UDMA CRC Error"
HDD_Seek_Error_Rate="true"
HDD_Seek_Error_Rate_Title="Seek Error Rate"
HDD_MultiZone_Errors="true"
HDD_MultiZone_Errors_Title="Multi Zone Error"
HDD_Helium_Level="true"
HDD_Helium_Level_Title="He Level"
HDD_Last_Test_Age="true"
HDD_Last_Test_Age_Title="Last Test Age"
HDD_Last_Test_Type="true"
HDD_Last_Test_Type_Title="Last Test Type"

# For Solid State Drive Section
SSD_Device_ID="true"
SSD_Device_ID_Title="Device ID"
SSD_Serial_Number="true"
SSD_Serial_Number_Title="Serial Number"
SSD_Model_Number="true"
SSD_Model_Number_Title="Model Number"
SSD_Capacity="true"
SSD_Capacity_Title="HDD Capacity"
SSD_SMART_Status="true"
SSD_SMART_Status_Title="SMART Status"
SSD_Warranty="true"
SSD_Warranty_Title="Warr- anty"
SSD_Drive_Temp="true"
SSD_Drive_Temp_Title="Curr Temp"
SSD_Drive_Temp_Min="true"
SSD_Drive_Temp_Min_Title="Temp Min"
SSD_Drive_Temp_Max="true"
SSD_Drive_Temp_Max_Title="Temp Max"
SSD_Power_On_Hours="true"
SSD_Power_On_Hours_Title="Power On Time"
SSD_Wear_Level="true"
SSD_Wear_Level_Title="Wear Level"
SSD_Reallocated_Sectors="true"
SSD_Reallocated_Sectors_Title="Re-alloc Sects"
SSD_Reallocated_Events="true"
SSD_Reallocated_Events_Title="Re-alloc Evnt"
SSD_Pending_Sectors="true"
SSD_Pending_Sectors_Title="Curr Pend Sects"
SSD_Offline_Uncorrectable="true"
SSD_Offline_Uncorrectable_Title="Offl Unc Sects"
SSD_UDMA_CRC_Errors="true"
SSD_UDMA_CRC_Errors_Title="UDMA CRC Error"
SSD_Last_Test_Age="true"
SSD_Last_Test_Age_Title="Last Test Age"
SSD_Last_Test_Type="true"
SSD_Last_Test_Type_Title="Last Test Type"

# For NVMe Drive Section
NVM_Device_ID="true"
NVM_Device_ID_Title="Device ID"
NVM_Serial_Number="true"
NVM_Serial_Number_Title="Serial Number"
NVM_Model_Number="true"
NVM_Model_Number_Title="Model Number"
NVM_Capacity="true"
NVM_Capacity_Title="HDD Capacity"
NVM_SMART_Status="true"
NVM_SMART_Status_Title="SMART Status"
NVM_Warranty="true"
NVM_Warranty_Title="Warr- anty"
NVM_Critical_Warning="true"
NVM_Critical_Warning_Title="Critical Warning"
NVM_Drive_Temp="true"
NVM_Drive_Temp_Title="Curr Temp"
NVM_Drive_Temp_Min="false" # I have not found this on an NVMe drive yet, so set to false
NVM_Drive_Temp_Min_Title="Temp Min"
NVM_Drive_Temp_Max="false" # I have not found this on an NVMe drive yet, so set to false
NVM_Drive_Temp_Max_Title="Temp Max"
NVM_Power_On_Hours="true"
NVM_Power_On_Hours_Title="Power On Time"
NVM_Wear_Level="true"
NVM_Wear_Level_Title="Wear Level"

###### Drive Ignore List
# What does it do:
# Use this to list any drives to ignore and remove from the report. This is very useful for ignoring USB Flash Drives
# or other drives for which good data is not able to be collected (non-standard).
#
# How to use it:
# We are using a comma delimited file to identify the drive serial numbers. You MUST use the exact and full serial
# number smartctl reports, if there is no identical match then it will not match. Additionally you may list drives
# from other systems and they will not have any effect on a system where the drive does not exist. This is great
# to have one configuration file that can be used on several systems.
#
# Example: "VMWare,1JUMLBD,21HNSAFC21410E"

Ignore_Drives="none"

###### Drive UDMA_CRC_Error_Count List
# What does it do:
# If you have a drive which has an UDMA count other than 0 (zero), this setting will offset the
# value back to zero for the concerns of monitoring future increases of this specific error. Any match will
# subtract the given value to report a 0 (zero) value and highlight it in yellow to denote it was overridden.
# The Warning Title will not be flagged if this is zero'd out in this manner.
# NOTE: UDMA_CRC_Errors are typically permanently stored in the drive and cannot be reset to zero even though
# they are frequently caused by a data cable communications error.
#
# How to use it:
# List each drive by serial number and include the current UDMA_CRC_Error_Count value.
# The format is very specific and will not work if you wing it, use the Live EXAMPLE.
#
# Set the FLAG in the FLAGS Section ignoreUDMA to false (the default setting).
#
# If the error count exceeds the limit minus the offset then a warning message will be generated.
# On the Status Report the UDMA CRC Errors block will be YELLOW with a value of 0 for an overridden value.
# -- NOTE: We are using the colon : as the separator between the drive serial number and the value to change.
#
# Format: variable=Drive_Serial_Number:Current_UDMA_Error_Count and add a comma if you have more than one drive.
#
# The below example shows drive WD-WMC4N2578099 has 1 UDMA_CRC_Error, drive S2X1J90CA48799 has 2 errors.
#
# Example: CRC_Errors="WD-WMC4N2578099:1,S2X1J90CA48799:2,P02618119268:1"

CRC_Errors="none"

###### Multi_Zone_Errors List
# What does it do:
# This identifies drives with Multi_Zone_Errors which may be irritating people.
# Multi_Zone_Errors for some drives, not all drives are pretty much meaningless.
#
# How to use it:
# Use same format as CRC_Errors (see above).

Multi_Zone="none"

####### Reallocated Sectors Exceptions
# What does it do:
# This will offset any Reallocated Sectors count by the value provided.
#
# I do not recommend using this feature as I'm a believer in if you have over 5 bad sectors, odds are the drive will get worse.
# I'd recommend replacing the drive before complete failure. But that is your decision.
#
# Why is it even an option?
# I use it for testing purposes only but you may want to use it.
#
# How to use it:
# Use same format as CRC_Errors (see above).

Bad_Sectors="none"

######## ATA Error Log Silencing ##################
# What does it do:
# This will ignore error log messages equal to or less than the threshold.
# How to use:
# Same as the CRC_Errors, [drive serial number:error count]

ata_errors="none"

####### Warranty Expiration Date
# What does it do:
# This section is used to add warranty expirations for designated drives and to create an alert when they expire.
# The date format is YYYY-MM-DD.
#
# Below is an example for the format using my own drives, which yes, are expired.
# As previously stated above, drive serial numbers must be an exact match to what smartctl reports to function.
#
# If the drive does not exist, for example my drives are not on your system, then nothing will happen.
#
# How to use it:
# Use the format ="Drive_Serial_Number:YYYY-MM-DD" and add a comma if you have more than one drive.
#
# Example: Drive_Warranty="K1JUMLBD:2020-09-30,K1JRSWLD:2020-09-30,K1JUMW4D:2020-09-30,K1GVD84B:2020-10-12"

Drive_Warranty="none"

expiredWarrantyBoxColor="#000000" # "#000000" = normal box perimeter color.
WarrantyBoxPixels="1" # Box line thickness. 1 = normal, 2 = thick, 3 = Very Thick, used for expired drives only.
WarrantyBackgndColor="#f1ffad" # Hex code or "none" = normal background, Only for expired drives.

######## Enable-Disable Text Portion ########
enable_text="true" # This will display the Text Section when = "true" or remove it when not "true". Default="true"

###### Global table of colors
# The colors selected you can change but you will need to look up the proper HEX code for a color.

okColor="#b5fcb9" # Hex code for color to use in SMART Status column if drives pass (default is darker light green, #b5fcb9).
warnColor="#f765d0" # Hex code for WARN color (default is purple, #f765d0).
critColor="#ff0000" # Hex code for CRITICAL color (default is red, #ff0000).
altColor="#f4f4f4" # Table background alternates row colors between white and this color (default is light gray, #f4f4f4).
whtColor="#ffffff" # Hex for White background.
ovrdColor="#ffff66" # Hex code for Override Yellow.
blueColor="#87ceeb" # Hex code for Sky Blue, used for the SCRUB In Progress background.
yellowColor="#f1ffad" # Hex code for pale yellow.

joeschmuck · Dec 12, 2022

freenas-supero said:
What are these offsets?

These are adjustments for any values like UDMA_CRC_Errors to zero those out and only react to any increase in the value. It is safe to do. Anything that was overridden will be highlighted in a pale yellow color.

It appears the multi_report_config.txt file was created with version v1.6e. There is a fix for the "date" issue in v1.6e (and earlier) that should fix for non-English configurations, assuming that is the issue. If you want to use the v1.6e version, update the script and add the following line on line 3 of the script LANG="en_US.UTF-8" and that should fix it.

I fixed the typo echo" # Please look at the new Experimental Custom" to echo "# Please look at the new Experimental Custom" did you see the difference? Hence why I said it was still beta and I have more testing to do. I didn't see the error message because my configuration file was already current and then I made a few comment changes and didn't delete and recreate the config file again, yet. I still have more testing to do but in general is should be good. I also updated the date in the script to today, my way to track my changes.

Please let me know if this works now, while I'm fairly certain it will run on FreeNAS 11, it's always good to get confirmation.

freenas-supero · Dec 12, 2022

Hello @joeschmuck

I downloaded 1.6f but unfortunately I am still having the date problem:

Multi-Report v1.6f-beta dtd:2022-12-12 (TrueNAS Core FreeNAS-11.3-U5)
Configuration File Version Date: 2022-12-12
date: illegal option -- I
usage: date [-jnRu] [-d dst] [-r seconds] [-t west] [-v[+|-]val[ymwdHMS]] ...
[-f fmt date | [[[[[cc]yy]mm]dd]HH]MM[.ss]] [+format]
^C

Just for my info, is the Tandy thing just a easter egg or is your script detecting something abnormal???

Humm, kind of slow...

Do you have enough RAM? Looks like about 8KB. Troubleshooting...

I found the problem, the system was identified as a Tandy TRS-80 Model 1
Wow! That was a fantastic consumer computer, in it's day.
Adjusting for the 1.774 MHz clock rate...
Success!

joeschmuck · Dec 12, 2022

The Tandy thing is a joke and for those old enough to know what it was.

The date problem, well I'm at the hospital right now, wife having stuff done to her heart. So when I have time later then I can look into it more. Could you provide the output of locale in the meantime. I don't understand why I'm having so many complaints about fate format issues all of a sudden.

freenas-supero · Dec 12, 2022

joeschmuck said:
wife having stuff done to her heart

Please let go this forum for now and come back when you can. IMO your wife is much more important than anything else.

"locale" for now...

LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_ALL=

joeschmuck · Dec 12, 2022

Thanks for the info. I'm unsure as to why this date issue is happening. I will try to takle it tomorrow. Does it tell you which line failed? I don't use the date command too often, I may need to send you a test version that will spit out a message just before each date command, then I could at least isolate the culprit.

The wife needs open heart surgery, it's no fun. But she's going to be in the hospital for over a week. I'll be bringing my laptop to do some office work and then some script troubleshooting to keep my mind busy.

freenas-supero · Dec 12, 2022

I am pretty sure it is line 5860 with

datestamp2=$(date -Idate)

This is the only place where you call argument I after the date command

joeschmuck · Dec 18, 2022

freenas-supero said:
I am pretty sure it is line 5860 with

This is the only place where you call argument I after the date command

You are absolutely correct. The -I option did not exist in FreeBSD 11.x, it was introduced in version 12.0

You can replace this command with datestamp2=$(date +%Y-%m-%d) and it should work fine.

Please let me know if this works. I also have not tested it on Debian so I need to look that up as well but I expect it to work fine.

And sorry for not getting back to you sooner, the wife is still in the hospital but looking to have her depart the day before Christmas, well we hope.

freenas-supero · Jan 1, 2023

Your script is now running on a weekly basis and well reporting the pool/drive stats, but there are some things that are looking strange to me:

1. I wonder why its reporting the "Scrub repaired bytes" empty but pink colored?
2. Why is the email title: *WARNING* SMART Testing Results for freenas *WARNING* ??

(Note that the last scrub age of 22 for the pool is normal, the server was shutdown on the day the last scrub was scheduled to run ;)

joeschmuck · Jan 2, 2023

I sent you a private message requesting some data. The warning message and colors are not what I'm concerned about, it's the fact that you have empty boxes which is causing the issue. So why are any of the boxes empty? That is what the private message is about.

freenas-supero · Jan 2, 2023

I sent you the email report entirely. Only the last scrub age boxes are empty but the zpool status command is reporting repaired "0" for all pools.... No errors on that side. I believe this is only a reporting issue but I'll let you check it out and please ask if you need anything else.

By the way I hope your wife is doing much better now!!

joeschmuck · Jan 2, 2023

Thanks, she is doing much better.

As for the scrub issue... I got your data and I need to look into it. I'm almost positive I know what is going on and I hope it's not a FreeNAS 11 unique issue. I know I can address it but I need to install a VM of FreeNAS 11 in order to test it out and verify it works correctly.

Important Announcement for the TrueNAS Community.

Hard drive SMART data reset to zero after power outage

freenas-supero

Contributor

Attachments

joeschmuck

Old Man

Attachments

freenas-supero

Contributor

joeschmuck

Old Man

Attachments

freenas-supero

Contributor

joeschmuck

Old Man

freenas-supero

Contributor

joeschmuck

Old Man

freenas-supero

Contributor

joeschmuck

Old Man

freenas-supero

Contributor

joeschmuck

Old Man

freenas-supero

Contributor

joeschmuck

Old Man

Similar threads