Register for the iXsystems Community to get an ad-free experience and exclusive discounts in our eBay Store.

Scripts to report SMART, ZPool and UPS status, HDD/CPU T°, HDD identification and backup the config

Western Digital Drives - The Preferred Drives of FreeNAS and TrueNAS CORE

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
I wanted to share my scripts because I think I'm not the only one who wants to monitor his FreeNAS server a bit more closely and in a more personalized way than the automatic nightly emails would do.

Please note that you're the only one responsible of what you do in the CLI so don't blame me if you mess up your system.

This post is segmented like this:
  • SMART report email
  • ZPool report email
  • UPS report email
  • Display CPU and HDD temperatures
  • Display drives identification infos (Device, GPTID, Serial)
  • Send config backup to email
  • Misc
  • Script basics
  • Changelog



SMART report email

Based on https://forums.freenas.org/index.php?threads/setup-smart-reporting-via-email.6211/ and https://forums.freenas.org/index.ph...cript-to-show-smart-attribute-overview.23804/

For this script I wanted the maximum information in the minimum text size possible. There is a first block which is a summary of the most important values for each drive, it is useful when you don't want to spend too much time to read the detailed blocks.

SATA version:

If a drive is over a chosen limit temperature, or has any reallocated, pending or uncorrectable sectors, or if the last test age is over testAgeWarn then the chosen warning symbol will be added to the end of the device name. If it is over the critical temperature or if any of the reallocated, pending or uncorrectable sectors value is over sectorsCrit then it's the critical symbol that will be added instead of the warning symbol.

Then there is a detailed block for each drive with the overall health self-test result, the SMART attributes, the error log, and the details of the last self-test.

The output (shortened): http://pastebin.com/wjfXbSGw

The script: http://pastebin.com/9xBRFFuB Don't forget to put your email address and your drives labels in the parameters section at the top of the script ;)

SAS version:

If a drive is over a chosen limit temperature then the chosen warning symbol will be added to the end of the device name. If it is over the critical temperature then it's the critical symbol that will be added instead of the warning symbol.

Then there is a detailed block for each drive with the SMART attributes, the error log, and the details of the last self-test.

The output (there's only one drive is this example because I've done the tests with a file provided by cyberjock (thanks to him BTW) as I don't have any SAS drive but look at the output example of the SATA version if you want to see what it looks like with more than one drive): http://pastebin.com/56LuTdKe

The script: http://pastebin.com/veDv2FfZ Don't forget to put your email address and your drives labels in the parameters section at the top of the script ;)



ZPool report email

Similarly to the SMART script, this one output a summary of the most important values of the pools at the top of the email. The chosen critical symbol will be added at the end of the pool name if any of these conditions are met: the pool status is equal to "FAULTED", the used space percentage is greater than the usedCrit value, the last scrub errors value is greater than 0.

Likewise the chosen warning symbol will be added if any of these conditions are met: the pool status is different of "ONLINE", the value of the read, write or checksum errors is greater than 0, the used space percentage is greater than the usedWarn value, the last scrub repaired value is greater than 0, the last scrub is older than the value of scrubAgeWarn (in days).

If the pool has never been scrubbed or if there is a resilver in progress then "N/A" will replace the scrub values.

Then the summary is followed by the output of zpool status -v for each pool.

The output: http://pastebin.com/rd3edPHU

The script: http://pastebin.com/hQ1j6F2g Again, don't forget to put your email address and your pool(s) name(s) in the parameters section at the top of the script.



UPS report email

This script is a bit less useful than the others (I originally created this one to hack the UPS notifier script, if you're interested you can find the hack in this thread) but it allows you to keep an eye on your UPS (particularly on the battery charge).

The output: http://pastebin.com/zKp98XQJ

The script: http://pastebin.com/SzCm7qKS As always, don't forget to put your email address in the parameters section at the top of the script. You can heavily personalize the attributes depending on your particular UPS, use upsc your_ups_name@localhost to see all the attributes and pick the ones you want :)



Display CPU and HDD temperatures

Based on https://forums.freenas.org/index.ph...-hdd-mobo-gpu-temperatures-on-freenas-8.2994/

Don't forget to put the number of CPU cores and your drives labels in the parameters section at the top of the script.

The output: http://pastebin.com/MaQ30u2S

The script (SATA version): http://pastebin.com/FtZKahQk
The script (SAS version): http://pastebin.com/7z7bVV54



Display drives identification infos (Device, GPTID, Serial)

This script is particularly useful when you have to replace a failed drive and you don't know which drive is which for example.

Be careful to not rely on the device name if you've rebooted since you've executed the script because it can change from reboot to reboot.

The output: http://pastebin.com/HgwjJNWu

The script (SATA version): http://pastebin.com/DsjT51aq
The script (SAS version): http://pastebin.com/59AxfasQ



Send config backup to email

Based on https://forums.freenas.org/index.php?threads/nightly-check-of-freenas-database.19999/

This script checks the configuration file integrity then sends a copy of it and md5 + sha256 checksums in a tar file to the email address if everything is ok, or sends an error message otherwise.

Because of the method used to attach the file it works with Gmail but doesn't work with Outlook.com and Thunderbird (other email clients haven't been tested).

The script: http://pastebin.com/syF2JeAU Don't forget to put your email address in the parameters section at the top of the script ;)



Misc

### For those who want a self-adaptive script for the drives you can follow this tutorial: http://pastebin.com/7RFkrxJU

### For those who want a self-adaptive script for the ZPools you can follow this tutorial: http://pastebin.com/VpCLUtry

### I use CRON to execute automatically the scripts after the automated SMART tests and scrubs accordingly to this schedule:
Code:
+===================+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+
|task          day->|01   |02   |03   |04   |05   |06   |07   |08   |09   |10   |11   |12   |13   |14   |15   |16   |17   |18   |19   |20   |21   |22   |23   |24   |25   |26   |27   |28   |29   |30   |31   |
+===================+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+=====+
|boot scrub         |04:00|     |     |     |     |     |     |     |     |     |     |     |     |     |     |04:00|     |     |     |     |     |     |     |     |     |     |     |     |     |     |     |
+-------------------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+
|pool scrub         |05:00|     |     |     |     |     |     |     |     |     |     |     |     |     |     |05:00|     |     |     |     |     |     |     |     |     |     |     |     |     |     |     |
+-------------------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+
|long smart test    |     |04:00|     |     |     |     |     |     |     |     |     |     |     |     |     |     |04:00|     |     |     |     |     |     |     |     |     |     |     |     |     |     |
+-------------------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+
|short smart test   |     |     |     |     |     |     |06:00|     |     |     |     |06:00|     |     |     |     |     |     |     |     |     |06:00|     |     |     |     |06:00|     |     |     |     |
+-------------------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+
|send smart report  |     |     |07:00|     |     |     |     |07:00|     |     |     |     |07:00|     |     |     |     |07:00|     |     |     |     |07:00|     |     |     |     |07:00|     |     |     |
+-------------------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+
|send zpool report  |     |     |07:01|     |     |     |     |     |     |     |     |     |     |     |     |     |     |07:01|     |     |     |     |     |     |     |     |     |     |     |     |     |
+-------------------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+
|send ups report    |     |     |07:02|     |     |     |     |     |     |     |     |     |     |     |     |     |     |07:02|     |     |     |     |     |     |     |     |     |     |     |     |     |
+-------------------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+
|send config backup |     |     |07:03|     |     |     |     |     |     |     |     |     |     |     |     |     |     |07:03|     |     |     |     |     |     |     |     |     |     |     |     |     |
+-------------------+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+



### Of course you can personalize the scripts to show/don't show the values you want, it's pretty simple, but if you have a problem you can post below and I'll help you ;)

### Feel free to suggest me some improvements :)



Script basics

I recommend to use PuTTY (or an UNIX system) to SSH with a full screen window (to avoid breaking long lines in the script) and the nano text editor. The CLI integrated in the web GUI is full of bugs, don't use it for more than executing a few commands to test something rapidly.

Do not put your scripts in /bin or other system directories because if you ever edit or delete one of the system script/binary inadvertently you'll don't like the result... And especially on FreeNAS I wouldn't use /bin because of the updates, the /base thing, etc. I recommend either to put your scripts on one of your pool or in a directory in the home directory of one of the users.

What I like to do on my UNIX systems is to create a directory "scripts" in my home directory to put all my scripts, the path will then look like this: /home/your_user_name/scripts/

I recommend against using special characters and/or spaces in the directories and files names, they can only bring problems. Remember too that UNIX systems are case sensitive.

So, first create a directory (if you want) to put your scripts in: mkdir scripts

Go in this directory (cd scripts) and open a new a file: nano -w your_script.sh

Copy the raw paste data of the script on pastebin (at the bottom of the page), be careful to not forget the first or last line, the best thing to do is to click on the text, do a Ctrl + A and then a Ctrl + C.

Paste the data in the file you created by right-clicking (on the UNIX systems if you highlight something it'll be copied to the clipboard and if you middle-click (right-click on Putty) you'll paste whatever is currently in the clipboard)

Check rapidly that everything is as it should, do the changes you want to make (email address, ...) then save and quit the editor (for nano: Ctrl + O then Enter, and then Ctrl + X)

Add execution rights to the file: chmod +x your_script.sh

If you need you can change the owner (and group; just omit the ":group" part if you don't need it): chown user:group your_script.sh

Now you should have a working script, you can test it like this: ./your_script.sh

Please note that several (all?) of the scripts needs root rights so use the root account to execute them.

You can now add a CRON task in the web GUI. Make sure you select "root" as the user and use the absolute path to your script as the command (for example: /root/scripts/your_script.sh).



Changelog

You can find the changelog here: http://pastebin.com/QupKTWAK
 
Last edited:

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I've actually wrote a script for gathering SMART info and putting it in a table. I have to add SAS drive support though as iXsystems uses a lot of SAS (which outputs different SMART info).

Moved to the scripts section!
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
Interesting, I didn't know that the SAS drives have a different SMART info.

I saw you talked about a guide you've wrote about interpreting the SMART info for the noobs on another topic (and I had the same idea :D). It would be pretty interesting to read but I can't find it, is this normal?

Thank you ;)
 
Last edited:

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
It's not released yet.
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
Oh, ok.
 

titan_rw

Neophyte Sage
Joined
Sep 1, 2012
Messages
591
I can see that hardcoding the devices (da0, da1, etc) is going to be problematic.

You're going to have to remember to update the script if you make any hardware changes. For example, if I have extra ports, and change a drive from being connected directly to the MB to going through a SAS HBA, there'll be one less ADA and one more DA device.

Kind of a pain to need to constantly edit monitoring scripts.
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
It's just that I prefer when I have the full control on the scripts and I don't plan to change that kind of thing very often, but I understand that it's not the case of everyone so I added the method for a self-adaptive script in the "Misc" section at the end of the first post ;)

Thanks for the comment :)
 

nick779

Member
Joined
Dec 17, 2014
Messages
189
My God....

THANK YOU THANK YOU THANK YOU
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
You're welcome ;)
 

nick779

Member
Joined
Dec 17, 2014
Messages
189
You're welcome ;)
Question, None of my drives seem to support the Seek Errors, Total Seeks, High Fly Writes, and Cmd Timeout fields on the main output. I attempted to remove these fields and ended up with awk errors.
Would you mind modifying the script to remove those 4 columns in the quick glance output?
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
This should do the trick:
Code:
for drive in $drives
do
    (
        (echo "device: ${drive}"; smartctl -A -i /dev/${drive}) | awk '\
        /device:/{device=$2} \
        /Serial Number:/{serial=$3} \
        /Temperature_Celsius/{temp=$10} \
        /Power_On_Hours/{onHours=$10} \
        /Start_Stop_Count/{startStop=$10} \
        /Spin_Retry_Count/{spinRetry=$10} \
        /Reallocated_Sector/{reAlloc=$10} \
        /Current_Pending_Sector/{pending=$10} \
        /Offline_Uncorrectable/{offlineUnc=$10} \
        END {
            printf "|%-6s|%-15s| %s |%5s|%5s|%5s|%7s|%7s|%8s|\n",
            device, serial, temp, onHours, startStop, spinRetry, reAlloc, pending, offlineUnc;
        }'
    ) >> ${logfile}
done


Edit: I've somewhat optimized the script by using -A -i instead of -a for the smartctl options.
 
Last edited:

nick779

Member
Joined
Dec 17, 2014
Messages
189
This should do the trick:

Code:
for drive in $drives
do
    (
        (echo "device: ${drive}"; smartctl -a /dev/${drive}) | awk '\
        /device:/{device=$2} \
        /Serial Number:/{serial=$3} \
        /Temperature_Celsius/{temp=$10} \
        /Power_On_Hours/{onHours=$10} \
        /Start_Stop_Count/{startStop=$10} \
        /Spin_Retry_Count/{spinRetry=$10} \
        /Reallocated_Sector/{reAlloc=$10} \
        /Current_Pending_Sector/{pending=$10} \
        /Offline_Uncorrectable/{offlineUnc=$10} \
        END {
            printf "|%-6s|%-15s| %s |%5s|%5s|%5s|%7s|%7s|%8s|\n",
            device, serial, temp, onHours, startStop, spinRetry, reAlloc, pending, offlineUnc;
        }'
    ) >> ${logfile}
done
Works perfect.

Edit: dont want to clutter this thread, making a new one for Cron issue
 
Last edited:

Cellobita

Member
Joined
Jul 15, 2011
Messages
47
Excellent scripts! An interesting option for the SMART email report would be to print a * after the device name if the reallocated , pending or uncorrectable sectors values for it were not 0 (zero). This would allow you to see at a glance if any drive has bad or flaky sectors. Is this doable?
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
Thanks! Oh yeah, excellente idea, I can definitely add that :)
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
I've modified the script according to your idea and added a max temperature at the same time (because why not? :D)
 

Cellobita

Member
Joined
Jul 15, 2011
Messages
47
Works great - your scripts are much appreciated!
 

marian78

Member
Joined
Jun 30, 2011
Messages
207
Thanks, dear sir.
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
Thanks ;)

I added the time to the UPS script because my mailbox doesn't show the seconds by default (and I've to go through a few clicks to see it, it's just a pain in the ass) and when the UPS goes on battery for less than a minute (I hacked the default UPS script to send the same info than this one when the UPS status changes, you can see how in this thread) the emails show the same time so I can't see easily for how long the UPS has been on battery.
 
Last edited:

adrianwi

Neophyte Sage
Joined
Oct 15, 2013
Messages
1,178
Great work! Thanks.
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
I completely recreated the ZPool script to include a summary block as in the SMART script (it just took one afternoon and one evening after all... xD)

Thanks Adrian ;)

Edit: I added the critical symbol to the summary of the SMART script like in the ZPool script.
 
Last edited:
Top