Scripts to report SMART, ZPool and UPS status, HDD/CPU T°, HDD identification and backup the config

Magius · May 1, 2022

dak180 said:
Could you try out my first pass and comment in the github issue tracking this (I have no sas drives and therefor no way to test myself)?

I just tested it and commented on the Github. High level summary it was ~98% perfect. I left you a couple notes with line numbers to fix minor issues but otherwise it looks awesome! I also left some notes for additional things you could add in the future if you like, but right now none of them are output in the json format (hence why I didn't mention them above when giving the json parsing instructions). I probably should have added those to the bug report ticket last night, I'll see about updating the ticket or making a new one...

I used to live and breathe this stuff and have written several tools of my own for parsing and interpreting health data, so if you have any other questions please reach out. There's a lot of vendor proprietary stuff I wish I could share publicly to enhance the state of the tools we have available (including my own pySMART) but a lot of that stuff (particularly for SSDs!) they keep close to the chest and only share under NDA/PIA. :(

bollar · May 9, 2022

FYI, I tried running this on SCALE and got a "bc is missing, please install" error. bc isn't included in SCALE and getting it would require a manual download and installation using dpkg.

Maybe there's another way to reformat the hours data?

dak180 · May 9, 2022

bollar said:
FYI, I tried running this on SCALE and got a "bc is missing, please install" error. bc isn't included in SCALE and getting it would require a manual download and installation using dpkg.

A known issue see also this issue on jira it will require 10 votes for the devs to look at it.

bollar · May 9, 2022

dak180 said:
A known issue see also this issue on jira it will require 10 votes for the devs to look at it.

Okay. I gave it the first vote… ‍

‍

bollar · May 9, 2022

Given awk is used earlier in the script and the math seems easy, it might be able to be used instead of bc. I'll think about that.

Maybe something like this:

Code:

local yrs = awk -v awkvar="$onHours" '{ print awkvar / 8760; }'

dak180 · May 9, 2022

bollar said:
Given awk is used earlier in the script

One of my goals is to eliminate as much awk as possible both for speed and clarity reasons. The only reason I have not yet taken it out of zfs status section is that I would like to wait until they add a json output option (currently in progress) and change all the parsing once.

Magius · May 13, 2022

bollar said:
Given awk is used earlier in the script and the math seems easy, it might be able to be used instead of bc. I'll think about that.

Maybe something like this:

Code:
local yrs = awk -v awkvar="$onHours" '{ print awkvar / 8760; }'

I submitted a handful of bug fixes and feature adds over the last couple weeks and almost all of them used awk. dak accepted all the contributions, but before merging my code he swapped out all the awks for something else. He seems pretty set on getting rid of all the awk, which honestly isn't a bad idea. I'm sure he'd accept a patch if you put one together.

Here's an example way of parsing column formatted data without awk, which seems like 90% of what we all use awk for:

Code:

# echo "this is example:      column output" | awk '{print $4" "$5}'
column output
# echo "this is example:      column output" | tr -s " " | cut -d ' ' -sf '4 5'
column output

Ericloewe · May 13, 2022

As someone with no strong opinions on awk, why the awk hate?

Magius · May 13, 2022

Ericloewe said:
As someone with no strong opinions on awk, why the awk hate?

I can't speak for dak and I personally use awk all the time (like I said he had to remove it from almost all the patches I submitted :)). I believe it's just a personal choice on his part not to invoke another programming language inline like we often do. Keep the code in "pure bash"?
He also said something above about performance, which I've never measured to compare, but I could see if you're invoking awk hundreds of times in a script, setting up a whole new programming environment each time just to split a string into parts, that overhead could probably add up? I'd be curious to run some timing tests and see what the impact is, but at the end of the day it's his script so I'll follow his rules :)

dak180 · May 13, 2022

Ericloewe said:
As someone with no strong opinions on awk, why the awk hate?

More than anything else it is about easy future readability and maintainability; see my first refactor commit for an example. It also makes things like shellcheck.net much more useful.

creoleninja · Jun 30, 2022

THANK YOU !! helped me out 7 years later . Mahalo!

qmcb23YR · Jul 1, 2022

@dak180 I more or less managed to make the script work (for my purposes) under SCALE.

installed bc (I know, not great)
manually defined my disks due to a sysctl error, i.e. changed <<< "$(for drive in $(sysctl -n kern.disks | sed -e 's:nvd:nvme:g'); do to <<< "$(for drive in { sda sdb ... }; do
changed sed -i '' -e to sed -i -e
Tried to make all the date instances work, but failed with the 'last scrub' column, as date -jf does not work under Linux

The only thing that is not displaying correctly is the 'last scrub age' column. I'll keep searching, however so far have yet to find the correct Linux syntax. Any idea what to use instead of date -jf?

dak180 · Jul 1, 2022

qmcb23YR said:
I more or less managed to make the script work (for my purposes) under SCALE.

Post your edits to a fork on github (ideally with the commit(s) referencing the relevant issue) and I will see what I can do.

qmcb23YR said:
installed bc (I know, not great)

[NAS-115175] - iXsystems TrueNAS Jira

ixsystems.atlassian.net

qmcb23YR · Jul 1, 2022

I do not have a GitHub account.

Anyway, it's solved after a bit of trial and error--turns out Linux's date does not much care about the input format, as long as the day/year are switched around vs what's in the script, else there's an invalid date error. This might depend on the user's time zone preferences, I have not done any testing around this.

The below works, and the script is now fully functional under SCALE, at least for the reporting functions.

Code:

            # Convert time/datestamp format presented by zpool status, compare to current date, calculate scrub age
            if [ "${multiDay}" -ge 1 ] ; then
                scrubDate="$(echo "${statusOutput}" | grep "scan:" | awk '{print $15"-"$14"-"$17" "$16}')"
            else
                scrubDate="$(echo "${statusOutput}" | grep "scan:" | awk '{print $13"-"$12"-"$15" "$14}')"
            fi
            scrubTS="$(date -d "${scrubDate}" +%s)"
            currentTS="$(date +%s)"
            scrubAge="$((((currentTS - scrubTS) + 43200) / 86400))"
            if [ "${multiDay}" -ge 1 ] ; then
                scrubTime="$(echo "${statusOutput}" | grep "scan" | awk '{print $6" "$7" "$8}')"
            else
                scrubTime="$(echo "${statusOutput}" | grep "scan" | awk '{print $6}')"
            fi

Important Announcement for the TrueNAS Community.

Scripts to report SMART, ZPool and UPS status, HDD/CPU T°, HDD identification and backup the config

Magius

Explorer

bollar

Patron

dak180

Patron

bollar

Patron

bollar

Patron

dak180

Patron

Magius

Explorer

Ericloewe

Server Wrangler

Magius

Explorer

dak180

Patron

creoleninja

Cadet

qmcb23YR

Dabbler

dak180

Patron

[NAS-115175] - iXsystems TrueNAS Jira

qmcb23YR

Dabbler

Similar threads

Important Announcement for the TrueNAS Community.

Scripts to report SMART, ZPool and UPS status, HDD/CPU T°, HDD identification and backup the config

Explorer

Patron

Patron

Patron

Patron

Patron

Explorer

Server Wrangler

Explorer

Patron

Cadet

Dabbler

Patron

Dabbler

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Scripts to report SMART, ZPool and UPS status, HDD/CPU T°, HDD identification and backup the config"

Similar threads