Resource icon

multi_report.sh version for Core and Scale 3.0

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994

TooMuchData

Contributor
Joined
Jan 4, 2015
Messages
188
TooMuchData updated multi_report.sh version for Core and Scale with a new update entry:

Joe is the Eveready Bunny of scripting!

# v1.6d (05 October 2022)
# - Thanks goes out to ChrisRJ for offering some great suggestions to enhance and optimize the script.
# - Updated gptid text and help text areas (clarifying information)
# - Updated the -dump parameter to -dump [all] and included non-SMART attachments.
# - Added Automatic UDMA_CRC, MultiZone, and Reallocated Sector Compensation to -config advanced option K.
# - Fixed Warranty Date always showing as expired.
# - Added Helium and Raw Read Error Rates to...

Read the rest of this update entry...
 

TooMuchData

Contributor
Joined
Jan 4, 2015
Messages
188
TooMuchData updated multi_report.sh version for Core and Scale with a new update entry:

Corrected Version 1.6d

# v1.6d (05 October 2022)
# - Thanks goes out to ChrisRJ for offering some great suggestions to enhance and optimize the script.
# - Updated gptid text and help text areas (clarifying information)
# - Updated the -dump parameter to -dump [all] and included non-SMART attachments.
# - Added Automatic UDMA_CRC, MultiZone, and Reallocated Sector Compensation to -config advanced option K.
# - Fixed Warranty Date always showing as expired.
# - Added Helium and Raw Read Error Rates to...

Read the rest of this update entry...
 

TooMuchData

Contributor
Joined
Jan 4, 2015
Messages
188
TooMuchData updated multi_report.sh version for Core and Scale with a new update entry:

Corrected and Improved

# v1.6d-1 (08 October 2022)
# - Bug Fix for converting multiple numbers from Octal to Decimal. The previous process worked "most" of the time
# -- but we always aim for 100% working.
#
# The multi_report_config file is compatable with version back to v1.6d.
#
# v1.6d (05 October 2022)
# - Thanks goes out to ChrisRJ for offering some great suggestions to enhance and optimize the script.
# - Updated gptid text and help text areas (clarifying information)
# - Updated the -dump...

Read the rest of this update entry...
 

TooMuchData

Contributor
Joined
Jan 4, 2015
Messages
188
TooMuchData updated multi_report.sh version for Core and Scale with a new update entry:

More fixes from The Schmuck

# v1.6d-2 (09 October 2022)
# - Bug fix for NVMe power on hours.
# --- Unfortunately as the script gets more complex it's very easy to induce a problem. And since I do not have
# --- a lot of different hardware, I need the users to contact me and tell me there is an issue so I can fix it.
# --- It's unfortunate that I've have two bug fixes already but them's the breaks.
# - Updated to support more drives Min/Max temps and display the non-existant value if nothing is obtained vice...

Read the rest of this update entry...
 

Deeda

Explorer
Joined
Feb 16, 2021
Messages
65
Thanks Joe, just updated to the latest version and it's working well.
 

isopropyl

Contributor
Joined
Jan 29, 2022
Messages
159
What is the proper way to run this with TrueNAS?
I see the field to input e-mail, and I have e-mail notifications setup. I inputted the e-mail in that field. So my question is simply how do I set the script to run, and where do I place it?
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
What is the proper way to run this with TrueNAS?
I see the field to input e-mail, and I have e-mail notifications setup. I inputted the e-mail in that field. So my question is simply how do I set the script to run, and where do I place it?
I created a folder for the script and have a cronjob running it every week.
Code:
######### INSTRUCTIONS ON USE OF THIS SCRIPT
#
# This script will perform three main functions:
# 1: Generate a report and send an email on your drive(s) status.
# 2: Create a copy of your Config File and attach to the same email.
# 3: Create a statistical database and attach to the same email.
#
# In order to configure the script properly read over the User-definable Parameters before making any changes.
# Make changes as indicated by the section instructions.
#
# To run the program from the command line, use ./program_name.sh [-h] for additional help instructions,
# and [-config] to run the configuration routine (highly recommended).
#
# If you create an external configuration file, you never have to edit the script,
# so how many times do I need to say it is highly recommended?  And I may force the
# change to require the external configuration file.
#
# You may need to make the script executable using "chmod +x program_name.sh"
#
 
Last edited:

Deeda

Explorer
Joined
Feb 16, 2021
Messages
65
Hi Joe,

Running the latest version, have just noticed the email reports for one of my servers report the pool size incorrectly. Please see screenshot attached.
 

Attachments

  • Screenshot 2022-10-26 202232.png
    Screenshot 2022-10-26 202232.png
    14.9 KB · Views: 82

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
Hi Joe,

Running the latest version, have just noticed the email reports for one of my servers report the pool size incorrectly. Please see screenshot attached.
Can you provide me some details and I could fix it up. I need the file created this way so I can pass it through the script on my end to find out what the issue is, it can't be a cut/paste operation as that at times will not be processed exactly the same. Sorry that I'm requesting a lot of data from you but I haven't heard anyone else having this issue so I'm perplexed, especially if the other pools are reporting correctly.

I need to know the multi_report_config.txt file value under General Settings -> pool_capacity="zfs" or ="zpool". Default is "zfs". you can change this value to "zpool" to see what the results are but I prefer to use the "zfs" value as it lines up with TrueNAS values. "Zpool" was the older way this script displayed the data.

The commands below will place the files in the location you run the commands from. you could place them in the /tmp/ location (ex. /tmp/pool_status.txt) if you desire and they will be deleted upon reboot, but you need access to them to copy the files off the system. PM me if you need further assistance.

zpool status Pool2 > pool_status.txt
zpool list -H -p -o capacity Pool2 > pool_used.txt
zpool list -H -o size Pool2 > pool_size.txt
zpool list -H -o free Pool2 > pool_free.txt
zfs list Pool2 > zfs_list.txt

Then attach the file in the forums and I'll grab it. I might not be able to do anything until Friday, busy week at work so I'm getting home late.

If I need more data then I will PM you. Actually I will PM you with an updated script when I fix the issue to make sure it works. if it does, it will be in the next version release which I will likely make happen in the next month.

-Joe
 

awasb

Patron
Joined
Jan 11, 2021
Messages
415
Please use >> instead of >.

>> will append data.
> will overwrite.
 
Last edited:

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
Please use >> instead of >.

>> will append data.
> will overwrite.
I do not want want appended data. That does not help me. They need to be clean files for me to process them. Do it how I listed please.
 

awasb

Patron
Joined
Jan 11, 2021
Messages
415
Ah. Sorry. Misread that. It's a one time action. Again: Sorry.
 

Deeda

Explorer
Joined
Feb 16, 2021
Messages
65
Hi Joe,

I've attached the files requested.

In my config file pool_capacity="zfs"
 

Attachments

  • zfs_list.txt
    84 bytes · Views: 70
  • pool_free.txt
    6 bytes · Views: 74
  • pool_size.txt
    6 bytes · Views: 96
  • pool_used.txt
    2 bytes · Views: 69
  • pool_status.txt
    680 bytes · Views: 104

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
I've attached the files requested.
Thanks. The data you provided looks correct, now I need to figure out what blasted math is wrong. I will be able to use your exact data to feed into the script to troubleshoot it. Math in BASH sucks!

Ah. Sorry. Misread that. It's a one time action. Again: Sorry.
No problem. The reason I ask for the data in this way is because when 'awk' looks through it, any special/hidden characters can throw me for a loop, so I need the data that would be presented in it's exact format. Cut and Paste often interprets some characters and will rain hell all over me as I'm scratching my head to figure out why I can't replicate the problem. I've included in the script the -dump parameter so the script will automatically dump the data I typically need (drive data), but it does not include zpool info, YET. It should on the next version, but I just hope I don't need to collect all that data many more times.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
@Deeda You have a message and a file to see if it fixes the issue.
 

syruprise

Cadet
Joined
Jul 7, 2019
Messages
3
First off thanks 100 million cuz this multi_report script has made keeping track of both my Core & Scale systems much nicer. These are all super minor things but figured i would share my systems quirks and stuff.

dax/daxx drives jumbled:

Hdd Summary Report and in the SMART summary report for Core system. Can fix in statistical_data by just adding a zero in front so it reads da0x/da0xx. Don't know if that would work here with this email formatting. Probably more elegant way to do it but idk.

Helium on Toshiba MG0# drive:
These Toshiba drives use SMART attribute id 23 & 24 as helium. Read from zero instead of hundred but still would be nice in the Summary Report for quick glance in case of statistical change. On Core System.

Reserve NAND block Micron/Crucial ssd's:
On Cruical MX / Micron ssd's SMART attribute id 180 is Unused Reserve NAND blk. My understanding is that this is counting down the overprovisioned NAND blocks left on the drive. Be nice to have it on the SSD summary report on SCALE system. Would be double cooler to have a system like you did with UDMA CRC errors that could set number and warn if the number has dropped. Might be too edge case to put in all that work i admit. SSD's in my experience just randomly die anyways.

Commented email section:
Hell of a time getting script to work without failing when i first tried the script out. Took a hot minute but eventually figured that email providers (gmail/outlook/etc) were NOT liking the (from="TrueNAS@local.com") part. Once I switched the from section to (from = "myemail@address.com") it has worked fine. Might just put in comment section above something like: "The from address does not need to be changed but if failure just enter your email address in section as well". No picture of failure but could maybe recreate if needed.

Many thanks!
 

Attachments

  • micron-ssd.png
    micron-ssd.png
    27.4 KB · Views: 76
  • toshiba-drives.png
    toshiba-drives.png
    31.9 KB · Views: 97
  • da-order.png
    da-order.png
    6.2 KB · Views: 94

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
These are all super minor things but figured i would share my systems quirks and stuff.
While they might be small minor things, they are things none the less. To address some of these:

dax/daxx drives jumbled:
Can fix in statistical_data by just adding a zero in front so it reads da0x/da0xx.
Adding a leading zero would make the device name technically incorrect and would definitely cause confusion for anyone troubleshooting a drive. For example, if I have the script report drive /dev/da01 has a bad sector, then I manually run the command smartctl -a /dev/da01 it will return an error that the device does not exist. But I appreciate you trying to offer a solution, most people do not make that effort.

That is odd, everyone else who has used in in both Core and Scale are sorted, well the results that people have sent me. I know early versions were not sorted. I will look at the sort routine to make sure I'm sorting properly and that I didn't break it at some point in time.

Helium on Toshiba MG0# drive:
Reserve NAND block Micron/Crucial ssd's:
When some data is listed as "unknown attribute" I can't guess what it pertains to. I do not have a table of drive make/models to do this work, that is what I rely on 'smartctl' to decode. I would need the -dump command run and the select drive files sent my way to add them. I need to test the code to make sure it works and I do not mess up something, which is very easy to do as this script has gotten more complex each month.

Commented email section:
That is an odd problem and the first time I've heard this and I suspect your email server does not like it. What email server/service do you use? I'm using msn.com (now called outlook.com) now but have used hotmail.com and gmail.com in the past, but I have no idea if they would work today. But I can add a comment to address it.

I will send you a Conversation request (PM). I'd like to collect your data in order to update the script.
 

Cuprum

Cadet
Joined
Aug 15, 2018
Messages
6
Hi everyone!

First, thank you so much to the team making this script possible, it helps a lot for doing the follow up of my server!

As you can see in my signature, my system uses two mirrored Kingston A400 SSD as boot drives. They show the wear level with attribute ID 231 and with the Attribute Name SSD_Life_Left and they both are currently at 99 (as per the Raw Value). The issue is that the report shows "Wear Level" at 1 when emailed. See below:

Wear Level Report.png


As per Kingston's SMART Attribute Details (link, PDF file): the attribute indicates the approximate SSD life left where 100 = best and 1 = worst.

Is there any workaround to set the right value of wear level in the report? For reference, I'm attaching the smartctl --all for the drives but if a dump or additional info is needed, please let me know.

Thanks!
 

Attachments

  • ada0.txt
    6.4 KB · Views: 77
  • ada1.txt
    6.4 KB · Views: 67

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
Thanks for reporting the error. It's difficult to try to get every version of every drive out there and make the SMART data work for you.

In the meantime you could manually edit the multi_report_config.txt file and look for the value 'wearLevelCrit=9' and change it to a value of '1'.

You can also change it using '-config' option then a -> a -> c and then change it form 9 to 1.

I might have a fix today but before I make a change I need to make sure I do not break something else. When I do have a fix I will PM you and attach the updated script for you to test. I would appreciate a quick feedback on it if possible since I'm about to release a new version any day now and if this does fix a problem, I'd like to include it. Additionally I'd like to collect some data from you in a PM for my testing when updating the script. It's good to have '-dump' drive data since I do not have all different types of drives at my fingertips.

-Joe
 
Last edited:
Top