multi_report.sh version for Core and Scale 3.0

mistermanko · Mar 22, 2022

joeschmuck said:
Interesting. Out of curiosity, did you change anything like the date format of the system and language? Anything which might change the way the system handles date/time functions? I'm not sure why that would cause this, I would expect the date function to just work properly regardless. i do not recall using the underscore in the date section. And while it looks like you are running TrueNAS 12.0-U8 (Core), I do want to verify you are running Core. The date function quoted above is for Core and I would hate it if the code failed to recognize you were really running Scale.

I can't recall if I've ever changed the date and time settings, but here there are:

And can confirm I'm running core 12.0 U8. If I can be of any further help, let me know.

joeschmuck · Mar 22, 2022

mistermanko said:
I can't recall if I've ever changed the date and time settings, but here there are:
View attachment 54292
And can confirm I'm running core 12.0 U8. If I can be of any further help, let me know.

I will give this a try on my system sometime this week to see what happens. If there is a fix, hopefully it can be made easily.

joeschmuck · Mar 22, 2022

Something is wrong with my TrueNAS, I'm unable to change the time zone. Weird. Even after rebooting it still reverts back. Maybe my config file is corrupt, possible since I have been using it for many years and not rebuilding anything. I am also using 12.0-U8. So it will have to wait for me to look at it until this weekend. If you change the timezone to America/New York and you no longer have the issue, then we know it's the timezone. From my search on the internet, the location is an issue and the %b value makes a bad impact. Maybe I could do something else with the code but until I can change my timezone, I can't really do much except investigate a different way to fix it. "on" should be the first three letters of the Month, not sure if "on" is correct for Germany.

mistermanko · Mar 24, 2022

Ok I dug a little deeper in your script. The relevant line is referring to the date and time specified by the last run scrub. The GUI localization has no impact on how zpool status is displaying this time value. See here:

scan: scrub repaired 0B in 1 days 00:16:33 with 0 errors on Sat Mar 19 03:49:37 2022
The script greps it with this line:
scrubDate="$(echo "$statusOutput" | grep "scan" | awk '{print $15"-"$12"-"$13"_"$14}')"
Which results in: 19-on-Sat_Mar
So the follwing date conversion is running into a failure obviously.
But how can we fix it?
Maybe the awk part should need to change to awk '{print $17"-"$14"-"$15"_"$16}'
with a result of 2022-Mar-19_03:49:37 in my example. Which in result will make the date conversion work.

date -j -f "%Y-%b-%e_%H:%M:%S" "2022-Mar-19_03:49:37" "+%s" results in:1647658177 This may be the desired value!?

Seems that a lot of awk positions are off. Here is another one:
scrubErrors="$(echo "$statusOutput" | grep "scan" | awk '{print $8}')"
Should be awk '{print $10} to print 0 scrub errors instead of the last scrub duration.

joeschmuck · Mar 25, 2022

mistermanko said:
Seems that a lot of awk positions are off. Here is another one:
scrubErrors="$(echo "$statusOutput" | grep "scan" | awk '{print $8}')"
Should be awk '{print $10} to print 0 scrub errors instead of the last scrub duration.

But Duration is what is reported in the cell header and why we are doing all that time stuff. You can of course modify the code to your liking, we could also add another column to include Scrub Errors. I'm easy since I did not come up with that portion of the code originally but I do like knowing how long a Scrub takes. If there is a Scrub error, I generally expect an email from TrueNAS.

mistermanko said:

Nope, I don't think so.

Please take this as genuine, please modify the script and test it out, then post your version in this thread so others can test it out. Make sure it runs on both FreeBSD and Debian. I am open to change and I promote any good idea that people may desire.

joeschmuck · Mar 26, 2022

joeschmuck said:
You can of course modify the code to your liking, we could also add another column to include Scrub Errors.

Needless to say, I was trying to answer a question while at work. There already is a Scrub Errors column of course. I will try to look into the date issue but as I said before, I may need to rebuild my configuration file because I cannot get TrueNAS to change. My config file has been with me and just "upgraded" over the years. Guess I will need to recreate it. Oh joy.

EDIT: So we are examining the output of "zpool status pool" and the line with "scan" in it. At the end is the date the scan was performed. That format seems to be dependent on the TZ selected so now it's in a different format than expected. What I need to do now is check what TZ is selected and then write a routine for each TZ. This can be done with a simple "case' command, but first I still need to be able to change the TZ on my NAS. Should be able to do that later this afternoon. Then I can see what each format looks like and write the script to compensate. I'm curious how many other places this needs to occur as well and if Debian needs this fix too, I hope not but I'll plan for the worst. So version 14d may be out in the near future to fix this bug.

TooMuchData · Mar 26, 2022

TooMuchData updated multi_report.sh versions for Core and Scale with a new update entry:

Minor formatting change.

Removed surrounding brackets from HDD/SDD sizes.

Read the rest of this update entry...

TooMuchData · Mar 26, 2022

TooMuchData updated multi_report.sh versions for Core and Scale with a new update entry:

Minor formatting change.

Removed brackets from device sizes.

Read the rest of this update entry...

joeschmuck · Mar 26, 2022

@mistermanko I see the issue now. Is has to do with it taking "1 days" extending the line of text. It has nothing to do with the time zone. So I need to adjust for this. Working on it now.

joeschmuck · Mar 26, 2022

@mistermanko Attached is a special version for you to try out, please let me know how it works. It "should" compensate for the longer repair time reported. This is not a complete fix, I just wanted to make sure it passed the initial test, if it does, I will make the other three changes I need to make.

EDIT: I pulled it, I found other breaking code due to a scrub taking longer than 24 hours. Fixing it now.

joeschmuck · Mar 26, 2022

Several changes made. The scrub reporting issue was larger than I imagined and I took an easy way out and I did test it on both versions of TrueNAS. Now when there is one or more days for a scrub, it will add 24 hours for each day, up to the 7th day. If a scrub takes longer than that, I could add a few more days, easy stuff now. Jeff has also removed the brackets from the drive capacity and that is included as well.

Let's see what else falls from the sky.

TooMuchData · Mar 26, 2022

TooMuchData updated multi_report.sh versions for Core and Scale with a new update entry:

Minor formatting change (continued).

Still version 1.4c, but date revised to today.

Read the rest of this update entry...

TooMuchData · Mar 27, 2022

TooMuchData updated multi_report.sh versions for Core and Scale with a new update entry:

Joeshmuck fixed the 24+ hour scrub issue

Thanks again, joeshmuck.

Read the rest of this update entry...

TooMuchData · Mar 27, 2022

TooMuchData updated multi_report.sh versions for Core and Scale with a new update entry:

Yet another Joeshmuck fix for the 24+ hour scrub issue.

This is the last of the 1.4c versions. It will handle simultaneous, year-long scrubs! Please report your longest scrub time in the discussion section. Joeshmuck is offering condolence prizes (once the scrub completes).

Read the rest of this update entry...

mistermanko · Mar 30, 2022

Tried out the new version, works pretty well except:
line 727: [: too many arguments
That would be:

Code:

if [ $scrubDays == "days" ]; then
   scrubextra="$(echo "$statusOutput" | grep "scan" | awk '{print $6}')"

extra info: there is a scrub in progress right now. Seems the script detects it correct.

Bildschirmfoto 2022-03-30 um 11.06.40.png

joeschmuck · Mar 30, 2022

mistermanko said:
Tried out the new version, works pretty well except:
line 727: [: too many arguments
That would be:

Code:
if [ $scrubDays == "days" ]; then scrubextra="$(echo "$statusOutput" | grep "scan" | awk '{print $6}')"

extra info: there is a scrub in progress right now. Seems the script detects it correct. View attachment 54387

So I never got that error when I ran it but that doesn't mean anything. I'm at work right now but you have my email address I believe. Send me a copy of your script as-is. Let me run it on my system to see if it works. Hopefully the issue is obvious. There is definitely not too many arguments on that IF statement. And i'm assuming that you are still running on TrueNAS 12.0-U8 for this issue.

mistermanko · Mar 30, 2022

Was also at work, here is the script as is, except my mail address. Still at CORE U8, yes.

joeschmuck · Mar 30, 2022

I just ran that exact script, no errors. The only changes I made were my email address and the file location for the statistics to /tmp/. How are you running the script?

My method for testing:

Code:

root@freenas:/mnt/farm2/scripts # ./run_multi_report.sh
Creating statistical datafile
Collecting Drive Information
Running Purging Routine
Sending Email
root@freenas:/mnt/farm2/scripts #

I ran it also as a CRON job, no error messages in the cron log nor the messages log.

All I can think of is you have a scrub message format issue, well to be correct, I may have a formatting issue. Please post the current output from 'zpool status' and then maybe there is something obvious, I hope. I will need to fake the software by entering at line 723 your zpool status data which will be similar to mine below:

Code:

statusOutput="pool: farm2
 state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
        still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(5) for details.
  scan: scrub repaired 0B in 06:10:30 with 0 errors on Sun Mar  6 06:10:30 2022
config:

        NAME                                            STATE     READ WRITE CKSUM
        farm2                                           ONLINE       0     0     0
          raidz2-0                                      ONLINE       0     0     0
            gptid/64a06668-d52f-11e7-ab84-0cc47ab37c5a  ONLINE       0     0     0
            gptid/6528d863-d52f-11e7-ab84-0cc47ab37c5a  ONLINE       0     0     0
            gptid/65b68ce1-d52f-11e7-ab84-0cc47ab37c5a  ONLINE       0     0     0
            gptid/66431f30-d52f-11e7-ab84-0cc47ab37c5a  ONLINE       0     0     0

errors: No known data errors"

So whatever you give me, if you have multiple pools, I will need to do the testing for all variations. And please use code brackets so the format remains the same.

Cheers,
-Mark

mistermanko · Mar 30, 2022

Good evening,
interestingly, in the meantime the scrub on my mainpool finished and I did run the script again without any errors.
So it's safe to assume that formatting issue comes with zpool status' message when a scrub is in progress.
Let me restart the scrub on the main pool (hopefully my hdds withstand the frequent srubbing :shudder:) and I'm gonna report again if the error occurs.

mistermanko · Mar 30, 2022

Ok, so here we are again. Scrub is running on mainpool and the script is finishing with the exact same error.

/mnt/mainpool/scripts/Multi_Report_Script/run_multi_report.sh: line 727: [: too many arguments

Code:

 pool: mainpool
 state: ONLINE
  scan: scrub in progress since Wed Mar 30 22:40:37 2022
        3.31T scanned at 740M/s, 1.48T issued at 330M/s, 14.3T total
        0B repaired, 10.33% done, 11:18:01 to go
config:

        NAME                                            STATE     READ WRITE CKSUM
        mainpool                                   ONLINE       0     0     0
          raidz2-0                                      ONLINE       0     0     0
            gptid/086f0ce8-519b-11ea-84c9-002590472e07  ONLINE       0     0     0
            gptid/08e989a1-519b-11ea-84c9-002590472e07  ONLINE       0     0     0
            gptid/08b1b7be-519b-11ea-84c9-002590472e07  ONLINE       0     0     0
            gptid/08cd2b7c-519b-11ea-84c9-002590472e07  ONLINE       0     0     0
            gptid/b75a7f3c-8267-11ea-8342-0cc47a05eaa0  ONLINE       0     0     0
            gptid/090df988-519b-11ea-84c9-002590472e07  ONLINE       0     0     0

errors: No known data errors

Bildschirmfoto 2022-03-31 um 00.01.49.png

Important Announcement for the TrueNAS Community.

multi_report.sh version for Core and Scale 3.0

Guru

Old Man

Old Man

Guru

Old Man

Old Man

Contributor

Contributor

Old Man

Old Man

Old Man

Attachments

Contributor

Contributor

Contributor

Guru

Old Man

Guru

Attachments

Old Man

Guru

Guru

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "multi_report.sh version for Core and Scale"

Similar threads