SOLVED Help with truenas information in grafana/influxDB.

Joined
Jan 27, 2020
Messages
577
Well, somehow the measurement of .df_complex.free for my main pool is excatly what the GUI is telling me is free of my main pool. So I can deduct used space via: total (I already know) - free = used.
 

ragametal

Contributor
Joined
May 4, 2021
Messages
188
Well, somehow the measurement of .df_complex.free for my main pool is excatly what the GUI is telling me is free of my main pool. So I can deduct used space via: total (I already know) - free = used.
I have the same results as you on the "free" space but, how do you calculate the "Total"? I mean, i know that 100% of the disk space is not usable so the (disk space) x (# of disks) is not equal to "Pool total space".

Care to expand on how you did it?
I'm not criticizing i honestly want to know for my own education.

The reason for calculating the total is to use it to calculate the "used space" per your suggestion.
 
Joined
Jan 27, 2020
Messages
577
I have the same results as you on the "free" space but, how do you calculate the "Total"? I mean, i know that 100% of the disk space is not usable so the (disk space) x (# of disks) is not equal to "Pool total space".

Care to expand on how you did it?
I'm not criticizing i honestly want to know for my own education.

The reason for calculating the total is to use it to calculate the "used space" per your suggestion.
You can calculate raw capacity easily. number of disks * disk space = total pool space. Of course there's deductibles with zfs.
I used this calculator to calculate my pools total. https://wintelguy.com/zfs-calc.pl
For me a ZPool with Raid-Z2 of 6 disks each 4 TB, gives a raw total of 24 TB and as per the calculator a usable space of 15.94 (I rounded up to 16 TB for grafana total).
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
My dashboards are for multiple systems, some with multiple pools, so I want the reporting system to tell me in Influx what size/total, used and free are for each pool :wink:
 

ragametal

Contributor
Joined
May 4, 2021
Messages
188
You can calculate raw capacity easily. number of disks * disk space = total pool space. Of course there's deductibles with zfs.
I used this calculator to calculate my pools total. https://wintelguy.com/zfs-calc.pl
For me a ZPool with Raid-Z2 of 6 disks each 4 TB, gives a raw total of 24 TB and as per the calculator a usable space of 15.94 (I rounded up to 16 TB for grafana total).
This is a great workaround.
With the actual size of the pool I was able to calculate the "Used" space based on the "Free" space.

used.jpg


Now, @Patrick M. Hausen , I admit that using a constant like this is not ideal as it will require manual intervention per Pool, but i really don't feel confortable adding or altering files to the truenas base system. That is why i like to experiment inside jails.
 
Joined
Jan 27, 2020
Messages
577
My dashboards are for multiple systems, some with multiple pools, so I want the reporting system to tell me in Influx what size/total, used and free are for each pool :wink:
Sure, your script comes in handy, when multiple instances are at play. I'll give it a try soon.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
but i really don't feel confortable adding or altering files to the truenas base system. That is why i like to experiment inside jails.
Putting a script file in a location on your storage pool - mine are at /mnt/hdd/scripts/* - is not considered altering the TrueNAS system. And cron jobs are an officially supported UI feature.
 
Joined
Jan 27, 2020
Messages
577
Are you using the "reporting" function that is built in or a separate graphite installation? I use the former to push data into InfluxDB and visualize with Grafana. That's actually collectd running inside TN and using the graphite plain text API to deliver the data.

For pool space I have created this shell script that delivers all the data I need. The prefix is defined to match with what the built in collectd does.
Code:
root@freenas[/mnt/hdd/scripts]# cat zpool-metrics.sh
#! /bin/sh

HOST="192.168.1.55"
PORT="2003"
PREFIX="servers"

time=$(/bin/date +%s)
hostname=$(/bin/hostname | /usr/bin/tr '.' '_')

/usr/local/sbin/zpool list -Hp | while read pool size alloc free ignore
do
    ralloc=$(echo "scale=8;${alloc}/${size}" | /usr/bin/bc)
    rfree=$(echo "scale=8;${free}/${size}" | /usr/bin/bc)

    echo "${PREFIX}.${hostname}.zpool.${pool}.size ${size} ${time}"
    echo "${PREFIX}.${hostname}.zpool.${pool}.alloc ${alloc} ${time}"
    echo "${PREFIX}.${hostname}.zpool.${pool}.alloc-ratio ${ralloc} ${time}"
    echo "${PREFIX}.${hostname}.zpool.${pool}.free ${free} ${time}"
    echo "${PREFIX}.${hostname}.zpool.${pool}.free-ratio ${rfree} ${time}"
done | /usr/bin/nc "${HOST}" "${PORT}" -w2
Ok tried it out.
So this is interesting and in the end it'll come down to ones preference which metric is more pleasing to oneself:

zpool as per your script reports pool size of 21.8T and 6.94T Free. Which is 69% usage.
the Gui reports
1658239047532.png
usage of 69% and 4.48T Free. Doing the math comes down to 14.45T size.
Same usage but different total sizes - zfs magic. Like I said, it comes down to what one would like to see in grafana as total.

To wrap this up. Your script provides a fast way of getting all the metric for every pool into influxDB. I very much prefer that way.
Thank you Patrick! Maybe make it a resource ;-)
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
You have to keep in mind that the zpool command reports sizes inconsistently. For a pool built strictly from mirrored vdevs useable space is used, i.e. half of the disks. (Or a third of you triple mirror, etc.). For a pool built from RAIDZn vdevs the raw drive capacity is reported instead. Possibly the other numbers are not coherently computable.

No idea what it's going to do for mixed vdevs.
 

ragametal

Contributor
Joined
May 4, 2021
Messages
188
Awesome, one of my issues is resolved, thank you both.

Now, do you happen to have a magic sauce to get the UPS Status in grafana?
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Write a script similar to mine using upscmd instead of zpool command :wink:
 

ragametal

Contributor
Joined
May 4, 2021
Messages
188
Write a script similar to mine using upscmd instead of zpool command :wink:
I know your heart is in the right place, and i know it is to my benefit to figure these things for myself, but my scripting skills are nothing to brag about.

I have done some simple scripts before but i honestly don't have the knowledge to do this one. As a matter of fact I'm having problems understanding all the steps in your script, not because it is complicated but because I'm not familiar with some of the commands and system variables that you used.

Please keep in mind that I don't do IT for a living, I do it for my company which is running on limited budget.

Anyway, Thank you for pointing me in the right direction. I will try to develop the script and report back once i do.

Now to the next problem. Do you have a clue about what is going on with the network metrics?
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
The TrueNAS values seem to be 8 times the Grafana ones. One is counting bits, one is counting bytes, possibly. Check your configuration. What collectd sends to Influx are either packets or octets, i.e. bytes as you can see here.

What my script does is this:
Code:
# get current time in Unix timestamp format, save in $time
time=$(/bin/date +%s)
# get hostname, replace "." with "_", save in $hostname
hostname=$(/bin/hostname | /usr/bin/tr '.' '_')

# iterate over all ZFS pools as output by `zpool list`
# - read the values into the variables $pool $size $alloc $free
# - read the last column of the output into the variable $ignore - we don't use it
/usr/local/sbin/zpool list -Hp | while read pool size alloc free ignore
do
    # calculate ratio from absolute values to 8 digits precision with the `bc` calculator
    ralloc=$(echo "scale=8;${alloc}/${size}" | /usr/bin/bc) 
    rfree=$(echo "scale=8;${free}/${size}" | /usr/bin/bc) 

    # output all the values in graphite plain text format
    echo "${PREFIX}.${hostname}.zpool.${pool}.size ${size} ${time}"
    echo "${PREFIX}.${hostname}.zpool.${pool}.alloc ${alloc} ${time}"
    echo "${PREFIX}.${hostname}.zpool.${pool}.alloc-ratio ${ralloc} ${time}"
    echo "${PREFIX}.${hostname}.zpool.${pool}.free ${free} ${time}"
    echo "${PREFIX}.${hostname}.zpool.${pool}.free-ratio ${rfree} ${time}"
# and pipe all the output for all pools and values into the `nc` program to send them over the network
done | /usr/bin/nc "${HOST}" "${PORT}" -w2


You can give me a sample output of fetching the UPS status with `upscmd` - I can help you with the script. Getting the status is left as an exercise :wink:
 

ragametal

Contributor
Joined
May 4, 2021
Messages
188
@Patrick M. Hausen ,Grafana is reading “Octets” and the units are set to “bytes/sec(SI)” based on your advisement but the end result was still way off from what Truenas was reporting on its GUI.

So, in grafana i just increased the octet values by a factor of 8 and now the results, while not identical, are very close to what Truenas is reporting. Thanks for that.

About the script, challenge accepted. It was very kind of you to spend your time in adding comments to your script so i could understand it better.

The following is the output of 'upscmd' -l as requested:
Code:
root@socrates[~]# upscmd -l Socrates-ups
Instant commands supported on UPS [Socrates-ups]:

beeper.disable - Description unavailable
beeper.enable - Description unavailable
beeper.mute - Description unavailable
beeper.off - Description unavailable
beeper.on - Description unavailable
load.off - Description unavailable
load.off.delay - Description unavailable
load.on - Description unavailable
load.on.delay - Description unavailable
shutdown.return - Description unavailable
shutdown.stayoff - Description unavailable
shutdown.stop - Description unavailable


As for my exercise, I figured out a way of getting the status directly as follows:
Code:
root@socrates[~]# upsc Socrates-ups | grep 'ups.status'
ups.status: OL
 

ragametal

Contributor
Joined
May 4, 2021
Messages
188
@Patrick M. Hausen , I have not tried it but this is my first attempt at creating the script to obtain the UPS status and send it to grafana
Code:
#! /bin/sh
#script to obtain the UPS status and send it
#to InfluxDB and Grafana
#
#by Ragametal 07-21-2022

##################################################################
# modify the variables below to meet your needs

# change HOST to match the IP of the InfluxDB jail
HOST="10.0.0.24"

# change the PORT to match the InfluxDB port (default is shown)
PORT="2003"

# change PREFIX to match the InfluxDB items (default is shown)
PREFIX="servers"

# change the UPS to match the name of the UPS in Truenas
UPS="Socrates-ups"

# end of variables

# start of script
# DO NOT MODIFY BELOW THIS LINE unless you understand the code
#####################################################################

# get current time in Unix timestamp format, save in $time
time=$(/bin/date +%s)
# get hostname, replace "." with "_", save in $hostname
hostname=$(/bin/hostname | /usr/bin/tr '.' '_')

# - read the UPS Status value and save it to the variable $upsstatus
upsstatus=/usr/local/bin/upsc ${UPS} | grep 'ups.status'

# output UPS status values in graphite plain text format
echo "${PREFIX}.${hostname}.nut.${UPS}.ups.status ${upsstatus} ${time}"


Do you see anything inherently wrong in it?
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
The upsstatus line won't work that way. For once you have to enclose the command in $() so the OUTPUT ends up in the variable. Second you will probably want to strip the "ups.status:" in front.

What are the possible values and what do they mean? I would need that info to make a complete suggestion.
 

ragametal

Contributor
Joined
May 4, 2021
Messages
188
The upsstatus line won't work that way. For once you have to enclose the command in $() so the OUTPUT ends up in the variable. Second you will probably want to strip the "ups.status:" in front.

What are the possible values and what do they mean? I would need that info to make a complete suggestion.
Thanks for the input on the upsstatus, i honestly didn't see it.

The following are the possible UPS status values and their meaning:

ValueStatus
OLOn line (mains is present)
OBOn battery (mains is not present)
LBLow battery
RBThe battery needs to be replaced
CHRGThe battery is charging
DISCHRGThe battery is discharging (inverter is providing load power)
BYPASSUPS bypass circuit is active (no battery protection is available)
CALUPS is currently performing runtime calibration (on battery)
OFFUPS is offline and is not supplying power to the load
OVERUPS is overloaded
TRIMUPS is trimming incoming voltage
BOOSTUPS is boosting incoming voltage
*unknown state
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
I don't know from the top of my head - can you parse these as strings in Grafana? I.e. can we send them directly to Influx? Or do they need to be converted to numerical values?

In any case to get that status in a clean way:
Code:
upsstatus=$(/usr/local/bin/upsc ${UPS} | /usr/bin/awk '/ups\.status:/ { print $2 }')


Then if that's enough to send, you are done. If not some code like this can do the trick:
Code:
case ${upsstatus} in
    OL)
        # online
        numstatus=0
    ;;
    OB)
        # on battery
        numstatus=1
    ;;
    LB)
        # low battery
        numstatus=2
    ;;
[...]
    *)
        # unknown
        numstatus=99
    ;;
esac
 

ragametal

Contributor
Joined
May 4, 2021
Messages
188
@Patrick M. Hausen, I have incorporated your suggestiosn to the script but for some reason the ups.status is not available in grafana yet. I ran the script manually from the cli and got no errors so i'm not sure what the problem is. I also restarted the grafana jail and that didn't help either.

Any ideas?

Btw the script as it stands right now is as follows
Code:
#! /bin/sh
#script to obtain the UPS status and send it
#to InfluxDB and Grafana
#
#by Ragametal 07-21-2022

##################################################################
# modify the variables below to meet your needs

# change HOST to match the IP of the InfluxDB jail
HOST="10.0.0.24"

# change the PORT to match the InfluxDB port (default is shown)
PORT="2003"

# change PREFIX to match the InfluxDB items (default is shown)
PREFIX="servers"

# change the UPS to match the name of the UPS in Truenas
UPS="Socrates-ups"

# end of variables

# start of script
# DO NOT MODIFY BELOW THIS LINE unless you understand the code
#####################################################################

# get current time in Unix timestamp format, save in $time
time=$(/bin/date +%s)
# get hostname, replace "." with "_", save in $hostname
hostname=$(/bin/hostname | /usr/bin/tr '.' '_')

# - read the UPS Status value and save it to the variable $upsstatus
upsstatus=$(/usr/local/bin/upsc ${UPS} | /usr/bin/awk '/ups\.status:/ { print $2 }')

case ${upsstatus} in
    OL)
        # online
        numstatus=0
    ;;
    OB)
        # on battery
        numstatus=1
    ;;
    LB)
        # low battery
        numstatus=2
    ;;
    RB)
        # replace battery
        numstatus=3
    ;;
    CHRB)
        # charging battery
        numstatus=4
    ;;
    DISCHRB)
        # discharging battery
        numstatus=5
    ;;
    BYPASS)
        # bypass circuit is active
        numstatus=6
    ;;
    CAL)
        # performing calibration
        numstatus=7
    ;;
    OFF)
        # offline
        numstatus=8
    ;;
    OVER)
        # overloaded
        numstatus=9
    ;;
    TRIM)
        # trimming incoming voltage
        numstatus=10
    ;;
    BOOST)
        # boosting incoming voltate
        numstatus=11
    ;;
    *)
        # unknown
        numstatus=12
    ;;
esac

# output UPS status values in graphite plain text format
echo "${PREFIX}.${hostname}.nut.${UPS}.ups.status ${upsstatus} ${time}" | /usr/bin/nc "${HOST}" "${PORT}" -w2
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Change this:
echo "${PREFIX}.${hostname}.nut.${UPS}.ups.status ${upsstatus} ${time}" | /usr/bin/nc "${HOST}" "${PORT}" -w2
to this, probably:
echo "${PREFIX}.${hostname}.nut.${UPS}.ups.status ${numstatus} ${time}" | /usr/bin/nc "${HOST}" "${PORT}" -w2

You go all your way to set a $numstatus and then in the final statement you did not use it.
 
Top