Custom Metric Collector for Prometheus' Node Exporter

Status
Not open for further replies.

Maelos

Explorer
Joined
Feb 21, 2018
Messages
99
Hello again all. I have been in the midst of trying to get back into programming, specifically Go, and come to you for help once more. The objective is to create a custom collector for FreeNAS. This will work within the node exporter that gives metrics to Prometheus, a newer monitoring system. I have had a basic version working here, https://forums.freenas.org/index.php?threads/freenas-esxi-lab-build-log.61833/page-8#post-450251, but ran into a few problems. I just finished with the udemy course, https://www.udemy.com/learn-how-to-code/, and am trying to put that to use in better understanding the code within the Prometheus project (in go) and for professional/home lab use. My first priority is to get temperature monitoring fixed. I can then expand to export whatever metrics people are interested in. I am doing this as I don't feel like email alerts are adequate, and I want to learn

Problem #1 - The current CPU temperature reads as 400 M. While I know I am undervolting the fans...
https://github.com/prometheus/node_exporter/blob/master/collector/cpu_freebsd.go
Code:
temp, err := unix.SysctlUint32(fmt.Sprintf("dev.cpu.%d.temperature", cpu))
...error handlng
ch <- c.temp.mustNewConstMetric(float64(temp-2732)/10, lcpu)

This takes the temperature from...(next) and then does a conversion (no idea to or from what) before sending it off to a channel (spaced use for concurrent and parallel processing)

https://github.com/golang/sys/blob/master/unix/syscall_bsd.go (took out error detecting)
Code:
func SysctlUint32(name string) (uint32, error) { return SysctlUint32Args(name) }

func SysctlUint32Args(name string, args ...int) (uint32, error) {
mib, err := sysctlmib(name, args...)
n := uintptr(4)
buf := make([]byte, 4)
if err := sysctl(mib, &buf[0], &n, nil, 0); err != nil {
return 0, err
}

if n != 4 {
return 0, EIO
}

return *(*uint32)(unsafe.Pointer(&buf[0])), nil
}

Which calls...not gonna keep going down the wormhole.

I propose writing a new program, creating a slice of structs for the CPUs and a slice of structs for the drives. This allows expansion as we can add in additional fields their respective calls. I need to translate the shell scripts used for the drive and CPU temperatures that I previously mushed together with the SMART report. The scripts are here: https://github.com/Spearfoot/FreeNAS-scripts. I then need to get those translated scripts to feed into the Prometheus node exporter like the other collectors. Any help or guidance is appreciated. I will post up what I have so far soon.

<I moved the code to a second post so it will be easier to update. Eventually I will just update it via github...>

I know there is a bit of work and understanding to get through, but I'm hitting a bit of a wall and thought it worthwhile to reach out for help here. I have tried the Golang and Prometheus IRC channels without much luck.

After going back and re-reading some of the posts related to creating a custom collector, it appears just running a script on CRON that appends the data to the exported text file is the best bet. I really want to get a custom one done in Go though...

I think I may actually have it, though now I need to test it. Now that I have it nearly ready, I realized that I am going to have to make two separate collectors, one for the CPU info (from IPMI) and one for the drive temps (from FreeNAS). It may almost be easier to use the IPMI exporter and try to modify one of the working disk collectors...
 
Last edited:

Maelos

Explorer
Joined
Feb 21, 2018
Messages
99
Code:
package collector

import (
	"io/ioutil" //for reading the password for IPMI
	"os/exec" //for using external processes - IPMI and smartctl calls
	"strconv" //used in the conversion of an in to a string in the final function which pushes the data to be a metric

	"github.com/prometheus/client_golang/prometheus"
)

//--------------------------------
// *** CPU Temperatures ***
//--------------------------------

//a struct (like a multi field object) for the CPUs, leaves room for more later
type cpu struct {
	temp int
}

//generic error checker

func check(e error) {
	if e != nil {
		panic(e)
	}
}

//get the # of CPU cores and their temperatures and put them in a cpu struct, then return a slice of these structs
func getCPUtemps() (out []cpu) {

	// IPMI Variables/Settings
	//useIPMI := true //I want to use IPMI, use what you like
	ipmiHost := "IPMI ADDRESS" // IP address or DNS-resolvable hostname of IPMI server:
	ipmiUser := "IPMI Username" // IPMI username
	// IPMI password file. This is a file containing the IPMI user's password
	// on a single line and should have 0600 permissions:
	ipmiPW, err := ioutil.ReadFile("/root/ipmi_password") //needs to find the file at location and read the line to the variable
	check(err)											//calls a generic error checker

	out = make([]cpu)

	//define the command to get the number of CPUs and then use it
	numCpuCmd := exec.Command("/usr/local/bin/ipmitool", " -I lanplus -H ", ipmiHost, " -U ", ipmiUser, " -f ", ipmipw, " sdr elist all | grep -c -i 'cpu.*temp")
	numCpuSoB, _ := numCpuCmd.Output() //returns a slice of bytes and an error
	numCpu = int(numCpuSoB[0])		 //converts the first and hopefully only value of slice of bytes into an int

	//go through each CPU and get the temperature
	if numCPU == 1 {
		//define the command used to get the CPU temperature
		cpuTemp := exec.Command("/usr/local/bin/ipmitool", " -I lanplus -H ", ipmiHost, " -U ", ipmiUser, " -f ", ipmipw, " sdr elist all | grep 'CPU Temp' | awk '{print $10}'")
		temp, _ := cpuTemp.Output()
		out = append(cpu, int(temp))
	} else {
		for i = 0; i < numCPU; i++ {
			cpuTemp := exec.Command("/usr/local/bin/ipmitool", " -I lanplus -H ", ipmiHost, " -U ", ipmiUser, " -f ", ipmipw, " sdr elist all | grep 'CPU", string(i), " Temp' | awk '{print $10}'")
			temp, _ := cpuTemp.Output()
			out = append(cpu, int(temp))
		}
	}
	return out // returns the slice of cpu structs
}

//Beginning of the Go section that I believe is required for the collector to run

func init() {
	//registers the collector with collector.go to be started/fetched with each update
	registerCollector("freenas", defaultEnabled, NewStatCollector)
}

// NewStatCollector returns a new Collector exposing CPU stats.
func NewStatCollector() (Collector, error) {

	//returns the struct with data
	return &cpu{
		temp: typedDesc{
			prometheus.NewDesc( //github.com\Prometheus\client_golang\prometheus\desc.go
				prometheus.BuildFQName(namespace, cpuCollectorSubsystem, "temperature_celsius"), //builds the FQDN" node_cpu_temperature_celsius". I may need to adjust the namespace
				"CPU temperature",
				[]string{"freenas"}, nil,
			), prometheus.GaugeValue},
	}, nil
}

// Expose CPU stats using sysctl.
func (c *statCollector) Update(ch chan<- prometheus.Metric) error {

	//if err != nil {
	//  return err
	//}
	for i := range getCPUTemps() {
		lcpu := strconv.Itoa(i)
		ch <- c.temp.mustNewConstMetric(float64(temp), lcpu)
	}
	return err
}

 
Last edited:

Maelos

Explorer
Joined
Feb 21, 2018
Messages
99
I am proud enough of my little accomplishment to make this a bump. Life has been busy but I got my FreeNAS custom exporter for Prometheus working. You can see it at www.github.com/maelos/freenas_exporter . While it is basic, it is working and is the start of something much grander. I dream of making an exporter that can handle all of the reports that the community has developed, presenting them via Grafana, with a flexible alert manager. This will take the humble daily email up to an enterprise standard of monitoring metrics, alerting, and insight. Yaay... :smile:
 

blacs30

Dabbler
Joined
Mar 12, 2017
Messages
22
Hi Maelos,
this is definitely a nice start.
It would be great to have one exporter for all the needed freenas metrics.
Currently I use these to get all the metrics on freenas:

ipmi exporter
nut exporter
node exporter
smart exporter
zpool exporter
zfs exporter

Some of them are in go, some in python, I think it's around 50/50.

Currently I don't have time though to work on this.
 

Maelos

Explorer
Joined
Feb 21, 2018
Messages
99
Thank you for the comment and tips. I am actually getting back into setting up the server and am doing burn in tests for the drives. I mean to finish the scripts to gather the drive temps, then will start on more. I'd love to incorporate all of those you mentioned, so again, thank you for the tip.
 
Status
Not open for further replies.
Top