Help with SNMP monitoring of TrueNAS Scale in Zabbix; OID for ZPool Health not right?

surfrock66

Dabbler
Joined
Apr 2, 2013
Messages
36
I'm using a fully updated Zabbix environment and am using the built-in TrueNAS SNMP monitoring template to watch my TrueNAS Scale environment. I think it was written for TrueNAS Core, but it works just about perfectly for TrueNAS Scale. My TrueNAS environment is 22.12.1.

Discovery is fine, however I get 2 "Problems" with my 2 pools, my "boot-pool" and "sr66-nas-v01" in regard to the following sensor: "TrueNAS: Pool [sr66-nas-v01]: Status is not online". This is from the following discovery trigger:

last(/HOSTNAME.SUBDOMAIN.DOMAIN.com/truenas.zpool.health[sr66-nas-v01]) <> 0

That item is built with the following SNMP OID in the item prototypes:

.1.3.6.1.4.1.50536.1.1.1.1.7.{#SNMPINDEX}

{#SNMPINDEX} is getting discovered as "1" for boot-pool and "2" for sr66-nas-v01.

Now, the MIB for this is defined here: https://mibs.observium.org/mib/FREENAS-MIB/#zpoolHealth

Unfortunately, there's a disconnect. The values expected would be this:

online(0),
degraded(1),
faulted(2),
offline(3),
unavail(4),
removed(5)

However, the value I am getting from that OID is some giant integer:

1681253337557.png


I am thinking the pool index in regards to the OID is wrong, so it's pulling the wrong value. I don't see documentation for the OID's which should be presented by TrueNAS, but if I know what the OID should be for individual discovered zpools, I'm happy to submit a pull request to fix the template.
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
Perhaps update to SCALE 22.12.2 and see if issue persists.
 

surfrock66

Dabbler
Joined
Apr 2, 2013
Messages
36
Wow, that's released today. I usually don't update on day 1, but thx for letting me know to check. I doubt it's that; I would think an OID change would be a bigger change than a point rev. My guess is this is something different between core/scale.
 

serhii.prk

Cadet
Joined
May 3, 2023
Messages
2
I have the same problem. Already updated to latest version but problem still exist. It's had returned not valid data

root@zabbix:~# snmpwalk -v2c -c public <host_address> .1.3.6.1.4.1.50536.1.1.1.1.7

iso.3.6.1.4.1.50536.1.1.1.1.7.1 = Counter64: 43461197824
iso.3.6.1.4.1.50536.1.1.1.1.7.2 = Counter64: 164410417152
 

serhii.prk

Cadet
Joined
May 3, 2023
Messages
2
i had find decision.

root@zabbix:~# snmpwalk -v2c -c public <host_addres> .1.3.6.1.4.1.50536.1.1.1.1.3
iso.3.6.1.4.1.50536.1.1.1.1.3.1 = STRING: "ONLINE"
iso.3.6.1.4.1.50536.1.1.1.1.3.2 = STRING: "ONLINE"

so need just fix zabbix template
 

dauntless101

Cadet
Joined
Mar 21, 2017
Messages
7
i had find decision.

root@zabbix:~# snmpwalk -v2c -c public <host_addres> .1.3.6.1.4.1.50536.1.1.1.1.3
iso.3.6.1.4.1.50536.1.1.1.1.3.1 = STRING: "ONLINE"
iso.3.6.1.4.1.50536.1.1.1.1.3.2 = STRING: "ONLINE"

so need just fix zabbix template
Sorry, what did you change in the template to fix this? Thanks

I updated to the lastet version of the template from the repo and looks like the problem still exists today.
 

salvadorgzz

Cadet
Joined
Aug 23, 2023
Messages
1
Not @serhii.prk, but I was able to get use his notes and figure out how to fix the template. I'm not sure if this is the proper fix, but it works locally. Hopefully this helps who ever next runs into this problem.

1. Open the template. (Go to data collection / templates / search for truenas).
2. Find the ZFS pools discovery rule. (In the discovery rules section)
3. In item prototypes, edit the "TrueNAS: Pool [{#POOLNAME}]: Health" prototype. I changed the SNMP OID to ".1.3.6.1.4.1.50536.1.1.1.1.3.{#SNMPINDEX}", and the Type of information to Text. Then save/update.
4. Edit trigger prototypes. (Since the type of info changed to text, the old trigger expression needs updating.) Look up the trigger prototype called "TrueNAS: Pool [{#POOLNAME}]: Status is not online". Change the expression to:
last(/TrueNAS by SNMP/truenas.zpool.health[{#POOLNAME}]) <> "ONLINE"

I don't know if this was needed or not, but to quickly see the updates applied, I went to the host and in the templates section unlinked and cleared the template and added it back again.
 
Top