FN 11RC1 sending test to alert services

Status
Not open for further replies.

cmh

Explorer
Joined
Jan 7, 2013
Messages
75
Installed 11RC1 last night on my backup FN box, love being able to just switch trains to do the upgrade. Looking through the new features so I can hopefully provide some useful feedback.

Trying to set up the alert services - I can test Slack and PagerDuty and have set up targets on the receiving end. Is there doc on setting up these services that I'm missing? I looked at http://doc.freenas.org/11/system.html#alert-services but it's pretty rudimentary. I was unsure about the "Cluster name" field for Slack.

I also don't see any way to test these integrations like you can with an email, so is there any easy way to generate a warning or critical alert that will allow me to confirm these things are working? I'd like something a bit less destructive than pulling a drive as I do send replication to this host.

I opened https://bugs.freenas.org/issues/23928 for sending a test message.
 

cmh

Explorer
Joined
Jan 7, 2013
Messages
75
So my test box very considerately has disk errors:

Code:
Device: /dev/ada3, 2 Offline uncorrectable sectors


I'm getting emailed very agressively - once a minute. I haven't gotten anything through PagerDuty or Slack, however. Anywhere to look for debugging output on this, because I find nothing matching "slack" or "pagerduty" in /var/log.
 

cmh

Explorer
Joined
Jan 7, 2013
Messages
75
noticing a whole bunch of no fox being given for this, so probably talking to myself here, but...

Created another PagerDuty integration using the APIv2, and unack'd the failing disk alert. Immediately got emails, and nothing via PagerDuty. Still nothing via slack. Still no idea how to debug this, and no doc explaining how to set it up so don't know if I'm doing it wrong or it's just plain broken.
 

Bonnie Follweiler

QA Technician
iXsystems
Joined
May 23, 2016
Messages
35
Hello, you're not talking to yourself. :D
I am in the process of setting these up on my system. I will keep you updated as to my progress and I would appreciate it if you would continue to update as you go along
 

cmh

Explorer
Joined
Jan 7, 2013
Messages
75
I tried a couple things this morning with no luck - so I opened bug reports. I then noticed that my monitoring was informing me of an update. Updated to 11.0-RC2 and my Slack integration worked, so I updated that bug report (24018) and they closed it.

Still no dice from PagerDuty, tried a couple things with no luck - the associated issue is https://bugs.freenas.org/issues/24017

One thing that's different with RC2 is I'm not getting emailed every minute, which is good... but I'm wondering if there is now a timeout on all alerts. In RC1 I'd get emailed every minute until I went to the GUI and unchecked the alerts on the failing disk. If RC2 only sends notifications once for each critical alert, and my PD key was wrong (still unsure if it uses API v1 or API v2) then it might have marked it as sent and is not re-sending. That's completely guesswork.

Looked into the logs again on RC2 and still don't see anything that seems relevant for debugging the slack or PD notifications.
 

Bonnie Follweiler

QA Technician
iXsystems
Joined
May 23, 2016
Messages
35
Thank you for the update. I am also working on PagerDuty. I am also working on OpsGenie but now I will add Slack to the list as well (thank you for the info on that)
 

cmh

Explorer
Joined
Jan 7, 2013
Messages
75
I managed to get Slack working as noted above, and when I found out that the PD integration should use the Consul integration, I generated a new key for that, and that actually worked. Didn't know that FreeNAS uses consul for monitoring, good to know.

I've forked and checked out the FreeNAS doc repo, and if I can figure it out, I'll update the docs for at least those two services.
 

Bonnie Follweiler

QA Technician
iXsystems
Joined
May 23, 2016
Messages
35
We must have found out about the consul integration around the same time. I also FINALLY got PagerDuty to work. I had to create an warning message to check it and, when I attempted it, I had left out several lines of code. Once I figured that out, thanks to the help of a patient friend, it worked.

Thank you for volunteering to update the docs
 

cmh

Explorer
Joined
Jan 7, 2013
Messages
75
No worries on updating the docs, but no promises as I have to figure out yet another format for writing docs and then how to test it. Will do what I can, though.
 

wblock

Documentation Engineer
Joined
Nov 14, 2014
Messages
1,506
If you like, state the information you want to include and I'll add it to the docs.
 

cmh

Explorer
Joined
Jan 7, 2013
Messages
75
Sorry I haven't gotten to this - things have gotten busy for me and I haven't been able to make the time yet.
 
Status
Not open for further replies.
Top