Issue with APC BGM1500 UPS

Joined
Jan 3, 2022
Messages
3
Hi all, newb here, did try searching but didn't find anything relevant to what I'm seeing and I will try to post all the relevant information but please let me know if it's not enough and I can check and add

Issue: After setting up UPS service things would seemingly work for a little less than a day, then the communication to UPS would crash resulting in messages in log and then non stop email about COMMBAD and NOCOMM UPS.

Setup (I will try to include only relevant information to keep it short)
Version: TrueNAS-12.0-U7
UPS in question is APC BGM1500
It's a hand me down Skylake system though I'm not sure how relevant that is to this issue :)

UPS service setup
Identifier: ups
UPS Mode: Master (I only really care about safely shutting down this box, the router attached to it can die for all I care for now, I might later try to set something up there once I figure this out)
Driver: usbhid-ups (which seems to be what newer APC Back-UPS are on?
Port: ugen0.2

Timeline
I think when I first set up the UPS Sunday it worked for a little less than 24 hrs before similar errors. I don't have logs going back that far. I turned off the UPS service when the error happened then and then turned it back on a day later and it seems like it was working fine. I can upsc to it just fine to show stats. Checked logs and nothing ups relevant happened before 17:36 today

Emails
17:36: (UPS notification)NOTIFICATION: 'COMMBAD'
17:36: (TrueNAS notification)New alerts: * Communication with UPS ups lost.
17:41: (UPS notification)NOTIFICATION: 'NOCOMM'
17:41: (TrueNAS notification)The following alert has been cleared: * Communication with UPS ups lost. (This is confusing to me)
17:46: NOCOMM
17:51: NOCOMM

Logs at the time things seems to have failed (tons of repeat after that as well until I turned the service off)
Jan 4 17:36:08 truenas 1 2022-01-04T17:36:08.765443-08:00 truenas.local usbhid-ups 76900 - - device->Product is NULL so it is not possible to determine whether to activate max_report_size workaround
Jan 4 17:36:09 truenas 1 2022-01-04T17:36:09.164442-08:00 truenas.local usbhid-ups 76900 - - device->Product is NULL so it is not possible to determine whether to activate max_report_size workaround
Jan 4 17:36:09 truenas 1 2022-01-04T17:36:09.164551-08:00 truenas.local upsd 76902 - - Data for UPS [ups] is stale - check driver
Jan 4 17:36:10 truenas 1 2022-01-04T17:36:10.463770-08:00 truenas.local upsmon76911 - - Poll UPS [ups@localhost:3493] failed - Data stale
Jan 4 17:36:10 truenas 1 2022-01-04T17:36:10.463785-08:00 truenas.local upsmon76911 - - Communications with UPS ups@localhost:3493 lost
Jan 4 17:36:10 truenas 1 2022-01-04T17:36:10.778441-08:00 truenas.local usbhid-ups 76900 - - device->Product is NULL so it is not possible to determine whether to activate max_report_size workaround
Jan 4 17:36:12 truenas 1 2022-01-04T17:36:12.801492-08:00 truenas.local usbhid-ups 76900 - - device->Product is NULL so it is not possible to determine whether to activate max_report_size workaround
Jan 4 17:36:14 truenas 1 2022-01-04T17:36:14.807485-08:00 truenas.local usbhid-ups 76900 - - device->Product is NULL so it is not possible to determine whether to activate max_report_size workaround
Jan 4 17:36:15 truenas 1 2022-01-04T17:36:15.513942-08:00 truenas.local upsmon76911 - - Poll UPS [ups@localhost:3493] failed - Data stale
Jan 4 17:36:16 truenas 1 2022-01-04T17:36:16.829435-08:00 truenas.local usbhid-ups 76900 - - device->Product is NULL so it is not possible to determine whether to activate max_report_size workaround
Jan 4 17:36:18 truenas 1 2022-01-04T17:36:18.649993-08:00 truenas.local collectd 76950 - - nut plugin: nut_read: upscli_list_start (ups) failed: Data stale
Jan 4 17:36:18 truenas 1 2022-01-04T17:36:18.844438-08:00 truenas.local usbhid-ups 76900 - - device->Product is NULL so it is not possible to determine whether to activate max_report_size workaround

After about 20 mins (I was out of the house) I tried to turn off the UPS service, wait a minute, and turn it back on. When I tried to "upsc ups@localhost" I got a message which I unfortunately cannot recall that essentially said that it cannot communicate with the UPS.

Now after about an hour and a half, I have cleared the events in the UPS, I have turned on the service and I can ping the UPS fine in shell again.
root@truenas[~]# upsc ups@localhost
battery.charge: 100
battery.charge.low: 1
battery.mfr.date: 1980/01/01
battery.runtime: 5141
battery.runtime.low: 120
battery.type: PbAc
battery.voltage: 27.6
battery.voltage.nominal: 24.0
device.mfr: American Power Conversion
device.model: Back-UPS BGM1500B
device.serial: 0B2044P00655
device.type: ups
driver.name: usbhid-ups
driver.parameter.pollfreq: 30
driver.parameter.pollinterval: 2
driver.parameter.port: /dev/ugen0.2
driver.parameter.synchronous: no
driver.version: 2.7.4
driver.version.data: APC HID 0.96
driver.version.internal: 0.41
input.sensitivity: medium
input.transfer.high: 147
input.transfer.low: 88
input.transfer.reason: input voltage out of range
input.voltage: 121.0
input.voltage.nominal: 120
ups.beeper.status: muted
ups.delay.shutdown: 20
ups.firmware: 31316S12-31320S10
ups.load: 7
ups.mfr: American Power Conversion
ups.mfr.date: 2020/10/27
ups.model: Back-UPS BGM1500B
ups.productid: 0002
ups.realpower.nominal

What kind of troubleshooting can/should I do now and any suggestion would be welcome. Thank you!
 
Joined
Jan 3, 2022
Messages
3
Well it happened again this morning. The response I get after I ping the UPS is "Error: Data Stale". Email notification is COMMBAD, followed by a bunch of NOCOMM. Curiously the non UPS specific notification still says "communication with UPS lost" is cleared about 5 mins after the issue begin, but the issue remain.
 

bearattack11

Dabbler
Joined
Jan 1, 2021
Messages
11
Well it happened again this morning. The response I get after I ping the UPS is "Error: Data Stale". Email notification is COMMBAD, followed by a bunch of NOCOMM. Curiously the non UPS specific notification still says "communication with UPS lost" is cleared about 5 mins after the issue begin, but the issue remain.
I’m having the same issue with a CyberPower OR700LCDRM1U with the usbhid-ups driver. I get an alert, which then seems to clear itself after a few minutes. During the error state, when I try to run the command I also get an “Error: Data Stale” response. Once the alert clears I can ping the UPS again without issue.

Did you end up finding a fix?
 
Joined
Jan 3, 2022
Messages
3
I’m having the same issue with a CyberPower OR700LCDRM1U with the usbhid-ups driver. I get an alert, which then seems to clear itself after a few minutes. During the error state, when I try to run the command I also get an “Error: Data Stale” response. Once the alert clears I can ping the UPS again without issue.

Did you end up finding a fix?
Nope I switched to an older UPS (something like https://www.costco.com/cyberpower-1...-greenpower-technology.product.100277321.html) and it worked flawlessly since. I went through a bunch of different drivers and they either didn't work or produce the same issue and I was tired of troubleshooting.
 
Top