APC UPS not powering back on properly...

Status
Not open for further replies.

rruss

Dabbler
Joined
Sep 14, 2015
Messages
35
Hello,

I'm running FreeNAS-9.3-STABLE-201511040813 with a USB connection to an APC SmartUPS 3000. The ups is configured using the "APC ups 2 Smart-UPS USB USB (usbhid-ups)" driver in master mode. There are several machines communicating with the freenas in slave mode. Everything seems to be working great - I can query details from the master or slaves, and everyone gets konsole notifications of communication errors (forced when I change UPS config in GUI).

A little background info - I have 3 outlet groups set up to do a sequenced power up.
Group 1 - comes on immediately for networking gear
Group 2 - comes on 2 minutes later powering the FreeNAS server
Group 3 - comes on 8 minutes later powering other servers - after FreeNAS has had time to boot and NFS mounts are available.

For testing purposes, I have freenas set to initiate a powerdown when "UPS goes on battery" instead of low battery to avoid running things down just to test the procedure.

After setting everything up, I simulated a power failure by pulling the plug on the UPS. All of the servers shut down (slaves and freenas master), although I wasn't happy with the sequence. FreeNAS didn't give the other servers time to shutdown - but I'm sure I can find the parameter governing this timing. The key is have the system commit to the power down once initiated and slaves have been notified there is no turning back.

Here is my problem... The FreeNAS gives the UPS the shutdown command on its way down, and the UPS powers everything off, and sits in what looks like a deep sleep mode. When I reconnect the power, the UPS turns back on immediately (as expected, since the battery wasn't drained), but instead of going through the 10 minute sequence of powering on the outlet groups, all 3 groups just come on immediately.

This leads to servers booting before the NFS mounts are available, etc... not the desired behavior.

I think it may have something to do with the actual command that is being issued by the driver to shut the UPS down. Has anyone else encountered this type of issue?

Thanks.
 

Nick2253

Wizard
Joined
Apr 21, 2014
Messages
1,633
What makes you think it has something to do with the command sent from FreeNAS? Do you know that the UPS works correctly if you use APC's software?
 

rruss

Dabbler
Joined
Sep 14, 2015
Messages
35
That is a fair question...

I intend to troubleshoot more in my own setup, and I'll update here if the powerchute software yields a different result. That will take some time to schedule however. I was hoping that someone else with the APC Smart UPS family could comment on whether they had good results with sequenced power ons and if so, did they do anything to their NUT configuration to get that result.
 

rruss

Dabbler
Joined
Sep 14, 2015
Messages
35
Ok - I connected the UPS to a windows 8 machine running Powerchute Business Edition agent/server/console level programs. From their software, i can see the power-on delays that I programmed at the front panel.

I have their software set to shutdown the computer (and UPS) 1 minute after power failure. When I pull the plug on the UPS... it continues running for 60 seconds, then the PC initiated a shutdown and after the programmed delay, the UPS shut off as well. Only, this time, instead of sitting in sleep with "Waiting for AC" on the UPS front panel display and the LEDs sequencing, it sits there with "Outlet Group 2 on by USB in: 120s", but nothing happens. Once AC power is restored, the 120s counts down to 0 and the outlet group turns on, and after their programmed delay the other outlet groups turn on as well.

So...

With Powerchute, it sits with a message of "Outlet Group 2 on by USB in: 120s" and works correctly when AC is restored.
With NUT (using usbhid-ups driver), it sits with a message of "Waiting on AC", and ignores power-on delays and just turns everything on at once when AC is restored. :(

This appears to be a limitation of the usbhid-ups driver, and not specifically an issue with FreeNAS. I may try to use NUT and this driver from a generic linux machine to see if I can get it working properly when I have more control over the NUT setup (FreeNAS doesn't give you free access to the config files and over-writes them if you set anything directly in the files). But since the driver is being directed to shutdown the ups, and it decides what physical level command to send to do that, I imagine the UPS will always wind up in this "Waiting for AC" state when turned off from this driver.

Options as I see them -
1) Figure out if there are settings for usbhid-ups that will put the UPS in the same state that powerchute does, or modifying it so that it does. This may require editing the source and lots of experimenting.
2) Hope APC sees this as an actual bug and provides a firmware update to change the UPS behavior when AC power is restored in this state. Highly unlikely as they don't "support" anything but powerchute, and their tech support hung up on me when I tried to get them to connect me to someone that might know more about the system.
3) Stop using power on delays and configure my servers directly to delay their boot sequence to wait for the file shares on the freenas box to be available before they finish booting.
 

Nick2253

Wizard
Joined
Apr 21, 2014
Messages
1,633
Depending on exactly what you're doing, delaying boot sequences to wait for a file server can be really bad. It can prevent you from bringing your servers up at all if the file share isn't available.

A better solution would be to wrap your application in a program that checks for the presence of the file share. If it doesn't find the share, it sleeps for X mins, and then tries to reconnect. Once it connects, it lets the program launch. That way, your computer can completely boot, but your application won't be running without the share.
 

rruss

Dabbler
Joined
Sep 14, 2015
Messages
35
When I say "delay" the boot sequence, I really mean sleeping for a pre-determined amount of time to give the freenas box time to make the file shares available... Not to actually wait for the shares. If they don't come up, the machines would boot normally.

What I want to avoid is needing to manually log into each machine and "fix" the NFS mounts because they didn't get mounted at boot time.
 

Nick2253

Wizard
Joined
Apr 21, 2014
Messages
1,633
For Windows, you can set boot time tasks that will start a batch file that will run until it connects to the NFS mount. For Linux, you can do similar things with a shell script.
 

rruss

Dabbler
Joined
Sep 14, 2015
Messages
35
Hi Nick2253 - Thanks. I know how to script all of these things in Linux. The point is, there is a clean way to do things, and the way to hack things to work. The clean way is to let the box simply boot and mount all of the filesystems defined in /etc/fstab. If I can get a clean delayed boot for my servers this is much preferred over maintaining shadow file of filesystem definitions and then having a script run on boot up to look for the NFS mounts and issue the mount commands one by one to mount everything once it sees them.
 

rsquared

Explorer
Joined
Nov 17, 2015
Messages
81
The easier way may be to just set a longer timeout in the boot loader... Assuming Grub since you mentioned Linux, edit /etc/default/grub as and set the GRUB_TIMEOUT variable to something like 120.

Sent from my Nexus 6 using Tapatalk
 

rruss

Dabbler
Joined
Sep 14, 2015
Messages
35
rsquared - that's a very good idea. Just delay the entire boot process from starting... which is essentially what delaying the power-on at the UPS would do.
 

rruss

Dabbler
Joined
Sep 14, 2015
Messages
35
So, for completeness sake - here are the conclusions I came to and the solution I settled on to get the desired behavior.

Using the generic usbhid-ups driver, my particular UPS (APC SMX3000LV) goes into a different shutdown state than it does when commanded by the powerchute software. This state does not follow the proper power-on delay sequences that had been programmed into the UPS. APC - This is a bug, you should follow these delays any time the outlet group goes from off to on, period. But, I won't hold my breath for a firmware update and I need an immediate solution. Here is what I've settled on:

UPS Setup:

  • I've set the "on" delays on all outlet groups to 0.
  • I've set the "off" delays to 3 minutes (it does respect this timing when it shuts down via NUT) to make sure FreeNAS has plenty of time to finish syncing and powering off after it kills the UPS.
  • Turn on is programmed to hold off until battery has enough charge to allow everything to reboot, and be shut back down again cleanly. (This still needs to be tested, hopefully the UPS respects this timing)
FreeNAS setup:
  • I programmed it to trigger shutdown when "UPS goes on battery", and set the shutdown timer to 600 seconds.
    • This time has to be long enough to make sure that if power has failed again on a reboot, that the linux servers have had enough time to boot and hear the command to shut down again! Otherwise, they'd crash when FreeNAS eventually turns off the UPS.
    • Since FreeNAS takes 7 minutes to boot, and the linux boxes delay boot by 9 minutes (see below), FreeNAS can get the power-fail notice as soon as 2 minutes before the linux boxes start booting. They take about 5 minutes to boot, so there is a few minutes of margin there.
  • Due to a bug in FreeNAS, this also sets the "FINALDELAY" in upsmon.conf to 600 seconds. So after the attached NUT slaves are told to shut down, FreeNAS will wait another 600 seconds before shutting itself (and the UPS) down.
    • and once the UPS is told to shut down, it waits another 3 minutes as per the UPS setup itself.
Linux servers setup:
  • The linux servers are all NUT slaves to FreeNAS. When they are told to shut down they simply shut down - happy that the NFS mounts provided by FreeNAS are still stable and present during that process.
  • As per rsquared's suggestion, I've altered the grub configuration on all of them to delay their boot by 9 minutes to give FreeNAS ample time to finish booting and have the NFS mounts up and ready before they are needed.

There are always pathological cases of power failure, recovery, re-failure, etc, that aren't accounted for, but I think this covers just about everything and since FreeNAS is directly controlling the UPS, FreeNAS itself should be completely protected. The linux servers are cloneable and disposable by comparison.
 
Status
Not open for further replies.
Top