Thor is messing with my UPS

Or: Why are my battery status LEDs blinking all the time?

This post is related to a post from 2017 titled The Tale of Thor’s angry electrons. Related as in that it takes place at the same location in the western part of Norway where my family lives. Most of the equipment referenced in the previous post has been replaced by now. Most notably, the old ADSL internet line has been replaced by a long distance fiber line. That reduced the number of internet outages considerably. The power line and transformers were also updated at some point. I do not remember if this was before or after the incident chronicled in 2017, but it gave a massive improvement in power delivery. Multi-day outages was not uncommon during the rainy season. An in this part of Norway, the rainy season never ends, unless it is replaced by a short-lived snowstorm or a massive heat-wave, that is, a couple of days with temperatures above 20 degrees C.

But back to the internet. There will be no internet without power. But wait, we have mobile phones and laptops I hear you say. Well, the mobile phone talks to a base station. This base station requires power to operate. So even if we have a handy dead dinosaur converter that creates enough electricity to keep the fish frozen and the laptops charged, without power to the base station and the local internet distribution point there will not be any internet. Neither the magic floating wireless internet or the more traditional and stable wired variety coming out of the wall.

As you would know if you have read the previous chronicle, we have employed several measures to ensure a stable Internet connection (and power delivery). One of those measures is an APC SmartUPS 1500. It makes sure that the core network components receive clean power, and it provides backup power for at least 30 minutes. As a line-interactive UPS it is definitely a massive overkill for a residential building, but it has done everything asked of it without complaining since it was made in 2008.

I chose the APC SmartUPS series because I have only ever seen one that was utterly destroyed. It was connected to a network switch in the engine compartment of a massive cargo ship and had been subjected to “a small amount of water”. It still tried its best though. It didn’t care that the batteries had expanded inside the battery compartment and had to be removed using a crowbar and a hazard suit. Fitted with a new-ish battery it provided output, but the charging circuit was destroyed. Sadly no pictures, this was a long time ago.

But enough reminiscing, let us move forward in time to sometime before Christmas 2021. The batteries inside the UPS were no longer deemed functional, and the UPS announces this by beeping constantly and igniting the red “Battery Fault LED”. This is an impressive feat by the way, as to the best of my knowledge they are the original batteries from 2008. 2-5 years is a normal lifetime for pack of UPS batteries. Thor and his wayward electrons were still the primary suspects though, as this incident happened at the peak of thunder-season. That is, an extra nasty part of the ever constant rainy season.

A new original battery pack was ordered. These are relatively easy to make yourself if you have access to the cells by the way. They are usually some kind of Yuasa 12V lead-acid cell, and this particular pack needs two of them. They were of course all on backorder though due to the ongoing supply-chain issues. Some snooping around on the interwebs revealed a nice deal on the original battery pack, so it was put on order and arrived around three weeks later. I did not replace it myself, but it does not in any way require a rocket surgeon:

  • Yank off the plastic front cover without destroying it
  • Unscrew two Philips no. 2 screws holding the battery cover/ hold-down in place
  • Remove the cover
  • Pull out the old battery
  • Disconnect the battery cable.
  • Reassemble in reverse.
From the APC manual. UPS shown with the front panel in place.
From the APC manual. Note the front panel on top. The battery hold down is removed and not shown. It is held in place by two screws on the top and latches into thee main body at the bottom.

Problem

Finally we have arrived at the problem from the title. Though it may be a slightly interesting story so far, replacing a UPS battery pack is easy as long as you are able to lift the battery (it is heavy) and know how to operate a Phillips no. 2 screwdriver. The most common screwdriver in existence. The problem was as follows:

After the battery pack was replaced, the battery fault warning light was extinguished as expected. The battery charge indicators were however blinking constantly even after the battery was fully charged.

Note that the markings are different from those in the manual. The LED functions are however identical. APC changed from pictograms to text at some point.

Analysis

The blinking lights indicate that the battery is not powerful enough to supply the minimum amount of runtime for the connected load. The default value of 2 minutes was not changed. I seem to remember an indicated runtime in excess of 30 minutes when the UPS was installed, and the load has been reduced significantly since then as the connected equipment has been upgraded with newer less power hungry models. Thus, something is very wrong and the aforementioned wayward electrons are still prime suspects.

From what I can tell, when a new battery is fitted there are several tests performed during a more or less normal self test. The UPS performs such a test every time it is turned on, and you can trigger a test by pressing the On/test button if the UPS is already running. A self test is also triggered periodically every 14 days or so, depending on settings and model.

Note: This paragraph contains speculation based on testing and information from memory. There are at least two battery monitoring systems, one of which checks the general battery condition, and the more interesting one in this case that calculates the battery capacity as runtime for the current load. The runtime capacity of the battery is stored as a “Battery constant”. This value is used to calculate the runtime for varying loads. As the battery ages, this constant will change to reflect that the battery gets weaker over time. The constant should automatically revert to the default value when a new battery is detected. It should; but this can sometimes fail. When that happens you may experience a situation where the calculated runtime for the current load is 0 minutes.

When the runtime is 0 minutes (and perhaps when it is less than the aforementioned minimum runtime of 2 minutes), the self test fails (indicated in the UPS log). This probably blocks a reset of the battery constant, as said constant is supposed to be reset as part of a successful self test.

Speculation aside, what is needed is a reset of the battery constant to its default value. The actual value and how to set it manually is a closely guarded secret by the way, but APC has an article detailing the recommended steps to correct the situation here. To summarize:

  • Reset the minimum runtime value (default 2) to a lesser value. No use as 0 (the measured value)is not greater than 0 (the lowest permissible value).
  • Replace the battery. Already done, and the battery voltage was a constant 27V or thereabouts which is a nice value for a fully charged 24V battery on trickle charge.
  • Perform a manual calibration with a constant load of no less than 30%. Now this looks promising, as the normal load is around 10%. Maybe it needs a heavier load?

Manual calibration attempt

I turned off all the loads, disconnected them and connected a heater element that pulled a constant 50% load and disconnected the input power. A few minutes or rather seconds later the UPS shut down. I removed the heater element and restarted the UPS without a load and let it recharge. It recharged in about an hour, but after reconnecting the load the calculated runtime was still 0 minutes.

An interesting tidbit is the fact that when I re-ran this test with the normal 10% load and powerchute open, the battery voltage was more or less constant throughout the test, even though the charge sank to a reported 5%. I do not know what charging algorithm is employed in the UPS, but it is normal for the voltage to vary while charging. Let us compare to a normal lead-acid car battery which is basically the same type of cell. A depleted battery with just enough charge to get the engine running will usually start off with a charging voltage around 14.9, falling towards 13,9 as the battery is charged. (Values will vary by car model). Be aware that these values are measured while the charger is connected. If you disconnect a fully charged new 12V lead-acid battery and let it stabilize, the voltage will usually be somewhere around 12.9v. As the UPS battery pack contains two 12V cells in series, a fully charged new battery should be around 25,8V without a charger or load connected.

Logic reset

I came across a video from APC support (APC is owned by Schneider) showing how to reset the internal logic in the UPS. You basically disconnect everything, remove the battery and press the power button for 10 seconds. It did not change anything.

Talking to the UPS

This UPS model has three forms of communication with the outside world:

  • A USB cable talking to PowerChute
  • A web-server/SNMP expansion card (not equipped on this one)
  • What looks like an RS232 serial cable, but is not.
Rear view from the APC manual

I know that the USB cable works, as it is connected to a computer running PowerChute (the official software from APC). I tried using the serial port, as internet rumors claimed that you could talk to the UPS using Putty. When I connected a standard nullmodem cable, the UPS promptly shut down. No warning, it just died and refused to restart. This is when I became convinced that Thor had claimed another victim finally. These units are expensive, and I quickly found out just how expensive they had become in the current supply chain. And yes, they are still made. In a newer version of course, but basically the same unit.

So I decided to dig deeper. A search through the APC forums revealed that this was an intended result. You see, APC have special nullmodem cables for this purpose that can be connected to the UPS without triggering the kill-switch. Whether or not such a cable was included with this model is unclear, but if so we could not locate it. They were moderately expensive, and also on backorder everywhere. We can only speculate as to what the intentions of such a kill switch was originally, but not having the cables in stock does not contribute to sales. Note: Removing the nullmodem cable allowed the UPS to be restarted.

I shall spare you the long story about “The quest for a 940-024C cable”. TLDR: I made my own cable. This involved importing parts from The UK and two trips to a local supplier (still 50 clicks away), but surprisingly it was all done in just three days. The cable is not difficult to make with the right parts and equipment, just search for 940-024C pinout. Be aware that there are a couple of different models of this cable with other part numbers, and different UPS models may require a different variant.

Custom 940-024C with a RS232 to USB adapter

Solution

To reset the battery constant I used a tool called APC-FIX. It appears that the problem was widespread enough with this generation of UPS that someone in Belarus made the effort to create a specialized tool for the job. The documentation I could find is all written in Russian, but scroogle and bing has nice translators available that makes it somewhat readable.

Action plan

Pictures below.

Warning: This action plan details running third party non-approved software that changes settings not supposed to be user accessible. If you mess around with this tool you could brick your UPS. Take pictures along the way so you know what the values were before you or the program changed them. Also, be prepared for a power outage at any time.

  • Disconnect the USB cable
  • If you are using a computer that has PowerChute installed, disable any running services.
  • Get a hold of an RS-232 adapter or one of those rare computers that are still delivered with a built-in port. USB adapters work fine as long as you get one with a genuine Prolific chip. Fake chips that stop working due to driver problems has been an issue.
  • Get, find or make a 940-024C cable.
  • Remove the SNMP/Network card from the UPS if such a card is installed.
  • Connect the serial cable and find the port number in device manager.
  • Tell APC-FIX what com port you are using and activate Battery const auto fix mode.
  • Press Connect and wait for the program to work its magic. The UPS will beep and make noises while this process is running.
  • If it fails, make sure the battery self-identifies as 100% charged and try again.
  • I had to run it twice before a valid constant was set. The first run cleared the constant and gave me a valid runtime. After running a self-test, I re-ran APC-FIX and it successfully calculated a valid constant.
  • Stop APC-Fix and disconnect the serial cable
  • Re-connect the USB cable and re-install the SNMP/network card if you removed them.
  • Restart PowerChute and check the runtime.

You can find an unofficial list of APC UPS register/constant values here: https://kirbah.github.io/apc-ups/UPS-constants/

Pictures

Before we start the Runtime is at 0 minutes and const 0 is set to 09
Set the COM-port and enable Auto fix
After the first try const 0 is blank and the runtime starts to rise with the battery charge.
After the second and final attempt the constant is set to A1 (the default value for this model), and the runtime is almost two hours.

Author: DizzyBadger

SQL Server DBA, Cluster expert, Principal Analyst

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.