Dark magic at the backup site

This is a story in the “Knights of Hyper-V” series, an attempt at humor with actual technical content hidden in the details.

It was a nice Friday afternoon, and the Knights of Hyper-V were holding an end-of-week council meeting. The mood was light, as cunning and not so cunning plans were laid for the days ahead. There were misbehaving servers in need of an attitude adjustment, and misbehaving minions in need of a reality check. Just as the head knight was delivering a lengthy speech on the proper use of Virtual Message Queue-Incantations and the correct grammar for Receive Side Scaling spells of power, a stressed out envoy from the all-seeing monitors arrived and demanded immediate audience. They had lost all communication with one of the ghosts in the offsite backup dungeon. “Not another network armor issue!” exclaimed one of the Knights. She knew that some of the hosts at the offsite dungeon still used the inferior (and highly unstable) Broadcom plates as a connection to The Wasteland of Nexus.

The Knights leapt into action and went to interrogate the hosts. To their big surprise the hosts were in uproar. The cluster event scroll was gushing with blood red critical alerts, and host1 was completely unreachable. Undoubtedly, some dark magic was veiling this information from the gaze of the all-seeing monitors. The Knights bade the VMM daemons to put the missing host in maintenance mode. This proved to be a great mistake. As the VMM daemons tried to herd all the ghosts over to hosts 2 and 3, the cluster log cried out in red agony once more, but this time with storage problems on host3 as well. Thus the cluster was left with only one working node, and perhaps a cursed node spewing corrupt data into the storage. The Knights had no other choice but to take it all down and put all the ghosts to sleep. What started out as a quiet afternoon had suddenly turned into a frenzy of unruly runaway hosts and buzzards circling in the sky above the backup site.

Reciting spells from the unholy book of Drac, a portal was opened to the dungeon, making it possible to behold the console of the unreachable host. It was in disarray, sending network packets at full speed without reaching The Wastelands of Nexus. Once more a finger was pointed at the unstable Broadcom plates. More spells were uttered, but to no avail. Then, the Wizard of Badgerville entered the room, and recited the first commandment of IT; “Have you tried turning it off and on again?”. His voice echoed through the room. No one answered, but one of the Knights hammered away at the portal’s console, and before long the host was restarting. The Knights watched impatiently while the host rebooted. It seemed to take forever, but finally they were greeted by a logon prompt. A quick test revealed that the host was up and running, but the storage error was not gone. A minion from the storage realm was summoned. While they waited, The Knights examined the scroll of changes hoping to perhaps reveal the reason why the network was unstable at the backup dungeon. It was easy to blame the Broadcom plates, but something just didn’t smell right. The scroll revealed that a prior from the Nexus cult had been down in the dungeon the previous evening. Exactly what dark magic he had performed there was unclear, but according to the event scrolls it had most certainly triggered a cluster failover. It did not explain the storage problem though.

The storage minion arrived and started chanting strange storage spells into the portal. Nothing happened. Other spells were tried, to no avail. It soon became apparent that we had to wake up one of the ghosts that had been put to sleep, as it was responsible for transmitting spells to the storage realm. Thus, another problem was revealed. Some rouge knight had been fiddling with the storage settings, and the hosts now had an additional undocumented storage array connected. According to the storage minion everything was fine. Closer inspection of host 3 revealed that such was not the case. Host 3 had not been properly configured to talk to the new array, and was firm in its belief that at least four new arrays had been connected. Because of this storage packets were sent all over the place, causing great confusion among the hosts. The Wizard of Badgerville was consulted. He searched his notebook of spells for a long time, before suddenly chanting something about multipathing sternly into the portal. Host 3 shrieked in agony and vanished from view. Then nothing happened. The cluster log listed host 3 as down, and the Knights prepared for a trek over the icy plains to visit the host in person. Swords were sharpened, and spiked horseshoes were commissioned from the blacksmith. Just as they were ready to depart, host 3 reappeared in the portal, looking much better this time. An investigation revealed that the rouge Knight had not followed the correct procedure for connecting storage. In his haste to please the annoying service-team minions he had cut corners, and everything had appeared nice and shiny as long as no one ever tried to use host 3 for anything other than a paperweight. All the ghost was awake, and the Knights where very ready for calling it a day, unsaddling the horses and going home. But that was not going to happen anytime soon…

The ghost that started it all still had no connection to the rest of the world. Yet another ghost was also out of touch, but the remaining twenty or so were all fine. A real puzzle indeed. Suspicion was cast on the Prior from the Nexus cult. What had he really done in the dungeon? Had he achieved his goal, or had it all failed? Was he perhaps still there, trapped in some kind of magic spider web of cables? Sadly, there was nothing the Knights could do to about it. The connection to the Wasteland of Nexus was clearly up and running again, so the problem had to be somewhere inside the wasteland itself. There was no other way, another prior from The Cult of Nexus was required. A prior that spoke the dark dialect of IOS was summoned, and the ticket handed over to him. All the Knights could do at this point was wait and test, test and wait, as the prior went further into the labyrinth of routing configuration. Finally, after several hours, all the problems were found and fixed. Exhausted, the Knights went home to rest. Whether or not the second prior ever found his way out of the maze is unknown to this day, and all attempts to contact the first prior has so far been unsuccessful.

Author: DizzyBadger

SQL Server DBA, Cluster expert, Principal Analyst

One thought on “Dark magic at the backup site”

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.