List VM Networks in SCVMM

To list the connected networks and subnets for each VM running in VMM, run the following in a VMM connected powershell window:

get-vm | select -ExpandProperty VirtualNetworkAdapters|select Name, VMNetwork, VMSubnet, IPV4Subnets, IPv4Addresses|sort-object VMnetwork|ft

A new (and improved?) wasteland

This is a story in the “Knights of Hyper-V” series, an attempt at humor, with actual technical content hidden in the details. This particular one is just for fun though. Any resemblance of actual trademarks or people or events (real or those  that can only be found residing inside your mind), is purely coincidental and should be disregarded.

The knights of Hyper-V were doing some spring cleaning. Or, it was actually summer and thus to late in the year to call it spring cleaning anymore. Project setbacks, slow equipment deliveries and the plague-that-shall-not-be-named had severely hampered progress. But finally, the day had arrived to replace some of the hard working VMs with fresh new ones, running updated software versions glistening in the summer sun. Or covered in the more gloomy, but oh so common summer rain. And perhaps snow, locusts or other more or less funny local phenomenon.

The old servers was not really all that old, but a change in networking politics had ushered in an early swap over. We were leaving the The Wasteland of Nexus for a new and supposedly better (and cheaper) Wasteland. With software-defined wasteland processors or something to that effect. The knights did not really care, all they knew was that new network armor plate connections were required, and that was always a pain in the backside. The application minions would be grumpy as they would have to write scroll after scroll of requests beseeching for safe paths through the walls of fire.

But enough of the backstory. After a long, looong time the imposed quest for a new wasteland was nearing its end, and it was time for cleanup. Most of the knights were finally at summer vacation, preparing to queue along the congested paths to the beach, waiting in line to look over a cliff, visiting distant relations or hiding in a deep dungeon to escape the aforementioned plague. Only a skeleton crew (not composed of actual skeletons this time) remained to watch over the systems and do the odd cleanup job. A passing minstrel wrote an ode to one of the old servers in exchange for a late breakfast, or early lunch depending on  your point of view.:

Ode to server sixteen

New servers come in, and old ones get phased out.
It exists now only as a memory
Vanished into thin air
Like a fleeting ghost in the machine
Binary code rearranged to form new beginnings
It will always remain in our hearts

For the time being, all was well in the kingdom. All the VMs were kept in line by the automated all-seeing eye of OM, and it was time to relax, read, and practice dragon slaying if one was such inclined. Till next time, enjoy your life such as it is. Remember, things could always be worse. Before you know it, the roars of a three-headed Application bug dragon and the distant horrified screams of application team minions could wake you from your slumber…

Upgrade to VMM 2019, another knight’s tale

This is a story in the “Knights of Hyper-V” series, an attempt at humor with actual technical content hidden in the details.

The gremlins of the blue window had been up to their usual antics. In 2018 they promised a semi-annual update channel for System Center Virtual Machine Manager. After a lot of badgering (by angry badgers) the knights had caved and installed SCVMM 1807. (That adventure has been chronicled here). As you are probably aware, the gremlins of the blue window are not to be trusted. Come 2019 they changed their minds and pretended to never have mentioned a semi-annual channel. Thus, the knights were left with a soon-to-be unsupported installation and had to come up with a plan of attack. They could only hope for time to implement it before the gremlins changed the landscape again. Maybe a virtual dragon to be slain next time? Or dark wizards? The head knight shuddered, shrugged it off, and went to study the not so secret scrolls of SCVMM updates. It was written in gremlineese and had to be translated to the common tongue before it could be implemented. The gremlins was of the belief that everyone else was living in a soft and cushy wonderland without any walls of fire, application hobbits or networking orcs and wrote accordingly. Thus, if you just followed their plans you would sooner or later be stuck in an underground dungeon filled with stinky water without a floatation-spell or propulsion device.

Continue reading “Upgrade to VMM 2019, another knight’s tale”

Logical switch uplink profile gone

Problem

When you try to connect a new VM to a logical switch you get a lot of strange error messages related to missing ports or no available switch. The errors seem random.

Analysis

If you check the logical switch properties of an affected host, you will notice that the uplink profile is missing:

image

If you look at the network adapter properties of an affected VM, you will notice that the Logical Switch field is blank:

image

This is connected to a WMI problem. Some Windows updates uninstall the VMM  WMI MOFs required for the VMM agent to manage the logical switch on the host. See details at MS Tech.

Solution

MOFCOMP to the rescue. Run the following commands in an administrative Powershell prompt. To update VMM you have to refresh the cluster/node. Note: Some versions use a different path  to the MOF-files, so verify this if the command fails.

 

image

Mofcomp “%systemdrive%\Program Files\Microsoft System Center\Virtual Machine Manager\setup\scvmmswitchportsettings.mof”
Mofcomp “%systemdrive%\Program Files\Microsoft System Center\Virtual Machine Manager\DHCPServerExtension\VMMDHCPSvr.mof”
Get-CimClass -Namespace root/virtualization/v2 -classname *vmm*

Hypervisor not running

Problem

After upgrading my LAB to VMM 1801, and subsequently VMM 1806 (see https://lokna.no/?p=2519), VMs refuse to start on one of my hosts. EventID 20148 was logged when I tried to create a new VM. I restarted the host in hope of a quick fix, but the result was that none of the VMs living on this host wanted to boot.

Virtual machine ‘NAME’ could not be started because the hypervisor is not running”

image

Solution

For some reason the Hypervisor has been disabled. You can check this by running BCDEDIT in an administrative command prompt. The hypervisorlaunchtype should be set to Auto. If it is not, change it by running the following command:

bcdedit /set hypervisorlaunchtype auto


After that, reboot the host and everything should be running again. Unless, of course, you have a completely different issue preventing your VMs from starting.

image

Upgrade to VMM 1801, a knights tale

This is a story in the “Knights of Hyper-V” series, an attempt at humor with actual technical content hidden in the details.

A proclamation had been issued several moons ago by the gremlins of the blue window, declaring that a new version of the Virtual Machine Manager had been released. This had mostly been ignored by our merry knights, they were all busy building new systems, putting out fires and slaying dragons. You know, the usual stuff. Thus they had no time to spare for doing such things as maintenance on systems that were chugging along nicely without issues. But when a second proclamation appeared about an even newer version, it was decided to spend some time trying to do an upgrade in the lab, down in the spare dungeon.

Alas, this was not to be an easy task. The lab servers were in dire need of some maintenance as well, and one of the host flat out refused to respond to commands. Closer inspection revealed a “No bootable device” error on the local console. The results of a botched patching run a long time ago. For some reason the main partition was no longer marked active, a relatively easy fix in diskpart. But on to the main quest. Rumors would have it that there was no in place upgrade from SCVMM 2016 to SCVMM 1801. Those rumors were true indeed.

A knight was sent into the maze of documentation to look for answers. He came upon several dead ends and a lot of references to the hidden cat of 404, but he persisted and finally ended up at https://docs.microsoft.com/en-us/system-center/vmm/upgrade-vmm?view=sc-vmm-1801. Just as in the upgrade from SCVMM 2012 to SCVMM 2016, an uninstall/reinstall was required.

A cunning plan is devised

The SCVMM 1801 scroll of  system requirements were reviewed to make sure that our systems were supported. The spare dungeon contains a single VM running both SCVMM and SQL Server and some old hosts. The VMM VM has the following setup:

  • Windows Server 2016
  • SQL Server 2012 SP4
  • SCVMM 2016 4.0.2314.0 (UR5)

After some pondering around the table reading the scroll of upgrade instructions mentioned above, the following plan was agreed upon:

  • Checkpoint/snapshot the VMM server.
  • Create a Copy-Only backup of the VMM database.
  • Reboot the VMM Server to make sure there are no pending reboots or other nasty stuff lurking in memory.
  • Uninstall VMM 2016.
  • Restart the server.
  • Install VMM 1801.
  • Upgrade VMM to 1807.
  • Remove the checkpoint.
  • Update the VMM Agent on the hosts.
  • Turn off Diagnostic and Usage data.

Note: If you are running other System Center products, make sure that you review the upgrade sequence. Especially noteworthy is the fact that Operations Manager should be upgraded before VMM.

Continue reading “Upgrade to VMM 1801, a knights tale”

Configure VMQ and RSS on physical servers

Introduction

Samples below are collected from Windows Server 2016

The primary objective is to avoid weighing down Core 0 with networking traffic. This is the first core on the first NUMA node, and this core is responsible for a lot of kernel processing. If this core suffers from contention, a wild blue screen of death will appear. Thus, we want our network adapters to use other cores to process their traffic. We can achieve this in three ways, depending on what we use the adapter for:

  • Enable Receive Side Scaling (RSS) and configure it to use specific cores.
  • Enable Virtual Message Queueing and configure it to use specific cores
  • Set the preferred NUMA node

For physical machines

On network adapters used for generic traffic, we should enable RSS and disable VMQ. On adapters that are part of a virtual switch, we should disable RSS and enable VMQ. The preferred NUMA node should be configured for all physical adapters.

For virtual machines

If the machine has more than one CPU, enable vRSS.

Investigating the NUMA architecture

Sockets and NUMA nodes

Sysinternals coreinfo -s -n will show the relationship between logical processors, sockets and NUMA nodes. In the example below we have a CPU with two sockets and four nodes.

clip_image002

Closest NUMA node

Each PCIE adapter is physically connected to a specific NUMA node. If possible, RSS / VMQ should be mapped to cores on the same NUMA node that the NIC is connected to. Get-NetadApterRss will show you which NUMA node is closest for each adapter. The one in the sample is connected to/closest to NUMA node 0 as the NUMA distance for cores in group 0 is 0. We can also see that the NUMA distance to node 1 for this particular port is lower than the distance to nodes 2 and 3. This is caused by the fact that node 0 and 1 are on the same physical CPU, whereas node 2 and 3 are on another physical CPU.

clip_image004 Continue reading “Configure VMQ and RSS on physical servers”

Script to migrate VMs back to their preferred node

Situation

You have a set of VMs that are not running at their preferred node due to maintenance or some kind of outage triggering unscheduled migration. You have set one (and just one) preferred host for all you VMs. You have done this because you are want to balance your VMs manually to guarantee a certain minimum of performance to each VM. By the way, automatic load balancing cannot do that, there will be a lag between a usage spike and load balancing if load balancing is required. But I digress. The point is, the VMs are not running where they should, and you have not enabled automatic failback because you are afraid of node flapping or other inconveniences that could create problems. Hopefully though, you have some kind of monitoring in place to tell you that the VMs are restless and you need to fix the problem and subsequently corral them into their designated hosts. Oh, and you are using Virtual Machine Manager. You could do this on the individual cluster level as well, but that would be another script for another day.

If you understand the scenario above and self-identify with at least parts of it, this script is for you. If not, this script could cause all kinds of trouble.

Script Notes

  • You can skip “Connect to VMM Server” and “Add VMM Cmdlets” if you are running this script from the SCVMM PowerShell window.
  • The MoveVMs variable can be set to $false to get a list. this could be a smart choice for your first run.
  • The script ignores VMs that are not clustered and VMs that does not have a preferred server set.
  • I do not know what will happen if you have more than one preferred server set.

 

The script

 

#######################################################################################################################
#   _____     __     ______     ______     __  __     ______     ______     _____     ______     ______     ______    #
#  /\  __-.  /\ \   /\___  \   /\___  \   /\ \_\ \   /\  == \   /\  __ \   /\  __-.  /\  ___\   /\  ___\   /\  == \   #
#  \ \ \/\ \ \ \ \  \/_/  /__  \/_/  /__  \ \____ \  \ \  __< \ \  __ \  \ \ \/\ \ \ \ \__ \  \ \  __\   \ \  __<   #
#   \ \____-  \ \_\   /\_____\   /\_____\  \/\_____\  \ \_____\  \ \_\ \_\  \ \____-  \ \_____\  \ \_____\  \ \_\ \_\ #
#    \/____/   \/_/   \/_____/   \/_____/   \/_____/   \/_____/   \/_/\/_/   \/____/   \/_____/   \/_____/   \/_/ /_/ #
#                                                                                                                     #
#                                                   http://lokna.no                                                   #
#---------------------------------------------------------------------------------------------------------------------#
#                                          -----=== Elevation required ===----                                        #
#---------------------------------------------------------------------------------------------------------------------#
# Purpose: List VMs that are not running at their preferred host, and migrate them to the correct host.               #
#                                                                                                                     #
#=====================================================================================================================#
# Notes:                                                                                                              #
# There is an option to disable VM migration. If migration is disabled, a list is returned of VMs that are running at #
# the wrong host.                                                                                                     #
#                                                                                                                     #
#######################################################################################################################



$CaptureTime = (Get-Date -Format "yyyy-MM-dd HH:mm:ss")
Write-Host "-----$CaptureTime-----`n"
# Add the VMM cmdlets to the powershell
Import-Module -Name "virtualmachinemanager"

# Connect to the VMM server
Get-VMMServer –ComputerName VMM.Server.Name|Select-Object Name

#Options
$HostGroup = "All Hosts\SQLMGMT\*" #End this with a star. You can go down to an individual VM. All Hosts\Hostgroup\VM.
$MoveVMs = $true #If you set this to true, we will try to migrate VMS to their preferred host.
#List VMS in the host group
$VMs = Get-SCVirtualMachine | where { $_.IsHighlyAvailable -eq $true -and $_.ClusterPreferredOwner -ne $null -and $_.HostGroupPath -like $HostGroup }

# Process
Foreach ($VM in $VMs) 
{
    # Get the Preferred Owner and the Current Owner
    $Preferred = Get-SCVirtualMachine $VM.Name | Select-Object -ExpandProperty clusterpreferredowner
    $Current = $VM.HostName
    
    
    # List discrepancies
    If ($Preferred -ne $Current) 
    {
        Write-Host "VM $VM should be running at $Preferred but is running at $Current." -ForegroundColor Yellow
        If ($MoveVMs -eq $true)
        {
            $NewHost = Get-SCVMHost -ComputerName $Preferred.Name
            Write-Host "We are trying to move $VM from  $Current to $NewHost." -ForegroundColor Green
            Move-SCVirtualMachine -VM $VM -VMHost $NewHost|Select-Object ComputerNameString, HostName
        } 
    }
}


Slow startup on VMs with Virtual Fibre Channel

Problem

After you enable Virtual Fibre Channel on a Hyper-V VM it takes forever to start up. In this instance forever equals about two minutes. The VM is stuck in the Starting… stat for most of this period.

Analysis

During startup, the Hypervisor waits for LUNs to appear on the virtual HBA before the VM is allowed to boot the OS. When there are no LUNs defined for a VM, i.e. when you are deploying a new VM, the Hypervisor patiently waits for 90 seconds before it gives up. Thus startup of the VM is delayed by 90 seconds if there are no LUNs presented to the VM, or if the SAN is down or misconfigured. Event ID 32213 from Hyper-V-SynthFC is logged:

SNAGHTML24190ea1

Solution

Depending on your specific cause, one of the following should do the trick:

  • Present some LUNs
  • Remove the Virtual HBA adapters if they are not in use
  • Correct the SAN config to make sure the VM is able to talk to the SAN

Failed to allocate VMQ

Problem

Event ID 113 from Hyper-V-VmSwitch is logged each time a VM is started on a host:

Failed to allocate VMQ for NIC [GUID]  (Friendly Name: [VM name]) on switch [GUID](Friendly Name: [Switch name]). Reason – The OID failed. Status = An invalid parameter was passed to a service or function.SNAGHTML37e5e773

Solution

This is caused by a bug. Download and install KB3031598 to fix the problem.