Upgrade to VMM 1801, a knights tale

This is a story in the “Knights of Hyper-V” series, an attempt at humor with actual technical content hidden in the details.

A proclamation had been issued several moons ago by the gremlins of the blue window, declaring that a new version of the Virtual Machine Manager had been released. This had mostly been ignored by our merry knights, they were all busy building new systems, putting out fires and slaying dragons. You know, the usual stuff. Thus they had no time to spare for doing such things as maintenance on systems that were chugging along nicely without issues. But when a second proclamation appeared about an even newer version, it was decided to spend some time trying to do an upgrade in the lab, down in the spare dungeon.

Alas, this was not to be an easy task. The lab servers were in dire need of some maintenance as well, and one of the host flat out refused to respond to commands. Closer inspection revealed a “No bootable device” error on the local console. The results of a botched patching run a long time ago. For some reason the main partition was no longer marked active, a relatively easy fix in diskpart. But on to the main quest. Rumors would have it that there was no in place upgrade from SCVMM 2016 to SCVMM 1801. Those rumors were true indeed.

A knight was sent into the maze of documentation to look for answers. He came upon several dead ends and a lot of references to the hidden cat of 404, but he persisted and finally ended up at https://docs.microsoft.com/en-us/system-center/vmm/upgrade-vmm?view=sc-vmm-1801. Just as in the upgrade from SCVMM 2012 to SCVMM 2016, an uninstall/reinstall was required.

A cunning plan is devised

The SCVMM 1801 scroll of  system requirements were reviewed to make sure that our systems were supported. The spare dungeon contains a single VM running both SCVMM and SQL Server and some old hosts. The VMM VM has the following setup:

  • Windows Server 2016
  • SQL Server 2012 SP4
  • SCVMM 2016 4.0.2314.0 (UR5)

After some pondering around the table reading the scroll of upgrade instructions mentioned above, the following plan was agreed upon:

  • Checkpoint/snapshot the VMM server.
  • Create a Copy-Only backup of the VMM database.
  • Reboot the VMM Server to make sure there are no pending reboots or other nasty stuff lurking in memory.
  • Uninstall VMM 2016.
  • Restart the server.
  • Install VMM 1801.
  • Upgrade VMM to 1807.
  • Remove the checkpoint.
  • Update the VMM Agent on the hosts.
  • Turn off Diagnostic and Usage data.

Note: If you are running other System Center products, make sure that you review the upgrade sequence. Especially noteworthy is the fact that Operations Manager should be upgraded before VMM.

Continue reading “Upgrade to VMM 1801, a knights tale”

CredSSP encryption oracle remediation

Problem

One of my minions contacted me about a strange error message connecting to a server. He was running scheduled maintenance, but he was unable to connect via RDP to one of his servers. The error message looked like this:

image

“An authentication error occurred The Function requested is not supported”

“This could be due to CredSSP encryption oracle remediation”

Analysis

Some Microsoft gremlin thought it was a good idea to block remote connections to Windows 2012R2 servers missing the march 2018 CredSSP patch if your client is patched. You know, just to make it extra easy to patch the servers. They even try to blame Oracle for their mess.

According to 4093492, this fine function was enabled on 2018-05-08. “By default, after this update is installed, patched clients cannot communicate with unpatched servers.” You can override this by creating a GPO and restarting all affected systems, but that would leave you permanently vulnerable to what is in fact a security issue. Moreover, as a reboot is needed for the workaround it is easier to just patch the servers (which was our initial plan).

Solution

Install the patches from this list on your servers: https://portal.msrc.microsoft.com/en-us/security-guidance/advisory/CVE-2018-0886. If you are lucky they are just VMs and you have access to the VM console ore some kind of KVM. If you are not lucky, a trip to the server room it is.

Configure VMQ and RSS on physical servers

Introduction

Samples below are collected from Windows Server 2016

The primary objective is to avoid weighing down Core 0 with networking traffic. This is the first core on the first NUMA node, and this core is responsible for a lot of kernel processing. If this core suffers from contention, a wild blue screen of death will appear. Thus, we want our network adapters to use other cores to process their traffic. We can achieve this in three ways, depending on what we use the adapter for:

  • Enable Receive Side Scaling (RSS) and configure it to use specific cores.
  • Enable Virtual Message Queueing and configure it to use specific cores
  • Set the preferred NUMA node

For physical machines

On network adapters used for generic traffic, we should enable RSS and disable VMQ. On adapters that are part of a virtual switch, we should disable RSS and enable VMQ. The preferred NUMA node should be configured for all physical adapters.

For virtual machines

If the machine has more than one CPU, enable vRSS.

Investigating the NUMA architecture

Sockets and NUMA nodes

Sysinternals coreinfo -s -n will show the relationship between logical processors, sockets and NUMA nodes. In the example below we have a CPU with two sockets and four nodes.

clip_image002

Closest NUMA node

Each PCIE adapter is physically connected to a specific NUMA node. If possible, RSS / VMQ should be mapped to cores on the same NUMA node that the NIC is connected to. Get-NetadApterRss will show you which NUMA node is closest for each adapter. The one in the sample is connected to/closest to NUMA node 0 as the NUMA distance for cores in group 0 is 0. We can also see that the NUMA distance to node 1 for this particular port is lower than the distance to nodes 2 and 3. This is caused by the fact that node 0 and 1 are on the same physical CPU, whereas node 2 and 3 are on another physical CPU.

clip_image004 Continue reading “Configure VMQ and RSS on physical servers”

Script to migrate VMs back to their preferred node

Situation

You have a set of VMs that are not running at their preferred node due to maintenance or some kind of outage triggering unscheduled migration. You have set one (and just one) preferred host for all you VMs. You have done this because you are want to balance your VMs manually to guarantee a certain minimum of performance to each VM. By the way, automatic load balancing cannot do that, there will be a lag between a usage spike and load balancing if load balancing is required. But I digress. The point is, the VMs are not running where they should, and you have not enabled automatic failback because you are afraid of node flapping or other inconveniences that could create problems. Hopefully though, you have some kind of monitoring in place to tell you that the VMs are restless and you need to fix the problem and subsequently corral them into their designated hosts. Oh, and you are using Virtual Machine Manager. You could do this on the individual cluster level as well, but that would be another script for another day.

If you understand the scenario above and self-identify with at least parts of it, this script is for you. If not, this script could cause all kinds of trouble.

Script Notes

  • You can skip “Connect to VMM Server” and “Add VMM Cmdlets” if you are running this script from the SCVMM PowerShell window.
  • The MoveVMs variable can be set to $false to get a list. this could be a smart choice for your first run.
  • The script ignores VMs that are not clustered and VMs that does not have a preferred server set.
  • I do not know what will happen if you have more than one preferred server set.

 

The script

 

#######################################################################################################################
#   _____     __     ______     ______     __  __     ______     ______     _____     ______     ______     ______    #
#  /\  __-.  /\ \   /\___  \   /\___  \   /\ \_\ \   /\  == \   /\  __ \   /\  __-.  /\  ___\   /\  ___\   /\  == \   #
#  \ \ \/\ \ \ \ \  \/_/  /__  \/_/  /__  \ \____ \  \ \  __< \ \  __ \  \ \ \/\ \ \ \ \__ \  \ \  __\   \ \  __<   #
#   \ \____-  \ \_\   /\_____\   /\_____\  \/\_____\  \ \_____\  \ \_\ \_\  \ \____-  \ \_____\  \ \_____\  \ \_\ \_\ #
#    \/____/   \/_/   \/_____/   \/_____/   \/_____/   \/_____/   \/_/\/_/   \/____/   \/_____/   \/_____/   \/_/ /_/ #
#                                                                                                                     #
#                                                   http://lokna.no                                                   #
#---------------------------------------------------------------------------------------------------------------------#
#                                          -----=== Elevation required ===----                                        #
#---------------------------------------------------------------------------------------------------------------------#
# Purpose: List VMs that are not running at their preferred host, and migrate them to the correct host.               #
#                                                                                                                     #
#=====================================================================================================================#
# Notes:                                                                                                              #
# There is an option to disable VM migration. If migration is disabled, a list is returned of VMs that are running at #
# the wrong host.                                                                                                     #
#                                                                                                                     #
#######################################################################################################################



$CaptureTime = (Get-Date -Format "yyyy-MM-dd HH:mm:ss")
Write-Host "-----$CaptureTime-----`n"
# Add the VMM cmdlets to the powershell
Import-Module -Name "virtualmachinemanager"

# Connect to the VMM server
Get-VMMServer –ComputerName VMM.Server.Name|Select-Object Name

#Options
$HostGroup = "All Hosts\SQLMGMT\*" #End this with a star. You can go down to an individual VM. All Hosts\Hostgroup\VM.
$MoveVMs = $true #If you set this to true, we will try to migrate VMS to their preferred host.
#List VMS in the host group
$VMs = Get-SCVirtualMachine | where { $_.IsHighlyAvailable -eq $true -and $_.ClusterPreferredOwner -ne $null -and $_.HostGroupPath -like $HostGroup }

# Process
Foreach ($VM in $VMs) 
{
    # Get the Preferred Owner and the Current Owner
    $Preferred = Get-SCVirtualMachine $VM.Name | Select-Object -ExpandProperty clusterpreferredowner
    $Current = $VM.HostName
    
    
    # List discrepancies
    If ($Preferred -ne $Current) 
    {
        Write-Host "VM $VM should be running at $Preferred but is running at $Current." -ForegroundColor Yellow
        If ($MoveVMs -eq $true)
        {
            $NewHost = Get-SCVMHost -ComputerName $Preferred.Name
            Write-Host "We are trying to move $VM from  $Current to $NewHost." -ForegroundColor Green
            Move-SCVirtualMachine -VM $VM -VMHost $NewHost|Select-Object ComputerNameString, HostName
        } 
    }
}


Hyper-V VM with VirtualFC fails to start

Problem

This is just a quick note to remember the solution and EventIDs.

The VM fails to start complaining about failed resources or resource not available in Failover Cluster manager. Analysis of the event log reveals messages related to VirtualFC:

  • EventID 32110 from Hyper-V-SynthFC: ‘VMName’: NPIV virtual port operation on virtual port (WWN) failed with an error: The world wide port name already exists on the fabric. (Virtual machine ID ID)
  • EventID 32265 from Hyper-V-SynthFC: ‘VMName’: Virtual port (WWN) creation failed with a NPIV error(Virtual machine ID ID).
  • EventID 32100 from Hyper-V-VMMS: ‘VMNAME’: NPIV virtual port operation on virtual port (WWN) failed with an unknown error. (Virtual machine ID ID)
  • EventID 1205 from Microsoft-Windows-FailoverClustering: The Cluster service failed to bring clustered role ‘SCVMM VM Name Resources’ completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered role.

Analysis

The events point in the direction of Virtual Fibre Channel or Fibre Channel issues. After a while we realised that one of the nodes in the cluster did not release the WWN when a VM migrated away from it. Further analysis revealed that the FC driver versions were different.

SNAGHTML66058b10

SNAGHTML660875d4

Solution

  • Make sure all cluster nodes are running the exact same driver and firmware for the SAN and network adapters. This is crucial for failovers to function smoothly.
  • To “release” the stuck WWNs you have to reboot the offending node. To figure out which node is holding the WWN you have to consult the FC Switch logs. Or you could just do a rolling restart and restart all nodes until it starts working.
  • I have successfully worked around the problem by removing and re-adding the virtual FC adapters n the VM that is not working. I do not know why this resolved the problem.
  • Another workaround would be to change the WWN on the virtual FC adapters. You would of course have to make this change at the SAN side as well.

Is your LAPS working as it should?

Intro

So, you have implemented LAPS, and you are wondering whether or not it is working as it should? Or at least, you should wonder about that. You see, LAPS is a solution with quite a few “moving parts”, and all of them have to work for your local administrator passwords to be randomized and rotated automatically. You need a Group Policy Client Side Extension on each and every Server and Workstation (client), you need a GPO using said extension, and you need to extend the schema and set AD permissions. If any of these are not working properly somewhere, LAPS will not work properly. The most usual problems are:

  • The GPO CSE is not deployed to some clients.
  • The GPO is not linked in all OUs where you have clients.

 Detection

We can easily check if LAPS is working for a specific client be reading the contents of the AD attributes associated with LAPS. We need access to read these properties, so all automated and manual testes mentioned henceforth has to be run by an account with permissions to read the properties. The LAPS operations guide details how you should configure the permissions. That being said, you should of course also test the permissions to make sure that only privileged users are able to read said properties. The properties are called:

  • ms-Mcs-AdmPwd stores the password as clear text.
  • ms-Mcs-AdmPwdExpirationTime stores the point in time for the next password change. The GPO checks this value when it is applied and resets the password if the time has passed.

We can use both of these to test if LAPS has been applied to a specific computer object at least once. If you do a manual test by using the Attribute Editor in AD Users and Computers you will see both. I have written PowerShell commands to automate the process based on the value of the ms-Mcs-AdmPwdExpirationTime attribute.

List computers without LAPS

This lists all computer objects without a LAPS expiry set. Virtual cluster computer objects are excluded. The results are exported to the file C:\TEMP\NoLaps.csv.

get-adcomputer -Properties Name, operatingSystem, Description, ms-Mcs-AdmPwdExpirationTime `
-LDAPFilter "(&(!ms-Mcs-AdmPwdExpirationTime=*)(operatingSystem=Windows*)(!Description=ClusterAwareUpdate*)(!Description=Failover cluster virtual network name account))"|`
Select Name, operatingSystem, Description, ms-Mcs-AdmPwdExpirationTime| Sort-Object Name | export-csv C:\Temp\NoLaps.csv -Delimiter ";" -NoTypeInformation

List computers with expired LAPS

Lists all computer objects where LAPS has been applied at least once, where the expiration time has passed. These are usually computers that are not powered on, maybe removed but not properly deleted from the AD. The results are exported to the file C:\TEMP\ExpiredLaps.csv.

$now = Get-Date
get-adcomputer -Properties Name, operatingSystem, Description, ms-Mcs-AdmPwdExpirationTime `
-LDAPFilter "(&(ms-Mcs-AdmPwdExpirationTime=*)(operatingSystem=Windows*)(!Description=ClusterAwareUpdate*)(!Description=Failover cluster virtual network name account))"|`
Select Name, operatingSystem, Description, @{N='ExpiryTime'; E={[DateTime]::FromFileTime($_."ms-Mcs-AdmPwdExpirationTime")}}| `
Where-Object ExpiryTime -lt $now| Sort-Object ExpiryTime| export-csv C:\Temp\ExpiredLAPS.csv -Delimiter ";" -NoTypeInformation 

Get LAPS Expiration date for one or more computer(s)

This command lists the expiration time for one or more computers based on an LDAP filter. The sample filter (Name=Badger*) will list all computers whose name starts with Badger. Computers where the expiration time is not set are filtered out. For more information about the LDAP filter syntax se this link:  https://social.technet.microsoft.com/wiki/contents/articles/5392.active-directory-ldap-syntax-filters.aspx

$now = Get-Date
 get-adcomputer -Properties Name, operatingSystem, Description, ms-Mcs-AdmPwdExpirationTime `
-LDAPFilter "(&(ms-Mcs-AdmPwdExpirationTime=*)(Name=Badger*))"|`
Select Name, operatingSystem, Description, @{N='ExpiryTime'; E={[DateTime]::FromFileTime($_."ms-Mcs-AdmPwdExpirationTime")}}| Sort-Object Name

Get LAPS expiration date for one or more computers, excluding those with no expiry set

Similar to above, but includes computer objects where the expiration time is not set. Those return 01.01.1601 01.00.00 as ExpiryTime because of the conversion of 0 from FileTime to DateTime. To put it in another way, if the expiration time is reported as 01.01.1601 01.00.00 it has not been set.

$now = Get-Date
 get-adcomputer -Properties Name, operatingSystem, Description, ms-Mcs-AdmPwdExpirationTime `
-LDAPFilter "(&(!ms-Mcs-AdmPwdExpirationTime=*)(Name=Badger*))"|`
Select Name, operatingSystem, Description, @{N='ExpiryTime'; E={[DateTime]::FromFileTime($_."ms-Mcs-AdmPwdExpirationTime")}}| Sort-Object Name

Securing Windows Active Directory

This is a list of measures you can implement to increase your Windows AD Security. The list is in no way exhaustive, and some of the items overlap. Be aware that security recommendations change over time. This article was originally created 2018.01.22. If that is several years in the past when you read this, I cannot promise that all recommendations are up to date.

LAPS – Local administrator password management

Implementing LAPS ensures that all your domain-joined computers have a unique password that is changed periodically for the local administrator account. It operates as a GPO Client Side Extension, and thus requires you to install and register a DLL on each target computer. You can do this via GPO, in your VM image, or through any other software deployment solution you may use.

On the management computers and/or the DC itself, you have to add management tools and GPO Editor templates. There is a graphical user interface and a PowerShell module. The PowerShell module also includes the commands necessary to extend the AD Schema for storing the passwords and their associated expiry date.

See https://technet.microsoft.com/en-us/mt227395.aspx for details.

Securing the built-in Administrator account

 

The built in Administrator account in the domain should be secured. The ObjectSID of the domain admin account always ends in -500, and is thus easy to identify even if the name has been changed. The guidance used to be “Disable the Administrator account”, but it has been changed due to some recovery scenarios requiring an active Administrator-account. Specifically, the Administrator account is the only account able to log on when no global catalogs are online.

See https://docs.microsoft.com/en-us/windows-server/identity/ad-ds/plan/security-best-practices/appendix-d–securing-built-in-administrator-accounts-in-active-directory for details and an implementation guide. Some highlights are shown below.

Set the DOMAIN\Administrator account as sensitive and require smart card

 

clip_image001

Create a GPO to prevent Domain Admins from logging on to member servers or workstations

I have gone a bit further than the guide here, adding Domain Admins and Guests for good measure. The “Local account and member of Administrators group” is related to denying local administrator accounts access to the computer from the network. More about this below.

Make sure that this GPO does not apply to domain controllers, that is, do not link it at the domain level.

clip_image002

 

Block remote access for local accounts

Add Guests, Local account and member of Administrators group, Domain Admins, Enterprise Admins and Schema Admins to the policy Computer Configuration\Windows Settings\Local Policies\User Rights Assignment\Deny Access to this computer from the network.

clip_image003

For details, see https://blogs.technet.microsoft.com/secguide/2014/09/02/blocking-remote-use-of-local-accounts/

Disable weak ciphers for Windows Secure Channel

You can build a GPO to limit the cipher suites used by the Windows Secure Channel API, and by extension IIS. Be aware that this does not in any way limit other usage of weak ciphers. For instance, a TomCat server running on the same computer may very well use RC4 even if you have removed it from the list of Windows secure channel ciphers.

The GPO is located at Computer Configuration\Administrative Templates\Network\SSL Configuration Settings\Cipher Suites.

When you enable this setting, you get a list of all the default ciphers as a long comma separated string. Which ciphers you get is dependent of the Windows version. The easiest way to edit this list is to copy the string into a text editor. You can change the order to change the priority and remove weak ciphers.

clip_image004

clip_image005

 

Do not allow local users to run remote elevated sessions

Do not apply this fix: https://support.microsoft.com/en-us/help/951016/description-of-user-account-control-and-remote-restrictions-in-windows

That is, do not create the HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System \LocalAccountTokenFilter Policy value, and if it exists, make sure it is set to 0. We could of course create a GPO to enforce this setting.

clip_image006

For details, see https://www.harmj0y.net/blog/redteaming/pass-the-hash-is-dead-long-live-localaccounttokenfilterpolicy/

Set a password policy and a lockout policy

 

  • Password length: 8 characters. Encourage users to create passwords with a random length between 8 and 20 characters. You want your users to have passwords that vary in length. If you set this limit to 14, chances are all passwords are exactly 14 characters long. This makes it a lot easier to crack them.
  • Complexity not required. If you require complexity, users tend to add numbers and capitals at the start and end of the password.
  • Password history: 10.
  • Maximum password age: 0, that is password never expires. To frequent password changes may lead to bad password diversity and predictable passwords. Leaked passwords are almost always exploited immediately, so there is no point in forcing a monthly password change. If you must, set the maximum age to one year. Urge users to choose new passwords that are completely different from the previous passwords. That is, do not use MypassWord1, MypasswOrd2 and so on.
  • Do not enable the reversible encryption option. Ever. Just don’t.
  • Lockout policy: Locked for 24 hours after five unsuccessful attempts.

clip_image007

clip_image008

For background information, see:

Enforce SMB Signing and disable SMB1

 

Enforce signing

You can enforce signing on both the server and client side. The server side is shown below. Be aware that some services require this setting to be disabled. If you have such services, create an overriding GPO for those servers only, leaving SMB signing on in the rest of the domain.

 

clip_image009

See https://technet.microsoft.com/en-us/library/cc731957(v=ws.11).aspx

Disable SMB1

You have to create some registry-GPO settings. Details are at the link below. Be aware that legacy clients like Windows XP will be dependent on SMBv1 on Domain Controllers to access the Sysvol share. The recommendation is still to disable SMBv1 everywhere.

 

clip_image010

 

See https://blogs.technet.microsoft.com/staysafe/2017/05/17/disable-smb-v1-in-managed-environments-with-ad-group-policy/ for details.

Create new computer objects in a separate OU, not in the Computers container

 

Thus you can delegate permissions to manage them, and you can apply GPOs to newly added computers. You do this with the rdircmp console application.

  • Log on to a domain controller.
  • Start an administrative CMD-shell
  • Execute rdircmp [FQDN of OU]

clip_image011

You can verify or check this setting using PowerShell:

Get-ADDomain |Select-Object ComputersContainer.

Limit the number of domain admins

Domain admin accounts should only be used for domain administration tasks, and you should not have many of them. Do not use service accounts with domain admin access.

A recommended number of domain admins is 5.

Avoid explicit permissions, prefer group permissions

 

All permissions in AD should preferably be given to groups, not individual users. This makes it a lot easier to manage permissions, and it is also easier to see what permissions a user has based on which groups he is a member of. That is, if you follow this principle. There will always be exceptions, but they should be few and far between.

Limit the number of people with delegated access to AD

 

AD administration task can be delegated. For instance, your service desk could be able to reset passwords and create users without full domain admin access. It is important to limit these delegations and keep tabs on them.

Use dedicated domain controllers

 

  • Make sure that your domain has at least two domain controllers.
  • If they are virtual, they should not be on the same cluster. Preferably you should have at least one dedicated physical domain controller.
  • Do not install anything on you domain controllers, with the exception of backup agents, antivirus software, monitoring agents and software deployment agents.
  • Do not enable the Hypervisor role on your physical domain controllers to run other software in a VM.
  • Make sure you have a system state backup of your domain controllers.

No trusts between domains

 

Avoid using forests and trusts between domains. Trusted domains should be handled as a single security context (e.g. dev, test, production, management etc), and thus you only really need one domain for each security context unless you want to divide it based on departments or divisions.

Enforce the Windows firewall

 

Make sure that Windows Firewall is turned on. There are many ways to do this, e.g. SCCM or GPO.

Install antivirus software on all servers and workstations

 

And make sure that it is activated and up to date. SCCM enables you to monitor and manage the default Windows Defender antivirus. Most commercial AntiVirus software comes with some kind of centralized management and monitoring tool.

Log out RDP sessions after 24 hours

 

Remote desktop server sessions that are still active (idle or disconnected) after 24 hours should be logged out automatically. Really active sessions are left for 5 days.

The GPO settings are located at:

Computer Configuration\Administrative Templates\Windows Components\Remote Desktop Services\Remote Desktop Session Host\Session Time limits

  • Set time limit for disconnected sessions: 1 day.
  • Set time limit for active but idle RDS sessions: 1 day.
  • Set time limit for active RDS sessions: 5 days.
  • End session when time limits are reached: Enabled
  • Set time limit for logoff of RemoteApp sessions: 1 day.

Group Managed Service accounts and Managed Service Accounts

Enable the domain for group managed service accounts, and encourage its use on supported services.

https://docs.microsoft.com/en-us/windows-server/security/group-managed-service-accounts/group-managed-service-accounts-overview

https://blogs.msdn.microsoft.com/markweberblog/2016/05/25/group-managed-service-accounts-gmsa-and-sql-server-2016/

Removing a drive from a cluster group moves all cluster group resources to Available Storage

Problem

During routine maintenance on a SQL Server cluster we were planning to remove one of the clustered drives. We had previously replaced the SAN, and this disk was backed by an old storage unit that we wanted to decommission. So we made sure that there were no dependencies, right-clicked the drive in Failover Cluster Manager under the SQL Server role and pressed “Remove from SQL Server”. Promptly the drive vanished from view, together with all other cluster resources associated with the role…

After a slightly panicky check to make sure that the SQL Server instance was still running (it was), we started to wonder about what was happening. Running Get-ClusterResource in PowerShell revealed that all our missing resources had been moved to the “Available Storage” resource group.

image

We did a failover to verify that the instance was still working, and it gladly failed over with the Available Storage group. There is a total of 4 instances of SQL Server on the sample cluster pictured above.

Solution

The usual warning: Performing this procedure may result in an outage. If you do not understand the commands, read up on them before you try.

Move the resources back to the SQL Server resource group. If you move the SQL Server resource, that is the resource with the ResourceType SQL Server, all other dependent resources should follow. If your dependency settings are not configured correctly, you may have to move some of the resources independently.

Command: Get-ClusterResource “SQL Server (instance)”|Move-ClusterResource –Group “SQL Server (instance)”

Just replace Instance with the name of your SQL Server instance.

Then, run Get-ClusterResource|Sort-Object OwnerGroup, ResourceType to verify that all you resources are associated with the correct resource group. The result should look something like this. As a minimum, you should have an IP address, a network name, SQL Server, SQL Server Agent and one ore more Physical disk drives.

image

Microsoft Update with PSWindowsUpdate 2.0

Preface

This is an update to my previous post about PSWindowsUpdate located here: https://lokna.no/?p=2132. The content is pretty much the same, but updated for PSWindowsUpdate 2.0.

Most of my Windows servers are patched by WSUS, SCCM or a similar automated patch management solution at regular intervals. But not all. Some servers are just too important to be autopatched. This is a combination of SLA requirements making downtime difficult to schedule and the sheer impact of a botched patch run on backend servers. Thus, a more hands-on approach is needed. In W2012R2 and far back this was easily achieved by running the manual Windows Update application. I ran through the process in QA, let it simmer for a while and went on to repeat the process in production if no nefarious effects were found during testing. Some systems even have three or more staging levels. It is a very manual process, but it works, and as we are required to hand-hold the servers during the update anyway, it does not really cost anything. Then along came Windows Server 2016. Or Windows 10 I should really say, as the Update-module in W2016 is carbon copied from W10 without changes. It is even trying to convince me to install W10 Creators update on my servers…

clip_image001

In Windows Server 2016 the lazy bastards at Microsoft just could not be bothered to implement the functionality from W2012R2 WU. It is no longer possible to defer specific updates I do not want, such as the stupid Silverlight mess. If I want Microsoft update, then I have to take it all. And if I should become slightly insane and suddenly decide I want driver updates from WU, the only way to do that is to go through device manager and check every single device for updates. Or install WUMT, a shady custom WU client of unknown origin.

I could of course use WSUS or SCCM to push just the updates I want, but then I have to magically imagine what updates each server wants and add them to an ever growing number of target groups. Every time I have a patch run. Now that is expensive. If I had enough of the “special needs” servers to justify the manpower-cost, I would have done so long ago. Thus, another solution was needed…

PSWindowsUpdate to the rescue. PSWindUpdate is a Powershell module written by a user called MichalGajda enabling management of Windows Update through Powershell. You can find it here: https://www.powershellgallery.com/packages/PSWindowsUpdate/2.0.0.0. In this post I go through how to install the module and use it to run Microsoft Update in a way that resembles the functionality from W2012R2. You could tell the module to install a certain list of updates, but I found it easier to hide the unwanted updates. It also ensures that they are not added by mistake with the next round of patches.

Getting started

(See the following chapters for details.)

  • You should of course start by installing the module. This should be a one-time deal, unless a new version has been released since last time you used it. New versions of the module should of course be tested in QA like any other software.
  • Then, make sure that Microsoft Update is active.
  • Check for updates to get a list of available patches.
  • Hide any unwanted patches
  • Install the updates
  • Re-check for updates to make sure there are no “round-two” patches to install.

Continue reading “Microsoft Update with PSWindowsUpdate 2.0”

About the UserAccountControl attribute

Intro

When working with MIM you will sooner or later have to deal directly with the UserAccountControl Active Directory attribute. This attribute defines account options, and we use it most prevalently to enable and disable users, but there are a lot of other options as well. These options are stored in a binary value as bit flags, where each bit defines a specific function.

Bit number 1 (or 2 if you are not used to zero-based numbering) defines whether or not an account is enabled. Bit number 9 defines an account as a normal account. Thus, a normal disabled account will have bits 1 and 9 set to one. As long as no other bits are set, the decimal value is 2^9 + 2^1 = 514 or (0010 0000 0010). If we enable the account, the value is 2^9 = 512 (0010 0000 0000).

In MIM we are usually presented with decimal values. These are easier to read, but not necessarily easier to understand.

Continue reading “About the UserAccountControl attribute”