Running out of time in the lab

First a friendly warning; This post details procedures for messing with the time service on domain controllers. As always, if you do not understand the commands or their consequences; seek guidance.

Problem

I have been upgrading my lab to Windows Server 2016 in preparation for a production rollout. Some may feel I am late to the game, but I have always been reluctant to roll out new server operating systems quickly. I prefer to have a good baseline of other peoples problems to look for in your friendly neighborhood tracking service (AKA search engine) when something goes wrong.

Anyways, some weeks ago I rolled out 2016 on my domain controller. When I came back to upgrade the Hyper-V hosts, I noticed time was off by 126 seconds between the DC and the client. As the clock on the DC was correct, I figured the problem was client related. Into the abyss of w32tm we go.

Analysis

The Windows Time Service is not exactly known for its user friendliness, so I just started with the normal shotgun approach at the client:

net stop w32time
w32tm /config /syncfromflags:domhier
net start w32time

These commands, if executed at an administrative command prompt, will remind the client to get its time from the domain time sync hierarchy, in other words one of the DCs. If possible. Otherwise it will just let the clock drift until it passes the time delta maximum, at which time it will not be able to talk to the DC any more. This is usually the point when your friendly local monitoring system will alert you to the issue. Or your users will complain. But I digress.

Issuing a w32tm /resync command afterwards should guarantee an attempt to sync, and hopefully a successful result. At least in my dreams. In reality though, it just produced another nasty error:  0x800705B4. The tracking service indicated that it translates to “operation timed out”. 

The next step was to try a stripchart. The stripchart option instructs w32tm to query a given computer and show the time delta between the local and remote computer. Kind of like ping for time servers. The result should look something like this:

SNAGHTMLbf7b59

But unfortunately, this is what I got:

image

I shall spare you the details of all the head-scratching and ancient Viking rituals performed at the poor client to no avail. Suffice it to say that I finally realized the problem had to be related to the DC upgrade. I tried running the stripchart from the DC itself against localhost, and that failed as well. That should have been a clue that something was wrong with Time Service itself. But as troubleshooting the Time Service involves decoding its registry keys, I went to confirm the firewall rules instead. Which of course were hunky-dory.

image

I then ran dcdiag /test:advertising /v to check if the server was set to advertise as a time server:

image

 

The next step was to reset the configuration for the Time Service. The official procedure is as follows:

net stop w32time
w32tm.exe /unregister
w32tm.exe /register
net start w32time

This procedure usually ends with some error message complaining about the service being unable to start due to some kind of permission issue with the service. I seem to remember error 1902 is one of the options. If this happens, first try 2 consecutive reboots. Yes, two. Not one. Don’t ask why, no one knows. If that does not help, try again but this time with a reboot after the unregister command.

The procedure ran flawlessly this time, but it did not solve the problem.

Time to don the explorer’s hat and venture into the maze of the registry. The Time Service hangs out in HKLM\System\CurrentControlSet\Services\W32Time. After some digging around, I found that the NTP Server Enabled key was set to 0. Which would suggest that it was turned off. I mean, registry settings are tricksy, but there are limits. I tried changing it to 1 and restarted the service.

image

Suddenly, everything works. The question is why… Not why it started working, but why the setting was changed to 0. I am positive time sync was working fine prior to the upgrade. Back to the tracking service I went. Could there be a new method for time sync in Windows 2016? Was it all a big conspiracy caused by Russian hackers in league with Trump? Of course not. As usual the culprits are the makers of the code.

Solution

My scenario is not a complete match, but in KB3201265 Microsoft admits to having botched the upgrade process for Windows Time Service in both Windows Server 2016 and the corresponding Windows 10 1607. Basically, it applies default registry settings for a non-domain-joined server. Optimistic as always they tell you to export the registry settings for the service PRIOR to upgrading. As if I have the time to read every KB they publish. Anyways, it also details a couple of other possible solutions, such as how to get the previous registry settings out of Windows.old.

My recommendation is as such: Do not upgrade your domain controllers. Especially not in production. I only did it in the lab because I wanted to save time.

If you as me have put yourself in this situation, and honestly, why else would you have read this far, I recommend following method 3 in KB3201265. Unless you feel comfortable exploring the registry and fixing it manually.