Permission denied restoring imported backup

Problem

Trying to restore a backup of a database on a different server than the one were the backup originates generates the following error message: “The operating system returned the error ‘5(failed to retrieve text for this error. Reason: 15105)’ while attempting ‘RestoreContainer::ValidateTargetForCreation’”:

SNAGHTML375e27

 

Analysis

Error 5 is as always access denied. I don’t know why text retrieval fails, but that is a problem for another day Smilefjes.Restores of backups originating from the destination server on the destination server works as expected, and the SQLServer service account have the required permissions granted on both the .bak file and the target folder. Restore of the same backup file on other servers during staging worked as expected. I thus concluded that there had to be some differences between the staging and production servers causing the issue.

  • staging servers are virtual and stand-alone servers, production is a physical cluster
  • staging servers use vm drives as data volumes, production has SAN disks attached to mount points
  • staging has a couple of cores and > 20gigs of ram, production has 16 cores and > 200 gigs

Then it hit me, mount points are renowned for causing strange permission issues due to the fact that the permissions for the mount point and the actual volume mounted are stored separately in different ACLs:

SNAGHTML47effa

Further investigation revealed that the service account had full control permissions on both ACLs, but the volume permissions on the data and transaction log volumes were granted via group membership, while the mount point permissions on those volumes were granted explicitly. This doesn’t seem to be a problem for other operations, but when you try to restore a database to a different destination than were it came from, the server account needs explicit permissions on the destination folder(s). That is, when original file name and Restore As are not the same.

SNAGHTML4f865f

solution

Grant the SQLServer service account explicit permissions on both the volume and the mount point. If you have different mount points for transaction log and data files, you have to do this on both folders. Furthermore, I would guess similar errors could occur if the service account lacks access to the source bak file.

Cluster disk resource XX contains an invalid mount point

Problem

During cluster startup or failover one of the following event is logged in the system event log:

SNAGHTML342b0d8SNAGHTML341df38

Event-ID 1208 from Physical Disk Resource: Cluster disk resource ‘[Resource name]’ contains an invalid mount point. Both the source and target disks associated with the mount point must be clustered disks, and must be members of the same group.
Mount point ‘[Mount path]’ for volume ‘\\?\Volume{[GUID]}\’ references an invalid target disk. Please ensure that the target disk is also a clustered disk and in the same group as the source disk (hosting the mount point).

Cause and investigation

The cause could of course be the fact that the base drive is not a clustered disk as the event message states. If that is the case, read a book about WFC (Windows failover clustering) and try again. If not, I have found the following causes:

  • If the mount point path is C:\$Recycle.bin\[guid], it is caused by replacing a SAN drive with another one at the same drive letter or mount point but with a different LUN. This confuses the recycle bin.
  • If the clustered drive for either the mount point or the volume being mounted is in maintenance mode and/or currently running autchk/chkdsk. This could happen so quickly that you are unable to detect it, and when you come back to check, the services are already up and running. Unless you disable it, WFC will run autochk/chkdsk when a drive with the dirty bit set is brought online. This is probably logged somewhere, but I have yet to determine in which log. Look in the application event log for Chkdsk events or something like this:

Event 17207 from MSSQL[instance]:

Event 1066 from FailoverClustering

 

Resolution

  • If it is the recycle.bin folder, make sure you have a backup of your data and delete the mount point folder under C:\recycle.bin. You might have to take ownership of the folder to be able to complete this task. If files are in use, take all cluster resources offline and try again.
  • If you suspect a corrupt mount point or drive, run chkdsk on ALL clustered drives. See https://lokna.no/?p=1194 for details.

Check C:\Windows\Cluster\Reports (default location) for files titled ChkDSK_[volume].txt, indicating that the cluster service has triggered an automatic chkdsk on a drive.

Run disk maintenance on a failover cluster mountpoint

Problem

“Validate this cluster” or another tool tells you that the dirty bit is set for a cluster shared volume, and taking the disk offline and online again (to trigger autochk) does not help. The error message from “Validate this cluster” looks like this:

SNAGHTML2c4a12c

 

Continue reading “Run disk maintenance on a failover cluster mountpoint”

Annoying default settings

I have never quite liked the way Microsoft wants me to use Windows Explorer. The standard settings are quite annoying to me, but I understand why they are as they are on end user versions of Windows. Joe User is stupid, usually more so than you might imagine possible, so it is important to protect him against himself. On a server on the other hand, I would think we should anticipate some minimal knowledge about the file system. A server user should be able to look at a system file without thinking: “Hmm, bootmgr is a file I haven’t seen before. I should probably delete it. And that big windows folder just contains a lot of strange files I never use. I’m deleting some of those too, it will leave more room for pictures of my cat!”. But no, it has the same stupid defaults as the home editions. Because of this, I have had to create a list of all the stuff I have to remember to change whenever I log on to a new server, lest I go insane and maul the next poor user who want’s me to recover the database he “forgot” to back up before the disk crashed. Smilefjes som rekker tunge

 

Continue reading “Annoying default settings”

Event ID 1006 from GroupPolicy

Problem

Event 1006 is logged several times each day in the system event log with the message The processing of Group Policy failed. Windows could not authenticate to the Active Directory service on a domain controller. (LDAP Bind function call failed). Look in the details tab for error code and description. The details pane lists Invalid Credentials as the error description:

image

Analysis

This error is most likely caused by a user session that is logged on to the machine with an expired domain password. The user name event property identifies the user in question. This situation typically arises when users stay logged on to a computer or server for several weeks at the time, long enough for a domain password expiry policy to force a password change. The user is prompted to change the password at the next login, but if the user never logs out, the session keeps running with the old credentials. The same error will occur if the users session is a disconnected or active remote desktop session.

Solution

Log out and log back in to trigger the password change dialog. If the password has already been changed on another computer or directly in the directory, just log back in with your new password.

If your own session isn’t the culprit, you can forcibly log out another user using Remote Desktop Services Manager (server only) or Task Manager. Be aware of the fact that this method will exit all programs without saving in the session you log off.

File server memory leak?

Problem

On two of our fileservers (Windows 2008R2) we noticed an increase in memory usage over time. It would start out at say 1.5GiB after a boot, and then slowly work it’s way up to 6GiB, that was the server’s allocated amount of memory (vmware). This being a busy file server due to hosting our user profiles for citrix, we tried increasing the memory allocation to 8GiB. Sadly, this only had the effect that reaching 99% memory usage took longer time after a reboot. After a day or two it would be back up. Further investigation revealed that it also affected performance. Backup took 18 hours for 800GiB, and once in a while it would just give up. Testing also revealed that profile access was sometimes slow.

Continue reading “File server memory leak?”

Folder copy with logging in Powershell, and a bit about scripts in general

Problem

I have been trying for some time to find an easy method for keeping my scripts up to date on my servers. I could of course use robocopy or something like that, but I wanted something written in PS. I figured I would learn something along the way, and I also had hopes that this would be fairly easy to accomplish. It would seem I was wrong on the being easy part, or perhaps I have over-engineered it slightly Smilefjes som blunker.

Not wanting to re-invent the wheel, I summoned the powers of the closest search engine to get up with some samples I could build on. I am a bit prejudiced towards scripts from the web in general, as I usually find most scripts longer than a few lines have some logical bugs in them. Scripts are, in general, easy to get started with, but it is very difficult to produce robust scripts. I have debugged countless VB and Powershell scripts (both my own and those of others) who were working fine in the lab, and perhaps also in production, but suddenly they cease to function as expected. Usually this is caused by some simple logical error appearing due to changes in the environment the script is running in, but from time to time you come across some obscure scripting engine bug you have to program around. And of course you have the pure idiotic fails, such as creating vb-scripts requiring “On error resume next” at the top of the script. Those are usually doomed by design and can take days to debug. Since I am fairly proficient in C# I usually just write a small utility .exe instead, thus circumventing many of the problems altogether. Once you have spent 4 hours debugging an error caused by a misspelled variable name in the middle of a 200 line script, you start dreaming about the wonders of explicit variable declaration and intellisense. Anyways, I think that is enough ranting for one post.Smilefjes

Continue reading “Folder copy with logging in Powershell, and a bit about scripts in general”

Enable logging for Windows Firewall (2008R2)

When troubleshooting problems with the internal Windows firewall it might be beneficial to know exactly what traffic is being blocked. One can of course just turn the firewall off to test if things start working, and then search for documentation for the failing application or service. Sadly, such an approach causes security issues during testing, and documentation is often not complete as to which ports an application actually depends on. The firewall log makes it somewhat easier to troubleshoot without having to disable the firewall completely.

Configuration

Start with bringing up the firewall properties from the Windows Firewall from the Advanced Security mmc snap-in:

image

You can configure logging for each of the profiles (domain, public and private). By default they all log to the same file, %windir%\SYSTEM32\Logfiles\firewall\pfirewall.log. It might be smart to use different log files if you have connections on more than one profile, e.g. if you have one lan and one wan adapter. Logging dropped packages only is recommended, as logging successful connections will fill up the log quickly on a busy server.

image

I would recommend turning logging of when troubleshooting is finished and leaving the log size limit at 4 096 KiB. If you specify a different folder than the default folder you must make sure that the firewall service have the necessary file system permissions. Unlike the w3svc log the firewall log is limited to two files, the main .log and a .old file. This ensures that the disk is not filled with firewall log files, and translates to a maximum disk space allocation of 2 times the size limit.

Analysis

The log files are space delimited, and can be imported into a spreadsheet for analysis, but it is easier to use a specialized log analyzer such as Sawmill (large professional tool) or ZedLan Firewall Log Analyser (freeware).

Unable to access local drive(s)

Problem

On a Windows 2008 or 2008 R2 server administrators are unable to browse the contents of local drives while logged on to the server either directly at the console or via remote desktop. Access to the same drive using a network share works fine. UAC is turned on, and the local administrators group have full control access to the drive(s) in question. You get an “Access denied” error in Windows Explorer even when running in an elevated process (administrator mode).

The problem also affects Windows Vista and 7.

Analysis

If you try to access the drive using a program other than Windows Explorer, you can access the drive as long as the program is running in an elevated session. The problem seems to affect Windows Explorer alone, but I am not sure about that. What I have been able to establish though, is that it only affects users who are members of the local “Administrators” group. If a user has explicit access or access through another group, everything works as expected.

I detected the problem while migrating files and permissions from an old 2003 server to a new one running 2008 R2, and I think it is related to the local “Users” group not being granted access to the drive. Not denied, just removed from the root acl on the drive.

Solutions

  • Add explicit access to the drive for the administrative users that need access
  • Turn off UAC (not recommended)
  • Create a new group called Local_Admin_Access or something like that, add the local administrators group as a member, and give the new group full control of the drive.
  • Give the local group “Interactive” full control of the drive. This grants access to any user who have local logon permissions and are currently logged on to the server.

Check if processes are running

Intended for use as part of a bigger script where you need to check if a process or processes are running or closed. Checks if an array of processes are running on the system, and counts how many of them are running in an integer variable.

#Declare variables
[Int]$intRunning = 0
[bool]$Debug = $true

#Main logic
function Main
{
	$menucolor = [System.ConsoleColor]::white
	write-host '-------------------------------------------------------------------'-ForegroundColor $menucolor
	write-host '|                 Check if processes are running                  |'-ForegroundColor $menucolor
	write-host '|                      Jan Kåre Lokna                          |'-ForegroundColor $menucolor
	write-host '|                     v 1.0                                       |'-ForegroundColor $menucolor
	write-host '-------------------------------------------------------------------'-ForegroundColor $menucolor
	write-host 
	checkProcesses
}

#Check processes
function checkProcesses
{
	$processes = "iexplore", "winamp", "Opera", "dfdsafs"
	
	foreach ($process in $processes)
	{
		try
		{
			$s= Get-Process $process -ErrorAction stop
			if($Debug -eq $true) {write-host $process 'is running'-ForegroundColor Green}
			$intRunning = $intRunning + 1
		}
		catch
		{
			if($Debug) {write-host $process 'is not running' -ForegroundColor Magenta}
		}
	}
	if($Debug) {Write-host "Running processes: " $intRunning "of" $processes.count}
}
. Main