Local System Certificate store pooched after windows update

Problem

After patching one of our SQL servers it was acting strange. Suddenly, the reporting services service refused to service https requests, and the SCOM monitoring agent refused to start. The error message from the reporting server website as reported by opera was “Secure connection: fatal error 552”. This could be translated to either “Requested file action aborted, storage allocation exceeded”, which is an FTP status code, or “552 – Unknown authentication service call-back”, which is a more likely explanation.

An examination of the event logs on the server revealed some certificate related messages from the SCOM agent:

Log Name:      Operations Manager
Source:        HealthService
Date:          17.03.2011 17:26:55
Event ID:      7029
Task Category: Health Service
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      ##########
Description:
The Health Service was detected that the private key for secure data processing has been removed or is invalid.  The certificate and key will be regenerated.
Log Name:      Operations Manager
Source:        HealthService
Date:          17.03.2011 17:26:55
Event ID:      7022
Task Category: Health Service
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      ##########
Description:
The Health Service has downloaded secure configuration for management group ##########, and processing the configuration failed with error code Cannot find the certificate and private key for decryption.(0x8009200B).
Log Name:      Operations Manager
Source:        HealthService
Date:          17.03.2011 17:26:55
Event ID:      1220
Task Category: Health Service
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      ##########
Description:
Received configuration cannot be processed. Management group "##########". The error is Cannot find the certificate and private key for decryption.(0x8009200B).

When we tried to restart the service, the following event occured:

Log Name:      Operations Manager
Source:        OpsMgr Connector
Date:          23.03.2011 09:07:33
Event ID:      21021
Task Category: None
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      ##########
Description:
No certificate could be loaded or created.  This Health Service will not be able to communicate with other health services.  Look for previous events in the event log for more detail.

We also tried to assign a new HTTPS certificate to MSSQL Reporting services, which raised the following events:

Log Name:      System
Source:        Schannel
Date:          23.03.2011 10:19:09
Event ID:      36870
Task Category: None
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      ##########
Description:
A fatal error occurred when attempting to access the SSL server credential private key. The error code returned from the cryptographic module is 0x80090016.
Log Name:      System
Source:        Schannel
Date:          23.03.2011 10:19:09
Event ID:      36870
Task Category: None
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      ##########
Description:
A fatal error occurred when attempting to access the SSL server credential private key. The error code returned from the cryptographic module is 0x8009030d.

Further investigation lead us to an article on Technet. This related to a Win2000 server, but the eventlog messages mentioned looks a lot like the ones listed above. And happily, it put us on the right track to a solution.

Solution

All our problems were caused by the fact that the local computer certificate store on the server was pooched. To be specific: The local System user and the local Administrators group did not have the necessary file system access rights to the folder where the certificates are stored. On Windows 2000 they are located in

%SystemDrive%\Documents and Settings\All Users\Application Data\Microsoft\Crypto\RSA\MachineKeys
%SystemDrive%\Documents and Settings\All Users\Application Data\Microsoft\Crypto\RSA\S-1-5-18

Our server on the other hand was a Windows 2008 R2, and the folders are called:

%SystemDrive%\Application Data\Microsoft\Crypto\RSA\MachineKeys
%SystemDrive%\ Application Data\Microsoft\Crypto\RSA\S-1-5-18

We found a description of these folders on msdn.

The System user and the Administrators group should be assigned Full Control on these folders and all subfolders and files. Furthermore, both folders and their subfolders/files should be owned by the Administrators group. We checked a working server, and on the MachineKeys folder, the everyone group was assigned Full Control.

The following screenshots are from a working server that has not experienced the errors:

image

It says special permissions, but it is actually Full Control.

image

image

image

After the permissions had been corrected, we restarted the Cryptographic Service to make sure the certificate store was working.

We also had to create a new certificate for the MSSQL Reporting services and bind the new certificate to the service. But as long as you haven’t tampered with the Reporting services certificate binding (like we did during troubleshooting), it shouldn’t be necessary.

Author: DizzyBadger

SQL Server DBA, Cluster expert, Principal Analyst

One thought on “Local System Certificate store pooched after windows update”

Comments are closed.