Orchestrator bug kills database server

Update

20131031: Code to limit the size of the error log files added to the solution. So far only tested on SQL 2012 SP1 CU2.

Problem

The Database server/operating system starts complaining about low disk space. Analysis reveals that the volume containing the default MSSQL log folder is very low on space, maybe even 0 bytes free. Further analysis shows that the ERRORLOG files are taking up all remaining space on the volume.

Analysis

The ERROROLOG files is filled to the brim with messages like this one:

Date		07.01.2013 21:09:09
Log		SQL Server (Current - 07.01.2013 14:57:00)

Source		spid29s

Message
The activated proc '[Microsoft.SystemCenter.Orchestrator.Maintenance].[MaintenanceWorker]' running on queue 'OrchestratorDB.Microsoft.SystemCenter.Orchestrator.Maintenance.MaintenanceServiceQueue' output the following:  'Error: 9002, The transaction log for database 'OrchestratorDB' is full. To find out why space in the log cannot be reused, see the log_reuse_wait_desc column in sys.databases'

Orchestrator has a maintenance process that does not handle a situation were the database transaction log is unable to grow, either due to restrictions or due to the fact that the disk is full. When this happens, it bloats the ERRORLOG with the messages mentioned above. This happens very quickly as it keeps retrying several times each second, so even a 10GB volume is filled in hours. If this situation is left unchecked for a couple of days, the risk of volume NTFS corruption is high.

solution

Disable access to the Orchestrator database or stop the Orchestrator servers to prevent further bloating.
Stop the database instance
Delete the bloated error log to free up space
Restart the instance

Change the number of error logs SQL server is allowed to create (default is 6, sample limits to 10) and limit the size of the error log files to 50MB each. This will give you a total of 11 error logfiles, each with a max size of about 50 000 KB for a total of 550MB.

USE [master]
GO
EXEC xp_instance_regwrite N'HKEY_LOCAL_MACHINE',
 N'Software\Microsoft\MSSQLServer\MSSQLServer',
 N'NumErrorLogs', REG_DWORD, 10
GO
EXEC xp_instance_regwrite N'HKEY_LOCAL_MACHINE',
 N'Software\Microsoft\MSSQLServer\MSSQLServer', 
 N'ErrorLogSizeInKb', REG_DWORD, 50000
GO

Solve the problem regarding the Orchestrator DB transaction log. Usually such problems are caused by either a failed backup job (thus no log truncation) or the simple fact that you need to allocate more space for the data. If a failed backup job is the problem, consider shrinking the transaction logs back down to their normal size. See https://lokna.no/?p=349 for more information about shrinking troublesome logs (Norwegian version only).
Enable access to the Orchestrator database.
Monitor the error logs and see if the values you chose for log size and number of logs fits your workload. If the logs contain at least a day of messages, but not more than a month, you are in the butter zone.

Last edit: Friday, November 1, 2013

Author: DizzyBadger

SQL Server DBA, Cluster expert, Principal Analyst View all posts by DizzyBadger

Update

Problem

Analysis

solution

Like this:

Related

Author: DizzyBadger

One thought on “Orchestrator bug kills database server”

Leave a ReplyCancel reply

Update

Problem

Analysis

solution

Share this:

Like this:

Related

Author: DizzyBadger

One thought on “Orchestrator bug kills database server”

Leave a ReplyCancel reply