Removing large backup history with sp_delete_backuphistory

Problem

In SQL Server, the sp_delete_backuphistory stored procedure is is the standard and supported method to purge old data from the backup and restore history tables in the msdb database:

  • backupfile
  • backupfilegroup
  • backupmediafamily
  • backupmediaset
  • backupset
  • restorefile
  • restorefilegroup
  • restorehistory

Recently, I encountered an issue running sp_delete_backuphistory on servers that hosted a large number of databases with frequent log backup & restore operations. The clean up task hadn’t been scheduled and the history tables had grown very large over several months. The msdb databases was also hosted on a volume with limited IOPs.

Attempting to run sp_delete_backuphistory under these conditions you will likely encounter these issues:

  • The operation will take a long time
  • Log backup and restore operations are blocked during the process
  • Your msdb log file will grow large if you try to remove too much data at once
  • If you kill off the process at any point, the whole thing will rollback.

The sp_delete_backuphistory stored procedure gets the job done in most cases but it’s not very well optimized if you have a large amount of history to clear.

Optimized Solution for Backup History Cleanup

To clear a large amount of history without resorting to schema modifications or unsupported methods we can take an incremental approach to removing data.

The script below will check how much retention we currently have. It will then call sp_delete_backuphistory in a loop, removing data 1hr (configurable) at a time until we hit the target retention. We can also specify a maximum runtime for the process.

This approach won’t necessarily speed things up but it provides the following benefits:

  • Data is cleared as a series of smaller transactions rather than a single large transaction
  • Smaller transactions allow for log truncation and prevent the log file from growing.
  • We can interrupt the process without having to rollback all the progress that has been made. (just the current batch)
  • Log backups/restores might still be blocked, but only for the duration it takes to complete a single loop – not for the entire time it takes to clear the history. This can still impact RPO/RTO though.
  • We can control how long the process runs for. We can keep it running outside business hours or allow for breaks to prevent log backup/restores from falling too far behind.
  • The script will print progress information.

Links:

These related links might be interesting:

https://www.brentozar.com/archive/2018/02/the-annals-of-hilariously-bad-code-part-1/
https://www.brentozar.com/archive/2018/02/annals-hilariously-bad-code-part-2/
https://dba.stackexchange.com/questions/233375/how-to-minimize-plan-ahead-the-effect-of-running-sp-delete-backuphistory-measur

Get drive data for ALL drives in T-SQL

There are a few methods you can use to get drive information using T-SQL. The problem is all these methods have some limitations. They provide free space but not drive capacity or they only provide data for volumes that contain database files. Ideally Microsoft would provide an easy way to get both drive capacity and free space for all drives. In this article I’ll demonstrate a way to do this using T-SQL.

Here are a few exiting methods that you might consider before we continue:

If these work for your needs, great! If you need capacity and free space for ALL drives you will need a different solution. The best option might be to query that data outside of SQL Server using PowerShell or your programming/scripting language of choice. If you need to do this in T-SQL though, the script below provides a way.

The script uses xp_cmdshell to run a powershell command to get data for ALL the volumes on your system. Powershell formats this data as XML which we capture in the @output table variable – 1 row per line of XML. Then we convert this back to a single string and store as XML in variable @x. We then shred the XML into a table format. This is a bit of a hack – sys.dm_os_volume_stats is a cleaner method but doesn’t give you data for all drives. As a bonus you get more detailed data about those drives; file system, allocation unit size & more.

A similar technique could be used to capture the output of other powershell commands in T-SQL. It might not be a good idea but I think it is an interesting technique to demo.

How to handle SQL Agent Jobs with Always On Availability Groups?

Introduction

When you configure your SQL servers for high availability you want the failover to be as seamless as possible. If you use availability groups, mirroring or log shipping as part of your high availability strategy you will need to take care of anything that lives outside the context of your databases. That might include logins, SQL Server agent jobs and ensuring consistent server level configuration across your availability nodes. This article will discuss how to handle SQL server agent jobs.

Not so seamless failover of agent jobs

If you have any SQL agent jobs that run in the context of your user databases these will need special handling to ensure a seamless failover. Ideally Microsoft would have added some improvements to the SQL agent to help with this. Maybe given us agent jobs that can live within the context of a user database instead of msdb?

As it stands you will need to keep your agent jobs in sync between the primary and secondaries and figure out a mechanism to activate the jobs on the new primary following failover and disable them on the old primary.

A Solution

There are a few ways to handle this, but my approach is to run all the jobs in the context of the master DB. I then check if the user DB is accessible and execute the command with a 3 part name. See the script below for an example. This will work for availability groups, mirroring and log shipping.

With this approach you can create your jobs and have them active on your primary and secondaries. You don’t need to worry about enabling/disabling jobs on failover – the jobs are already running but they only do any work if the database is primary. This makes the failover totally seamless. The downside to this approach is that the jobs will report that they ran successfully even if they didn’t do any work which could cause confusion.

The next problem is keeping the jobs in sync between your servers. You can script out the jobs using SSMS easy enough but this could get tedious if you have a larger number of jobs. I’ve created a powershell script below to help with this which you can use below. You can use this to report if your jobs are out of sync and also for performing a synchronization. See the comments at the top of the script for some usage examples.

Set Max Server Memory

Set Max Server Server Memory at Startup

Max server memory is one of the SQL Server configuration options that you should change from the default value. Brent Ozar has some good advice on this here. TLDR: You should leave some extra memory for the o/s.

A good starting point is to leave 10% (4GB minimum). If you are running other processes on your SQL server instances like SSRS, SSAS or if you are running multiple instances you might need to leave a lot more.

You can easily set the max server memory to your desired value in MB using sp_configure or using SSMS and simply forget about it:

What if you add/remove memory from your SQL instance? This is a setting that you need to remember to change. Gone are the days where you need to physically open your servers and install new DIMMS. With the advent of cloud and hypervisors – adding memory can be as easy as a few mouse clicks or running a script.

The point of this article is to consider doing this as a script at startup. If you scale up or down to a new instance type in the cloud – your max server memory could automatically be set to an appropriate value as a startup script. It’s something less to think about.

Show me the code!

This script will apply the 10%/4GB rule. You could adjust the formula as required.

You just need to run this as a SQL Agent job at startup which you can do using the script below:

Daily DBA Checks

DBA Daily Checks Email Report

Update: This was my original solution to automate the DBA Checks process. I don’t recommend people use this solution as it doesn’t scale well to a larger number of servers and the CLR function used to get drive space requires the TRUSTWORTHY property to be enabled which can be a potential security risk.

I recommend people use DBA Dash for daily DBA Checks. This gives you a central dashboard so you can see the status of all your SQL instances on a single page with drill downs for more detailed information.

Introduction

In addition to alerts I recommend running daily checks to validate that backups are running as expected, servers have sufficient disk space and agent jobs are running successfully.  Although alerts provide immediate notification that there is an issue you can’t assume that everything is OK because you haven’t received an email notification.

For example, an email alert might get caught by a spam filter or matched by a rule in your email system that moves it to a different folder. It’s also possible there was a problem sending the notification email. If you are not careful with how you configure alerts you can easily get spammed by a large number of alert emails and important notifications could be lost in this flood.

You might not get a failed backup job notification if your backup job never ran – maybe the job got disabled or the SQL Agent wasn’t running. These are some of the reasons I recommend daily checks.

The solution in this post has each SQL Server send a report via email that runs various checks and highlights any issues. It’s a solution I used when I managed a smaller number of servers but as the number of instances I managed increased it became unwieldy. Depending on the number of serves you manage you might find it useful.

I now use a different solution that collects data from multiple servers in a central repository. This allows me to have a single dashboard that highlights problems across multiple servers instead of having to check individual reports for each server. I hope to share my new method at some point.

Report Sample

The image below is a sample report that was generated from a test instance of SQL Server. 

DBA Checks HTML Report Sample

Compatibility

  • SQL Server 2005 (SP3)
    There seems to be a bug in earlier builds related to attachments in email messages.  You might get an error similar to this one:
    Exception happened when running extended stored procedure ‘xp_sysmail_format_query’ in the library ‘xpstar90.dll’. SQL Server is terminating process 62. Exception type: Win32 exception; Exception code: 0xc0000005.
    You can fix this by upgrading to the latest service pack or removing the code that adds the error log attachment to the email.  Restarting the SQL Server Agent can also provide a temporary fix for this issue, but the problem is likely to re-occur.
  • SQL Server 2008
  • SQL Server 2008 R2

Installation Instructions

  • Create a database for DBA related tasks if one doesn’t already exist.
  • Install SQLIO_fnGetDrives() CLR function.  See here.
  • Install DBA Checks stored procedures.  (Script below)
  • Make sure you have configured database mail on the SQL Server instance.
  • Create a SQL Server Agent Job to run the dbo.DBAChecks stored procedure.

Version History

DateVersionNotes
03-Apr-2012Version 1.2 BETAFix to DBAChecks_JobStats stored procedure.  Added CDATA section.
14-Dec-2010Version 1.1 BETAFixed datetime conversion bug in DBAChecks_JobStats procedure.
13-Jul-2009Version 1.0 BETABeta release

Running the report

The DBA Checks report is run by executing the dbo.DBAChecks stored procedure.  This stored procedure takes a number of parameters, but only one is required:exec dbo.DBAChecks @recipients=’dummyemail@wisesoft.co.uk’

The code below shows a call to the DBAChecks stored procedure with all parameters specified:EXEC dbo.DBAChecks @AgentJobsNumDays=3,
@FileStatsIncludedDatabases=NULL,
@FileStatsExcludedDatabases=NULL,
@FileStatsPctUsedWarning=90,
@FileStatsPctUsedCritical=95,
@DiffWarningThresholdDays=3,
@FullWarningThresholdDays=7,
@TranWarningThresholdHours=4,
@FreeDiskSpacePercentWarningThreshold=15,
@FreeDiskSpacePercentCriticalThreshold=10,
@UptimeCritical=1440 ,
@UptimeWarning=2880,
@ErrorLogDays=3,
@Recipients=’dummyemail@wisesoft.co.uk’,
@MailProfile=NULL

A full explanation of these parameters is available here:

 @AgentJobsNumDaysThe number of days SQL SQL Server jobs are reported over.
 @FileStatsIncludedDatabasesA list of databases (comma-separated) to display file stats for.  Default value is NULL (All databases).
 @FileStatsExcludedDatabasesA list of databases (comma-separated) that are excluded from database file stats.  Default values is NULL (No excluded databases)
 @FileStatsPctUsedWarningIf the percent used space in the database file is larger than this value (but less than critical threshold) it will be highlighted in yellow.
 @FileStatsPctUsedCriticalIf the percent used space in the database file is larger than this value it will be highlighted in red.
 @DiffWarningThresholdDaysHighlights differential backups that have not been completed for over “X” number of days
 @FullWarningThresholdDaysHighlights full backups that have not been completed for over “X” number of days
 @TranWarningThresholdHoursHighlights transaction log backups that have not been completed for over “X” number of hours.
 @FreeDiskSpacePercentWarningThresholdUsed to highlight disk drives with low disk space in yellow, where the free disk space percent is less than the value specified.
 @FreeDiskSpacePercentCriticalThresholdUsed to highlight disk drives with low disk space in red, where the free disk space percent is less than the value specified.
 @UptimeCriticalThe uptime in minutes threshold that causes the uptime to be highlighted in red.
 @UptimeWarningThe uptime in minutes threshold that causes the uptime to be highlighted in yellow.
 @ErrorLogDaysThe number of days worth of events included in the attached error log html file.
 @RecipientsThe email addresses where the report will be sent.
 @MailProfileThe mail profile used to send the email.  NULL = default profile.

Database Code

NameInfo
dbo.DBAChecksThis is the stored procedure you run to generate the email report. The stored procedure collates the information from the other stored procedures into a single email report.   The parameters are described in the previous section. 
dbo.DBAChecks_BackupsProduces HTML for the “Backups” section of the report.
dbo.DBAChecks_DBFilesProduces HTML for the “Database Files” section of the report. 
dbo.DBAChecks_DiskDrivesProduces HTML for the “Disk Drives” section of the report.
dbo.DBAChecks_ErrorLogProduces HTML for the “ErrorLog.htm” report attachment.  Review and ammend the filter applied to the error log as appropriate.
dbo.DBAChecks_FailedAgentJobsProduces HTML for the “Failed Jobs” section of the report.
dbo.DBAChecks_JobStatsProduces HTML for the “Agent Job Stats” section of the report
dbo.SQLIO_fnGetDrives()This CLR function is used by the dbo.DBAChecks_DiskDrives stored procedure to produce the report on disk drives and their available free space.  This function can be obtained here.

Conclusion

Daily checks are an important part of any DBA’s routine, giving you confidence that your SQL instances are running problem free and allowing you to take corrective actions when required.  Alerts are also important, but these should be used to compliment daily checks, rather than to replace them.  If you rely on by exception reporting only, your servers will be at risk if something happens to the alert process. 

I’m sure there are a wide variety of methods people currently use to perform their daily checks. The process you use is not important as long as it enables you to effectively monitor your SQL Server instances. If you’ve already got a routine that works for you – great!  If not, hopefully this article will provide you with a better solution.

fnGetDrives

SQL Server CLR IO Utility

Warning: the use of TRUSTWORTHY described in this article can be a potential security risk so you might want to consider alternative options.

Introduction

Using CLR (.NET Framework) code in SQL Server is a very powerful way of extending the built-in features and functionality. The .NET framework provides access to a library of useful code, allowing you to accomplish useful tasks with a minimal amount of code.  The SQL Server IO project does just that; providing access to some useful IO utilities from within SQL Server by leveraging the power of the .NET Framework.

Installation Instructions

You need to ensure that CLR is enabled for the server and the trustworthy database setting is turned on.  The trustworthy database setting is required to allow CLR code to have external access.

To enable CLR, run the following code:

To enable the trustworthy setting for the database, run the following code:

Warning: the use of TRUSTWORTHY can be a potential security risk.

Next, download the “T-SQL Code” and run it in SQL Server Management Studio to install the CLR functions and stored procedures in the current database.

Note: Consider the security implecations before running any of the code in this article.

Code Download

DownloadT-SQL Code

DownloadVisual Studio Source Code

CLR Functions

SQLIO_fnGetDrives

Returns information about the drives installed on your computer/server. SELECT * FROM dbo.SQLIO_fnGetDrives()

SQLIO_fnGetDrives function output

SQLIO_fnGetFiles

Returns a list of files in the specified folder.  The function takes 3 parameters:

@path – The folder from which to start the search
@searchPattern – The search string to match against the names of files in path
@includeSubfolders – Specifies whether the search operation should include all subdirectories or only the current directory.

SQLIO_fnGetFiles function output

SQLIO_fnGetFolders

Returns a list of subfolders within the specified folder.  The function takes 3 parameters:

@path – The folder from which to start the search
@searchPattern – The search string to match against the names of files in path
@includeSubfolders – Specifies whether the search operation should include all subdirectories or only the current directory.

SQLIO_fnFileExists

Takes a file path as a parameter and returns the bit value “1” if the file exists.

SQLIO_fnFolderExists

Takes a folder path as a parameter and returns the bit value “1” if the folder exists.

SQLIO_fnReadFile

Takes the path of a file as a parameter and returns the file data in varbinary format.

SQLIO_fnReadTextFile

Takes the path of a file as an parameter and returns the file data in nvarchar format.

CLR Stored Procedures

SQLIO_DeleteFile

Takes the path of a file as an parameter and deletes the specified file. 

SQLIO_DeleteFolder

Takes the path of a folder as a parameter and deletes the specified folder.

SQLIO_DeleteFiles

Takes the path of a folder, a search pattern and an include subfolders parameter, similar to the dbo.SQLIO_fnGetFiles function and deletes all the files matched from the search.

SQLIO_DeleteFilesOlderThan

This stored procedure is the same as the dbo.SQLIO_DeleteFiles procedure, but with an additional maxAge date parameter.  Only files older than the maxAge date specified will be deleted.

SQLIO_CreateFolder

Takes a path as a parameter and creates the new folder on the file system.

SQLIO_CreateFile

Creates a new file, by taking a file path and varbinary data value.

SQLIO_CreateTextFile

Creates a new text file, by specifying a file path and string value.

SQLIO_AppendTextFile

Appends text to the end of a file, by specifying a file path and text to append.

SQLIO_MoveFile

Takes source and destination parameters, moving a file from one location to the other.

SQLIO_CopyFile

This stored procedure takes source and destination parameters, copying a file from one location to the other.

SQLIO_MoveFolder

Takes source and destination parameters, copying a folder from one location to the other.

SQL Server Database Restore Utility

This script is a simple utility to allow users to restore SQL Server databases.  It’s easier to use for non-DBAs and it provides a method of restoring databases when the management tools are not installed.  It’s a good demonstration of a HTA application, but it’s not a intended as a replacement for SQL Server Management Studio. You can only use this utility for simple restore scenarios.A simple utility that allows you to restore SQL Server databases from a backup file.

Backup/Restore Progress script

Show running time and estimated time to go for SQL backup and restore operations.

T-SQL Proper Case Function

This function converts a string to proper case.

e.g.
Convert this: A LONG TIME AGO IN A GALAXY FAR, FAR AWAY
To this: A Long Time Ago In A Galaxy Far, Far Away

Note: It might be better to do this in the application rather than the database and it could be better written as a CLR function. This is useful if you need a T-SQL solution for a one off purpose.