DBA Dash Running Queries

DBA Dash vs sp_WhoIsActive / spBlitzWho

Introduction

sp_WhoIsActive and sp_BlitzWho are popular stored procedures created by the database community. They both capture queries currently running on your server. If you are a seasoned DBA it’s likely you have come across one or both of these tools. sp_WhoIsActive in particular has been around for a very long time, deployed to many, many thousands of servers. It’s helped me solve countless performance issues. Shout out to Adam Machanic who created the tool – Thank you!

The Brent Ozar team created sp_BlitzWho due to licensing issues distributing sp_WhoIsActive. The licensing issues are no longer a concern as both tools are now open source and can be freely distributed. sp_BlitzWho still has a more permissive license but the difference isn’t likely to matter to most people. Both work great. Each has unique selling points. Include them on all your SQL Server deployments! The excellent dbatools can help with that.

DBA Dash isn’t a stored procedure for showing active queries – it’s a full monitoring tool with a Windows app front end. So this is a bit of a weird comparison. It does capture running queries though which is where we can draw some comparisons.

Re-inventing the Wheel

While developing DBA Dash I did consider just embedding sp_BlitzWho or sp_WhoIsActive rather than re-invent the wheel. Licensing was maybe a concern for sp_WhoIsActive but I had other reasons for creating my own solution. I wanted a solution designed specifically for regular collection. sp_BlitzWho and sp_WhoIsActive can both log to a table. But they were designed for interactive use in SSMS.

Performance

The goal for DBA Dash was to limit the overhead of query collection. Also, I did not need or want the data to be formatted for human consumption on collection. The numbers below show a performance comparison. I’m using the defaults for each and averaging over 10 executions. 12 active queries were running for this test.

ToolAvg CPU (ms)Avg Duration (ms)
sp_WhoIsActive159166
sp_BlitzWho231241
DBA Dash (with session waits)615
Performance Tests

Note: I don’t worry about the overhead of running any of these tools interactively. Also, if you want to log the output to a table at regular intervals it’s not likely to hurt the performance of your server. Still, for regular monitoring, it’s good to keep overhead as low as possible.

If you want to test this for yourself, the capture query for DBA Dash is here. Confession time – this isn’t a fair fight. I’m not collecting query text or query plans. sp_BlizWho collects query plans by default so it’s at a disadvantage – the tools are not all doing the same thing. Also, the performance differences between these tools will vary based on several other factors.

So how does DBA Dash show query text?

  • Takes a distinct list of sql_handles from the running query capture
  • Collect text associated with those handles (if required).
  • Cache the handles.
  • Ignore the handles we’ve already captured on subsequent collections.

Query plan capture features a similar optimization. Also, DBA Dash doesn’t need to do anything with the formatting of the data for human consumption – at least not on collection.

Query text and query plans are stored in separate tables. Not duplicated by row. DBA Dash can also take care of data retention – clearing out old data efficiently with some partition switching.

This isn’t just about efficient data collection. DBA Dash has a custom GUI and isn’t bound by the same constraints as sp_WhoIsActive and sp_BlitzWho. This allows us to do some interesting things.

Summary Data

At the top level, we see the summary of the last running queries snapshot for each instance. In this case, we can see some serious blocking going on with instance DASH2019.

From there we can drill down to the DASH2019 instance and see a list of recent snapshots. We can also go back in time to any custom date range and see a list of the snapshots collected.

Use the summary data above to determine which snapshots are most interesting. Then drill down to the individual snapshots themselves:

Blocking

In the screenshot above I’ve zoomed into the blocking section of the report. DBA Dash highlights the sessions blocked in the “Blocking Session ID” column. In some cases, there will be long blocking chains. This can make it difficult to work out which is the root blocker by looking at the blocking session id and traversing the blocking chain manually. With sp_WhoIsActive you can pass @find_block_leaders=1 to make this easier which gives you a blocked session count.

With DBA Dash you can just look at the “Blocking Hierarchy” column. Take session 64. This is blocked by session 79 and we have a hierarchy of “90 \ 79”. Session 79 is blocking the current session 64. Session 90 is blocking session 79 and is our root blocker. The “Root Blocker” column also identifies which sessions are root blockers. You can filter to show just the root blockers at the click of a button.

The “Blocked Count” column shows how many queries are blocked directly by a session. You also have a “Blocked Count Recursive” column that shows you a total count of sessions blocked by the query (directly/indirectly). The counts are also clickable – allowing you to navigate the blocking chain interactively.

So what’s actually going on in this snapshot?

DBA Dash identifies session 90 as the root blocker. Session 90 isn’t blocked or waiting on anything. The CPU time and reads are very small. The status column is “sleeping” and we should pay close attention to this. A sleeping session is dormant – waiting for input from the app. This is a problem if the session has an open transaction – holding locks needed by other sessions. DBA Dash gives us the user, hostname, application name along with the query text that will assist in tracking this down.

The root cause of this issue is an update statement that ran from SSMS. It had a BEGIN TRAN but didn’t issue a COMMIT or ROLLBACK. This left the transaction open, holding locks that caused the blocking.

Sleeping sessions causing blocking are something to watch out for. It might indicate problems that need to be fixed in the app. e.g. Not using the “using” pattern, doing slow work in the middle of the transaction like calling a web service.

Waits

DBA Dash shows what queries are currently waiting on. It also captures the wait resource and parses the wait resource to make it easier to decipher. Session level waits are also available in DBA Dash (and sp_BlitzWho). DBA Dash extends this by providing a clickable link that will show the session level waits in table format with more detail.

Text

DBA Dash gives you the statement text and the full query text associated with the SQL handle. Just click the links in the grid to see the text formatted in a new window.

Plans

DBA Dash can show you query plans at the click of a button. Configure this using the service configuration tool. The collection process is optimized the same as for query text. Also, you can reduce the cost of collection further with plan collection thresholds.

A query plan tells you exactly how SQL Server was attempting to process the query. It also gives you the parameters that were used to compile the query plan.

Batch Completed / RPC Completed

Having the batch text and statement is great. For a typical RPC call, we can see the current statement along with the text of the associated stored procedure (or batch). What’s missing are the parameter values passed in from the client application. Unfortunately, we can’t get this by querying the DMVs. DBA Dash has a trick up its sleeve though. We can enable the capture of slow queries which creates an extended event session to capture queries that take longer than 1 second to run (configurable).

What’s cool is that you can click a session ID in running queries and see the associated RPC/batch completed event. This gives you the metrics associated with the completed execution and also the query text with the parameter values. Having the parameter values is useful when trying to reproduce and diagnose performance issues.

Note: The associated RPC/batch completed event is only available if the query has completed. It also needs to meet the threshold for collection. DBA Dash collects this data every 1min by default which adds additional delay. For the most recent snapshot, you might need to wait before the RPC/batch completed event is available.

Full Duplex

The link between running queries and RPC/batch completed is two-way. The Slow Queries tab shows you the RPC/batch completed events captured from the extended events session. From here you can click the session id and see all the running queries snapshots captured for that session while the query was running. Having access to the running queries snapshots can help answer questions about why a query was slow.
e.g. Was it blocked?
What was it waiting for?
What was the query plan used?
What statement was running?

Note: For queries with shorter execution times you might not have any data as running queries is captured every 1min by default.

Grouping and Export

You can group the running query capture in DBA Dash with the click of a button. For example, you might want to count the number of running queries by application, query hash, hostname….lots of options here. From there you can drill down to see the queries associated with your selected group value.

Exporting the data to Excel is also just a click away.

Not a GUI person?

The captured running query data is available in the DBA Dash repository database for you to query manually if required. I’d recommend using the dbo.RunningQueriesInfo view. The application calls dbo.RunningQueries_Get to return a specific snapshot (which uses the view).

Wrap up

Running queries capture is just a small (but interesting) part of what DBA Dash can do. There is a lot more that the tool can do – pair DBA Dash with query store and you have a lot of bases covered.

sp_WhoIsActive and sp_BlitzWho still have a place in my toolbox. I use them often. They are great tools and DBA Dash doesn’t replace them. I’m, running these tools interactively where DBA Dash is capturing valuable performance data round the clock for both current and retrospective analysis.

There is a lot of value in running a quick command in SSMS and having the results available immediately – in the same tool. With DBA Dash you have to wait for the next scheduled collection which is every 1min by default. Also, not everyone will want a full monitoring tool just to capture running queries. sp_WhoIsActive and sp_BitzWho both have their unique selling points and DBA Dash doesn’t replicate all their features and functionality.

If you haven’t tried DBA Dash, it’s a free monitoring tool for SQL Server that I created. My employer Trimble has allowed me to share the tool open source and it’s available on GitHub here. sp_WhoIsActive and sp_BlitzWho are also open source. Give them a try today.

Daily DBA Checks

DBA Daily Checks Email Report

Update: This was my original solution to automate the DBA Checks process. I don’t recommend people use this solution as it doesn’t scale well to a larger number of servers and the CLR function used to get drive space requires the TRUSTWORTHY property to be enabled which can be a potential security risk.

I recommend people use DBA Dash for daily DBA Checks. This gives you a central dashboard so you can see the status of all your SQL instances on a single page with drill downs for more detailed information.

Introduction

In addition to alerts I recommend running daily checks to validate that backups are running as expected, servers have sufficient disk space and agent jobs are running successfully.  Although alerts provide immediate notification that there is an issue you can’t assume that everything is OK because you haven’t received an email notification.

For example, an email alert might get caught by a spam filter or matched by a rule in your email system that moves it to a different folder. It’s also possible there was a problem sending the notification email. If you are not careful with how you configure alerts you can easily get spammed by a large number of alert emails and important notifications could be lost in this flood.

You might not get a failed backup job notification if your backup job never ran – maybe the job got disabled or the SQL Agent wasn’t running. These are some of the reasons I recommend daily checks.

The solution in this post has each SQL Server send a report via email that runs various checks and highlights any issues. It’s a solution I used when I managed a smaller number of servers but as the number of instances I managed increased it became unwieldy. Depending on the number of serves you manage you might find it useful.

I now use a different solution that collects data from multiple servers in a central repository. This allows me to have a single dashboard that highlights problems across multiple servers instead of having to check individual reports for each server. I hope to share my new method at some point.

Report Sample

The image below is a sample report that was generated from a test instance of SQL Server. 

DBA Checks HTML Report Sample

Compatibility

  • SQL Server 2005 (SP3)
    There seems to be a bug in earlier builds related to attachments in email messages.  You might get an error similar to this one:
    Exception happened when running extended stored procedure ‘xp_sysmail_format_query’ in the library ‘xpstar90.dll’. SQL Server is terminating process 62. Exception type: Win32 exception; Exception code: 0xc0000005.
    You can fix this by upgrading to the latest service pack or removing the code that adds the error log attachment to the email.  Restarting the SQL Server Agent can also provide a temporary fix for this issue, but the problem is likely to re-occur.
  • SQL Server 2008
  • SQL Server 2008 R2

Installation Instructions

  • Create a database for DBA related tasks if one doesn’t already exist.
  • Install SQLIO_fnGetDrives() CLR function.  See here.
  • Install DBA Checks stored procedures.  (Script below)
  • Make sure you have configured database mail on the SQL Server instance.
  • Create a SQL Server Agent Job to run the dbo.DBAChecks stored procedure.

Version History

DateVersionNotes
03-Apr-2012Version 1.2 BETAFix to DBAChecks_JobStats stored procedure.  Added CDATA section.
14-Dec-2010Version 1.1 BETAFixed datetime conversion bug in DBAChecks_JobStats procedure.
13-Jul-2009Version 1.0 BETABeta release

Running the report

The DBA Checks report is run by executing the dbo.DBAChecks stored procedure.  This stored procedure takes a number of parameters, but only one is required:exec dbo.DBAChecks @recipients=’dummyemail@wisesoft.co.uk’

The code below shows a call to the DBAChecks stored procedure with all parameters specified:EXEC dbo.DBAChecks @AgentJobsNumDays=3,
@FileStatsIncludedDatabases=NULL,
@FileStatsExcludedDatabases=NULL,
@FileStatsPctUsedWarning=90,
@FileStatsPctUsedCritical=95,
@DiffWarningThresholdDays=3,
@FullWarningThresholdDays=7,
@TranWarningThresholdHours=4,
@FreeDiskSpacePercentWarningThreshold=15,
@FreeDiskSpacePercentCriticalThreshold=10,
@UptimeCritical=1440 ,
@UptimeWarning=2880,
@ErrorLogDays=3,
@Recipients=’dummyemail@wisesoft.co.uk’,
@MailProfile=NULL

A full explanation of these parameters is available here:

 @AgentJobsNumDaysThe number of days SQL SQL Server jobs are reported over.
 @FileStatsIncludedDatabasesA list of databases (comma-separated) to display file stats for.  Default value is NULL (All databases).
 @FileStatsExcludedDatabasesA list of databases (comma-separated) that are excluded from database file stats.  Default values is NULL (No excluded databases)
 @FileStatsPctUsedWarningIf the percent used space in the database file is larger than this value (but less than critical threshold) it will be highlighted in yellow.
 @FileStatsPctUsedCriticalIf the percent used space in the database file is larger than this value it will be highlighted in red.
 @DiffWarningThresholdDaysHighlights differential backups that have not been completed for over “X” number of days
 @FullWarningThresholdDaysHighlights full backups that have not been completed for over “X” number of days
 @TranWarningThresholdHoursHighlights transaction log backups that have not been completed for over “X” number of hours.
 @FreeDiskSpacePercentWarningThresholdUsed to highlight disk drives with low disk space in yellow, where the free disk space percent is less than the value specified.
 @FreeDiskSpacePercentCriticalThresholdUsed to highlight disk drives with low disk space in red, where the free disk space percent is less than the value specified.
 @UptimeCriticalThe uptime in minutes threshold that causes the uptime to be highlighted in red.
 @UptimeWarningThe uptime in minutes threshold that causes the uptime to be highlighted in yellow.
 @ErrorLogDaysThe number of days worth of events included in the attached error log html file.
 @RecipientsThe email addresses where the report will be sent.
 @MailProfileThe mail profile used to send the email.  NULL = default profile.

Database Code

NameInfo
dbo.DBAChecksThis is the stored procedure you run to generate the email report. The stored procedure collates the information from the other stored procedures into a single email report.   The parameters are described in the previous section. 
dbo.DBAChecks_BackupsProduces HTML for the “Backups” section of the report.
dbo.DBAChecks_DBFilesProduces HTML for the “Database Files” section of the report. 
dbo.DBAChecks_DiskDrivesProduces HTML for the “Disk Drives” section of the report.
dbo.DBAChecks_ErrorLogProduces HTML for the “ErrorLog.htm” report attachment.  Review and ammend the filter applied to the error log as appropriate.
dbo.DBAChecks_FailedAgentJobsProduces HTML for the “Failed Jobs” section of the report.
dbo.DBAChecks_JobStatsProduces HTML for the “Agent Job Stats” section of the report
dbo.SQLIO_fnGetDrives()This CLR function is used by the dbo.DBAChecks_DiskDrives stored procedure to produce the report on disk drives and their available free space.  This function can be obtained here.

Conclusion

Daily checks are an important part of any DBA’s routine, giving you confidence that your SQL instances are running problem free and allowing you to take corrective actions when required.  Alerts are also important, but these should be used to compliment daily checks, rather than to replace them.  If you rely on by exception reporting only, your servers will be at risk if something happens to the alert process. 

I’m sure there are a wide variety of methods people currently use to perform their daily checks. The process you use is not important as long as it enables you to effectively monitor your SQL Server instances. If you’ve already got a routine that works for you – great!  If not, hopefully this article will provide you with a better solution.