T-SQL Tuesday #154 Invitation – SQL Server 2022

Posted on September 13, 2022September 13, 2022 by David

Introduction

This is my first time participating in T-SQL Tuesday. The Invitation from Glenn Berry asks to write about what you’ve been doing with SQL 2022.

Getting DBA Dash ready for SQL 2022

I was eager to test DBA Dash with SQL 2022 and started testing with the first public release (CTP 2.0). I added a SQL 2022 instance to my lab environment, created using AutomatedLab.

Unlike my other lab machines, I ran through the installer for 2022 manually to get up and running quickly and test the new installer.

DBA Dash only needed some minor modifications to work with SQL 2022. The first and most important step was to add version 16 to the hardcoded list of supported versions and also update some static data for SQL Server 2022. I hardcode the list of supported versions to give me the opportunity to test new versions first and fix any issues.

This initial support for SQL 2022 was added in May. I’ve been running DBA Dash in my lab environment since then and it works well.

DBA Dash – support for new features in SQL 2022

QAT_DEFLATE Backup Compression

Version 2.22.0 adds support for capturing the compression algorithm used on SQL 2022.

Glenn Berry has some good articles on this new backup compression algorithm here and here. I’m getting similar results to Glenn with my own testing using QAT_DEFLATE (software mode) – faster backups, higher compression and lower resource utilization.

sys.databases

SQL 2022 adds some new columns to sys.databases:

is_data_retention_on
is_ledger_on
is_change_feed_enabled

I’m planning to add these to the DBA Dash collection. Only is_ledger_on is currently documented.

Other

On the to-do list is to test DBA Dash with contained availability groups. DBA Dash has monitoring for agent jobs and availability groups so it’s likely some changes will be required in this area.

What I’m excited about in SQL 2022

The intelligent query processing features in SQL 2022 and tempdb concurrency enhancements are very interesting. There is also some really useful language enhancements like GENERATE_SERIES, DATE_BUCKET, GREATEST and LEAST that I can see myself using. Null handling also gets easier with IS [NOT] DISTINCT FROM. I can’t use any of this for DBA Dash as I want to keep the repository compatible with SQL 2016 for now. The language enhancements will be useful for some of the SaaS databases I manage though once we upgrade.

The ability to failover back and forth between Azure managed instance and SQL 2022 is also a game changer. It certainly takes some of the risk out of the process if you are considering using managed instances.

The feature I’m most excited about? Backup/Restore to S3. If you host SQL instances on AWS you have to figure out how to get your backups to S3. There isn’t a standard way to do this. Many people backup to an EBS volume first then push the files to S3 which isn’t ideal. EBS storage is expensive and the two step process to get the backups to S3 adds time and complexity to the process. The restore process is also slower and more complex as you need to pull the backups from S3 first before you can restore them.

The ability to backup directly to S3 is a huge win. It also has value outside of AWS as there are a number of other S3 compatible storage providers.

September Free Community Tools Awareness Month

Posted on September 8, 2022September 8, 2022 by David

I’m writing this post to join in with September Free Community Tools Awareness Month. There are many awesome free community tools. I’ve chosen to blog about 3 of them that I think are interesting any maybe less well known in the database community. I’m also taking the opportunity to shamelessly plug my own tool, DBA Dash.

DBA Dash

DBA Dash is a monitoring tool for SQL Server. It’s something I use every day and it provides value in several different ways. It supports everything from SQL 2005 to SQL 2022 and Azure DB.

🏎️If you have a performance issue, DBA Dash can help you get to the root cause. You can troubleshoot an issue that is occurring right now or do a postmortem for an issue that occurred weeks ago.

🏥It’s also a tool for health checks. Do you have missing backups, agent job failures, or corruption? Is log shipping, availability groups, or mirroring broken? Are your servers running out of disk space? DBA Dash checks all of these and more. Recently added is a check for identity columns running out of values.

⚙️DBA Dash captures configuration data for your whole SQL Server estate (config settings, trace flags, tempdb). Use the configuration data to check for configuration settings that are different between your SQL instances – great for validating the config of new SQL instances. Check the patch level of your SQL instances. Keep track of when updates were installed and configuration changes made.

l created this tool and I’m also it’s #1 user. The tool is totally free without restrictions or limitations – even better it’s open source. Get it on GitHub.

This video will give you a quick overview of the tool.

Note: This video was created in January when the tool first launched. Many new features and enhancements have been made since the video was created.

AutomatedLab

AutomatedLab enables you to quickly set up lab environments on Hyper-V and Azure. I use this to create lab environments for DBA Dash on Hyper-V.

The lab environments enable me to test DBA Dash with SQL versions from 2005 to 2022 as well as availability groups, and different configurations. I also use the lab environment as a general playground for learning and experimentation – something useful to have for any DBA.

The power of AutomatedLab is that I can re-create my lab by running a PowerShell script and I can tear it down just as easily. Infrastructure as code!

dbatools is also a key component to setting up my lab environment and it’s one of my favourite community tools. Use this for automating anything relating to SQL Server.

Pssdiag/SqlNexus

This is a monitoring tool for SQL Server, but unlike DBA Dash it’s not a tool for 24/7 monitoring. I don’t use this tool very often but it can be useful if you are dealing with a particularly tricky SQL issue.

I’ve been a DBA for 17 years and I’ve had a small number of edge case issues that required a phone call to Microsoft product support. During these engagements I’ve been asked to run a pssdiag session and upload it for their offline analysis. You can create your own pssdiag session with SqlDiag Manager and analyse it with SqlNexus.

I would run this tool for short periods of time while you are experiencing particularly tricky issues and use other tools like DBA Dash for your regular monitoring.

This is a Microsoft tool, but like the other tools on the list it’s also open source!

Should I Store Files in the Database?

Posted on March 5, 2022March 5, 2022 by David

SQL Server can store binary large objects such as files inside the database. Why might you want to store files inside the database? What problems might it cause and what alternative options should you be considering?

Background

I have experience managing several large SQL databases consisting of up to 160TB of blob data – using in database BLOBs as well as FILESTREAM. If you think you might need to store hundreds of GB or TBs of BLOB data you should think very carefully about where you place that data – your SQL Server database might not be a good solution! It can be made to work and I’ll provide some tips in this article but ultimately we ended up migrating our BLOBs out of SQL. In this article, I’ll explain why you might want to consider alternative options and how to manage huge volumes of blob data in SQL.

Why store in the DB?

FullText indexing.
This is the most compelling reason to store your blobs in the database in my opinion. There are alternative options to consider like Elastic Search which I actually think is better and more flexible than FullText search in SQL Server. Fulltext search is seamless though and easy to implement.
Transactional consistency
I’ve seen this argument used as a reason for storing blobs in the DB, but I think in most cases you can engineer what you require with the blobs stored externally. For example, you could commit the blob to an appropriate store first then add the associated metadata to the DB – ensuring you don’t have any metadata without an associated BLOB? A process could be added to clean up orphaned BLOB data. Or use a queue to ensure reliable processing of both the blob and metadata?
Note: You need to use explicit transactions to get transactional consistency in the database. Large BLOBS could create long-running transactions which result in blocking and poor concurrency. Doing the BLOB operation as the first part of the transaction is probably the best option if you need this.
Easy backup/restore
Backup/Restore is probably a compelling reason NOT to store your files in the DB (discussed later), but there is a simplicity in having all your ‘content’ data included in a database backup file. Also, your DB and blobs will be transitionally consistent if you need to rollback to an earlier time – ensuring no orphaned blobs or metadata (provided you used explicit transactions).

The Problems

DB size!
Your databases can potentially grow to enormous sizes. Most of the negatives below are related to this. If you don’t expect to store very large volumes of BLOB data some of these issues might not apply. Evaluate the information in this article and make an informed decision for your use case.
Storage costs
The database is probably the most expensive place to store your BLOB data. If you are hosting in the cloud you might find that there are options that are significantly cheaper and have better durability (e.g. S3, Azure blob).
Backups
Your database backups are going to be large! Also, because they are large they are going to be SLOW and EXPENSIVE! They are also inefficient to backup in the database.
Restores
If backups are slow as a result of BLOB data, also think about restores. DR/HA also becomes critical as you move into VLDB territory.
DR/HA
This becomes more necessary and also more expensive as a result of storing BLOB data in your DB. Your DR copies also need to include the blob data. Looking to scale out reads with AlwaysOn AGs? The extra storage cost for the BLOBs will add up.
DBCC checks
Make a coffee, it’s going to take a long time! You might need to think of partitioning your DBCC checks so that they can complete out of hours. DBCC checks are very resource-intensive and blob data pushing your DBs into VLDB territory can create significant challenges here.
Risk of Corruption
Corruption is usually the result of a SQL Server bug, driver, or hardware failure. With larger volumes of data, you could be more at risk.
Performance
SQL Server licensing is expensive and there is some CPU overhead in serving files from your DB. Also, if you are not using FILESTREAM the data pages associated with the BLOB data will consume space in the buffer pool – wasting valuable physical memory.
2GB size limit for in database BLOBs.
Removing files doesn’t automatically reclaim any space (in database blobs)

Inefficient Backup

Why are backups of BLOB data inefficient in the database? Upload a new file into your database – how many times is it backed up?

In the next transaction log backup.
In the next diff backup – and the next until a full backup is taken.
Every full backup from now until the end of time or the blob is deleted. Regardless of if the file has changed.

So how often should it have been backed up? Just Once. You might want multiple copies for redundancy but you don’t need to keep backing up files that don’t change. If you choose to store your files in the database you don’t have that option. Your backups and restores are going to be slow and inefficient. What if disaster strikes and you need to restore from backup? How long will that take and what pain will it cause the business?

What about FILESTREAM?

FILESTEAM is a hybrid approach giving you all the benefits of storing blobs in the DB with non of the downsides RIGHT? Not quite…

Filestream solves the problem of blobs consuming space in the buffer pool. It solves the 2GB limitation imposed for traditional BLOBs and it also provides a more efficient way of streaming larger BLOB data. Filestream data doesn’t count towards the 10GB SQL Express limitation – which might allow some people to stay on the free edition of SQL Server a while longer.

It solves some problems, but this shouldn’t be considered a best of both worlds solution as most of the issues mentioned still apply to FILESTREAM. For instance, It doesn’t help with DB size, backup/restore, storage costs, DR/HA, DBCC, or potential for corruption. It maybe helps to some extent with performance – but you are still adding extra load to your DB server.

If you are going to store blobs in SQL Server larger than 1MB, FILESTREAM is probably the best option. For smaller blobs, it might be more efficient to store in database. The best option? Skip FILESTREAM and store the blob data completely outside the database.

If not the DB, Where?

If you are running in the cloud you should look towards solutions similar to S3 or Azure blob. For example S3 is cheaper and has better durability than EBS volumes. Also, with cloud blob storage you don’t have to worry about expanding drives or running out of space (just paying the cloud provider bill).

If you need fulltext indexing, there are alternative solutions like ElasticSearch or Solr that are worth looking at. Those solutions will require you to extract the RAW text from your BLOB data where this could be done automatically for you in SQL Server with IFilters. Apache Tika might be an option to consider as an alternative. IFilters are maybe still an option but you would have to write the code to extract the text yourself using something like IFilterTextReader.

Note: Fulltext is quite a compelling reason to use SQL Server for blobs. I think there are better alternatives but they all require more effort to develop. For smaller projects where you need fulltext search, storing blobs in the DB is an option to consider. For larger projects that predict high volumes of data I would look at alternative solutions. It’s more effort but it could prove cheaper in the long run with more efficient storage, backups etc.

I still need to store my BLOBs in the DB. Help!

As volumes of data increase, backups, restores and DBCC checks become increasingly difficult. The largest DB I managed grew to 160TB which would require a sustained average backup rate of 970MB/sec to complete a FULL backup within a generous 48hrs window. So how do you cope as volumes of data increase? Throwing money at your hardware vendor might be one solution, but there is another option…

Partitioning is the key! Ideally your blob data is immutable (INSERT Only). This is something you can engineer but you might need to consider things like GPDPR. I would recommend partitioning your blob data by date and using separate filegroups per partition. This allows you to mark older filegroups as readonly. You need to create one off backups of your readonly filegroups and then you can create regular backups of just the writable filegroups. This can seriously cut down on the size of your weekly backups.

Note: Marking filegoups readonly requires exclusive access to your database. This means a very brief outage to mark filegroups readonly which isn’t ideal. Storing your blobs in a separate DB might be an option to consider to limit the impact of this.
If you need to delete data in your readonly filegroup you will first need to mark it writeable (exclusive access required again). You would then delete as normal and mark the filegroup readonly again. At this point a new FULL backup of the filegroup is required.

What if my table isn’t INSERT only?

If you expect the data to be mostly readonly you could handle updates as an insert operation and update any references to point to the new blob. You can keep a version history for your blobs in this way which might be useful for your application. Deletes could also be done as a “soft” delete where you remove the pointers to the blob.

In some cases there might be a legal obligation to physically delete data and a soft delete won’t cut it. Or maybe the quantity of update and delete operations is high so you can’t make the table insert only without adding significantly to your data storage problems. Ideally this is another reason to consider moving the files out of the DB.

Filegroups can still be useful in this scenario. If you have filegroups with older, less volatile data you can take less frequent FULL backups of these filegroups. You will still need to take regular diff backups of these filegroups but the diff backups will be quite small.

What about DBCC?

You can run DBCC checks at the FILEGROUP level which allows you to partition your DBCC checks. You can prioritize running checks for your writable filegroups which are most at risk from corruption.

What about Restore?

Restores are still going to be slow. Also if you are using FILEGROUP backups your restores have become more complex. If you get this wrong you might be in a position where you are unable to restore your database. The individual filegroup backups represent your database at different points in time – you need to bring these to a transactionally consistent point in time to recover your database. You do this by applying your diff and transaction log backups.

With filegroup backups, you might have some additional options available to bring things online much faster. It’s possible to do an online piecemeal restore where you recover the “active” part of your database and bring it online. You can then restore your older filegroups while users are accessing the database. This is an enterprise edition feature though.

More Info

Some of the topics in this article I’ve only covered briefly like partitioning and online piecemeal restores. I’d recommend viewing the MCM (Microsoft Certified Master) videos on online piecemeal restore. It’s no longer available on Microsoft’s website, but SQL Skills has it here.

The big shrink of 2019:

DBA Dash captures the moment when blob data was removed from one of our databases back in 2019. The files were moved to cloud blob storage and the associated table, files and filegroups were simply dropped from the database. No actual shrinking required which would have been a slow and painful process!

DBA Dash Overhead

Posted on March 2, 2022March 2, 2022 by David

Introduction

All monitoring tools have some overhead associated with them but generally, the overhead is low compared to the value they deliver. How much load does DBA Dash add to a server?

To find out I setup an extended event to capture the queries ran by the DBA Dash agent. I left the trace running for over 8hrs (8:23:33, ~504mins). This was done on a real production server – one of our busiest servers that supports a SaaS application. This was done during peak load. The results are valid for this server running version 2.13.1 of DBA Dash.

TLDR: The overhead of running DBA Dash is very small and it delivers great value!

Results

Query	Samples	Avg Duration (ms)	Avg CPU (ms)	Avg Reads	Max Duration (ms)	Max CPU (ms)
Instance	1520	0	0	12	2	16
RunningQueries Changed to capture every 30sec instead of the default 1min on this server. Session waits are collected.	1008	92	69	50	500	157
Plan Collection Plan collection is off by default but is enabled on this server. It only runs if there are plans to collect.	942	30	24	2036	236	141
Text Collection Collects text for running queries and only runs if there is text to be collected.	818	12	9	232	38	32
SlowQueries This is disabled by default. Query has a 1 second waitfor so look at CPU time	504	1467	295	3521	1793	500
PerformanceCounters	504	41	28	551	140	94
CPU	504	29	25	32	94	63
ObjectExecutionStats	504	17	14	1006	191	47
Waits	504	3	3	0	28	16
AvalabilityReplicas	504	11	2	39	52	16
DatabasesHADR	504	2	2	362	4	16
AvailabilityGroups	504	21	2	34	121	16
MemoryUsage	504	1	1	31	3	16
IOStats	504	0	0	0	2	16
JobHistory	504	0	0	12	60	31
DBFiles	8	6115	523	12199	16585	797
Backups	8	279	215	12625	510	281
Custom Check Note: This runs custom check that is NOT part of DBA Dash and is unique to this server. This is YOUR code (Or mine in this case)	8	1053	188	299241	1312	203
Databases	8	139	107	21564	275	141
ServerExtraProperties	8	598	41	1061	1088	62
DBConfig	8	83	37	1843	183	78
LogRestores	8	22	16	247	48	47
LastGoodCheckDB	8	30	16	590	76	47
DBTuningOptions	8	18	14	300	31	31
Corruption	8	23	10	706	47	32
SysConfig	8	3	8	308	8	16
ServerProperties	8	2	2	1	5	15
TraceFlags	8	0	0	0	0	0
DatabaseMirroring	8	1	0	43	2	0
OSInfo	8	3	0	146	6	0
Alerts	8	8	0	206	13	0
Max date modified from sysjobs	8	0	0	30	1	0

The queries with 504 samples are been run every 1min with RunningQueries sampled 1008 times on this server as it was changed to collect every 30 seconds instead of every 1min.

Of these frequently ran query captures, ironically it’s the SlowQueries capture query that is the slowest. The query has a 1-second delay built in while it starts a new extended event session before flushing the ring_buffer of the old one. It’s better to look at the average CPU time of 295ms for this query. It’s still the heaviest of the queries that are collected every 1min.

This is quite a busy production server and SlowQueries captures all queries taking longer than 1 second to run. This collection isn’t enabled by default and you can configure the threshold for collection as required.

The next slowest of the frequently executed queries is the RunningQueries capture with a 92ms average execution time. I’ve previously compared this to sp_WhoIsActive. In a lab environment with 12 active queries the average duration was only 15ms. So results will vary depending on your workload. The Plan Collection and Text Collection also form part of the RunningQueries capture with plan collection been optional.

Of the hourly collections, it’s the DBFiles collection that is the slowest. The overhead of this query will likely be different on your server depending on the databases and files on your SQL instance.

Other Caveats

The Slow Query capture doesn’t take into account the overhead of the associated extended event.
The repository database is on a different server. If you host both on the same server you will have additional overhead.
The DBA Dash agent is ran on a different server. If you run the agent on your monitored SQL instance this will also add additional overhead.
The results above are valid for this server with version 2.13.1 of DBA Dash.
See here for a list of what DBA Dash captures and when. The analysis above doesn’t include collections that are ran daily as they didn’t fall into the 8hr window of this test.
You have full control over the collection schedule. Disable some collections or change the frequency as required!

How to test

I’ve posted the results above to give an indication of the overhead of running DBA Dash. If you want to monitor the activity on your own server, this extended events session will get you started:

CREATE EVENT SESSION [DBADashAgent] ON SERVER 
ADD EVENT sqlserver.rpc_completed(
    ACTION(sqlserver.client_app_name)
    WHERE ([sqlserver].[like_i_sql_unicode_string]([sqlserver].[client_app_name],N'DBADash%') AND [object_name]<>N'sp_reset_connection')),
ADD EVENT sqlserver.sql_batch_completed(
    ACTION(sqlserver.client_app_name)
    WHERE ([sqlserver].[like_i_sql_unicode_string]([sqlserver].[client_app_name],N'DBADash%')))
ADD TARGET package0.event_file(SET filename=N'DBADashAgent')
WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=30 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=OFF,STARTUP_STATE=OFF)
GO

CREATE EVENT SESSION [DBADashAgent] ON SERVER

ADD EVENT sqlserver.rpc_completed(

ACTION(sqlserver.client_app_name)

WHERE ([sqlserver].[like_i_sql_unicode_string]([sqlserver].[client_app_name],N'DBADash%') AND [object_name]<>N'sp_reset_connection')),

ADD EVENT sqlserver.sql_batch_completed(

ACTION(sqlserver.client_app_name)

WHERE ([sqlserver].[like_i_sql_unicode_string]([sqlserver].[client_app_name],N'DBADash%')))

ADD TARGET package0.event_file(SET filename=N'DBADashAgent')

WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=30 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=OFF,STARTUP_STATE=OFF)

I saved the results to a table and ran a quick and dirty query to group the results.

SELECT grp,
        COUNT(*) Samples,
        AVG(duration/1000) AvgDurationMs,
        AVG(cpu_time/1000) AvgCPU,
        AVG(logical_reads) AvgReads,
        MAX(duration/1000) MaxDurationMs,
        MAX(cpu_time/1000) MaxCPUMs
FROM dbo._DBADashTrace
OUTER APPLY(SELECT ISNULL(batch_text,statement) AS txt) calc1
OUTER APPLY(SELECT CASE WHEN txt LIKE '%@plan%' THEN '{Plan Collection}'
                        WHEN txt LIKE '%@handles%' THEN '{Text Collection}' 
                        WHEN txt LIKE '%sysjobhistory%' THEN '{JobHistory}' 
                        ELSE txt END AS grp) calc2
GROUP BY grp

SELECT grp,

COUNT(*) Samples,

AVG(duration/1000) AvgDurationMs,

AVG(cpu_time/1000) AvgCPU,

AVG(logical_reads) AvgReads,

MAX(duration/1000) MaxDurationMs,

MAX(cpu_time/1000) MaxCPUMs

FROM dbo._DBADashTrace

OUTER APPLY(SELECT ISNULL(batch_text,statement) AS txt) calc1

OUTER APPLY(SELECT CASE WHEN txt LIKE '%@plan%' THEN '{Plan Collection}'

WHEN txt LIKE '%@handles%' THEN '{Text Collection}'

WHEN txt LIKE '%sysjobhistory%' THEN '{JobHistory}'

ELSE txt END AS grp) calc2

GROUP BY grp