Guide to SQL Server Query Tuning

on May 1, 2020

Following on from my Learner’s Guide to SQL Server Performance Triage, I’m tackling Query Tuning. In this guide, I’m experimenting with an outline style rather than expanding each paragraph.

Why do we need query tuning?

We’ve seen incredible improvements over the last 15 years:

Storage speed has massively increased due to advances in both storage technology and network bandwidth, CPUs have become much faster, and prices for memory have dropped dramatically.
Database optimizers are constantly improving and finding ways to make queries more adaptable and to conquer areas of poor optimization.

Yet there are still professionals who make a good living tuning queries, and training others to tune queries. This process involves finding specific slow queries that are key to the performance of an application and making strategic changes, whether in the code, the database structures, the instance configuration, or something else, to ensure that these specific queries consistently execute with a given speed or to a required standard of performance.

Why?

Data sizes are also dramatically increasing
Customer expectations about performance / speed of applications have also risen
Customers expect to not have to wait for results now – they want to see current status right away

Who needs to do query tuning? Who doesn’t?

Query tuning is done by:

Database administrators
Developers who specialize in performance or in databases specifically

Full stack developers don’t generally do query tuning unless they have a specific interest or work experience. This is a specialization rather than a “quick learning” task, so most full-stack developers simply don’t have time, and they need to engage a more specialized person to help. Teams who don’t have a readily available specialist may periodically bring in consultants to help with this.

There are also many database administrators who manage databases where only “basic” availability and performance are required. These databases are used by cost-conscious organizations who don’t need every database to be tuned like a race-car: most of their databases are used by internal users who are used to moderate performance.

What skills are involved in query tuning?

I am not a wizard at TSQL, but I still became pretty good at query tuning. A very unscientific estimate of what skills make someone good at query tuning:

65% - the time, interest, and resources (including a network of people to ask) to build an extensive map of “performance trivia” – this is patterns that don’t optimize well, edge conditions where performance goes bad, understanding of trace flags and configuration on TSQL.
20% - interest and ability to learn how the database engine optimizes and processes queries using indexes and other resources, and what impacts concurrency as multiple queries run against a live database at the same time
15% - understanding of the TSQL language and different ways of rewriting a query to produce a given result set

Popular built-in query tuning tools and how they work together

Execution plans

These are a “map” of how the query is run behind the scenes.

“Estimated” execution plans show the choices the optimizer has made to run the query, including how many rows it estimates will flow through different parts of the plan
“Actual” execution plans are estimated plans updated with runtime statistics for things like how many rows flowed through the plan. If the plan is “adaptive” it will contain some information about which options were chosen.

Wait statistics

Was the query slow because it took a long time to read a lot of data? Or because it couldn’t get a lock resource? Wait statistics help sort this out.
Some wait statistics are included in actual execution plans as of SQL Server 2016 SP1+, but not all waits are tracked per query due to overhead

Query Store

SQL Server 2016+, all editions

This feature tracks execution plans and aggregate runtime metrics (duration, cpu usage) along with aggregate wait stats
This also has the ability to “freeze” plans
Query Store information restores with the database itself, so it can be shared between environments if desired.

Dynamic management views and performance counters

These help understand overall instance bottlenecks during the slow performance
Example: overall wait statistics for the instance and metrics about storage latency during the time the queries performed poorly can help explain if the query really needs to be tuned
- Query tuning frequently needs to do callbacks to “workload tuning”

SQL Trace and Extended Events Traces

These are tricky to use for query tuning because it’s easy to slow down your workload and cause performance problems when tracing:
- Execution plans (filtering doesn’t help in this case, the plans are all examined / collected and the filter is applied too late)
- Wait statistics (filtering can help here, but the data collected is so massive that you have to be very careful – and sorting through and querying the collected data is also quite cumbersome

Tough problems

Parameter Sniffing

If a query is “sometimes fast, sometimes slow”, this may be the cause
For parameterized queries, SQL Server “sniffs” the values supplied for the parameters on first execution. This plan is reused when other values are provided on subsequent execution unless something happens to cause recompilation.
An example…
- The CustomerOrderDetails procedure is initially compiled for @CompanyId= 1001, a tiny customer with one order. A plan expecting a very small amount of orders is generated, which allocates very little memory for sorts and joins.
- The CustomerOrders procedure is then run for @CompanyId= 128, our biggest customer with a million rows of details about their orders. As the query runs all the estimates are too small, there are inefficient one-by-one reads and memory spills are happening all over the place
- Someone restarts the SQL Server because “it’s slow”. This causes queries to all compile freshly. CustomerOrderDetails runs for the first time with @CompanyId = 128 and compiles a plan suited to many rows, and runs fast. Nobody understands what happened.
  - In this case the “slow plan” would not longer be in the DMVs. It would be stored in Query Store if that was in use.

Contention

It’s difficult to predict how queries will interact with one another in a live workload

Shared resources:
- Query workspace memory – a certain amount of memory needs to be allocated for sorts/ joins/ moving data around in a query. Lots of queries executing at once which need significant workspace memory can cause a problem with this.
  - Sometimes the queries are estimating they need far more of this memory than they need and it will need to be tuned – people are unlikely to realize this outside of a live workload
- The number of queries doing modification and the approach to locking is difficult to predict outside of a live workload
  - Changes in query plans can cause blocking when it wasn’t present before
Changes in server resources – even improvements – can cause blocking when it wasn’t present before
- Example: moving to a new server with more memory and faster CPUs saw an increase in lock waits, because storage waits went down and query execution sped up.

Is testing in production required?

Often, yes. One of the examples of this is parallelism.

Tuning the level of parallelism for a workload, and for specific queries in that workload tends to be quite hardware specific, and you need a live environment.

Workload “replays” are available within SQL Server’s toolkit but they are:

Time consuming to set up
Only replays – you can’t “amp up” the activity meaningfully (deleting the same rows 10 times isn’t the same as deleting different rows 10 times)

Automated query tuning: history and evolution

Automated Plan Correction

SQL Server 2017, Enterprise Edition

Built on top of Query Store
Detects queries that are sometimes fast and sometimes slow
Can recommend changes only if desired
Can freeze plans, test if it helps, and react accordingly
- Freezing is intended as a temporary fix – it’s recommended that a user evaluate the query for tuning as a longer term fix
Very good feature for helping identify parameter sniffing

Intelligent Query Processing

IQP features have been releasing over the last few versions of SQL Server
Several of these features address common query tuning issues in SQL Server
Examples:
- Scalar function inlining: Scalar functions have historically been very poorly optimized in SQL Server, causing slow row-by-row performance
  - This feature “inlines” the functions into the larger execution plans and rewrites the plan accordingly
- Adaptive joins: Adaptive logic is built into plans in some cases so that not only a “single path” for execution is available in the plan
  - This currently applies to limited scenarios
- Batch mode execution:
  - Optimizes how SQL Server handles scans of data by processing multiple values at once
  - This originally was available only for queries which reference columnstore indexes.
  - Batch mode for rowstore indexes introduced in SQL Server 2019