Here’s a little bit about our procedure
The procedure is named: PopularNames. When this procedure, is run and compiled with Threshold=null, it gets a “slow plan”. I’m going to just show you the estimated plan right now. I did CTRL+L to bring up the estimated plan. Here is what this slow plan looks like: this query uses windowing functions in it, and if we go all the way over to the right here it scans.
Notice that this is a long plan. I’m scrolling a lot, right? It does an index scan on a index agg.FirstNameByYourState, and then it does some parallelism work. Then it does a bunch of these segment and window spool operators, breaking the data into chunks and calculating data over windows.
If I look at this window spool the slow plan is all in row mode. This is using row by row operations and it takes a while for this guy to run. We’ll see some information about that in Query Store.
However, what if this plan compiles with a different parameter?
That first one was Threshold=null. If we compile this procedure with, Threshold=50,000, for example, something that isn’t null.
I did CTRL+L again to look at that estimated plan.
Hey, did you noticed I didn’t have to keep scrolling nearly as much? This plan is much smaller! It’s fit all of the work into fewer operators. It’s still doing a index scan, but notice that we don’t have all those segment operators, we don’t have all the windows spool operators.
Instead we have this fancy new window aggregate operator and this window aggregate operator is screaming fast. It’s a new batch mode operator in SQL Server 2017. It doesn’t do everything row by row. It works on larger chunks of data, and it’s just smart too.
We don’t have to have all those segment operators and all that, it’s not flowing individually in as many parts. This is a really cool new feature. But SQL Server is just not deciding to use this, depending on what parameter the procedure’s compiled with, because this is not the prettiest procedure. The code in this procedure is, like real world code, it’s a little convoluted and we have an optimization problem.
What I’ve already done is:
- I configured the database, I configured Query Store
- First I ran – I compiled the fast plan. I let it run a while, it didn’t take very long. The fast plan is fast and it only took 1.5 minutes to run it 42 times.
- Then I forced a recompile, and I compiled the ‘slow plan’ that uses null as the parameter, and I ran that more often. It took eight minutes to run it 77 times
Automatic tuning doesn’t kick in right away
It’s looking at averages [and things like standard deviations].
What is the average CPU time when this runs? So, it was slow for a while before automatic plan correction said, “hey, okay it’s looking like you are slower and I’m going to go ahead and apply that last plan.”
How can I tell if it’s been tuned?
Our new buddy, if we’re running Enterprise Edition, or I’m running Developer Edition, so you can easily see this in a test environment without having to pay for Enterprise Edition.
I am now going to look at: sys.dm_db_tuning_recommendations…
This is where it has the suggestions it makes about, ‘hey I think this plan would be faster’, as well as what the status of that plan is. Now that I’ve queried tuning_recommendations, I can see that I have a recommendation here.
It is of the type of FORCE_LAST_GOOD_PLAN. Right now that’s all the automatic tuning I have. But in the future I may have more types of automatic tuning, so this will tell me which one it is. And I love the detail it gives me in this reason field. It says, ‘hey, I noticed that the average CPU time was fast and that it got much slower.’ It even gives me the details on that: it has a scoring system which is an internal magic number about how confident it is about this suggestion. It goes up to 100, this one’s a 42. It’s got kind of a new currency of suggestion worth, and I can see information on suggestions.
Now, one little aside here: I’m using ‘At Time Zone’ in my T-SQL
In your production code if you’re looking at large rowsets, behind the scenes ‘At Time Zone’ makes a call out to Windows and asks the operating system for help with time zones. That can be slow against large row sets.
For small rowsets like this it’s really convenient to use, but whenever I use it now I like to make sure that folks know that you might want to test this if you’re using this against large rowsets.
We can see lots of cool things about this, including the state of the suggestion
Right now our suggestion is being verified
This is a really great part of the feature.
Think about it this way: in my particular procedure that I have here, my fast plan really is always fast. Whatever parameter I run it with it is faster than that other plan that has the old window spool operator. But there are going to be some cases where queries have a different situation. Where, okay, we have a historic event where Plan A was really fast for a while. And then it got Plan B that was slower. Notably slower.
But if we force Plan A, maybe it runs with different parameter values than we saw before. And in some of those cases Plan A may not be the best thing. It may get even slower or as slow, right? That’s why we have a verification process to say, “okay, I’ve tried forcing this but let’s see how it really works out. If it isn’t working out great, we may back off from it.”
It does have some of that intelligence built into it.
It also has intelligence around knowing this may not be the best plan forever. If things change, I may want to let go of this and see: are things fast without me forcing this plan?
So this is–
Think of this as a temporary safety net that has some consciousness in it about trying to make sure that it really is helping things, rather than accidentally making things worse
I love that it has that smarts in it.
Let’s also look at the details column. Over here on the right there is another column that has details in it. And this is another, JSON column. It has more details in it about what plan was forced as well as the implementation details. I have a more complex query here just to unpack the JSON there. We can even get from these query recommendations the script that was used.
Because I have auto-tuning enabled, this script was used to force the plan.
If I’m running Enterprise edition or Developer Edition and I haven’t enabled that last good plan forcing, I can still see the suggestion in here. It’ll have a different status. It’ll be like, ‘I didn’t force it because you don’t have Automatic Tuning enabled.’ But I could still see that suggestion and say, ‘oh I want to go figure out okay what’s, query_id= 25?’ And I want to look at plan_id= 5, which it says is the plan that looks better, and then I want to look at the regressed plan. Which is plan 6. I want to figure out why are you sometimes getting plan 5 or plan 6?
Maybe I want to manually force a plan
But maybe I don’t!
Maybe I want to figure out is there a better way to fix it? That can be really, really powerful.
Question: Is this available in 2016, or only 2017?
Query Store is available in 2016. The Automatic Tuning feature is in 2017+.
You can see the reports and a lot of the information manually in 2016, but this Automatic Tuning is 2017, or Azure SQL Database. Yeah.
I love the details we get in here. We even get information about the CPU time of the regressed and the recommended plan.
One note about these recommendations: they are not persisted to disk for you
If you want to get the information even after databases are taken offline, or after restarts, you need to pull the recommendations out and store them elsewhere yourself.
I think they’re really valuable information even if you aren’t using that Automatic Tuning, so I would recommend doing that and saying, ‘I want to see where it’s noticing these plan regressions.’