We can force joins, and we can break our query into pieces
You can specify which physical joins that SQL Server is allowed to use for your query with table hints or query hints. But be careful: using the wrong join may slow us down.
Another way of “bossing” the optimizer around is to break our query into pieces using a temp table or a table variable. This forces the optimizer to handle our code in shorter steps. But does it help make it faster in this case?
Read the Microsoft KB on Trace Flag 2453. The KB discusses that you can also use RECOMPILE hints to see the rowcount in table variables – that has some downsides, hence the new option of the table variable. You’ll learn more about the downsides of RECOMPILE in an upcoming module.
Let’s say that we want to force a specific type of join
Now in the plans that we’ve looked at so far, hash joins have not been in the plans that were fast, but just for experimentation, let’s say we wanted to see what if we made you do a hash join between couple different parts of the plan?
Well you can use join hints in different places. You have options on how you implement these. I’ve already run these two queries, but before we look at the results, let’s look at how I hinted the joins.
On the first query, I specified a join hint as a table hint
I said for this join to agg.FirstNameByYearStateWide. I want you to use a hash join. I didn’t say anything about the join to ref.FirstName, though. I also can specify a hash join as a query hint, or I can specify a join hint as a query hint.
In this case, I’ve said option hash join. And what I’ve done here is say, “Here’s the list of the join types “that you can use when implementing “all of the joins in the query.” and I’ve only said hash join, so I’ve said everything like you’re only option is to do a hash join. So let’s take a look at the performance. Well, our first query, all right, where it’s six, using a hash join even with the table hint took 16.5 seconds.
Our second query really wasn’t much better at about 16 seconds where we did the query hint
And looking at the execution plans, that first query, where I did the table hint, it still chose a nested loop when going to ref.FirstName because I only put the hint on a specific join. It did do what I asked, and it used the hash join for the results from the CTE and agg.FirstNameByYearStateWide.
For the second query, where I said, “I want–”, you know, I’m using a query hint to say the only join available to you is a hash join, sure enough, I have a hash match here, going to ref.FirstName, which is now up here at the top right. And then I also have another hash match join down here. So hash match is everywhere.
I do have all of these little lines indicating the little double arrows, indicating my query went parallel, but that parallelism didn’t win me super fast performance. This got slower.
So, you know, I didn’t have any indication that a hash join was going to be great because the queries that I saw with hash joins in them were not the fast queries, but this is an example of just showing how, you know, you got to be careful.
I might think, “Oh, if I make it use a big join “it’ll be faster if it’s dealing “with sizable amount of data.”
But I’ve got to be careful because it can make it slower.
Well what if I want to force a loop join?
We have seen that our plans that had nested loop joins between the windowing function CTE, and agg.FirstNameByYearStateWide, we have seen that those were faster. So what if I say, “Okay, well, you can use “the new cardinality estimator, “but, trust me, you want to do a nested loop join “at this place.”
So instead of doing that fast hint, or doing that cardinality estimator hint, I’m now using a join hint. Let’s go ahead and execute that and see how that does leaving the estimates that the new cardinality estimator’s using in place.
Performance isn’t too bad. I get just over six seconds of elapsed time, and 5.2 seconds of CPU time. And looking at my plan, sure enough it uses the join I ask it to, it also decides to use a nested loop join when it’s going to ref.FirstName. And I still have though, of course, a warning on my select.
Let’s see what we allocated for memory here, looking at that warning on the select. It’s just under 200 MBs of memory allocated for this because different joins in SQL Server have different memory requirements for how much, you know, memory to allocate for that when data is being used in that join.
But things like a sort is still going to need, you know, I have other things in my plan that still need memory allocation based on the number of rows SQL Server thinks is going to go into them.
Well, so we haven’t set any records with our join hint. What if we take a new approach?
And this is a bullying approach. What if I use a temp table? And I’m not using hints here at this point, I’m just really saying what if I materialize the results from my windowing function and say, “Okay, go ahead and run the windowing function, and filter out the rows where rank by gender and row is number one, and dump those into a temp table named NameRank.”
Then, instead of, you know, using a CTE, essentially I am saying I’m going to use a temp table so that we materialize those results, and then I joined the temp table.
Temp tables support statistics
That can help SQL Server estimate, “Okay, how many rows am I actually “having in this temp table when I join to it?”, but of course, we have a two step process now. So let’s go ahead and give this a run. I don’t have hints here, but I am forcing that materialization by breaking this into parts.
And, hm, how are we doing here? Well we’ve got a total run time of about five seconds here. My first query that dumps stuff into the temp table takes five seconds duration, and about the same CPU time. And then the second query is about half a second duration, very short CPU time.
Looking at my execution plans, now I have multiple plans, when SQL Server runs this top query, that’s where I still have that problem of it thinks a lot of rows are coming out of the filter. But all it’s doing, after that point, is dumping stuff into the temp table. When we go and query the temp table, in the bottom query, here we have our table scan of our temp table.
And when we hover over the rows coming out of it, it’s able to see because of it’s ability to create statistics on this temp table, “Oh, I’ve got 10,903 rows in there. I’m willing to do a nested loop to go get the extra rows, in agg.FirstNameByYearStateWide, no problem.”
And I don’t have a warning about memory allocation at all on this second part of the plan. So I have a pretty good run time here, and I’m not over allocating memory for this. It does look like a good improvement. So we got rid of that excessive grant problem.
Can we do the same thing using a table variable?
I’ve got slightly different syntax because I’m declaring a table variable. And I’m doing the same thing though where I take the results, I take my ranked baby names and put them into the table variable, and then I join back to the table variable. Let’s give that a run.
Table variables don’t have the same magic ability to create column statistics on them that temp tables have. But still, my first step here of populating the table variable isn’t bad.
We’re at about four and a half seconds, and about half a second to run the second part of the query as well. We have very similar performance here. We’re going to have the exact same estimation problem, populating the table variable. We do have a different shape on this second plan here.
Because when we look at how many rows SQL Server thought were going to come out of the table variable named NameRank, it only thought there was going to be one row. That’s because table variables don’t have those statistics created on them and SQL Server’s just like, “I don’t know.” It guesses zero, and then it rounds that up to one.
But in this case that doesn’t turn out too badly for us because the low estimate leads to a nested loop plan, which ends up working pretty well for this specific query. I get lucky here with the table variable and it turns out okay.
We could say, “Well, what happens if we make the table variable smarter?”
Trace Flag 2453 was introduced in SQL Server 2012…
in Service Pack 2. And this trace flag lets SQL Server see, not whole statistics for table variables, but it just helps SQL Server see row estimates of how many rows are in the table variable without you having to use a recompile hint. This trace flag though doesn’t work with query trace on.
This is something important to know about query trace on is it doesn’t work with every trace flag. It only works with a list of supported trace flags.
But I can turn on DBCC TRACEON globally, or at my session level if I’m a SYSADMIN, and I am. So I’m going to turn on trace flag 2453 for my session, and then run my table variable query.
And I do have the syntax in here just if you want to test it, it doesn’t work for this query. I have to turn it on for the whole session, the query QUERYTRACEON doesn’t work.
But I’m going to run that code that puts the ranked names into the table variable, and then joins back to it and queries it. So I’m using my table variables still, but I’m making it a little bit smarter. I get pretty similar performance. Four and a half seconds for the first step, and then about half a second for the second step.
And looking at my second step, I do have a different plan because when it looked at my table variable, it said, “Oh, I can see the row count in there now.”
It was willing to do a nested loop for looking up all those reporting columns, but then it decided, “Hey, it might be worth it to do a merge join when it comes to joining back to ref.FirstName.”
There is a sort operator that appeared here too. If SQL Server’s using a merge join, merge joins require the inputs to be sorted the same way. So if they aren’t already sorted that way, I may have to, if I’m making the plan, put a sort operator in there, and that doesn’t come for free. It has a cost associated with it.
But the results on this, we were still very similar to the performance of the table variable. Before it made a different choice, but it performed very similarly in our case.
So I’m going to turn off my trace flag for my session. Didn’t make a huge improvement here but in some cases, especially if you have a bunch of table variables in your existing code someplace, it might be worth testing if turning on this trace flag could help improve performance. As you’ve seen it doesn’t always make a difference, and there is always a potential for regression, but sometimes it can really make smarter decisions based on that row count, so worth a test though, like I said, we’ve got to bully at a larger level, we can’t hint it on individual queries to take advantage of it.
There are more ways that we can tune this query though. We’re going to take a look at that in the next video.