Can we find a more stable solution and can we parameterize the query?
The method I’m going to use for parameterizing this is a stored procedure. And our first step is just to create a stored procedure with our original query. I haven’t done any tuning on this one, this query does not, there’s no recompile hint in here at all, we haven’t even rewritten the joins.
This change, the only change here is that we have made YearToRank instead of a local variable, and you can put local variables in procedures by declaring women the body, but we have said as we create the procedure, here’s YearToRank, this is a parameter for this stored procedure.
Are we seeing differences just by the fact that we are using a parameter instead of a local variable?
In this case does the behavior where it can sniff the value of a parameter, it can sniff and look at 1991, does it result in dramatically different optimization? Well, doesn’t fix all our problems for sure. We still have 26,000 logical reads. Just making this a parameter doesn’t fix everything that’s wrong with this query. We still have all 1.8 million rows, almost 1.9 million rows going into the filter in that driver branch and every branch.
And we can see that it has not pushed the filter down.
Let’s incorporate that step where we simplified the joins.
That definitely seems like a good step to take. This incorporates the change where instead of the joins having column plus 10 equals other value we’re doing parameter plus 10 equals other value in another table. So we are simplifying the logic in the joins for it. We haven’t fixed the problem with the outer branch but could this result in some improvements just by itself? It’s worth testing, right? Well, we’re still at 26,000 logical reads, and when we look at our execution plan there are no seeks to be seen at the end of these branches. It hasn’t resulted in any predicates being pushed down.
This by itself doesn’t fix the problem.
Would a recompile hint work?
Our goal is not to use a recompile hint because even though we now have a stored procedure here, even though I have a proper parameter here, by adding a recompile query hint I’m saying, “Optimize this just for me and don’t reuse it for anybody.” But it is worth testing and looking at does this result in the predicate being pushed down in the procedure just like it did with the local variable?
And you may notice that came back really fast and sure enough the recompile hint combined with rewriting the join logic in the stored procedure does get us just like with the local variable, it does get us below 500 logical reads.
And if we look at our seeks, just like with the local variable when we put that recompile hint on SQL Server said, “Ah, I will just go ahead and treat this as if it was a literal value. I’m not going to reuse this, it’s just for you, so I am going to look at that value and treat it just like a literal.” It isn’t even being treated like a parameter anymore. And once it does that it is able to push that filter down to the end there, but at the price of plan reusability, so we’ve got some downsides.
Another quick fix approach: a temp table
When I’m kind of still looking for other quick fixed I might say okay, instead of us running this window function over and over and over, I am just going to run it once.
And so instead of a CTE I am, just for once and all, I’m going to do this window function for everybody and then I’m going to pull from the temp table and join to that and access that temp table, and do the window function, and do the rank for different years. But I’ve already precalculated what the ranks are before they went into the temp table. So I’m going to go ahead and do this procedure here and we’re just going to go ahead and rank for all years and then join back to it. And when I look at this and do rank by year for 1991, I do get different performance, I have multiple steps now.
But even just that first step on rank by year of populating everything, that takes 5,202 steps, and then when we’re accessing the temp table later, we’re at 30,425 logical reads there.
Temp tables can be good tuning tools to use in stored procedures and can help break down complex queries into simpler steps. They can be really useful. But in this case it’s a path I could try to go down farther but I don’t think I need to do that. I think that I might be able to take an approach where I do this all still in one query. I don’t think the temp table’s going to be a quick fix.
Well, what about a forceseek hint?
And this is usually when I’m looking for quick fixes and I get to the point where I’m like, “What about random hint X?” “What about random hint Y?” That’s usually when I come to the end of the quick fix section and then turn to, “Okay, let’s actually go through a tuning process.”
But let’s indulge ourselves. Also trace flag falls into this category. What if I enabled this trace flag? Maybe a miracle will appear.
In this case what we’re saying is, we know that the behavior that we want is that we want it to do a seek on that clustered index, so what if I just put in a forceseek? Will it then figure out magically that I want it to push that predicate down there and do a seek there?
Well, if I try this with forceseek, this is often the way that flailing around and trying random trace flags or query hints, this is often what it results in. And I’m not against trace flags or query hints, I’m just acknowledging that this is not the first time I’ve created this message. SQL Server says to me, “Actually, your hint screwed me up and I can’t even compile a plan, so you don’t even get an execution this time.”
All right, well, let’s step back from this. Instead of trying more hints, let’s step back and say: How can I rewrite the TSQL for this: I’m going to try to keep it in one statement, but I’m going to try to break the query down and start fresh.