There are hints that I can use that can tell SQL server to optimize for a value for @RowID that will result in a “smaller” plan.
This procedure’s called TempTableOptimizedForTest, and it’s got the same code that we have for our normal temp table that we’ve been using.
We create it, we add a million rows, but when we go to update the temp table we have added an OPTIMIZE FOR hint, and I say: pretend that I have passed in @RowID = 8000.
When I optimized this query for 8,000 rows, I got that narrow plan for the temp table that we saw be fast. This time we’re going to test, give me an execution plan that’s optimized for 8,000 rows.
We should get that narrow plan, but actually update whatever we’ve passed in for @RowID, which is going to be 800,000 rows. We will be purposefully creating, just for this update statement, a plan for a smaller amount of rows to update than we are actually updating. Let’s create that procedure. Now, let’s go ahead and execute it.
I’m going to turn off my actual plans, and first just measure the statistics IO duration for it.
Woohoo! This went well! The duration of my update statement was one second.
How much IO did I do? We will scroll on over and see that we did exactly what we want, 53,000 logical reads.
I’m going to turn on execution plans now and rerun it, I think I managed to click that button, yes I did, right before I ran it.
Sure enough, because we used OPTIMIZE FOR, we got the narrow plan for this clustered index update.
What we found is that when we’re updating a temp table for a lot of rows in a stored procedure, it is still fast when we get narrow plan, so that worked out for us in this case. This fix isn’t always going to be suitable for everyone.
In this case, I did have an easy value to optimize for
That’s not always the case depending on what you’re doing. Sometimes this might work, but you may not always have just an obvious thing to say OPTIMIZE FOR this. Won’t always be the case, but here it worked out for us.
Another option to make the temp table faster is to change the scope
Remember how when we were running queries outside the context of a stored procedure, when we were running them as ad hoc SQL, remember when they were faster, the temp table did act like the user table when it wasn’t in the stored procedure.
Well, in this case we can use Dynamic SQL to achieve that if we want. We are now going to create TempTableDSQLTest. In here, we create our temp table, we add a million rows to it, but when it comes to running the update, I am setting a piece of Dynamic SQL to be my update.
I am still using @RowID. I am executing my Dynamic SQL with sp_executesql, which allows me to pass parameter values in so I can still keep the stored procedure, keep the @RowID parameter there, and using the Dynamic SQL executed by sp_execute I can pass that parameter value into there. I don’t have to make it into just a string that’s executed, I can maintain it as parameterized code.
After running this, I do a similar thing right here where before we run it we check the time, after we run it we check the time.
The difference here is we are using Dynamic SQL and exec sp_executesql here. Turning off our execution plans and running it first with just statistics time and IO, let’s see how this turns out.
We got our fast duration by changing the scope of the update statement and executing it as Dynamic SQL, I’m at just over one second.
If I scroll right and look, yes, we are NOT getting that massive inflation of logical reads. Well, what does the execution plan look like? I’ll turn on my actual execution plans and run it again.
Let’s see, are we getting wide plan? Are we getting narrow plan?
We are getting wide plan, but the good version of wide plan.
Query two, when executed as Dynamic SQL from sp_executesql– so I still have my parameterization there– the IO statistics on the clustered index seek… Yes, we’ve got our 53,000.
If we go on over to our clustered index update, yes, we aren’t doing extra IO, we already found those rows. We shouldn’t have to be doing a bunch of logical reads to find them again.
By changing, and definitely we’re getting into buggy territory here: this is not a pattern I would expect to need to use for a simple update of a temp table, but if I need to change my code now before I am waiting to get feedback from Microsoft on whether this can be fixed and can be updated, this is a change I can make in my code to still use a temp table if that’s appropriate for me, and not be massively inflating my logical reads and having my query slow down on me.
I love that there is a way to get that workaround.
Summing up what worked to speed up this temp table
We can coerce the narrow update plan, which in our testing for our code that narrow update plan was always fast with our temp table, even in the context of a stored procedure. This can be tricky in real world code because the OPTIMIZE FOR hint, there’s not always an obvious value to OPTIMIZE FOR in the real world.
Dynamic SQL to execute the update from works well for us, you do have to become comfortable with using Dynamic SQL, it’s going to add a level of complexity to your code, but generally that’s going to be easier to fit in because you don’t have to figure out exactly what is the right OPTIMIZE FOR hint for this bit of code.
Yes, this is a real bug
In fact, the folks who gave me this question and said, “Hey, do you know anything about this?” They also pursued other avenues, and on MSDN forums they were chatting about this and did get confirmation from a Microsoft employee that, “Yeah, this doesn’t look right, this doesn’t look good. We have identified an issue and are evaluating a fix.”
They point out that the workaround they recommend using based on what they saw was the Dynamic SQL or a separate stored procedure. They found that using sub/child stored procedures and changing the scope that way would also work for it as well.
I personally tend to think Dynamic SQL is a little easier to manage, but depending on how things work at your business you could use either one if you ran into this problem.
The reason that I think this is such an interesting problem is…
It gives you a great place to practice performance tuning, troubleshooting steps, measuring the duration, the CPU and the IO usage both of the statement, as well as we looked at individual operators.
We were really focused on looking at IO, you could also look at CPU time for each of those operators in your actual plans on recent versions of SQL Server as well.
Looking for differences in execution plans can also clue you in.
Like in this example, it is complicated, it often isn’t as simple as: Wide Plan Bad. What we had to dig though was, “In some cases wide plan is bad, what are those?”
And those nuances are the nuances that you work with when you’re tuning real queries in the real world.
Execution plans give you great insight into: “Okay, where may I be able to influence things to get this plan shape, or this plan style,” which turns out to be fast, and thinking about scope is also often helpful.
This question came to me from Mike and Frank
Thank you so much for the awesome question, I loved working through this and working through emails with you. It was a real pleasure, and I hope that someday soon I’ll be able to update this presentation with a link to the item announcing that fixes are being released for this, hopefully through many versions of SQL Server.
Thank you so much for joining me for this class
I hope that you have enjoyed our tour through this performance tuning problem.
Tempdb is always an exciting place to work ;)
I’m Kendra Little from SQLWorkbooks.com, I hope to see you in another course again soon.