I’m back in Management Studio.
Just to prove that I’m not cheating, I’m going to go ahead and show the problem again
Here is the way the deadlock happens. In session one, we run the first update. In session two, we run the select, and then back in session one, we run the second update, and whoa, we’ve got a deadlock that happens.
Now I’m going to take a look at the solution file, and just to avoid accidental spoilers, I have put some empty space at the top of the file in case you accidentally open the solution file first.
I use the ContosoRetailDW database.
Here is our first index that we’re going to try out to solve this problem
I’m going to create an index on DimProductSubcategory on the ProductSubcategoryName. I’m going to include two columns.
Well, hey, how did I come up with the idea to get that exact index definition?
I’ll show you how. Back on our SELECT query that is involved in the deadlock.
Remember I was saying, I want to figure out how to give this SELECT query different avenues? I’m going to look at the estimated execution plan. I just hit Control + L.
And if I’m thinking about indexing the DimProductSubcategory table– I picked that name out of the deadlock graph– what I want to do is go through the execution plan and say, okay, where is this table?
All these tables all have similar names. We’re looking for DimProductSubcategory.
DimProductSubcategory happens to be in the top right of the plan here. If I hover over this, what I really want to know is, what is a seek predicate, if it’s present. We don’t have a seek predicate. We just have a predicate, which is also known as a hidden filter. It’s looking at all of these rows, and on each one, it’s checking, is ProductSubcategoryName like Cell%.
If I make ProductSubcategoryName my index key, that will help it find these rows faster. It won’t have to look at the table itself, which in this case is a clustered primary key.
The output list is interesting as well, because I don’t want it to have to go back and also grab these columns from the clustered primary key. It’s finding the rows that begin with Cell%, and then it’s outputting or returning ProductSubcategoryKey and ProductCategoryKey.
So, ProductSubcategoryName, I’m going to put as my key column in the index. I am going to put these two columns as included columns so that my non-clustered index covers this query, at least as far as this table goes.
Now, covering isn’t always required to resolve a deadlock, but these columns aren’t super wide, and I’m going to go ahead and say: I want it to be able to go straight to these rows, and I want to give it everything it needs for the purpose of this test.
Let’s go ahead and create the index
It has the columns that we saw in the execution plan, and now we’re going to go back and see, does our deadlock happen if we repeat this process?
In session one, our transaction starts as usual. In session two, the query starts, and it is still blocked. This query didn’t finish fast. It is still blocked, but now when I run the second update, notice that the second update, it isn’t blocked. We don’t end up in a situation where both of them are blocking one another. I had a little blocking, but the index that I added allowed my SELECT query to read from a different object that avoided the second update statement.
So it’s kind of like we, we had a little bit of confusion, we’re sort of going around a roundabout together.
That index DID work, although I still had a little bit of blocking between the processes.
Let’s drop that index, let’s clean it up, and test out our second index option
Our second option that we saw in the execution plan involves saying, okay, what if we don’t want to request a lock from our SELECT query on DimProductCategory?
Let’s figure out what columns we want from that by looking at DimProductCategory in our estimated execution plan.
Doing Control + L again on my SELECT query, and we want DimProductCategory this time. This is DimProduct. If we keep going to the right, here is DimProductCategory. SQL Server is already doing a clustered seek on this.
The first index that we created got us out of doing a clustered index scan on DimProductSubcategory. It’s back to doing that clustered index scan, because I dropped the first index we tried. But we are already doing a clustered index seek on this. It’s able to, for DimProductCategory, it’s able to seek to the rows that it wants already, and this might make you think, well, how is adding an index going to help?
We want to add an index that it can also seek on, but the idea is to give it a different path.
Because the clustered index, even if we’re seeking on it, the clustered index is the table itself. It contains all of the columns in a table. So an update that’s locking a row in the clustered index, it’s locking a row with all of our columns in it, right? So, let’s look at the properties of this one.
This time, we do have seek predicates, because we’re doing a seek, not a scan, and the seek predicate is on ProductCategoryKey. Our index here is going to be a simple index on ProductCategoryKey on DimProductCategory.
I’m going to go ahead and create that index, and let’s test and see if this index resolves our deadlock as well
Now that I have the index on DimProductCategory, the situation happens again.
Our transaction begins. The first update runs. Now in the second session, it runs the SELECT, and this time, it isn’t blocked.
This makes sense, because remember from our deadlock graph, when the SELECT query ran, it had a lock on one table. It was waiting for a lock on DimProductCategory. B
y creating this nonclustered index, which remember, we created this on ProductCategoryKey. We’ve created an index just for ProductCategoryKey. Look at this update statement that touches DimProductCategory. It isn’t updating that key value. It is updating the name.
So the nonclustered index that contains only the key value and doesn’t contain the name, it doesn’t have to get locked by that update, so we don’t get blocked in the first place.
Now there’s no blocking at all, and so, when the update completes, which, you know, I’m rolling back so it’s re-runnable, but imagine I committed it, then we don’t have the chance of a deadlock at all, because our SELECT query was never blocked.
The downside of this index is…
If someone goes and looks at our table on ProductCategory, what they’re going to see is we have what looks like a duplicate index situation.
I’ve run just the built-in sp_helpindex. It isn’t perfect, but it at least gives me the index keys, right? And it shows me that on this table, we had a clustered primary key on ProductCategoryKey.
So our whole table is ordered by ProductCategoryKey, and now I have a nonclustered index that only contains ProductCategoryKey. This nonclustered index, it isn’t going to be obvious that it’s there to prevent a deadlock from happening, so we need to– if we’re going to go this route and create and index that is a near-duplicate, they aren’t identical because the clustered index has all of the columns in the table in it, but these indexes do have the same key column.
We want to make sure if we do go this route that we have documentation to make sure that someone doesn’t come along and clean up our index, thinking, oh, that doesn’t need to be there. Those are two index solutions to this problem. I hope you enjoyed playing along.
If you would like to solve another deadlock…
The course Troubleshooting Blocking and Deadlocks for Beginners has a whole section on deadlocks and a totally different example deadlock that you can try your hand at solving.
Thanks for playing along in this SQL challenge. I hope that you had as much fun as I did. I’ll see you in a future challenge.