Strategies to prevent deadlocks (9 minutes)

Transcript

Fighting deadlocks is tough in SQL Server

You can have more than two sessions involved in the deadlock things can get pretty complicated. And the fact that the input buffer and the deadlock graph doesn’t always give you the whole of the transaction that’s involved for that session does give you extra leg work to do.

The good news is, after you work with deadlocks for awhile and you get practice going, there will come a time when you know you’ve been working with more complex ones and you hit one and you’re able to quickly read the deadlock graph and quickly come up with ideas for how you can prevent it from happening again.

It is pretty fantastic when you have the situation where you get to be the deadlock hero and you can get there with work and with practice.

Indexes are my favorite way to prevent deadlocks from coming back

They’re my favorite because it tends to be relatively low risk to add an index. When you’re comparing changing TSQL involved and adding an index, there’s far less risk that you’re going to make the data wrong for someone by adding the index.

There is more work than I showed for adding an index

You want to make sure that you’re not adding a near duplicate index to something that’s already there.

For instance in the case of the index I added, there might be an index on country ID that just doesn’t have the included column I used, so instead of adding a new index, modifying an existing index might be a better bet.

I also need to check how many indexes does the table have already, is it okay to add another index for the size of the table and the insert rate. When is a safe time to add an index.

So there is, even when adding an index is a lower risk way to solve a deadlock, you wanna make sure you don’t skip the due diligence in your haste to become the deadlock hero.

Every deadlock really is a special monster

I like using that dedicated trace that I showed to collect the special monsters in a dedicated place so that things like the built in system health session don’t roll off the deadlocks and I make sure that I don’t miss them. If it’s important to fight deadlocks on that system.

Another tool that can come in handy is setting the deadlock priority when you have the ability to change the code you can set the deadlock priority for a session. And this won’t prevent the deadlock from happening it really is a way for you to get to decide who is going to be the victim if the deadlock happens. Deadlock priority can be set to high or medium or low.

There’s also a granularity that you can find all the details in books online for setting the deadlock priority for a session sometimes it can be a useful intermediate step, it’s not very satisfying, it doesn’t solve the root cause but if the deadlock monitor is picking a session that, as the victim who you really don’t wanna get killed off it can alleviate some of the pain for a short period of time.

Applications should catch deadlocks and notify monitoring

Generally you want to find out about deadlocks from the calling application and this just makes sense because the calling application needs to handle errors and failed queries against the SQL Server. It’s not just deadlocks that can happen to an application.

They can have timeouts, they can have connection failures they can have all sorts of things go wrong. Whenever they hit any kind of failure, they need to handle that error and generally as part of application design you want them to log it and that to be tied in to an alerting system as well. So usually you want the calling application to let you know when it is the deadlock victim.

What if I need to make SQL Server notify me of deadlocks?

If you need you can have the SQL Server notify you when deadlocks happen, it’s a little bit weird to set up but it, the commands involved are documented commands from Microsoft.

Michael Campbell has an article on how to do this on Sqlmag.com that will step you through how to do it. You can’t just set up an agent alert by itself you gotta do a little extra work with SP_altermessage to get the agent to actually get notified about the deadlocks.

Working with the deadlock graph: why setting up a repro is worth it

When you get the deadlocks and you’re working with the deadlock graph I always work very hard to see if I can set up a repro for the deadlock. Outside of production, use a restored copy of the database set up the sequence of commands and be able to simulate the deadlock.

It does take time, that’s not usually fast to set up.

The reason that it’s worth the time is if you don’t have a repro of the deadlock, usually deadlocks happen kinda sporadically you know sometimes they’ll happen in bursts and then they’ll go away for a long time. And then they’ll come back.

If you don’t have a way to reproduce the deadlock, you may still be able to hypothesize a fix that you think will work but if you implement it without the repro how do you know when it worked?

Do you wait a day, do you wait an hour? Do you wait a week?

It becomes very very difficult to actually prove that you prevented the deadlock because of the sporadic nature of them on most systems.

So being able to reproduce it outside of production and figure out can I solve it with an index, or can I solve it with another method which I think are usually a little harder. Because they require more testing, it’s just more satisfying when you’ve got that repro.

Other solutions: TSQL and Isolation Levels

Sometimes rewriting the TSQL can help prevent a deadlock. Sometimes changing the isolation level can help prevent the deadlock from happening as well.

I’m not talking about NOLOCK here, I already mentioned in path module, NOLOCK means I don’t care if the data is correct and almost always that is not acceptable to the business, by changing the isolation level I really mean here using optimistic locking in SQL Server: snapshot isolation level or read committed snapshot isolation level. They’re two different isolation levels. They can be enabled separately or together but they’re ways to have your database use versioning, when you have enabled snapshot or read committed snapshot. Inserts, updates, and deletes will use timestamps and versioning in the tempdb database.

So that transaction can read data that is current from the time either of statement or the transaction starts depending on what you’re using. But basically they use this versioning to get around some of the blocking issues in SQL Server.

Using snapshot or read committed snapshot, writers don’t block readers. Readers don’t block writers.

Now writers will still block writers, so this isn’t eliminating the possibility of deadlocks altogether: it’s just some it’s more of a strategic database level change.

You’ve gotta make sure that your system, your instance can support the versioning needed in tempdb. So that you don’t cause other problems.

You’ve got to make sure that you don’t run into race conditions if you’re using read committed snapshot isolation and then if you’re using snapshot you’ve gotta say which queries are gonna use it.

So it’s a whole training course by itself. I’m a really huge fan of them, I think they can be really powerful, it’s not as much of a troubleshooting step though as much as it is sort of a strategic database level change in how you’re handling your approach to concurrency, so it’s a fantastic tool. It can help you prevent deadlocks.

It’s just a higher level of strategy than sort of I’m handling an individual deadlock kind of troubleshooting tool too. So it’s definitely – it’ll show up on our learning plan of things you’ll wanna learn about in the future. But it’s not something you wanna do just as a reaction to an incident or two which you may be able to solve with a simpler tool.

Thanks for joining me in our battle to diagnose and prevent deadlocks! Using the scripts in this session you can absolutely build up your skills and you can become a deadlock fighting hero.