Diagnosis for stack dumps - a discontinued feature (8 minutes)

This video is for historical purposes only

Features come and features go, and this one was discontinued in January, 2018. If you are a customer who would like help interpreting stack dumps, Microsoft’s current guidance is to engage their product support team.

There’s a brand new add-in in town

Microsoft’s new add-in allows you to upload stack dumps to the cloud for automated diagnosis. See a demo of this tool (currently in preview).

Read this post to get a list of supported SQL Server versions and more (currently SQL Server 2008+).

You must have a recent version of the SSMS client tools to use the add-in.

Download SQL Server Diagnostics (Preview) from Microsoft here.

Transcript

We have recently gotten a really promising, exciting new feature for SQL Server Management Studio. It is an extension for diagnostics. As of this recording, it is currently in preview mode, and I have downloaded it from Microsoft and installed it.

Now that I’ve installed it, under my tools menu, I have SQL Server Diagnostics. What we’re going to be looking at is this cool feature where we can, if we choose, use this extension to upload stack dumps, if we’ve had stack dumps on our SQL Server.

Uploading a stack dump for analysis

We’re going to upload that to Microsoft and ask them to analyze our stack dump file in the cloud, and I think this is just a great use. You need to decide whether or not it’s appropriate to share your stack dump files in this way, but if you would be calling support with Microsoft anyway, then this seems like something you’re probably going to be comfortable with.

What I’m going to do is open this up, and I’ve actually already uploaded a few files that you can see. I can immediately, if I want to, go into my upload history and I can see the dates that I uploaded, what I uploaded, the status, and any comments that they have. I can also clear out my history if I want. This is from the client tool. I’m uploading and having this analyzed from my client tool. You don’t have to do this from the production server at all, because on the ‘Analyze SQL memory dump’ tab, you get to pick lots of things, at least a couple of things. You get to pick what region to upload it to, and West US definitely is a good choice for me given my location. If you want an email, you can provide an email address, and the email it sends is pretty nice, I’ve already gotten one or two, and then we select the file that we want them to analyze.

Your stack dumps, if you have them– hopefully you don’t have these all the time– because very often, not all the time, but very often when you have a stack dump, your SQL Server freezes up while it’s happening. These are stored in your SQL Server Error Log directory.

This is the location where I have mine here, and if I scroll down, see all these SQLDump.mdmp files? This is where they’re kept. If you’re looking at your SQL Error Log itself and you have a stack dump in the SQL Error Log, it’ll be like hey, stack dump.

So I’m going to grab, let’s grab .003.mdmp, and let’s get that uploading to Microsoft. I’m going to send it to that directory, and grab. Notice that I had a lot of files in there, but it’s like this is the extension that I know about for these mini dump files. I can’t just upload anything. I’m going to grab 003, and I’m going to say open. There we go. And I’m going to then click upload. It does take a while, it’s got to go ahead and send it up to the cloud. We’ve got to find available resources to perform the analysis and the recommendations.

Don’t forget to do some analysis yourself

This doesn’t completely tell you everything that you might be able to learn about your mini dump files. Importantly, there are still some simple things that you can do! Don’t JUST do this part. So I uploaded SQL dump dot 003.

If I go back to my list, notice that I have other files for this.

I’ve got a log file, and a text file. I can open the text file. I can open this, and just me, human me, has been provided some information by SQL Server that I can look at and notice right away: DBCC results, right. We’ve got output here from DBCC CHECKDB, against a database called CorruptMe.

Back when I ran this, I had purposefully corrupted some data for testing different things with CHECKDB and demoing corruption, and we can see, in the text file for this – it doesn’t take a genius to interpret this was database corruption.

It also got a mini dump file created with it, and that’s what I have asked Microsoft to analyze.

Now it says, no KB found, but you know, I can look at the text file and say oh, this was data corruption, and we don’t have the CorruptMe database anymore because I actually got rid of that.

Now it is saying hey, you could potentially look at cumulative updates. I’m on SQL Server 2016, I think SP1. I could compare my versions with these and see what I want to do, but right this isn’t, in some cases, looking at a mini dump file, if it can specifically analyze a bug and make recommendations. You can look at those and evaluate it, but in this case, it doesn’t yet seem to have the capability to be like oh, “you had corruption from CHECKDB, you should look at the CorruptMe database,” yet.

This is best if you act quickly – but exercise caution

Looking at my upload history, I do have one of these, my 005 mini dump file, it did recommend some KBs for me. I really liked picking this because I don’t really remember what I was doing when I generated this mini dump. I don’t know what caused this, but if we look at the KBs, it says one of these is like an error related to a procedure that takes a table valued parameter from RPC calls. I don’t remember doing anything like that, but that’s interesting.

Another one says deadlock with deferred, huh. I actually happen to know that this instance has never been part of an Availability Group, so it’s interesting that that bug is related, but it doesn’t make a ton of sense to me.

Now the final one has to do with a clustered columnstore index. These are potentially related things, and it is likely that I was doing something awful to a clustered columnstore index knowing me, right?

What I’m trying to say is even when it recommends changes for you in ‘Analyze dumps’, you still want to test these before you deploy them to production, and make sure they’re good.

If you can monitor for stack dumps when they happen and address them right away, you’re much more likely to be able to hone in on okay, what types of things were we running that causes this?

Looking back like I did at a mini dump I generated months and months ago is not always the most helpful thing and it isn’t going to have superpowers in terms of being able to pinpoint exactly what caused it. Another thing to check out on SQL Server diagnostics: Notice the just plain view recommendations option. It tells me we can look through different products and see recent releases with updates, what the fixes are, and what’s in there.

I think this is really cool that this info is now closer at hand and more available right in Management Studio, and again, this is still just in preview, this is early days, I’m excited that this exists, and really excited to see what more comes.