Archive | April, 2009

The DBA Sees Your Blocking Spids… A Bird's Eye Summary of Blocking

Maybe you’re a user in a reporting database running a long query in the read committed isolation level, merrily blocking a process which needs to load data.

Maybe you’re an application misbehaving in a transactional front end database, blocking other applications.

Or possibly you’re a middle tier application who is behaving well and trying to do a small update, but the table you’re updating holds a legacy trigger designed when it was a much smaller table, and that trigger is doing data validation with long running locks.

With there being a lot of users, a lot of applications, and in general a lot of blocking, it can be useful to get a bird’s eye view.

I manage a system where we frequently used to see lots of blocking related to triggers. We’d see one spid block another, and then that spid block a few more, and so on. We would also occasionally see a middle tier application open a connection and start doing work, then stop activity but leave a transaction open. In that case it could hold locks until the connection terminated, even though the spid was sleeping.

So it was extremely useful to be able to see quickly who the spids were at the root of the blocking trees. There would often be a very large number of connections open sp_who2 wasn’t the easiest method.

I also like to collect the execution plans for blockers, which makes it much easier to track down the cause.

Hence, this script…
Continue Reading →

Comments { 0 }

Index Usage Statistics with ColumnList and Index Size

As an add on to my last post, here is what I currently do use to track index usage. This shows usage, columns in the index, and index size on disk. The size can be quite useful to know when evaluating how much an index is worth– typically if this index is large then you’re paying a fair amount on the inserts.

If it’s not easy to tell the data types from your column names, that is a modification you’d definitely want to make. Remember that indexes that have a uniqueidentifier at the head are much more likely to cause page splits and be more work to maintain, so those indexes are more “expensive”. (In my current system I do have the luxury of a consistent naming convention where it’s fairly easy to tell the datatypes in indexed columns, so I haven’t added the datatype to the column list.)

The data represents usage since last sql server restart. I am about to start working on a project to automate collection and storage for review over a longer period of time. It’s been on the list for a while and is finally just about in the top priority spot.

It would be useful if the data in the index statistics/missing index DMVs could be cleared every time the data is stored off, as can be done with wait statistics. If you think so too, please lend your support for feedback item 264140, “Clearing dm_db_missing_index_*” at connect.microsoft.com.
Continue Reading →

Comments { 0 }

Everything About Your Indexes (well, almost)

I am going to post my monstrously big index query.

Why? Because it’s AWESOME. No really, it actually is awesome. At least, if you like that sort of thing. I use some variant of this almost daily, and I tweak it fairly regularly to suit the needs of whatever I’m working on. So it’s a work in progress, but I find it constantly valuable.

Awesome? Oh Really? Why?
This query describes the size, basic definition, location, number of rows, partition status, and enabled/disabled status for all clustered and nonclustered indexes in a database. I typically sort them by descending size, since my primary usage is when a drive space alert fires, or when someone asks one of the million “how much space would it take if we wanted to [x]?” questions.

When you are working with a database which has many indexes that are partitioned over multiple filegroups, which are spread out over multiple drives, this can be very useful when a reindex fails due to a file filling up. Or when you want to estimate how much free space you need to main in a given filegroup in order to be able to reindex the indexes using it.

And it’s also just a really useful big picture reference for what you’ve got, where it is, and how much room it’s taking up.

And it’s *fast*.

Well, Indexes aren’t Everything.
That depends who you ask. ;)

And I’m actually cheating, because I include index id 0, which is actually no index at all. As an extra bonus for the introductory offer, you get HEAPS for free in this query. Yes, you heard it here first.

But I don’t have index usage in this. At least not yet! I’m considering working in a rating since last sql start, but that’s for another day.
Continue Reading →

Comments { 0 }

How Stale are my Statistics?

It can be pretty difficult to manage statistics in data warehouses, or even OLTP databases that have very large tables.

This is because, even with auto_update_statistics turned on, SQL is pretty conservative about when to update statistics due to the cost of the operation. For large tables, statistics are updated when “500 + 20% of the number of rows in the table when the statistics were gathered” have changed. (see BOL here)

So for a table with 50 million rows, statistics will auto update when more than 10 million 500 rows have changed. I have a lot of tables with a lot of rows, and this can be a problem. Take a fact table, for instance, where the key is sorted by an integer representing a date. Every day, a large amount of new records are loaded and there is suddenly a lot of records for a new day. Typically though, the number of rows changed in one day in these large fact tables is not enough to trigger automatic updating of statistics.

You don’t want to update statistics more often than you have to, however. A schema stability lock is put in place, and depending on what you’re doing it can cost a bit in IO. You want to learn how often you need to manage your statistics in a given set of tables, and also have a guideline to use to ensure you’re staying within where you need to be over time.

Continue Reading →

Comments { 2 }

Checking Permissions on Linked Servers

One reason I started this blog was just the idea of going through my catalog of scripts and reviewing them and sharing out what might be useful to people.

Here is a quick one I put together a while back. I was starting to work with a group of servers [at an unnamed company, always an unnamed company!]. Some of the instances had been configured long ago, and I found some linked servers where passwords had been hardcoded into the login mappings.

This can be a big security vulnerability, particularly if the option has been chosen to map all users to that login, and the login has significant powers on the other end of the linked server.
Continue Reading →

Comments { 0 }

A Table Summarizing All Agent Jobs with Steps…

Also on the topic of SQL Agent jobs– each time I work with a new system, it can take a while to familiarize myself with what all the Sql Agent jobs do. Often there are quite a few jobs, and sometimes they have legacy names that either don’t describe what the job does very well anymore, or is just hard to understand.

Plus, I don’t like opening jobs in the SQL Agent itself very much, since it only opens in an ‘edit’ view. I very much prefer selecting job details out of the tables in msdb, it’s just safer.

Because of this, a while back I wrote a SQL script that takes a lot of descriptive information about a job in MSDB and pivots it out into a table. The table will automatically have as many columns as are required– I have a server where a job has 41 steps, so it’s got 41 columns for step, each in order.
Continue Reading →

Comments { 0 }

SQL Agent Jobs- Checking for failed steps at the end of a job.

I use the SQL agent a lot, and it is handy for a lot of things, but  it can be frustrating to not be able to pass state information between steps.

For example, I have a job where I want to execute data verification steps against multiple tables. It makes sense to have the check for each table in its own step with a clear label to simplify troubleshooting– so when the job fails, you can see which step had an error and know from the name exactly what’s wrong.  But I want all steps in the job to run, regardless of whether a step fails— I want to check for failure at the end.

The most basic way to do this is to have each job step log to a table. This isn’t really bad, but I’d rather not maintain a table for every job of this type. It leaves room for failure, it’s more to maintain, and it just feels redundant for  the most part: all of the job history  is tracked in MSDB anyway, shouldn’t I be able to use that?

Well, I think I can…
Continue Reading →

Comments { 2 }