Archive | Performance RSS feed for this section

SQLPASS Day 2- Optimization Timeouts and All about TLogs

SQLPass unfortunately can’t last forever, but happily it’s still going strong. Here’s some highlights from my Day #2.

Paul Randal Knows Exactly What’s Going on in Your Transaction Log…

A definite highlight of day 2 was attending Paul Randal‘s session on Logging and Recovery in SQL Server. I’ve read Pauls’ blog posts on this topic and attended his classes before, but even being familiar with the material I find I always learn something from his talks. You just can’t beat being strong on the basics!

I took a lot of notes in the session, this is my favorite excerpt from my notes:

  • SQL Server must reserve space in the TLOG so that it can roll back the active transactions, if needed.
  • Once a VLF no longer contains log records that are required, it can be cleared
  • This is done by a log backup in full or bulk_logged recovery models, or by checkpiont in simple
  • All that happens when a VLF is “cleared” is that it is marked as inactive
    • Nothing is cleared at that time
    • Nothing is truncated
    • Nothing is overwritten
    • The log file size does not change
    • The only thing that happens is that whole VLFs are marked inactive if possible (no active transactions)

Ben Nevarz asks, “How You Doing, Optimizer?”

One of my favorite pieces of information on day 2 was in Ben Nevarez‘s talk on how the query optimizer works. He mentioned this DMV, which I hadn’t used before yesterday:

Sys.dm_exec_query_optimizer_info Check me out!

The other useful bit of info is that the timeout flag is recorded in the xml for the sql plans, so plans which the optimizer finds so complicated that it times out on compilation can be queried from the cache!

SQLPASS homework assignment: Write and test this query, determine how to automate running it and collecting the information.

Sample Queries

This sample from BOL  to find excessive compiles/recompiles:

select *
from sys.dm_exec_query_optimizer_info
where counter = 'optimizations'
or counter = 'elapsed time'

See Also…

Ben Nevarez on the Phases of Query Optimization

Conor Cunningham’s Blog on sys.dm_exec_query_optimizer_info– excerpt:

“The other use of the DMV is to get a good statistical picture of a running system.  Say that I’m a DBA and I want to know how many queries in my application have hints or _need_ hints to work well.  Well, this will tell you.  Granted, it doesn’t separate recompiles from compiles, and if you have a system where plans are getting kicked out of the cache things may be a bit skewed, but I can tell you that this is far better than simply guessing.  Often the DB application developer doesn’t realize that they’ve built an application that requires a lot of hinting or a lot of compilations, and you can see this in more detail than you get with the performance counters”

Comments { 0 }

SQL PASS Day 1: To Free or Not To Free the Proc Cache?

Yesterday was day 1 of SQL PASS 2009. I am attending a variety of sessions on execution plans this year, and along the way I heard three very different opinions yesterday on managing the procedure cache in presentations.

Rule of Thumb: The “it depends” answer is usually right.

Opinion 1: Never Ever Clear the Proc Cache on a Production Server

This first opinion came in a good, solid presentation on using execution plans for troubleshooting. There were some good examples of when you want sql to look at the statistics and trigger generating a new plan, and when you don’t. (AKA when parameter sniffing is a good or a bad thing.) But the speaker was wholeheartedly against clearing the proc cache in production.

While I can definitely see this being true for some systems, I have definitely seen advantages of clearing the proc cache on others (more to come below), so I already knew this was too simple an answer for me– at least until I’ve solved the problems I have with out of date statistics on frequently modified large tables.

(Thanks to Grant Fritchey for a great presentation.)

Opinion 2: Be Free, Procedure Cache, be Free!

This second opinion came in a session on using DMVs to troubleshoot performance. This session was even geared toward OLTP systems, and the speaker said he regularly frees the procedure cache on his production sql servers at a given interval. He sees slight CPU pressure after doing so, but has the benefit of being able to capture and trend exactly what procedures go into the cache using the DMVs afterward (with the benefit of clean timestamps).

So in his environment, he has no issues clearing the proc cache.

(Thanks to Dr.DMV for a great talk!)

Opinion 3: It Depends: Check the Size of Your Proc Cache, Free if You Need To (and can handle the CPU for Recompilations)

The third speaker (Maciej Pilecki) talked about looking at the total size of the proc cache, and stressed that as this cache grows, it can steal space from the buffer pool. For each system, you should look at the size of the procedure cache and the amount of execution plan reuse you are getting on the system.

There are two main performance benefits to plan reuse (whether parameterized adhoc queries or procedure queries):

  • Speed: (recompiling takes time and CPU resources)
  • Smaller proc cache / More room for buffer pool to hold data in memory

Bonus: Maciej also mentioned how the ‘Optimize for Adhoc Workload’ option in sql 2008 can help alleviate bloat of the adhoc procedure cache. When enabled, this will only cache adhoc plans on their second run– for the first run sql will just store a small record that the query was executed once.

I really enjoyed these sessions, and one of the great things about PASS is the opportunity to hear and synthesize different perspectives on these topics.

Love it!

See also: Maciej Pilecki’s post on clearing only the adhoc part of the cache
Lara Rubbelke’s post on memory pressure and the proc cache

Comments { 0 }

Who's Using All that Space in TempDb, and What's their Plan?

Whatcha Doing in My TempDb???

This post contains a script that I adapted from the fantastic SQL Server Storage Engine Blog.

It comes in handy in my job all the time! Sometimes tempdb is filling up, but sometimes I just want to monitor the amount of tempdb and check out execution plans of heavy tempdb users while watching performance on a server. It just really comes in handy more frequently than I would have thought before I started using it.

Note: This script returns space used in tempdb only, regardless of the db context it’s run in, and it only works for tempdb.

Continue Reading →

Comments { 0 }

Average Daily Job Runtime

Here’s a query I found useful today– this week we moved many of our production datamart servers to SQL 2K5 SP3 CU4, and today among the course of other issues I wanted to take a look at my job runtimes to see if they might be noticeably slower or faster than prior runs. I often am in a similar situation after deploying significant changes to our codebase.

Since most of my processing runs in SQL agent jobs, looking at average runtime per day is a pretty convenient index of performance. However, the load in processing varies by day of week, so it’s frequently useful to check activity for only a certain day of the week.

This script allows for both. I usually want to tweak the conditions, so I don’t set them in variables at the top, I edit them within the query itself each time:

use msdb;

select
	d.jobname
	,d.servername
	, avgDurationMinutes=avg(d.durationMinutes)
	, daydate=convert(char(10),startdatetime,101)
from (
	select
		jobname=j.name
		,servername=server
		,startdatetime=
			CONVERT (DATETIME, RTRIM(run_date))
			+ (
				run_time * 9
				+ run_time % 10000 * 6
				+ run_time % 100 * 10
			) / 216e4
		, durationMinutes=
				(CAST(SUBSTRING((right('0000000' + convert(varchar(7), run_duration), 7)),1,3) AS INT) * 60 * 60
				 + CAST(SUBSTRING((right('0000000' + convert(varchar(7), run_duration), 7)),4,2) AS INT) * 60
				 + CAST(SUBSTRING((right('0000000' + convert(varchar(7), run_duration), 7)),6,2) AS INT)
				)/60.

		,enddatetime =
		dateadd
			(ss,
				(CAST(SUBSTRING((right('0000000' + convert(varchar(7), run_duration), 7)),1,3) AS INT) * 60 * 60
				 + CAST(SUBSTRING((right('0000000' + convert(varchar(7), run_duration), 7)),4,2) AS INT) * 60
				 + CAST(SUBSTRING((right('0000000' + convert(varchar(7), run_duration), 7)),6,2) AS INT)
				)
			,
			(CONVERT (DATETIME, RTRIM(run_date))
			+ (
				run_time * 9
				+ run_time % 10000 * 6
				+ run_time % 100 * 10
			) / 216e4 )
			)
		, retries_attempted
	from sysjobs j (nolock)
	join sysjobhistory h  on
		h.job_id = j.job_id
		and h.step_id = 0 -- look only at the job outcome step for the total job runtime
	where
		j.name in ('<strong>JobName</strong>')  -- Set the jobname here

) d
where
	datepart(dw,startdatetime)=7 -- Set  your day of week here if desired. 7=Saturday
group by
	d.jobname
	,servername
	,convert(char(10),startdatetime,101)
order by
	d.jobname
	,servername
	,cast(convert(char(10),startdatetime,101)as datetime) desc
Comments { 0 }

Finding Plans and Stats for Queries like '%something%'

I often need to find a query plan in the cache for a process that has run long overnight. Typically I’ll be able to figure out from our logging some of the tables involved in the query. Sometimes I will have most of the executing text but won’t know exactly what dates or reference points were included.

Even when I have enough information to get an estimated plan, it’s usually really helpful if I can pull the actual plan out of the cache along with runtime statistics.

The query below is what I use at this point to try to find these plans– I also sometimes use it just to look for long running queries in general.

One note to remember– the last_execution_time field is the time of the plan activities at the last execution. So if you’re looking for a query that ran for an hour, this time would show at the beginning of that execution. (The logging on my systems is done after a batch of activities complete, so I always have to do a bit of work to figure out approximately when the activity would have started and look around that time for the plan.)

Continue Reading →

Comments { 0 }

The DBA Sees Your Blocking Spids… A Bird's Eye Summary of Blocking

Maybe you’re a user in a reporting database running a long query in the read committed isolation level, merrily blocking a process which needs to load data.

Maybe you’re an application misbehaving in a transactional front end database, blocking other applications.

Or possibly you’re a middle tier application who is behaving well and trying to do a small update, but the table you’re updating holds a legacy trigger designed when it was a much smaller table, and that trigger is doing data validation with long running locks.

With there being a lot of users, a lot of applications, and in general a lot of blocking, it can be useful to get a bird’s eye view.

I manage a system where we frequently used to see lots of blocking related to triggers. We’d see one spid block another, and then that spid block a few more, and so on. We would also occasionally see a middle tier application open a connection and start doing work, then stop activity but leave a transaction open. In that case it could hold locks until the connection terminated, even though the spid was sleeping.

So it was extremely useful to be able to see quickly who the spids were at the root of the blocking trees. There would often be a very large number of connections open sp_who2 wasn’t the easiest method.

I also like to collect the execution plans for blockers, which makes it much easier to track down the cause.

Hence, this script…
Continue Reading →

Comments { 0 }

Index Usage Statistics with ColumnList and Index Size

As an add on to my last post, here is what I currently do use to track index usage. This shows usage, columns in the index, and index size on disk. The size can be quite useful to know when evaluating how much an index is worth– typically if this index is large then you’re paying a fair amount on the inserts.

If it’s not easy to tell the data types from your column names, that is a modification you’d definitely want to make. Remember that indexes that have a uniqueidentifier at the head are much more likely to cause page splits and be more work to maintain, so those indexes are more “expensive”. (In my current system I do have the luxury of a consistent naming convention where it’s fairly easy to tell the datatypes in indexed columns, so I haven’t added the datatype to the column list.)

The data represents usage since last sql server restart. I am about to start working on a project to automate collection and storage for review over a longer period of time. It’s been on the list for a while and is finally just about in the top priority spot.

It would be useful if the data in the index statistics/missing index DMVs could be cleared every time the data is stored off, as can be done with wait statistics. If you think so too, please lend your support for feedback item 264140, “Clearing dm_db_missing_index_*” at connect.microsoft.com.
Continue Reading →

Comments { 0 }

How Stale are my Statistics?

It can be pretty difficult to manage statistics in data warehouses, or even OLTP databases that have very large tables.

This is because, even with auto_update_statistics turned on, SQL is pretty conservative about when to update statistics due to the cost of the operation. For large tables, statistics are updated when “500 + 20% of the number of rows in the table when the statistics were gathered” have changed. (see BOL here)

So for a table with 50 million rows, statistics will auto update when more than 10 million 500 rows have changed. I have a lot of tables with a lot of rows, and this can be a problem. Take a fact table, for instance, where the key is sorted by an integer representing a date. Every day, a large amount of new records are loaded and there is suddenly a lot of records for a new day. Typically though, the number of rows changed in one day in these large fact tables is not enough to trigger automatic updating of statistics.

You don’t want to update statistics more often than you have to, however. A schema stability lock is put in place, and depending on what you’re doing it can cost a bit in IO. You want to learn how often you need to manage your statistics in a given set of tables, and also have a guideline to use to ensure you’re staying within where you need to be over time.

Continue Reading →

Comments { 2 }