Which pages are in my index?
Sys.dm_db_index_physical_stats described to me the shape of my index, but it didn’t give me exact page numbers.
I can get page numbers from an undocumented dynamic management view that we got in SQL Server 2012 called sys.dm_db_database_page_allocations.
Similarly, I’m telling SQL Server: I want to look in this database at the DimEmployee table for index ID 28, and I want detailed information from that. This will look at all the pages assigned to that object and return to me: okay, what are the page numbers?
I’m going to to take those page numbers and plug them into more undocumented commands to say, hey what’s on this page? Let’s take a look at it.
I only want to look at index pages, and I only want pages that have been allocated.
Right now for my index, there’s only one page at page level two
That’s our root page of the index, and here is its page number.
Page level one, that is our intermediate pages of the index
That level that only has two pages in it, and we get the page numbers for that. For that level we also have more information. Because there’s more than one page in the level, we can see that the first page at the intermediate level, it knows who the next page is, and the next page knows who the previous page was.
These pages, when there’s more than one page at a level, they have pointers back and forth.
Here, page level zero, these are our leaf level pages for the index
Here’s the first leaf page, it says okay I know who the next page in line is, and then so on and so forth.
Let’s examine pages with DBCC PAGE
We’re going to start here at the root. I’m copying out the page ID for the root of our index, and I’m going to paste that into this command right here.
Trace Flag 3604
But before I run it, before I run the undocumented DBCC PAGE command, I’m going to make sure that I have a trace flag on for my session (TF 3604). DBCC TRACEON says please take the output from DBCC PAGE if I run it, and put it in my messages tab here in Management Studio. Please return it to me. So I’ve turned on that trace flag, and now I can run the undocumented DBCC PAGE.
I give it the name of a database, the file number I want to look at pages in, the page number I want to look at, and then information about the style of results I want.
My previous query did tell me (against sys.dm_db_database_page_allocations) – it did tell me, if I look here in allocated_page_file_id, all of these pages are in database file one for this database. I haven’t configured this with multiple data files. So data file one, here is our page number that we copied out for the root page, and I want to look at this index page. As it’s a non-clustered index and I want result style three, which will return information for what’s on this page in this nice convenient table format.
The root page (level 2, in this case)
It says this page is in file ID one, and here’s the page number that we’re looking at. There are two rows on this page. I am at currently index level 2, that’s where my root page is. These child pages, they are each also in file ID one, and here are the page IDs of the child pages.
Key column guideposts
Then I get some interesting information in three columns. Notice that I have three key columns here. I defined my index with three key columns and they are all on the root page. They have – for the first child page, they just say NULL. Then for the second child page, they have some sample data. These are guideposts, this data for the key columns.
These data columns are guiding me if I’m looking for data. Let’s say I’m looking for the LastName Adams. Well, the guidepost I have for the second page is telling me this child page manages data for this value and higher. If want a value lower or previous to that such as Adams, I want to look on the child page before that, which in this case is the first one. It doesn’t have a guidepost, it’s everything up to, but just before, the guidepost for the next page. Well, that’s interesting.
Intermediate pages (level 1 only, in this case)
Let’s take a look at these child pages. I’m going to copy that first child page there and now I’m going to copy the second child page, and we’re going to plug those in to more DBCC PAGE commands. We’re now looking at the intermediate pages. Looking at the first intermediate page, it has quite a few rows, it has 186 child pages that it knows about.
On this intermediate page we have those key columns listed again with more of these guideposts. We don’t have any of those included columns here though. Neither the root page nor this intermediate page had anything about those included columns. We simply have sample values for the key columns. And so we can see, for the first row, we again have NULLs because it contains everything up to, everything just before the LastName Abrus, FirstName Luka. And so on, so we have these guideposts saying if we are looking for a row, here is where to go to find it.
We can also see really clear information on how this is all sorted
Everything is sorted first by LastName. The sorting for FirstName only comes in if we have a duplicate value for LastName. We have two Alexanders, so David comes first and then Michelle.
But if we were to just look at FirstName without looking at LastName, these can be in any order, right? Because we have a variety of LastNames, we don’t have a lot of duplicates in that, so the sorting of the second column only comes into play when we have a duplicate value for the FirstName.
Similarly for EmployeeKey, we would only have to explicitly change an order for that if we have duplicate values for LastName and then also for FirstName. In that case the third key would determine what order they were in. So that first key column, this first key column of LastName is incredibly powerful in terms of placing rows in this index, because in this case we have a lot of unique values for that LastName column.
Let’s look at our other intermediate page. We have two intermediate pages. This first one knew about a lot of child pages. Looking at our second intermediate page with this DBCC PAGE command, it has fewer child pages it knows about. It only knows about 58.
This isn’t evenly distributed, there’s no balance to be maintained. [Intermediate pages] don’t have to evenly divide the children that they know about.
Our first line here, our first guidepost on this page is to a familiar value. This was the guidepost to this page on our root page. Suriya Sajjateerakool, apologies Suriya for mangling your name, she has data on this child page 158449.
The index leaf (level 0, always)
Let’s take a look at that child page. I’m going to plug that into this command, which we know our index has three levels. We’ve already looked at two levels, so this should be the leaf index, this should be the final level.
And sure enough, when I look at this page there’s only one row on this page. That’s because I did that dastardly trick when I created the index and I said with FILLFACTOR one. FILLFACTOR applies to the leaf of the index, it was only able to put one row on this page and it is for Suriya.
Notice, we are at the leaf of the index. There’s no child page for the leaf, we are at the lowest level of the index. In our tree, we’re at the leaf level, there’s nothing further, and we have more than just our key columns.
The leaf contains keys and includes
We have the three key columns, and they’re sorted, but we also have the columns identified as the included columns. We have MiddleName, the EmergencyContactName, as well as the Status.
The leaf level of the index contains the keys, and things are ordered by the keys, then whatever we defined as those included columns are also present here. If I want to run a query that says: ‘Okay, please give me the Status of the employee with this LastName and this FirstName and this EmployeeKey’, it can get me that status information without having to go look at any other structure. Doesn’t have to go check for anything in the base table, it is included in the leaf of my index.