(The Curious Case of… used to be part of our bi-weekly newsletter but we decided to make it a regular blog post instead so it can sometimes be more frequent. It covers something interesting one of us encountered when working with a client, doing some testing, or were asked in a random question from the community.)
I was working recently on a client’s corruption issue where I needed to know how to find which page an index row is on. The client was getting weird errors from DBCC CHECKDB around missing nonclustered index records, but SELECTs using that index on the supposedly missing key values would work. I suspected a stale read problem from the I/O subsystem as part of building the database snapshot that DBCC CHECKDB uses, but needed to prove which page the issue was on.
There’s a cool undocumented column that you can select called %%PHYSLOC%% that gives the page:record:slot of the selected record, and I’ve used it many times when working on corruption/data recovery issues for clients. It gives the information back in hex, so there’s a companion function called sys.fn_PhysLocCracker that formats the output nicely. Both of these have been in the product since SQL Server 2005.
Let me show you an example using the old AdventureWorks sample database.
Let’s say there’s a corruption issue where it says there’s a missing row for StateProvinceID = 1 and AddressID = 519 in the IX_Address_StateProvinceID nonclustered index (which is index ID 4) of the Person.Address table. If I want to prove that that row is NOT missing, here’s what I can do:
SELECT [StateProvinceID] , [AddressID] , [physloc].* FROM [Person].[Address] WITH (INDEX = 4) CROSS APPLY sys.fn_PhysLocCracker (%%physloc%%) AS [physloc] WHERE [StateProvinceID] = 1 AND [AddressID] = 519; GO
I’m selecting the index keys for the nonclustered index and using an index hint to force that index to be used.
And if I get any output, the row exists. In this case, I get:
StateProvinceID AddressID file_id page_id slot_id --------------- ----------- ----------- ----------- ----------- 1 519 1 8120 9
I’ve also used this to find rows that I need to bit-twiddle using DBCC WRITEPAGE to work around corruptions to allow data recovery from trashed databases – tedious, but possible if you know what you’re doing.
Bottom line: there are quite a few little undocumented columns and functions you can use to easily get internals information about SQL Server, with real-world applications.
9 thoughts on “The Curious Case of… how to find which page an index row is on”
This is awesome! Are there any other interesting hidden columns?
Yes, future blog posts… :-)
Excellent, can’t wait :)
I have often wondered how SQL Server keeps track of which row goes to which file when there are multiple files in a file group. So now, you shown how to find out which file the row is in. Is there a chance you could discuss how the engine tracks the information. It’s just an itch I’d like to scratch. I don’t imagine it would actually be useful to the average DBA (is that a contradiction?) or developer.
The row goes on whichever page it *has* to based on the key value – there’s no choice. If it’s going on a brand new page, new allocations are based on round-robin through the files in the filegroup, taking into account proportional fill.
I’ve had something similar on 2012 and I think there was a DMV or using a few DMV’s where I was able to figure out where corruption was.
I’ve only had to do it once 3-4 years ago and my memory is hazy of how I did it
I have read about Stale Reads in many places but found it hard to grasp. Please as always explain in details about this too as who is better than you in explaining in layman terms.
Much obliged as always sir.
It’s simple – the I/O subsystem returns old data rather than what’s been written by SQL Server. E.g. a page is read from disk at time T0. It’s changed in memory and written back to disk at time T1. It’s dropped from memory at time T2. It’s read back into memory at time T3, but the I/O subsystem returns the page as of time T0, not time T1.
Thank you Sir :)