sqlskills-logo-2015-white.png

Data Page Count Influence on the Query Execution Plan

In my post Row and Page Count Influence on Estimated CPU and I/O Cost I demonstrated how I/O cost of a clustered index scan had sensitivity to page counts but not row counts.  For this post I’ll lay out a direct connection between data page counts and the query execution plan that gets generated.

To illustrate the scenario, I created a table in the Credit database based on the charge table and I added two indexes, one clustered and one nonclustered:

 
USE [Credit];
GO

SELECT TOP 575000
[charge_no],
[member_no],
[provider_no],
[category_no],
[charge_dt],
[charge_amt],
[statement_no],
[charge_code]
INTO [dbo].[charge_demo]
FROM [dbo].[charge];
GO

CREATE CLUSTERED INDEX [charge_demo_charge_no]
ON [dbo].[charge_demo] ([charge_no]);
GO

CREATE NONCLUSTERED INDEX [charge_demo_charge_amt]
ON [dbo].[charge_demo] ([charge_amt])
INCLUDE ([member_no])
WITH (FILLFACTOR = 100);
GO
 

Next, I checked the data page counts by index for this new 575,000 row table:

SELECT  [index_id],
[in_row_data_page_count]
FROM    [sys].[dm_db_partition_stats]
WHERE   [object_id] = OBJECT_ID('dbo.charge_demo');
GO

The clustered index has 3,426 data pages and the nonclustered index has 1,567 data pages.

Next I looked at the execution plan for the following query:

SELECT [member_no],
SUM([charge_amt]) AS [charge_amt]
FROM [dbo].[charge_demo]
WHERE [charge_amt] > 0
GROUP BY [member_no]
OPTION (RECOMPILE);
GO
 

The query execution plan (via SQL Sentry Plan Explorer) was as follows:

image

The overall estimated subtree cost for the plan ended up being 4.6168.

Next, I rebuilt the nonclustered index using a very low fill factor (far lower than I would ever recommend, but I was doing this to demonstrate the placement of the same number of rows over many more pages than the original default fill factor):

CREATE NONCLUSTERED INDEX [charge_demo_charge_amt]
ON [dbo].[charge_demo] ([charge_amt])
INCLUDE ([member_no])
WITH (FILLFACTOR = 1, DROP_EXISTING = ON);
GO
 

The clustered index still has 3,426 data pages (since we didn’t change it), but now the nonclustered index has 143,753 data pages instead of the original 1,567 data pages.  And again, this represents the same 575,000 row count. Re-executing the original test query, I saw the following changed plan:

image

The overall estimated subtree cost for the plan increased to 54.3065 with a few other significant changes as well.  The second plan switched to using a clustered index scan instead of a nonclustered index seek.  Also, the second plan uses a stream aggregate with an “injected” sort operation, instead of the original plan’s hash match aggregate operation.

5 thoughts on “Data Page Count Influence on the Query Execution Plan

  1. Good stuff. So, when an index has significantly less than expected use, evaluate: fragmentation, fit to query candidates(key columns/included columns), and page count comparison to the alternative path selected by the plan?

    1. Thanks Lonny. That’s a good list of considerations. And I think the overall takeaway is that fragmentation doesn’t just impact I/O and memory, but also impacts plan choices.

    1. Hi Neeraj,

      Fragmentation can inflate the page count – and the query optimizer looks at page count for some operators, so yes. But the QO doesn’t look directly at fragmentation as part of the set of properties considered…

      Joe

Comments are closed.

Other articles

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.