Data Page Count Influence on the Query Execution Plan

In my post Row and Page Count Influence on Estimated CPU and I/O Cost I demonstrated how I/O cost of a clustered index scan had sensitivity to page counts but not row counts. For this post I’ll lay out a direct connection between data page counts and the query execution plan that gets generated.

To illustrate the scenario, I created a table in the Credit database based on the charge table and I added two indexes, one clustered and one nonclustered:

 
USE [Credit];
GO

SELECT TOP 575000
[charge_no],
[member_no],
[provider_no],
[category_no],
[charge_dt],
[charge_amt],
[statement_no],
[charge_code]
INTO [dbo].[charge_demo]
FROM [dbo].[charge];
GO

CREATE CLUSTERED INDEX [charge_demo_charge_no]
ON [dbo].[charge_demo] ([charge_no]);
GO

CREATE NONCLUSTERED INDEX [charge_demo_charge_amt]
ON [dbo].[charge_demo] ([charge_amt])
INCLUDE ([member_no])
WITH (FILLFACTOR = 100);
GO

Next, I checked the data page counts by index for this new 575,000 row table:

SELECT  [index_id],
[in_row_data_page_count]
FROM    [sys].[dm_db_partition_stats]
WHERE   [object_id] = OBJECT_ID('dbo.charge_demo');
GO

The clustered index has 3,426 data pages and the nonclustered index has 1,567 data pages.

Next I looked at the execution plan for the following query:

SELECT [member_no],
SUM([charge_amt]) AS [charge_amt]
FROM [dbo].[charge_demo]
WHERE [charge_amt] &gt; 0
GROUP BY [member_no]
OPTION (RECOMPILE);
GO

The query execution plan (via SQL Sentry Plan Explorer) was as follows:

The overall estimated subtree cost for the plan ended up being 4.6168.

Next, I rebuilt the nonclustered index using a very low fill factor (far lower than I would ever recommend, but I was doing this to demonstrate the placement of the same number of rows over many more pages than the original default fill factor):

CREATE NONCLUSTERED INDEX [charge_demo_charge_amt]
ON [dbo].[charge_demo] ([charge_amt])
INCLUDE ([member_no])
WITH (FILLFACTOR = 1, DROP_EXISTING = ON);
GO

The clustered index still has 3,426 data pages (since we didn’t change it), but now the nonclustered index has 143,753 data pages instead of the original 1,567 data pages. And again, this represents the same 575,000 row count. Re-executing the original test query, I saw the following changed plan:

The overall estimated subtree cost for the plan increased to 54.3065 with a few other significant changes as well. The second plan switched to using a clustered index scan instead of a nonclustered index seek. Also, the second plan uses a stream aggregate with an “injected” sort operation, instead of the original plan’s hash match aggregate operation.

5 thoughts on “Data Page Count Influence on the Query Execution Plan”

Lonny Niederstadt says:

April 14, 2013 at 7:49 pm

Good stuff. So, when an index has significantly less than expected use, evaluate: fragmentation, fit to query candidates(key columns/included columns), and page count comparison to the alternative path selected by the plan?
1. Joseph Sack says:
  
  April 15, 2013 at 6:34 am
  
  Thanks Lonny. That’s a good list of considerations. And I think the overall takeaway is that fragmentation doesn’t just impact I/O and memory, but also impacts plan choices.
Neeraj says:

October 21, 2013 at 11:48 pm

thanks the the post,
i have a question here
does fragmentation also affect the query plan?
1. Joseph Sack says:
  
  October 22, 2013 at 6:07 am
  
  Hi Neeraj,
  
  Fragmentation can inflate the page count – and the query optimizer looks at page count for some operators, so yes. But the QO doesn’t look directly at fragmentation as part of the set of properties considered…
  
  Joe

Comments are closed.

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.

Data Page Count Influence on the Query Execution Plan

5 thoughts on “Data Page Count Influence on the Query Execution Plan”

Other articles

New Course: “SQL Server: Common Query Tuning Problems and Solutions – Part 2″

New Course: “SQL Server: Common Query Tuning Problems and Solutions – Part 1”

SQLIntersection Post-Conference Session

Optimizing Your Query Plans with the SQL Server 2014 Cardinality Estimator

Finding Timeout-Prone Code with max_elapsed_time

Deck from “Practical SQL Server Cardinality Estimation”

Imagine feeling confident enough to handle whatever your database throws at you.