I had an interesting question yesterday in class about query execution plan control flow (from the root of the plan to the leaf level) and data flow (from leaf to root) and thought I would share a small demo I put together over lunch break to walk through the discussion.

Let’s start by dropping clean buffers (on a test system, please):

DBCC DROPCLEANBUFFERS;

USE [Credit];
GO

SELECT  COUNT(*) AS [page_count]
FROM    [sys].[dm_os_buffer_descriptors] AS bd
WHERE   [bd].[database_id] = DB_ID() AND
[bd].[allocation_unit_id] = 15045483298816;
GO

This returns 0 rows for the allocation_unit_id associated with the table we’re about to query from the Credit database:

SELECT  [charge].[charge_no]
FROM    [dbo].[charge]
ORDER BY [charge].[charge_no];
GO

The actual plan shows the following (via SQL Sentry Plan Explorer):

image thumb Control Flow vs. Data Flow Demo

Nothing fancy, just a Clustered Index Scan.  And in terms of page counts from sys.dm_os_buffer_descriptors, we see 9,303 data pages in cache now.

Now let’s drop the clean buffers and execute a query returning just the top 100 rows:

DBCC DROPCLEANBUFFERS;

USE [Credit];
GO

SELECT TOP 100
[charge].[charge_no]
FROM    [dbo].[charge]
ORDER BY [charge].[charge_no];
GO

The actual plan is as follows:

image thumb1 Control Flow vs. Data Flow Demo

Re-executing the query against sys.dm_os_buffer_descriptors, this time we see just 13 data pages.

The original question/discussion was with regards to the storage engine – and whether all data pages still get loaded into memory even with a TOP.  As we see in this scenario, this was not the case.