Inside the Storage Engine: What’s in the buffer pool?

This is a quick post inspired by a question I was sent in email (thanks Marcos!) which very neatly lets me show a DMV I’ve been meaning to blog about for a while. And the weather here in Redmond really sucks right now so I can’t go outside – blogging will serve as my work-avoidance strategy this afternoon :-).

The (paraphrased) question is: A checkpoint is a process that writes all dirty pages to disk, and is per-database. So, if the data cache can hold a page from any database, how does checkpoint know which pages to check for a dirty status? Does it scan through buffer pool looking for pages for database X and process only those? Or is data cache somehow partitioned by database? I’d like to know a bit better how it works under the covers.

The answer is that pages are stored in buffers in the buffer pool (aka buffer cache or data cache), and the buffers are indeed hashed so they can easily be found by database. You can see what pages are currently in the buffer pool, and their status using the sys.dm_os_buffer_descriptors DMV in 2005:

SELECT * FROM sys.dm_os_buffer_descriptors;
database_id file_id  page_id  page_level  allocation_unit_id   page_type      row_count   free_space_in_bytes is_modified
----------- -------- -------- ----------- -------------------- -------------- ----------- ------------------- -----------
1           1        9        0           6488064              BOOT_PAGE      1           7362                0
1           1        6        0           6488064              DIFF_MAP_PAGE  2           6                   0
1           1        7        0           6488064              ML_MAP_PAGE    2           6                   0
1           1        104      0           262144               DATA_PAGE      100         4196                0
1           1        105      0           851968               DATA_PAGE      65          5041                0
1           1        106      0           262144               DATA_PAGE      197         413                 0
1           1        107      0           262144               DATA_PAGE      207         23                  0
1           1        108      1           262144               INDEX_PAGE     7           7949                0

I cut off the output rather than list all 3258 pages in the buffer pool on my laptop. The DMV gives you back some info from the pages themselves as well as you can see(remember all this is in memory so it’s quick to find).
I played around with the DMV a little bit and came up with a neat script that will tell you may many clean and dirty pages there are in the buffer pool per-database.

   (CASE WHEN ([is_modified] = 1) THEN N'Dirty' ELSE N'Clean' END) AS N'Page State',
   (CASE WHEN ([database_id] = 32767) THEN N'Resource Database' ELSE DB_NAME ([database_id]) END) AS N'Database Name',
   COUNT (*) AS N'Page Count'
FROM sys.dm_os_buffer_descriptors
   GROUP BY [database_id], [is_modified]
   ORDER BY [database_id], [is_modified];
Page State Database Name           Page Count
---------- ----------------------- ----------
Clean      master                  302
Dirty      master                  1
Clean      tempdb                  88
Dirty      tempdb                  52
Clean      model                   56
Clean      msdb                    622
Dirty      msdb                    5
Clean      adventureworks          110
Clean      DemoRestoreOrRepair     64
Clean      DBMaint2008             88
Clean      DemoFatalCorruption1    64
Clean      DemoFatalCorruption2    64
Clean      broken                  64
Clean      DemoFatalCorruption3    64
Clean      DemoCorruptMetadata     111
Clean      DemoDataPurity          88
Clean      SalesDB                 123
Clean      DemoNCIndex             88
Clean      shrinktest              88
Clean      DemoRestoreOrRepairCopy 64
Clean      DemoSuspect             64
Clean      FileHeaderTest          96
Clean      MultiFileDB             96
Clean      HA2008                  88
Clean      SalesDB_Snapshot        21
Clean      BootPageTest            88
Clean      Resource Database       599

Later this week I’ll try to blog a script that can tell you how much of a particular table is in memory ([Edit 12/8/11: that script is here]). Enjoy!

14 thoughts on “Inside the Storage Engine: What’s in the buffer pool?

  1. Paul,

    I like your script using the sys.dm_os_buffer_descriptors DMV.

    In the interest of readability, I revised it to show just one line per database, with both the clean and dirty page counts on the same line. It will usually have fewer result lines than the original version, and will show counts of zero if a given database didn’t have both clean and dirty pages in the buffer pool.

    The revised SQL statement is:

    (CASE WHEN ([database_id] = 32767) THEN ‘Resource Database’ ELSE DB_NAME (database_id) END) AS ‘Database Name’,
    SUM(CASE WHEN ([is_modified] = 1) THEN 1 ELSE 0 END) AS DirtyPageCount,
    SUM(CASE WHEN ([is_modified] = 1) THEN 0 ELSE 1 END) AS CleanPageCount
    FROM sys.dm_os_buffer_descriptors
    GROUP BY database_id
    ORDER BY DB_NAME(database_id)

    The resulting output for your example data would be (recreated manually – I couldn’t get tabs to work in this post – sorry!):

    Database Name DirtyPageCount CleanPageCount
    —————– ————– ————–
    Resource Database 0 599
    adventureworks 0 110
    BootPageTest 0 88
    broken 0 64
    DBMaint2008 0 88
    DemoCorruptMetadata 0 111
    DemoDataPurity 0 88
    DemoFatalCorruption1 0 64
    DemoFatalCorruption2 0 64
    DemoFatalCorruption3 0 64
    DemoNCIndex 0 88
    DemoRestoreOrRepair 0 64
    DemoRestoreOrRepairCopy 0 64
    DemoSuspect 0 64
    FileHeaderTest 0 96
    HA2008 0 88
    master 1 302
    model 0 56
    msdb 5 622
    MultiFileDB 0 96
    SalesDB 0 123
    SalesDB_Snapshot 0 21
    shrinktest 0 88
    tempdb 52 88

    I also changed the “Order By” clause to sort alphabetically by DB name (rather than DB ID).

    Scott R.

  2. I ran this script on a few production servers & noticed that tempdb seems to always be the DB with the most dirty pages. Even busy DBs tended to have only a few hundred pages dirty at max, whilst tempdb was often in the 10’s of thousands.

    Any explanation as to why this is the case? Are they not checkpointed perhaps?

  3. Hey Greg – tempdb does not have automatic checkpoints the same as user databases, as there’s no such thing as crash recovery of tempdb. As such, the only automatic checkpoint of tempdb is when the transaction log becomes 70% full. This is what’s causing the behavior you’re seeing. Cheers

  4. Hi Paul –

    Thanks for the script but the original question was regarding the checkpoint how does it know which are dirty pages.
    Does it scan all the pages or goes to the particular page because it is hashed.


  5. Paul, i runned the script and found some dbs showing clean page count with more than 10000 , does it mean the clean pages stored in the buffer pool causes the memory occupied for sql engine ?

  6. When I see lot of ‘dirtypages’ in different databases should I issue run ‘checkpoint’ to clears dirty pages? or what do I need to do? Please explain.

  7. Hello Paul,
    I’m facing with a question that drives me crazy:
    I try to figure out if plan cache (like other memory areas) are included in buffer pool.
    I consider the output of sys.dm_os_memory_clerks, where MEMORYCLERK_SQLBUFFERPOOL, MEMORYCLERK_SQLCONNECTIONPOOL, CACHESTORE_SQLCP seem to be distinct memory areas. From that output I understand that plan cache (CACHESTORE_SQLCP + CACHESTORE_OBJCP + CACHESTORE_PHDR + CACHESTORE_XPROC) is not included in MEMORYCLERK_SQLBUFFERPOOL because they are different clerk types.
    At the same time if I run DBCC FREEPROCCACHE, MEMORYCLERK_SQLBUFFERPOOL is not touched at all.
    The same I say about connection memory (MEMORYCLERK_SQLCONNECTIONPOOL) and others…
    Am I correct?

Leave a Reply

Your email address will not be published. Required fields are marked *

Other articles

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.