Inside the Storage Engine: Anatomy of an extent

This one’s a quickie.

In the previous post I explained about database pages – their structure and some page types. Now I’d like to explain how pages are grouped into units called extents. An extent is a group of eight physically consecutive pages in a data file. Extents are always aligned on 64KB boundaries (i.e. 8-page boundaries), starting at the beginning of the file. Extents, and all their properties, are exactly the same in all versions, but their uses are not.

There are two types of extents: mixed extents and dedicated (or uniform) extents.

Mixed extents (versions prior to SQL Server 2016)

The first 8 pages that are allocated to any IAM chain are single-page allocations, which are called mixed pages. This means that only a single page is allocated to the IAM chain at a time, rather than a whole extent. The rationale behind this is to allow very small tables to take up the minimum amount of space. Once an IAM chain has passed the 8-page threshold and switched to allocating dedicated extents, it will never go back to allocating mixed pages again.

These mixed pages are allocated from mixed extents that are not allocated to any particular IAM chain. The extent is tracked as being allocated globally (by a GAM page), so no IAM chain can allocate it. If the mixed extent has any unallocated pages in it, it will also be tracked by an SGAM page. When a mixed page is required to be allocated, the SGAM pages are checked to see if any such extents are available. If not, a new mixed extent is allocated, a page is allocated from it and then the extent is tracked by the relevant SGAM page until all its pages are allocated.

As the mixed extent is not allocated to any particular IAM chain, this means that it may hold pages allocated to possibly 8 separate IAM chains. IAM pages themselves are always mixed pages, regardless of how many pages have been allocated to an IAM chain. This means a mixed extent may hold a variety of page types too,including IAM, data, index, or text pages.

You can use trace flag 1118 to disable this behavior, which is especially important to do if you have tempdb allocation bitmap contention.

In all these versions, IAM pages are also mixed pages.

Mixed extents (SQL Server 2016 onward)

The use of mixed extents is disabled by default, and the only pages that are mixed pages are IAM pages, and this cannot be disabled.

Dedicated/Uniform extents

In SQL Server 2016 onward, all allocations to an IAM chain are from dedicated extents (except the IAM chain’s IAM pages).

In versions prior to SQL Server 2016, unless trace flag 118 is enabled, the first 8 pages allocation to an IAM chain are mixed pages, and once the 8-page threshold is passed, all further allocations to an IAM chain are from dedicated extents (except the IAM chain’s IAM pages).

Dedicated extents means that an extent at a time is allocated to an IAM chain and marked as such in one of the IAM pages in the IAM chain – whichever one maps the GAM interval that the extent is in. The extent is also tracked as being allocated by the relevant GAM page so no other IAM chain can allocate it.

All pages from a dedicated extent must be allocated to the same IAM chain. They do not all need to be the same type, however. For instance, a clustered index has a mixture of data and index pages. Also, when an extent is allocated to an IAM chain the pages in the extent are not all allocated at once (except for certain bulk operations). The pages are usually allocated as needed. The allocation state of each page is tracked using a PFS page.

When all the pages from a dedicated extent are deallocated, the extent itself is deallocated from its owning IAM chain and is available again for allocation to any of IAM chain, or to become a mixed extent.

Tracking changes to extents for backup

Changes to extents are tracked in two places:

  • Any extent that has been changed since the last full backup will be tracked in the relevant differential bitmap page. This is how a differential backup knows which extents to backup instead of the whole database. All diff map pages are reset when the next full backup is taken.
  • Any extent that has been changed by a bulk-logged operation since the last log backup will be tracked in the relevant minimally-logged bitmap page. Any log backup that’s taken after a bulk-logged operation will also include all extents tracked this way. All ML map pages are reset when the next log backup is taken.

More on PFS, GAM, SGAM, ML, and DIFF pages in a not-too-distant future post.

6 thoughts on “Inside the Storage Engine: Anatomy of an extent

  1. For clarity Paul, “All diff map pages are reset when the next backup is taken.” is referring to a FULL backup, yes?

  2. In my DB, in a Filegroup the relationship between used extents vs total extents is 40%.
    How can that be improved?

Leave a Reply

Your email address will not be published. Required fields are marked *

Other articles

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.