SQLskills has an ongoing initiative to blog about basic topics, which we’re calling SQL101. We’re all blogging about things that we often see done incorrectly, technologies used the wrong way, or where there are many misunderstandings that lead to serious problems. If you want to find all of our SQLskills SQL101 blog posts, check out SQLskills.com/help/SQL101.
Here’s a question I’m often asked (paraphrasing):
I’ve got a large heap table where the space isn’t being given up when I delete a large number of records, but then when I shrink the database the heap is reduced in size. Can you explain?
This behavior is how SQL Server works, but it’s pretty non-intuitive. When a page in an index becomes empty, it’s always deallocated, as an empty page isn’t allowed in a SQL Server index structure from SQL Server 2005 onward. However, the structure of a heap is different and as a result, the behavior is too.
Whenever a row is deleted in a heap, it’s usually the case that the page containing the row does not become empty. However, if the heap page that the row is stored on becomes empty as a result of the delete, the page cannot be deallocated from the table unless an exclusive (X) table lock is held (to remove the page from the internal free-space “tracking” mechanism that is in place for heaps). It’s not common that a table level X lock is held unless lock escalation has occurred (because you’re deleting enough rows to trigger escalation from individual row X locks to a single table X lock), or if you specifically use the TABLOCK hint on the delete statement, for instance. But, because both of these circumstances are unlikely, the empty heap page usually cannot be deallocated.
There is a Knowledge Base article that describes this phenomenon: KB 913399. However, the KB article only references up to and including SQL Server 2005 but this behavior exists in all version and is very easy to reproduce if you want to prove it to yourself. Also, if you’re using one of the flavors of snapshot isolation then even using TABLOCK won’t allow the pages to be deallocated.
The script below will create a database, create a table with one row per heap page, show you how many pages and rows, and how full each page is. Next it’ll delete all the rows, and show you the pages are still there, with no rows, and empty. Give it a try!
CREATE DATABASE [EmptyHeapTest]; GO USE [EmptyHeapTest]; GO CREATE TABLE [test] ([c1] INT IDENTITY, [c2] CHAR (8000) DEFAULT 'a'); GO SET NOCOUNT ON; GO -- Insert 1000 rows, which will allocate 1000 pages INSERT INTO [test] DEFAULT VALUES; GO 1000 -- How many pages and how full? SELECT [page_count], [record_count], [avg_page_space_used_in_percent] FROM sys.dm_db_index_physical_stats (DB_ID (N'EmptyHeapTest'), OBJECT_ID (N'test'), 0, DEFAULT, 'DETAILED'); GO
page_count record_count avg_page_space_used_in_percent -------------------- -------------------- ------------------------------ 1000 1000 98.974549048678
-- Delete all the rows DELETE FROM [test]; GO -- Did all the pages get deleted? SELECT [page_count], [record_count], [avg_page_space_used_in_percent] FROM sys.dm_db_index_physical_stats (DB_ID (N'EmptyHeapTest'), OBJECT_ID (N'test'), 0, DEFAULT, 'DETAILED'); GO
page_count record_count avg_page_space_used_in_percent -------------------- -------------------- ------------------------------ 1000 0 0
The empty pages will be reused by subsequent inserts (into the same table, of course), but if the space isn’t going to be reused following a large delete in a heap, you might consider using the TABLOCK hint to allow the empty pages to be deallocated and the space made available for other objects in the database to use.
Another alternative is to just use a clustered index instead, or if a heap is necessary, you could rebuild the heap using ALTER TABLE … REBUILD (that was added in SQL Server 2008 to support enabling compression on a heap), with the caveat that this will cause all the table’s nonclustered indexes to be rebuilt.
On the extreme end (in my opinion), you could reclaim the empty heap space using a shrink operation. Shrink won’t free up space inside pages as it moves them (with the exception of compacting LOB pages as it goes – somewhat unsuccessfully depending on which version and build you’re on – see KB 2967240), but it will remove empty pages rather than moving them. This will effectively shrink the heap after a large delete, but with the usual caveats about shrink causing index fragmentation and generally being an expensive, slow operation to perform.
So now you know why you may have a bunch of empty pages in your heaps!
5 thoughts on “SQLskills SQL101: Why does my heap have a bunch of empty pages?”
You stated “The empty pages will be reused by subsequent inserts”.
My understanding was that following a delete from a heap, if a table lock is not taken either via lock escalation or by explicitly using a table lock query hint, then the empty pages are not deallocated and other objects in the database cannot reuse the empty pages. But you’re saying the empty pages can be reused by subsequent inserts?
What would you recommend as the most optimal way to find heaps that are possibly effected by this behaviour and have potentially lots of unallocated pages and wasted space? For example, if you had a 200GB heap, using sys.dm_db_index_physical_stats could take a lot of time and resources so I’m wondering what other options are available? Could you use sys.dm_db_partition_stats to potentially show this?
Subsequent inserts into the same table. I believe you should see this in the difference between reserved and used pages.
Thanks for your very interesting blog
From my experience if the transaction isolation level is RCSI then deleting from heap table with TABLOCK will not lead to deallocate page, and they still allocation on table
Indeed – as the rows need to be kept around until the version store is cleaned up, and then the pages can’t be deallocated, as per usual.