(Be sure to join our community to get our bi-weekly newsletter with exclusive content, demo videos, and other SQL Server goodies! Also check out our online training courses.)
One of my biggest hot-buttons is around shrinking data files. Although I used to own the shrink code while I was at Microsoft, I never had a chance to rewrite it so that data file shrink is a more palatable operation. I really don’t like shrink.
Now, don’t confuse shrinking the transaction log with shrinking data files. Shrinking the log is necessary if your log has grown out of control, or as part of a process to remove excessive VLF fragmentation (see Kimberly’s excellent posts on this here and here). However, shrinking the log should be a rare operation and should not be part of any regular maintenance you perform.
Shrinking of data files should be performed even more rarely, if at all. Here’s why – data file shrink causes *massive* index fragmentation. Let me demonstrate with a simple script you can run. The script below will create a data file, create a 10MB ‘filler’ table at the start of the data file, create a 10MB ‘production’ clustered index, and then analyze the fragmentation of the new clustered index.
USE [master];
GO
IF DATABASEPROPERTYEX (N'DBMaint2008', N'Version') IS NOT NULL
DROP DATABASE [DBMaint2008];
GO
CREATE DATABASE DBMaint2008;
GO
USE [DBMaint2008];
GO
SET NOCOUNT ON;
GO
-- Create the 10MB filler table at the 'front' of the data file
CREATE TABLE [FillerTable] (
[c1] INT IDENTITY,
[c2] CHAR (8000) DEFAULT 'filler');
GO
-- Fill up the filler table
INSERT INTO [FillerTable] DEFAULT VALUES;
GO 1280
-- Create the production table, which will be 'after' the filler table in the data file
CREATE TABLE [ProdTable] (
[c1] INT IDENTITY,
[c2] CHAR (8000) DEFAULT 'production');
CREATE CLUSTERED INDEX [prod_cl] ON [ProdTable] ([c1]);
GO
INSERT INTO [ProdTable] DEFAULT VALUES;
GO 1280
-- Check the fragmentation of the production table
SELECT
[avg_fragmentation_in_percent]
FROM sys.dm_db_index_physical_stats (
DB_ID (N'DBMaint2008'), OBJECT_ID (N'ProdTable'), 1, NULL, 'LIMITED');
GO
avg_fragmentation_in_percent ----------------------------- 0.390625
The logical fragmentation of the clustered index before the shrink is a near-perfect 0.4%.
Now I’ll drop the ‘filler’ table, run a shrink to reclaim the space, and re-analyze the fragmentation of the clustered index:
-- Drop the filler table, creating 10MB of free space at the 'front' of the data file
DROP TABLE [FillerTable];
GO
-- Shrink the database
DBCC SHRINKDATABASE ([DBMaint2008]);
GO
-- Check the index fragmentation again
SELECT
[avg_fragmentation_in_percent]
FROM sys.dm_db_index_physical_stats (
DB_ID (N'DBMaint2008'), OBJECT_ID (N'ProdTable'), 1, NULL, 'LIMITED');
GO
DbId FileId CurrentSize MinimumSize UsedPages EstimatedPages ----- ------- ------------ ------------ ---------- --------------- 6 1 1456 152 1448 1440 6 2 63 63 56 56 DBCC execution completed. If DBCC printed error messages, contact your system administrator. avg_fragmentation_in_percent ----------------------------- 99.296875
Wow! After the shrink, the logical fragmentation is almost 100%. The shrink operation *completely* fragmented the index, removing any chance of efficient range scans on it by ensuring the all range-scan readahead I/Os will be single-page I/Os.
Why does this happen? A data file shrink operation works on a single file at a time, and uses the GAM bitmaps (see Inside The Storage Engine: GAM, SGAM, PFS and other allocation maps) to find the highest page allocated in the file. It then moves it as far towards the front of the file as it can, and so on, and so on. In the case above, it completely reversed the order of the clustered index, taking it from perfectly defragmented to perfectly fragmented.
The same code is used for DBCC SHRINKFILE, DBCC SHRINKDATABASE, and auto-shrink – they’re equally as bad. As well as introducing index fragmentation, data file shrink also generates a lot of I/O, uses a lot of CPU, and generates *loads* of transaction log – as everything it does is fully logged.
Data file shrink should never be part of regular maintenance, and you should NEVER, NEVER have auto-shrink enabled. I tried to have it removed from the product for SQL 2005 and SQL 2008 when I was in a position to do so – the only reason it’s still there is for backwards compatibility. Don’t fall into the trap of having a maintenance plan that rebuilds all indexes and then tries to reclaim the space required to rebuild the indexes by running a shrink – that’s a zero-sum game where all you do is generate a log of transaction log for no actual gain in performance.
So what if you *do* need to run a shrink? For instance, if you’ve deleted a large proportion of a very large database and the database isn’t likely to grow, or you need to empty a file before removing it?
The method I like to recommend is as follows:
- Create a new filegroup
- Move all affected tables and indexes into the new filegroup using the CREATE INDEX … WITH (DROP_EXISTING = ON) ON syntax, to move the tables and remove fragmentation from them at the same time
- Drop the old filegroup that you were going to shrink anyway (or shrink it way down if its the primary filegroup)
Basically you need to provision some more space before you can shrink the old files, but it’s a much cleaner mechanism.
If you absolutely have no choice and have to run a data file shrink operation, be aware that you’re going to cause index fragmentation and you should take steps to remove it afterwards if it’s going to cause performance problems. The only way to remove index fragmentation without causing data file growth again is to use DBCC INDEXDEFRAG or ALTER INDEX … REORGANIZE. These commands only require a single 8KB page of extra space, instead of needing to build a whole new index in the case of an index rebuild operation.
Bottom line – try to avoid running data file shrink at all costs!
30 Responses to Why you should not shrink your data files
Wow, that’s a fairly radical suggestion–moving all tables into a new file group instead of running a shrink.
Not quite sure, but I get the vague feeling you don’t like shrink ;)
Is that CREATE INDEX … WITH (DROP_EXISTING) syntax supported on SQL 2000?
I know you’re a busy man, but what about writing a 3rd party utility that does a "smart shrink" without fragmenting indexes?
You can work on that right after you finish your "run DBCC CHECKDB on a database backup" utility :)
Brad,
>>Is that CREATE INDEX … WITH (DROP_EXISTING) syntax supported on SQL 2000?
Yes, it is supported. Here is the BOL link. http://msdn.microsoft.com/en-us/library/aa258260(SQL.80).aspx
>>Not quite sure, but I get the vague feeling you don’t like shrink ;)
Easy one here. Paul doesn’t like SHRINK and a lot of DBA’s also don’t like that. You may want to catch up his previous post (http://www.sqlskills.com/blogs/paul/post/Auto-shrink-e28093-turn-it-OFF!.aspx)
>>what about writing a 3rd party utility that does a "smart shrink" without fragmenting indexes?
Never heard about this. Looks like a disk defragmenter kind of stuff; it may remove external fragmentation of
the files but shouldn’t run this as regular maintenance. Keep the files (data & log) as large as possible and not allow auto growth to kick in. But keep the auto growth turned on just in case.
Paul, hope you don’t mind this.
Brad, DROP_EXISTING exists on SQL 2000
I’m not sure how to apply this in a data-warehouse environment, where most of my slack space is caused by the process of creating the clustered index. Say I create a new database and bulk-insert a 250 gig table into it. I then build a clustered index and it swells to have 200 gig of free space when it’s done. SORT_IN_TEMP_DB helps, but it’s not always an option. Where does the sort occur when the table is moved to a new filegroup?
Hi Paul,
Thanks for this post. I have many databases in my enviroment with Auto_Shrink option set on. This post make me write a script to revert this situation. Follow below..(works well in SQL 2000, i dont test in 2005+)
DECLARE @DataBase sysname
DECLARE database_cursor CURSOR
FOR SELECT
quotename(name) as Base
FROM
master..sysdatabases
WHERE
DATABASEPROPERTYEX(name,’Status’)=’ONLINE’
AND DATABASEPROPERTYEX(name,’IsAutoShrink’) = 1
OPEN database_cursor
FETCH NEXT
FROM database_cursor
INTO @Base
WHILE @@FETCH_STATUS = 0
BEGIN
EXEC (‘ALTER DATABASE ‘ + @BASE + ‘ SET AUTO_SHRINK OFF’)
FETCH NEXT
FROM database_cursor
INTO @Base
END
CLOSE database_cursor
DEALLOCATE database_cursor
GO
will this work for a msdb database in SQL2005? by having a new file?
No – it needs to be a new filegroup – and you can’t get rid of the primary filegroup (although you could shrink it down after moving some of the larger objects to the new filegroup)
Paul, First of all, thanks for the great blog.
I get it that shrink ops cause massive index fragmentation. It appeared that prior versions of shrinkfile/db single thread operations. Is that still the case? I currently have a shrinkfile running on a mid sized data file (100gb) that has been running for quite a few hours – I’m trying to find the bottleneck. The machine is dedicated to running this shrink. Via perfmon, I’m seeing a lot of activity in the SQLServer:Databases Shrink Data Movements Bytes/Sec, Log Flushes/sec, and Log Bytes Flushed/sec. I see usage across all processors (4) – typically not more than 5% or so. There is no paging going on and disk queue lengths are consistenly below 1.
Yes, shrink is always single-threaded – which of course contributes to the long run-time. In your case, it could be that shrink is waiting for locks, it could be moving LOB data or heap data (which is very slow). Shrink’s just not a fast beast.
Thanks
[...] Why you should not shrink your data files [...]
[...] Grrr. See Why you should not shrink your data files for why shrink is bad. Auto-shrink is even worse. I tried to have it removed durign SQL 2005 [...]
[...] out this blog post where I go into more details and explain an alternative to using shrink: Why you should not shrink your data files and this one for how to manage your data files correctly: Importance of data file size [...]
[...] file shrinking. Just this morning I wrote a long blog post about this – see Why you should not shrink your data files. Running data file shrink causes index fragmentation, uses lots of resources, and the vast majority [...]
Understand the anomaly of Shrink, but what is the best way to move tables with its index from primary group at one go. i mean no manual drop and create. script which does it automatically for each tables in the the file group.
You’ll need to use CREATE INDEX .. WITH (DROP_EXISTING = ON) and specify the destination filegroup. You can come up with a script to do it easily.
Hello,
I’m seeking to understand if the above discussion also applies to Sql Server Compact 3.5. I’m dealing with a mobile application whose privately deployed DB should be capped at about 200MB. The plan was to programmatically purge old data as it becomes stale, followed by the shrink operation described here: http://msdn.microsoft.com/en-us/library/ms171752(v=sql.100)
Since Sql CE is a file-based DB that apparently doesn’t support clustered indexes, I’m in doubt as to whether the above discussion applies. At the same time, performance degradation is a top concern. Could someone share their knowlege?
I know nothing about Compact Edition at all I’m afraid, so can’t answer your question.
i don’t want to recover my deleted data n sql server 2008,,,….? so it’s happen with shrink option .. if yes the how can i solve it…
please help me on this topic i have such a problem in my current project..
my email id: araniyabipin@gmail.com / lotus.websolution@gmail.com
Are you asking how to get data back about it’s been deleted and a shrink occurs? If so the answer is you restore from a backup.
[...] [...]
[...] As always, there are trade-offs. Shrinking data files is generally frowned upon as it causes high levels of fragmentation. It’s also very slow, and LOB data makes it even slower. Still, fragmentation can be solved [...]
[...] As always, there are trade-offs. Shrinking data files is generally frowned upon as it causes high levels of fragmentation. It’s also very slow, and LOB data makes it even slower. Still, fragmentation can be solved [...]
I’m new to all of this, so i’d like to know what can I do to fix the fragmentation that weekly shrinking (over many years, apparently) has caused in my SQL 2005 database. Coincidentally, this is the only database that was set up this way and it is the only one where users report huge problems with performance.
You’ll need to defrag the indexes to remove the fragmentation. Check out http://technet.microsoft.com/en-us/magazine/2008.08.database.aspx
Has this now been fixed in 2008 R2? I ran your above test and avg_fragmentation_in_percent comes out at 0.
Nope – the repro works perfectly on R2 – just ran it to make doubly sure.
Interesting is there any other reason I would be getting 0?
Which result are you getting 0 for, the first one of the second one? Try a select * from the DMV and see if it gives meainingful results.
Both the post and pre shrink results are 0.