Does Index Fragmentation Matter with SSD’s?

Two weeks ago at SQL Connections in Orlando, FL, I got to participate in a session that Paul and Kimberly do at the end of the conference titled, “Follow the Rabbit.”  The premise of the session is that Paul and Kimberly throw a big list of topics up on the screen and anyone in the audience can ask any question they have about one of those topics and Paul and Kimberly will try to answer it.  I wasn’t the only person to end up participating in this session answering questions, Maciej Pilecki another MVP who also recently passed the Microsoft Certified Masters Exam for SQL Server 2008 answered a number of questions as well.  One of my favorite questions that was asked during this session was “Does index fragmentation matter with SSD’s anymore?”  Paul’s answer to the question was very practical and dealt with the wasted space utilization that excessive index fragmentation can cause in a database that uses a uniqueidentifier column with newid() as the primary key, and given the cost per gigabyte for SSD storage this could be quite significant.  Paul pointed out a couple of other points that escape me, primarily because my mind immediately started churning ideas about how to test the impact that index fragmentation actually has on a database.

It should be no surprise that this blog post is going to use Extended Events, it has after all been my favorite feature in SQL Server since it was first released in an early CTP for SQL Server 2008.  One of the sessions that I presented at SQL Connections last month was on Extended Events and I made some pretty heavy use of the I/O related events as a part of that session.  I also made heavy usage of the I/O related events back in December for a couple of my XEvent a Day blog posts, so I knew that I could really get some detailed information back from the system about what kind of impact fragmentation really had on a database, and since I have a SSD in my laptop, which while not enterprise class is more than worthy of performing the tests on, I thought I’d give a whirl at figuring out how much impact fragmentation really had.

When I began working on this problem, I started out using Windows Server 2008R2 and SQL Server 2008R2, primarily because it is my standard VM platform for the Immersion Training, and for presentations and it was immediately available. I originally intended to use the I/O size information from the file_read_completed event to aggregate the I/O sizes being read from disk, and I soon learned that this was not available in SQL Server 2008 as I had originally thought.  I knew I had used it in the past so I went back to my XEvent a Day blog series and found An XEvent a Day (29 of 31) – The Future – Looking at Database Startup in Denali

Bummer!

I happened to have my old Denali CTP1 VM that I used to create that blog post on a external hard disk so I decided to make use of it for the investigation.  I had originally hoped to be able to create a reproducible demo that anyone could use in their SQL Server 2008 environment to see the impact of fragmentation, but since it is not possible this post will be based on the information available through SQL Server Denali instead.

The first thing that we will need is a database with two tables that have identical data in them; one fragmented and the other not. To create this, we’ll use a uniqueidentifier with newid() as the primary key for the first table and a uniqueidentifier with newsequentialid() as the primary key for the second table. Then we will load approximately 1000 pages worth of data into the first table and then copy that data into the second table, and rebuild the indexes on the second table to remove any residual fragmentation from it.

CREATE DATABASE FragmentationTest
GO
USE FragmentationTest
GO
CREATE TABLE GuidHighFragmentation
(UniqueID UNIQUEIDENTIFIER DEFAULT NEWID() PRIMARY KEY,
 FirstName nvarchar(50) NOT NULL,
 LastName nvarchar(50) NOT NULL)
GO
 
CREATE NONCLUSTERED INDEX IX_GuidHighFragmentation_LastName
ON GuidHighFragmentation(LastName)
GO
 
CREATE TABLE GuidLowFragmentation
(UniqueID UNIQUEIDENTIFIER DEFAULT NEWSEQUENTIALID() PRIMARY KEY,
 FirstName nvarchar(50) NOT NULL,
 LastName nvarchar(50) NOT NULL)
GO
 
CREATE NONCLUSTERED INDEX IX_GuidLowFragmentation_LastName
ON GuidLowFragmentation(LastName)
GO
 
INSERT INTO GuidHighFragmentation (FirstName, LastName)
SELECT TOP 1000
    a.name, b.name
FROM master.dbo.spt_values AS a
CROSS JOIN master.dbo.spt_values AS b
WHERE a.name IS NOT NULL 
    AND b.name IS NOT NULL
ORDER BY NEWID()
GO 70
 
INSERT INTO GuidLowFragmentation (FirstName, LastName)
SELECT FirstName, LastName
FROM GuidHighFragmentation
GO
 
ALTER INDEX ALL ON GuidLowFragmentation REBUILD

With our tables built, we can validate the fragmentation information by querying the sys.dm_index_physical_stats() DMF:

SELECT 
    OBJECT_NAME(ps.object_id),
    i.name,
    ps.index_id,
    ps.index_depth,
    avg_fragmentation_in_percent,
    fragment_count,
    page_count,
    avg_page_space_used_in_percent,
    record_count
FROM sys.dm_db_index_physical_stats(
        DB_ID(), 
        NULL, 
        NULL, 
        NULL, 
        'DETAILED') AS ps
JOIN sys.indexes AS i
    ON ps.object_id = i.object_id
        AND ps.index_id = i.index_id
WHERE index_level = 0

image

Next we will create our event session to capture the I/O events that are related to physical reads from disk.  The event session is being created using dynamic SQL so that each of the events has a predicate on the sqlserver.database_id for the current database.

-- Create an Event Session to investigate our IO operations
IF EXISTS(SELECT * FROM sys.server_event_sessions WHERE name='FragmentationEffect')
    DROP EVENT SESSION [FragmentationEffect] ON SERVER;
DECLARE @sqlcmd nvarchar(4000) = '
CREATE EVENT SESSION FragmentationEffect
ON SERVER
ADD EVENT sqlserver.sql_statement_starting
( ACTION (sqlserver.sql_text)),
ADD EVENT sqlserver.sql_statement_completed
( ACTION (sqlserver.sql_text)),
ADD EVENT sqlserver.file_read
( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+')),
ADD EVENT sqlserver.file_read_completed
( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+')),
ADD EVENT sqlserver.physical_page_read
( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+')),
ADD EVENT sqlos.async_io_requested
( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+')),
ADD EVENT sqlos.async_io_completed
( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+'))--,
ADD TARGET package0.asynchronous_file_target(
     SET filename=''C:\SQLskills\EE_FragmentationEffect.xel'',
         metadatafile=''C:\SQLskills\EE_FragmentationEffect.xem'')
WITH (MAX_MEMORY = 8MB, EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS, 
      TRACK_CAUSALITY = ON, MAX_DISPATCH_LATENCY=5SECONDS)'
EXEC (@sqlcmd)
GO

This event session is configured to have MAX_MEMORY set at 8MB to account for the partitioning of the memory buffers and to ensure that there is ample buffer space for the session to try and mitigate against the potential for event loss.  The EVENT_RETENTION_MODE for the session can not be configured for NO_EVENT_LOSS since the sqlserver.physical_page_read event is a part of the event session.  TRACK_CAUSALITY is turned ON for the event session so that correlation between events can be made to tie the I/O operations back to the statement that generated them, and the MAX_DISPATCH_LATENCY has been set at five seconds because I am impatient and don’t want to have to wait the default of thirty seconds for the events to be dispatched to the targets while doing a demo.

Prior to starting the event session, a manual CHECKPOINT will be issued against the database to flush any dirty pages from the buffer cache to disk, and then the buffer cache will be cleared using DBCC DROPCLEANBUFFERS to ensure that the test statements have to physically read the data from disk into cache.

-- Issue checkpoint to flush dirty buffers to disk
CHECKPOINT
GO
 
-- Clear the Buffer Cache to force reads from Disk 
DBCC DROPCLEANBUFFERS 
GO 

With this completed, we can now start our event session and run the same query against each table to force a scan of all the pages in the table from disk, then stop the event session so that our target file only has events associated with this specific test in it.

-- Start the Event Session
ALTER EVENT SESSION FragmentationEffect
ON SERVER
STATE=START
GO
 
-- Aggregate the data from both tables 
SELECT LastName, COUNT(*)
FROM GuidLowFragmentation
GROUP BY LastName
GO        
SELECT LastName, COUNT(*)
FROM GuidHighFragmentation
GROUP BY LastName
GO
 
-- Wait for the events to dispatch to the target
WAITFOR DELAY '00:00:10'
GO
 
-- Stop the Event Session
ALTER EVENT SESSION FragmentationEffect
ON SERVER
STATE=STOP
GO

Now that we have our data, we need to read it from the file and break down the event data into columns to simplify analysis of the information.  To do this, we’ll first load the event data as XML into a staging table.  Trying to parse the XML while reading from the file through the DMF is incredibly slow by comparison.  Then we can shred the XML into a tabular format based on the events being collected.

-- Drop Results tables if they exist
IF OBJECT_ID('FragmentationEffectResults') IS NOT NULL
    DROP TABLE FragmentationEffectResults 
GO
IF OBJECT_ID('FragmentationEffectResultsParsed') IS NOT NULL
    DROP TABLE FragmentationEffectResultsParsed 
GO
 
-- Create results table to load data from XE files
CREATE TABLE FragmentationEffectResults
(RowID int identity primary key, event_data XML)
GO
 
-- Load the event data from the file target
INSERT INTO FragmentationEffectResults(event_data)
SELECT
    CAST(event_data AS XML) AS event_data
FROM sys.fn_xe_file_target_read_file('C:\SQLskills\EE_FragmentationEffect*.xel', 
                                     'C:\SQLskills\EE_FragmentationEffect*.xem', 
                                     null, null)
GO
 
-- Parse the event data
SELECT 
    RowID,
    event_data.value('(event/@name)[1]', 'varchar(50)') AS event_name,
    DATEADD(hh, 
            DATEDIFF(hh, GETUTCDATE(), CURRENT_TIMESTAMP), 
            event_data.value('(event/@timestamp)[1]', 'datetime2')) AS [timestamp],
    COALESCE(event_data.value('(event/data[@name="database_id"]/value)[1]', 'int'), 
             event_data.value('(event/action[@name="database_id"]/value)[1]', 'int')) AS database_id,
    event_data.value('(event/data[@name="mode"]/text)[1]', 'nvarchar(4000)') AS [mode],
    event_data.value('(event/data[@name="file_handle"]/value)[1]', 'nvarchar(4000)') AS [file_handle],
    event_data.value('(event/data[@name="offset"]/value)[1]', 'bigint') AS [offset],
    event_data.value('(event/data[@name="page_id"]/value)[1]', 'int') AS [page_id],
    event_data.value('(event/data[@name="file_id"]/value)[1]', 'int') AS [file_id],
    event_data.value('(event/data[@name="file_group_id"]/value)[1]', 'int') AS [file_group_id],
    event_data.value('(event/data[@name="size"]/value)[1]', 'bigint') AS [size],
    event_data.value('(event/data[@name="wait_type"]/text)[1]', 'nvarchar(100)') AS [wait_type],
    event_data.value('(event/data[@name="duration"]/value)[1]', 'bigint') AS [duration],
    event_data.value('(event/action[@name="sql_text"]/value)[1]', 'nvarchar(4000)') AS [sql_text],
    event_data.value('(event/data[@name="cpu"]/value)[1]', 'int') AS [cpu],
    event_data.value('(event/data[@name="reads"]/value)[1]', 'bigint') AS [reads],
    event_data.value('(event/data[@name="writes"]/value)[1]', 'bigint') AS [writes],
    CAST(SUBSTRING(event_data.value('(event/action[@name="attach_activity_id"]/value)[1]', 'varchar(50)'), 1, 36) AS uniqueidentifier) as activity_id,
    CAST(SUBSTRING(event_data.value('(event/action[@name="attach_activity_id"]/value)[1]', 'varchar(50)'), 38, 10) AS int) as event_sequence
INTO FragmentationEffectResultsParsed
FROM FragmentationEffectResults
ORDER BY Rowid

This is where the real fun begins.  There are too many events in results set to be useful in raw form, so what I decided to do was to locate the activity_id from TRACK_CAUSALITY for the sqlserver.sql_statement_starting events for both of the tables, and store those into variables for use in querying off the results set.  The first thing I looked at was how many times did each of the events actually fire for each of the tables.

DECLARE @FragmentationHighActivityID varchar(50),
        @FragmentationLowActivityID varchar(50)
        
SELECT @FragmentationHighActivityID = activity_id
FROM FragmentationEffectResultsParsed
WHERE event_name = 'sql_statement_starting'
  AND sql_text LIKE '%GuidHighFragmentation%'
 
SELECT @FragmentationLowActivityID = activity_id
FROM FragmentationEffectResultsParsed
WHERE event_name = 'sql_statement_starting'
  AND sql_text LIKE '%GuidLowFragmentation%'
 
-- Aggregate the results by the event name and file_id
SELECT event_name, 
    CASE activity_id
        WHEN @FragmentationHighActivityID THEN 'GuidHighFragmentation'
        WHEN @FragmentationLowActivityID THEN 'GuidLowFragmentation'
        ELSE 'UNKNOWN'
    END as QueryTable, 
    COUNT(*) AS occurences
FROM FragmentationEffectResultsParsed
WHERE activity_id IN (@FragmentationHighActivityID, @FragmentationLowActivityID)
  AND event_name IN ('async_io_completed', 'async_io_requested', 
        'file_read', 'file_read_completed', 'physical_page_read')
GROUP BY event_name, activity_id
ORDER BY activity_id, event_name

image

Impressive!  SQL Server had to do ten times the I/O operations against the fragmented table than it did against the non-fragmented table.  You might notice in the output that the number of physical_page_read events is larger than the number of actual pages in the indexes for the table.  Yes and no, I restricted the output of sys.dm_db_index_physical_stats to just level 0 of the indexes, or the leaf level.  There are two additional levels in the nonclustered indexes being scanned but they only account for nine additional pages for the fragmented index and five additional pages for the non-fragmented index.  The other pages are from the system metadata tables which have to be read by the engine as well, which can be confirmed by reading the parsed event data from the dbo.FragmentationEffectResultsParsed table.

Where the impact really gets interesting is when we take a look at the I/O sizes for file_read_completed events and aggregate the results based on the table and the I/O size being read from disk using the same activity_id’s from the previous query for aggregation.

-- Aggregate the file_read_completed events by the IO size and table
SELECT
    CASE activity_id
        WHEN @FragmentationHighActivityID THEN 'GuidHighFragmentation'
        WHEN @FragmentationLowActivityID THEN 'GuidLowFragmentation'
        ELSE 'UNKNOWN'
    END as QueryTable, 
    size/1024 as read_size_kb, 
    size/1024/8 as read_size_pages, 
    COUNT(*) AS occurences
FROM FragmentationEffectResultsParsed
WHERE activity_id IN (@FragmentationHighActivityID, @FragmentationLowActivityID)
  AND event_name IN ('file_read_completed')
GROUP BY database_id, file_id, size, activity_id
ORDER BY QueryTable, size desc
GO

image

I suspected that fragmentation would impact the ability of the database engine to do large sequential reads from the table index while scanning it, but I was not expecting the impact to be as bad as it actually was.  I ran this test numerous times to isolate out if it was some kind of a fluke and the results are consistently repeatable within a relatively low margin of change.  The smallest read size for the non-fragmented index was 64KB where the fragmented table did a majority of its I/O using 64KB reads or smaller.

Now I know what you are thinking, the non-fragmented index in this case is smaller than the fragmented one so it should do less total I/O.  You’d be right to think that, but keep in mind that they both contain the same number of records, demonstrating the wasted space that Paul talked about in the session at SQL Connections.  However, not to worry, we can fix the non-fragmented table and add additional records to it to push the index size over the size of the fragmented one in page count.

-- Add additional data to bring page count higher than the GuidHighFragmentation table!
INSERT INTO GuidLowFragmentation (FirstName, LastName)
SELECT TOP 10000
    a.name, b.name
FROM master.dbo.spt_values AS a
CROSS JOIN master.dbo.spt_values AS b
WHERE a.name IS NOT NULL 
    AND b.name IS NOT NULL
ORDER BY NEWID()
GO 10
 
-- Rebuild the indexes to remove any fragmentation
ALTER INDEX ALL ON GuidLowFragmentation REBUILD
GO

If we go back and rerun our query against sys.dm_db_index_physical_stats the non-fragmented index will have an additional 100000 records and just over 1000 pages.

image

Now we can delete the files from our original test, execute CHECKPOINT and flush the buffer cache and rerun the tests to see what affect more than double the index pages has on the non-fragmented indexes I/O operations.

image

For the event counts, even with more pages in the index, the non-fragmented table still incurs significantly less I/O operations.

image

For the I/O size, the smallest IO performed for the non-fragmented table is still 64KB, and there was a increase in reads at 256KB and 512KB sizes. 

While SSD’s might reduce the IO latency for random read operations, they don’t negate the need to continue to adhere to proper design principals for primary keys, fill factor on indexes, and index maintenance.  The actual runtimes of the two tests were nearly identical, and often flipped back and forth between which one took a few milliseconds longer than the other to complete for this demo.  However, keep in mind that these tables were less than 16MB data total until the non-fragmented table had additional records added to it.  Even at larger data sizes the performance of the two scans will be close enough that the average end user wouldn’t notice the difference, but under the covers the number of I/O operations being performed is significantly different.  One thing to consider is that an SSD while faster still has an upper limit to the number of I/O operations per second (IOPS) that it can perform.  Your average database might not be able to push that limit even with heavy fragmentation, but with the cost of SSD’s still at a premium I wouldn’t waste any of what was available if I had them in my server.

So to wrap up, the answer to the question “Does index fragmentation matter with SSD’s?”  It Depends on if you care about wasting space and wasting I/O more than anything else, but YES it still matters and it still has an impact.

One note about this post:

The demo for this post works with SQL Server 2008 and 2008 R2 with the exception that you don’t get the I/O size back from the file_read_completed event.  You can still see the impact that fragmentation has on the number of I/O operations being performed using the demo exactly as provided from this blog post with SQL Server 2008 and 2008 R2.  I chose to go the Denali route because the impact to the I/O size is quite interesting IMO.

See you on the playground!

11 thoughts on “Does Index Fragmentation Matter with SSD’s?

  1. Jonathan,

    Too good Post… very helpfull…U can play like anything with extended events…I have read you white paper on extended events… Can you please let me know some of the resources to get hands on with Extended events…

    Thanks
    Aditya

  2. I always assumed an SSD drive for random access (such as a web server) under heavy fragmentation would perform really well, and never had a few SSDs to test them. This totally destroys the theory that having fragmented data was even better, as it’ll perform the same and the data might wear out certain sections of the drive that were heavily used the most. Were you surprised by the findings Paul? What were your assumptions going in, if you had any? Thanks for the post.

  3. Extended Events are my new favourite way to find and resolve performance issues (in SQL 2008)
    I’m presenting @ a User Group meeting tonight {13th April} on Extended Events, I even have Christian Bolton in my audience for that

    I used Extended Events in my Demo @ SQLBits 8 we just had last weekend – it was great to demo this cool feature on Waits to a 120+ crowd.

    Jonathan you are my 1st stop on learning this cool feature.. Thanks again for another great post on the subject

  4. Given the high cost of SSDs, you’re right that we wouldn’t want our database fragmented and taking more space than it should. And I would look down on any DBA that didn’t try to keep his tables well-maintained and his DBs in tip-top shape.

    But the two operations took the same amount of time, and (you said) probably would even with larger tables. Apparently, with an SSD it doesn’t matter how many separate "sequential reads" there are. The concept may not even be applicable to SSDs. The reason multiple I/O operations are costly on a spinning disk is that seeks involve the physical movement of a head from one place to another, but with SSDs it seems a seek has very low overhead–as I would expect, there being no moving parts. So your statement about "wasting I/O" seems nonsensical.

    Wasting space, on the other hand, matters especially with expensive disk. So would row- and page-level compression help with size? I know it would cost some CPU, but if my memory serves me properly, in some cases overall CPU is lower with compression because even reads use CPU as well and with fewer reads there is less CPU usage.

    Anyway, the conclusion I get is a little different from yours: "with SSDs fragmentation doesn’t impact performance–just size." Which is a given. So best practice hasn’t changed, but you can worry a little less if you’ve got a few tables that are fragmented more than you’d like.

    I’d be very interested if you could rerun your tests with 1GB tables to settle once and for all the fragmentation/performance question.

  5. Aditya,

    If you have read my blog posts and the whitepaper I wrote you have everything you need, it is just a matter of jumping in and working with the feature learn what it can do.

  6. Ali,

    The only thing that was surprising about the outcome of the test was the impact to read ahead being as significant as it was. I expected fully that more I/O would be done for the fragmented index, and I expected to see smaller I/O sizes, but not as significantly smaller as it turned out to be.

  7. Awesome post! Now I don’t have to finish my much less adequate article on the same subject :). I field two question regularly on SSD’s and indexes. One, do I still need multiple indexes and two, should I continue to do index rebuilds. I’ll be sending this link around for sure.

  8. Erik,

    I don’t have time this week to put together a larger test, but I will mock up a larger database and get performance differences for a more realistic scenario including multi-table joins and post the results. The statement about wasting I/O is driven at the fact that SSD’s still have a maximum number of IOPS they can handle, and if fragmentation is forcing your queries to perform 7-10 times the number of I/O operations and you consistently have physical access occuring, you can hit that peak point. The counter arguement to this would be that you could increase the RAM in the server to reduce the I/O needs by having more pages in memory, but then you are throwing hardware at the problem again. Its a vicious circle that I see happen a lot on the forums, and bad design/processes/maintenace chews through bigger hardware faster and faster.

    1. Hey Matt,

      The page split and density issue from fragmentation as well as the small block I/O that results is still the same on 2019 and SSDs. Nothing has changed and there is still a point where correcting fragmentation matters and setting fill factor appropriately to reduce the impact is important, especially with AGs that have readable secondary replicas that incur the 14-byte version tag on the primary for row-version support under RCSI on the secondary.

Leave a Reply

Your email address will not be published. Required fields are marked *

Other articles

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.