Two weeks ago at SQL Connections in Orlando, FL, I got to participate in a session that Paul and Kimberly do at the end of the conference titled, “Follow the Rabbit.” The premise of the session is that Paul and Kimberly throw a big list of topics up on the screen and anyone in the audience can ask any question they have about one of those topics and Paul and Kimberly will try to answer it. I wasn’t the only person to end up participating in this session answering questions, Maciej Pilecki another MVP who also recently passed the Microsoft Certified Masters Exam for SQL Server 2008 answered a number of questions as well. One of my favorite questions that was asked during this session was “Does index fragmentation matter with SSD’s anymore?” Paul’s answer to the question was very practical and dealt with the wasted space utilization that excessive index fragmentation can cause in a database that uses a uniqueidentifier column with newid() as the primary key, and given the cost per gigabyte for SSD storage this could be quite significant. Paul pointed out a couple of other points that escape me, primarily because my mind immediately started churning ideas about how to test the impact that index fragmentation actually has on a database.
It should be no surprise that this blog post is going to use Extended Events, it has after all been my favorite feature in SQL Server since it was first released in an early CTP for SQL Server 2008. One of the sessions that I presented at SQL Connections last month was on Extended Events and I made some pretty heavy use of the I/O related events as a part of that session. I also made heavy usage of the I/O related events back in December for a couple of my XEvent a Day blog posts, so I knew that I could really get some detailed information back from the system about what kind of impact fragmentation really had on a database, and since I have a SSD in my laptop, which while not enterprise class is more than worthy of performing the tests on, I thought I’d give a whirl at figuring out how much impact fragmentation really had.
When I began working on this problem, I started out using Windows Server 2008R2 and SQL Server 2008R2, primarily because it is my standard VM platform for the Immersion Training, and for presentations and it was immediately available. I originally intended to use the I/O size information from the file_read_completed event to aggregate the I/O sizes being read from disk, and I soon learned that this was not available in SQL Server 2008 as I had originally thought. I knew I had used it in the past so I went back to my XEvent a Day blog series and found An XEvent a Day (29 of 31) – The Future – Looking at Database Startup in Denali.
I happened to have my old Denali CTP1 VM that I used to create that blog post on a external hard disk so I decided to make use of it for the investigation. I had originally hoped to be able to create a reproducible demo that anyone could use in their SQL Server 2008 environment to see the impact of fragmentation, but since it is not possible this post will be based on the information available through SQL Server Denali instead.
The first thing that we will need is a database with two tables that have identical data in them; one fragmented and the other not. To create this, we’ll use a uniqueidentifier with newid() as the primary key for the first table and a uniqueidentifier with newsequentialid() as the primary key for the second table. Then we will load approximately 1000 pages worth of data into the first table and then copy that data into the second table, and rebuild the indexes on the second table to remove any residual fragmentation from it.
CREATE DATABASE FragmentationTestGOUSE FragmentationTestGOCREATE TABLE GuidHighFragmentation(UniqueID UNIQUEIDENTIFIER DEFAULT NEWID() PRIMARY KEY,FirstName nvarchar(50) NOT NULL,LastName nvarchar(50) NOT NULL)GOCREATE NONCLUSTERED INDEX IX_GuidHighFragmentation_LastNameON GuidHighFragmentation(LastName)GOCREATE TABLE GuidLowFragmentation(UniqueID UNIQUEIDENTIFIER DEFAULT NEWSEQUENTIALID() PRIMARY KEY,FirstName nvarchar(50) NOT NULL,LastName nvarchar(50) NOT NULL)GOCREATE NONCLUSTERED INDEX IX_GuidLowFragmentation_LastNameON GuidLowFragmentation(LastName)GOINSERT INTO GuidHighFragmentation (FirstName, LastName)SELECT TOP 1000a.name, b.nameFROM master.dbo.spt_values AS aCROSS JOIN master.dbo.spt_values AS bWHERE a.name IS NOT NULLAND b.name IS NOT NULLORDER BY NEWID()GO 70INSERT INTO GuidLowFragmentation (FirstName, LastName)SELECT FirstName, LastNameFROM GuidHighFragmentationGOALTER INDEX ALL ON GuidLowFragmentation REBUILD
With our tables built, we can validate the fragmentation information by querying the sys.dm_index_physical_stats() DMF:
SELECTOBJECT_NAME(ps.object_id),i.name,ps.index_id,ps.index_depth,avg_fragmentation_in_percent,fragment_count,page_count,avg_page_space_used_in_percent,record_countFROM sys.dm_db_index_physical_stats(DB_ID(),NULL,NULL,NULL,'DETAILED') AS psJOIN sys.indexes AS iON ps.object_id = i.object_idAND ps.index_id = i.index_idWHERE index_level = 0
Next we will create our event session to capture the I/O events that are related to physical reads from disk. The event session is being created using dynamic SQL so that each of the events has a predicate on the sqlserver.database_id for the current database.
-- Create an Event Session to investigate our IO operationsIF EXISTS(SELECT * FROM sys.server_event_sessions WHERE name='FragmentationEffect')DROP EVENT SESSION [FragmentationEffect] ON SERVER;DECLARE @sqlcmd nvarchar(4000) = 'CREATE EVENT SESSION FragmentationEffectON SERVERADD EVENT sqlserver.sql_statement_starting( ACTION (sqlserver.sql_text)),ADD EVENT sqlserver.sql_statement_completed( ACTION (sqlserver.sql_text)),ADD EVENT sqlserver.file_read( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+')),ADD EVENT sqlserver.file_read_completed( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+')),ADD EVENT sqlserver.physical_page_read( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+')),ADD EVENT sqlos.async_io_requested( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+')),ADD EVENT sqlos.async_io_completed( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+'))--,ADD TARGET package0.asynchronous_file_target(SET filename=''C:\SQLskills\EE_FragmentationEffect.xel'',metadatafile=''C:\SQLskills\EE_FragmentationEffect.xem'')WITH (MAX_MEMORY = 8MB, EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS,TRACK_CAUSALITY = ON, MAX_DISPATCH_LATENCY=5SECONDS)'EXEC (@sqlcmd)GO
This event session is configured to have MAX_MEMORY set at 8MB to account for the partitioning of the memory buffers and to ensure that there is ample buffer space for the session to try and mitigate against the potential for event loss. The EVENT_RETENTION_MODE for the session can not be configured for NO_EVENT_LOSS since the sqlserver.physical_page_read event is a part of the event session. TRACK_CAUSALITY is turned ON for the event session so that correlation between events can be made to tie the I/O operations back to the statement that generated them, and the MAX_DISPATCH_LATENCY has been set at five seconds because I am impatient and don’t want to have to wait the default of thirty seconds for the events to be dispatched to the targets while doing a demo.
Prior to starting the event session, a manual CHECKPOINT will be issued against the database to flush any dirty pages from the buffer cache to disk, and then the buffer cache will be cleared using DBCC DROPCLEANBUFFERS to ensure that the test statements have to physically read the data from disk into cache.
-- Issue checkpoint to flush dirty buffers to diskCHECKPOINTGO-- Clear the Buffer Cache to force reads from DiskDBCC DROPCLEANBUFFERSGO
With this completed, we can now start our event session and run the same query against each table to force a scan of all the pages in the table from disk, then stop the event session so that our target file only has events associated with this specific test in it.
-- Start the Event SessionALTER EVENT SESSION FragmentationEffectON SERVERSTATE=STARTGO-- Aggregate the data from both tablesSELECT LastName, COUNT(*)FROM GuidLowFragmentationGROUP BY LastNameGOSELECT LastName, COUNT(*)FROM GuidHighFragmentationGROUP BY LastNameGO-- Wait for the events to dispatch to the targetWAITFOR DELAY '00:00:10'GO-- Stop the Event SessionALTER EVENT SESSION FragmentationEffectON SERVERSTATE=STOPGO
Now that we have our data, we need to read it from the file and break down the event data into columns to simplify analysis of the information. To do this, we’ll first load the event data as XML into a staging table. Trying to parse the XML while reading from the file through the DMF is incredibly slow by comparison. Then we can shred the XML into a tabular format based on the events being collected.
-- Drop Results tables if they existIF OBJECT_ID('FragmentationEffectResults') IS NOT NULLDROP TABLE FragmentationEffectResultsGOIF OBJECT_ID('FragmentationEffectResultsParsed') IS NOT NULLDROP TABLE FragmentationEffectResultsParsedGO-- Create results table to load data from XE filesCREATE TABLE FragmentationEffectResults(RowID int identity primary key, event_data XML)GO-- Load the event data from the file targetINSERT INTO FragmentationEffectResults(event_data)SELECTCAST(event_data AS XML) AS event_dataFROM sys.fn_xe_file_target_read_file('C:\SQLskills\EE_FragmentationEffect*.xel','C:\SQLskills\EE_FragmentationEffect*.xem',null, null)GO-- Parse the event dataSELECTRowID,event_data.value('(event/@name)', 'varchar(50)') AS event_name,DATEADD(hh,DATEDIFF(hh, GETUTCDATE(), CURRENT_TIMESTAMP),event_data.value('(event/@timestamp)', 'datetime2')) AS [timestamp],COALESCE(event_data.value('(event/data[@name="database_id"]/value)', 'int'),event_data.value('(event/action[@name="database_id"]/value)', 'int')) AS database_id,event_data.value('(event/data[@name="mode"]/text)', 'nvarchar(4000)') AS [mode],event_data.value('(event/data[@name="file_handle"]/value)', 'nvarchar(4000)') AS [file_handle],event_data.value('(event/data[@name="offset"]/value)', 'bigint') AS [offset],event_data.value('(event/data[@name="page_id"]/value)', 'int') AS [page_id],event_data.value('(event/data[@name="file_id"]/value)', 'int') AS [file_id],event_data.value('(event/data[@name="file_group_id"]/value)', 'int') AS [file_group_id],event_data.value('(event/data[@name="size"]/value)', 'bigint') AS [size],event_data.value('(event/data[@name="wait_type"]/text)', 'nvarchar(100)') AS [wait_type],event_data.value('(event/data[@name="duration"]/value)', 'bigint') AS [duration],event_data.value('(event/action[@name="sql_text"]/value)', 'nvarchar(4000)') AS [sql_text],event_data.value('(event/data[@name="cpu"]/value)', 'int') AS [cpu],event_data.value('(event/data[@name="reads"]/value)', 'bigint') AS [reads],event_data.value('(event/data[@name="writes"]/value)', 'bigint') AS [writes],CAST(SUBSTRING(event_data.value('(event/action[@name="attach_activity_id"]/value)', 'varchar(50)'), 1, 36) AS uniqueidentifier) as activity_id,CAST(SUBSTRING(event_data.value('(event/action[@name="attach_activity_id"]/value)', 'varchar(50)'), 38, 10) AS int) as event_sequenceINTO FragmentationEffectResultsParsedFROM FragmentationEffectResultsORDER BY Rowid
This is where the real fun begins. There are too many events in results set to be useful in raw form, so what I decided to do was to locate the activity_id from TRACK_CAUSALITY for the sqlserver.sql_statement_starting events for both of the tables, and store those into variables for use in querying off the results set. The first thing I looked at was how many times did each of the events actually fire for each of the tables.
DECLARE @FragmentationHighActivityID varchar(50),@FragmentationLowActivityID varchar(50)SELECT @FragmentationHighActivityID = activity_idFROM FragmentationEffectResultsParsedWHERE event_name = 'sql_statement_starting'AND sql_text LIKE '%GuidHighFragmentation%'SELECT @FragmentationLowActivityID = activity_idFROM FragmentationEffectResultsParsedWHERE event_name = 'sql_statement_starting'AND sql_text LIKE '%GuidLowFragmentation%'-- Aggregate the results by the event name and file_idSELECT event_name,CASE activity_idWHEN @FragmentationHighActivityID THEN 'GuidHighFragmentation'WHEN @FragmentationLowActivityID THEN 'GuidLowFragmentation'ELSE 'UNKNOWN'END as QueryTable,COUNT(*) AS occurencesFROM FragmentationEffectResultsParsedWHERE activity_id IN (@FragmentationHighActivityID, @FragmentationLowActivityID)AND event_name IN ('async_io_completed', 'async_io_requested','file_read', 'file_read_completed', 'physical_page_read')GROUP BY event_name, activity_idORDER BY activity_id, event_name
Impressive! SQL Server had to do ten times the I/O operations against the fragmented table than it did against the non-fragmented table. You might notice in the output that the number of physical_page_read events is larger than the number of actual pages in the indexes for the table. Yes and no, I restricted the output of sys.dm_db_index_physical_stats to just level 0 of the indexes, or the leaf level. There are two additional levels in the nonclustered indexes being scanned but they only account for nine additional pages for the fragmented index and five additional pages for the non-fragmented index. The other pages are from the system metadata tables which have to be read by the engine as well, which can be confirmed by reading the parsed event data from the dbo.FragmentationEffectResultsParsed table.
Where the impact really gets interesting is when we take a look at the I/O sizes for file_read_completed events and aggregate the results based on the table and the I/O size being read from disk using the same activity_id’s from the previous query for aggregation.
-- Aggregate the file_read_completed events by the IO size and tableSELECTCASE activity_idWHEN @FragmentationHighActivityID THEN 'GuidHighFragmentation'WHEN @FragmentationLowActivityID THEN 'GuidLowFragmentation'ELSE 'UNKNOWN'END as QueryTable,size/1024 as read_size_kb,size/1024/8 as read_size_pages,COUNT(*) AS occurencesFROM FragmentationEffectResultsParsedWHERE activity_id IN (@FragmentationHighActivityID, @FragmentationLowActivityID)AND event_name IN ('file_read_completed')GROUP BY database_id, file_id, size, activity_idORDER BY QueryTable, size descGO
I suspected that fragmentation would impact the ability of the database engine to do large sequential reads from the table index while scanning it, but I was not expecting the impact to be as bad as it actually was. I ran this test numerous times to isolate out if it was some kind of a fluke and the results are consistently repeatable within a relatively low margin of change. The smallest read size for the non-fragmented index was 64KB where the fragmented table did a majority of its I/O using 64KB reads or smaller.
Now I know what you are thinking, the non-fragmented index in this case is smaller than the fragmented one so it should do less total I/O. You’d be right to think that, but keep in mind that they both contain the same number of records, demonstrating the wasted space that Paul talked about in the session at SQL Connections. However, not to worry, we can fix the non-fragmented table and add additional records to it to push the index size over the size of the fragmented one in page count.
-- Add additional data to bring page count higher than the GuidHighFragmentation table!INSERT INTO GuidLowFragmentation (FirstName, LastName)SELECT TOP 10000a.name, b.nameFROM master.dbo.spt_values AS aCROSS JOIN master.dbo.spt_values AS bWHERE a.name IS NOT NULLAND b.name IS NOT NULLORDER BY NEWID()GO 10-- Rebuild the indexes to remove any fragmentationALTER INDEX ALL ON GuidLowFragmentation REBUILDGO
If we go back and rerun our query against sys.dm_db_index_physical_stats the non-fragmented index will have an additional 100000 records and just over 1000 pages.
Now we can delete the files from our original test, execute CHECKPOINT and flush the buffer cache and rerun the tests to see what affect more than double the index pages has on the non-fragmented indexes I/O operations.
For the event counts, even with more pages in the index, the non-fragmented table still incurs significantly less I/O operations.
For the I/O size, the smallest IO performed for the non-fragmented table is still 64KB, and there was a increase in reads at 256KB and 512KB sizes.
While SSD’s might reduce the IO latency for random read operations, they don’t negate the need to continue to adhere to proper design principals for primary keys, fill factor on indexes, and index maintenance. The actual runtimes of the two tests were nearly identical, and often flipped back and forth between which one took a few milliseconds longer than the other to complete for this demo. However, keep in mind that these tables were less than 16MB data total until the non-fragmented table had additional records added to it. Even at larger data sizes the performance of the two scans will be close enough that the average end user wouldn’t notice the difference, but under the covers the number of I/O operations being performed is significantly different. One thing to consider is that an SSD while faster still has an upper limit to the number of I/O operations per second (IOPS) that it can perform. Your average database might not be able to push that limit even with heavy fragmentation, but with the cost of SSD’s still at a premium I wouldn’t waste any of what was available if I had them in my server.
So to wrap up, the answer to the question “Does index fragmentation matter with SSD’s?” It Depends on if you care about wasting space and wasting I/O more than anything else, but YES it still matters and it still has an impact.
One note about this post:
The demo for this post works with SQL Server 2008 and 2008 R2 with the exception that you don’t get the I/O size back from the file_read_completed event. You can still see the impact that fragmentation has on the number of I/O operations being performed using the demo exactly as provided from this blog post with SQL Server 2008 and 2008 R2. I chose to go the Denali route because the impact to the I/O size is quite interesting IMO.
See you on the playground!