Looking at multiple data files and proportional fill with Extended Events

At SQL Connections, I presented a session titled “Learn SQL Server Internals with Extended Events” where I demonstrated a number ways to use Extended Events to learn about the internal workings of the database engine for SQL Server.  The morning of the session I was chatting with someone about a problem they had seen and the topic of proportional fill came up and how the database engine stripes data across multiple files in a user database.  From this discussion, I got the idea to play around with multiple database files and built a demo using Extended Events that showed how proportional fill worked inside of the database engine.  This wasn’t a planned demo for my presentation, and I had plenty of other demo’s that showed various SQL Server Internals, but it became a really good demo that I decided to throw into the mix, which put me way over budget for time.  I decided to leave it up the audience which demo they wanted to see for the last demo of the session, an originally planned one, or the one I wrote that morning for proportional fill; the majority wanted to see the one on proportional fill and it turned out to be the best demo of the entire session based on the crowd interest and feedback.  What was most interesting to me was the number of people attending the session that had never heard of the concept of proportional fill with SQL Server.  Proportional fill is the algorithm used by SQL Server to determine how much information is written to each of the files in a multi-file filegroup based on the proportion of free space within each file; which allows the files to become full at approximately the same time.  Proportional fill has nothing to do with the actual file sizes, it is strictly based on the free space within a file. 

To demonstrate proportional fill in SQL Server, we’ll take a look at how the page writes and I/O operations are distributed across varying configurations for a test database using the same test and event session for each configuration

Basic Example

To look at how proportional fill functions, we’ll start with a basic example, using a database with a separate filegroup for user objects that has been marked as the default filegroup for the database.  The UserObjects filegroup will have four data files, each 32MB in size.

-- Create a multi-file database with evenly sized files
CREATE DATABASE [MultipleDataFiles] ON  PRIMARY 
( NAME = N'MultipleDataFiles', 
    FILENAME = N'H:\SQLData\MultipleDataFiles.mdf' , 
    SIZE = 32768KB , 
    FILEGROWTH = 32768KB ), 
FILEGROUP [UserObjects] 
( NAME = N'MultipleDataFiles_UserObjects1', 
    FILENAME = N'H:\SQLData\MultipleDataFiles_UserObjects1.ndf' , 
    SIZE = 32768KB , 
    FILEGROWTH = 32768KB ), 
( NAME = N'MultipleDataFiles_UserObjects2', 
    FILENAME = N'H:\SQLData\MultipleDataFiles_UserObjects2.ndf' , 
    SIZE = 32768KB , 
    FILEGROWTH = 32768KB ), 
( NAME = N'MultipleDataFiles_UserObjects3', 
    FILENAME = N'H:\SQLData\MultipleDataFiles_UserObjects3.ndf' , 
    SIZE = 32768KB , 
    FILEGROWTH = 32768KB ), 
( NAME = N'MultipleDataFiles_UserObjects4', 
    FILENAME = N'H:\SQLData\MultipleDataFiles_UserObjects4.ndf' , 
    SIZE = 32768KB , 
    FILEGROWTH = 32768KB )
 LOG ON 
( NAME = N'MultipleDataFiles_log', 
    FILENAME = N'L:\SQLLogs\MultipleDataFiles_log.ldf' , 
    SIZE = 131072KB , 
    FILEGROWTH = 32768KB )
GO
ALTER DATABASE [MultipleDataFiles] 
MODIFY FILEGROUP [UserObjects] DEFAULT
GO

With the database created, we’ll create a table to load data into for our tests based on the AdventureWorks SalesOrderHeader table which will be the source for the test data:

- Create a table to load data into for the tests
USE [MultipleDataFiles]
GO
IF OBJECT_ID('SalesOrderHeader') IS NOT NULL
    DROP TABLE [dbo].[SalesOrderHeader]
GO
SET NOCOUNT ON
GO
CREATE TABLE [dbo].[SalesOrderHeader](
    [SalesOrderID] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY,
    [RevisionNumber] [tinyint] NOT NULL,
    [OrderDate] [datetime] NOT NULL,
    [DueDate] [datetime] NOT NULL,
    [ShipDate] [datetime] NULL,
    [Status] [tinyint] NOT NULL,
    [OnlineOrderFlag] [bit] NOT NULL,
    [SalesOrderNumber] [nvarchar](25) NOT NULL,
    [PurchaseOrderNumber] [nvarchar](25) NULL,
    [AccountNumber] [nvarchar](15) NULL,
    [CustomerID] [int] NOT NULL,
    [SalesPersonID] [int] NULL,
    [TerritoryID] [int] NULL,
    [BillToAddressID] [int] NOT NULL,
    [ShipToAddressID] [int] NOT NULL,
    [ShipMethodID] [int] NOT NULL,
    [CreditCardID] [int] NULL,
    [CreditCardApprovalCode] [varchar](15) NULL,
    [CurrencyRateID] [int] NULL,
    [SubTotal] [money] NOT NULL,
    [TaxAmt] [money] NOT NULL,
    [Freight] [money] NOT NULL,
    [TotalDue] [money] NOT NULL,
    [Comment] [nvarchar](128) NULL,
    [rowguid] [uniqueidentifier] NOT NULL,
    [ModifiedDate] [datetime] NOT NULL
) 
GO

Now we’ll create our event session using dynamic SQL to add the database_id for our test database to the predicate for each of the events, restricting them to firing only for our test database to minimize the need to filter through the event session data later on.  The event session is going to collect the sqlserver.checkpoint_begin, sqlserver.checkpoint_end. sqlserver.file_written, sqlserver.file_write_completed, sqlserver.physical_page_write, sqlos.async_io_requested, and sqlos.async_io_completed events.  The checkpoint events are included in the event session to show that writes don’t immediately begin to occur to the data files, but instead occur in response to the checkpoint operations in the database engine.

-- Create our Event Session dynamically
IF EXISTS(SELECT * FROM sys.server_event_sessions WHERE name='MultipleDataFiles')
    DROP EVENT SESSION [MultipleDataFiles] ON SERVER;
DECLARE @sqlcmd nvarchar(4000) = '
CREATE EVENT SESSION MultipleDataFiles
ON SERVER
ADD EVENT sqlserver.checkpoint_begin
( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+')),
ADD EVENT sqlserver.checkpoint_end
( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+')),
ADD EVENT sqlserver.file_written
( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+')),
ADD EVENT sqlserver.file_write_completed
( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+')),
ADD EVENT sqlserver.physical_page_write
( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+')),
ADD EVENT sqlos.async_io_requested
( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+')),
ADD EVENT sqlos.async_io_completed
( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+'))--,
ADD TARGET package0.asynchronous_file_target(
     SET filename=''C:\SQLskills\MultipleDataFiles.xel'',
         metadatafile=''C:\SQLskills\MultipleDataFiles.xem'')
WITH (MAX_MEMORY = 8MB, EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS, TRACK_CAUSALITY = ON, MAX_DISPATCH_LATENCY=5SECONDS)'
EXEC (@sqlcmd)

With the event session created we can run our data load:

-- Start the Event Session
ALTER EVENT SESSION MultipleDataFiles
ON SERVER
STATE=START
GO
 
-- Load data into the test database
INSERT INTO dbo.SalesOrderHeader
SELECT RevisionNumber, 
    DATEADD(DD, 1126+number, OrderDate), 
    DATEADD(DD, 1126+number, DueDate), 
    DATEADD(DD, 1126+number, ShipDate), 
    soh.Status, OnlineOrderFlag, SalesOrderNumber, 
    PurchaseOrderNumber, AccountNumber, 
    CustomerID, SalesPersonID, TerritoryID, 
    BillToAddressID, ShipToAddressID, 
    ShipMethodID, CreditCardID, CreditCardApprovalCode, 
    CurrencyRateID, SubTotal, TaxAmt, Freight, 
    TotalDue, Comment, rowguid, 
    DATEADD(DD, 1126+number, ModifiedDate)
FROM AdventureWorks2008R2.Sales.SalesOrderHeader AS soh
CROSS JOIN master.dbo.spt_values AS sv
WHERE sv.type = N'P'
  AND sv.number > 0 AND sv.number < 6
GO 3
 
-- Flush all dirty pages to disk
CHECKPOINT
GO
 
-- Stop the Event Session
ALTER EVENT SESSION MultipleDataFiles
ON SERVER
STATE=STOP
GO

The above script will load roughly 92MB of data into the test table which makes the insert operation smaller than the size of all four data files.  Note that there is an explicit checkpoint in the test to force all dirty pages to be written to the appropriate data files on disk.  Without this checkpoint, the file writes may appear to be imbalanced incorrectly due to the timing of the last automatic checkpoint and the dirty pages in cache when it occurred.  By manually checkpointing the database before ending the event session we ensure we have the file writes captured accurately.  To view the information captured by the event session we will read the event data into a table and then shred the XML into another intermediate table to allow for further analysis of the detailed information if you so desire.  Finally we’ll aggregate the events and pivot the results based on the file_id to see how SQL Server wrote to the database files for the database. 

-- Drop Results tables if they exist
IF OBJECT_ID('MultipleDataFileResults') IS NOT NULL
    DROP TABLE MultipleDataFileResults 
GO
IF OBJECT_ID('MultipleDataFileResultsParsed') IS NOT NULL
    DROP TABLE MultipleDataFileResultsParsed 
GO
 
-- Create results table to load data from XE files
CREATE TABLE MultipleDataFileResults
(RowID int identity primary key, event_data XML)
GO
 
-- Load the event data from the file target
INSERT INTO MultipleDataFileResults(event_data)
SELECT
    CAST(event_data AS XML) AS event_data
FROM sys.fn_xe_file_target_read_file('C:\SQLskills\MultipleDataFiles*.xel', 
                                     'C:\SQLskills\MultipleDataFiles*.xem', 
                                     null, null)
GO
 
-- Parse the event data
SELECT 
    RowID,
    event_data.value('(event/@name)[1]', 'varchar(50)') AS event_name,
    DATEADD(hh, 
            DATEDIFF(hh, GETUTCDATE(), CURRENT_TIMESTAMP), 
            event_data.value('(event/@timestamp)[1]', 'datetime2')) AS [timestamp],
    COALESCE(event_data.value('(event/data[@name="database_id"]/value)[1]', 'int'), 
             event_data.value('(event/action[@name="database_id"]/value)[1]', 'int')) AS database_id,
    event_data.value('(event/data[@name="mode"]/text)[1]', 'nvarchar(4000)') AS [mode],
    event_data.value('(event/data[@name="file_handle"]/value)[1]', 'nvarchar(4000)') AS [file_handle],
    event_data.value('(event/data[@name="offset"]/value)[1]', 'bigint') AS [offset],
    event_data.value('(event/data[@name="page_id"]/value)[1]', 'int') AS [page_id],
    event_data.value('(event/data[@name="file_id"]/value)[1]', 'int') AS [file_id],
    event_data.value('(event/data[@name="file_group_id"]/value)[1]', 'int') AS [file_group_id],
    event_data.value('(event/data[@name="wait_type"]/text)[1]', 'nvarchar(100)') AS [wait_type],
    event_data.value('(event/data[@name="duration"]/value)[1]', 'bigint') AS [duration],
    event_data.value('(event/action[@name="sql_text"]/value)[1]', 'nvarchar(4000)') AS [sql_text],
    event_data.value('(event/data[@name="cpu"]/value)[1]', 'int') AS [cpu],
    event_data.value('(event/data[@name="reads"]/value)[1]', 'bigint') AS [reads],
    event_data.value('(event/data[@name="writes"]/value)[1]', 'bigint') AS [writes],
    CAST(SUBSTRING(event_data.value('(event/action[@name="attach_activity_id"]/value)[1]', 'varchar(50)'), 1, 36) AS uniqueidentifier) as activity_id,
    CAST(SUBSTRING(event_data.value('(event/action[@name="attach_activity_id"]/value)[1]', 'varchar(50)'), 38, 10) AS int) as event_sequence
INTO MultipleDataFileResultsParsed
FROM MultipleDataFileResults
ORDER BY Rowid
 
-- Aggregate the results by the event name and file_id
SELECT file_id, [file_write_completed],[file_written], [physical_page_write]
FROM
(    SELECT event_name, file_id, COUNT(*) AS occurences
    FROM MultipleDataFileResultsParsed
    WHERE event_name IN ('file_write_completed', 'file_written', 'physical_page_write')
    GROUP BY event_name, file_id
) AS tab
PIVOT
(    MAX(occurences) 
    FOR event_name IN ([file_write_completed],[file_written], [physical_page_write])) AS pvt

The output of our pivot operation shows that the four data files in the UserObjects filegroup are written to relatively evenly, which should be expected based on the amount of free space being equal across the four data files.  One item that should become incredibly apparent is the importance of the transaction log which has twenty-three times the file writes occurring to it than the nearest data file. 

image

Different file size but same free space

As previously stated in the intro to this blog post, proportional fill is based on the amount of free space in each file in relation to the other files and not the actual file sizes themselves.  To demonstrate this, we’ll create a single file database with our table and then load approximately 31MB of data into the table.  Then we’ll increase the size of the first file to 63MB and add three additional files that are 32MB each to the database.

-- Delete target files from previous tests
EXECUTE sp_configure 'show advanced options', 1; RECONFIGURE;
EXECUTE sp_configure 'xp_cmdshell', 1; RECONFIGURE; 
EXEC xp_cmdshell 'DEL C:\SQLskills\MultipleDataFiles*';
EXECUTE sp_configure 'xp_cmdshell', 0; RECONFIGURE;
EXECUTE sp_configure 'show advanced options', 0; RECONFIGURE;
 
-- Drop the test database from the server
USE [master]
GO
IF DB_ID('MultipleDataFiles') IS NOT NULL
BEGIN
    ALTER DATABASE [MultipleDataFiles] SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
    DROP DATABASE [MultipleDataFiles];
END
GO
-- Create a single-file database 
CREATE DATABASE [MultipleDataFiles] ON  PRIMARY 
( NAME = N'MultipleDataFiles', 
    FILENAME = N'H:\SQLData\MultipleDataFiles.mdf' , 
    SIZE = 32768KB , 
    FILEGROWTH = 32768KB ), 
FILEGROUP [UserObjects] 
( NAME = N'MultipleDataFiles_UserObjects1', 
    FILENAME = N'H:\SQLData\MultipleDataFiles_UserObjects1.ndf' , 
    SIZE = 32768KB , 
    FILEGROWTH = 32768KB )
LOG ON 
( NAME = N'MultipleDataFiles_log', 
    FILENAME = N'L:\SQLLogs\MultipleDataFiles_log.ldf' , 
    SIZE = 131072KB , 
    FILEGROWTH = 32768KB )
GO
ALTER DATABASE [MultipleDataFiles] 
MODIFY FILEGROUP [UserObjects] DEFAULT
GO
 
-- Create a table to load data into for the tests
USE [MultipleDataFiles]
GO
CREATE TABLE [dbo].[SalesOrderHeader](
    [SalesOrderID] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY,
    [RevisionNumber] [tinyint] NOT NULL,
    [OrderDate] [datetime] NOT NULL,
    [DueDate] [datetime] NOT NULL,
    [ShipDate] [datetime] NULL,
    [Status] [tinyint] NOT NULL,
    [OnlineOrderFlag] [bit] NOT NULL,
    [SalesOrderNumber] [nvarchar](25) NOT NULL,
    [PurchaseOrderNumber] [nvarchar](25) NULL,
    [AccountNumber] [nvarchar](15) NULL,
    [CustomerID] [int] NOT NULL,
    [SalesPersonID] [int] NULL,
    [TerritoryID] [int] NULL,
    [BillToAddressID] [int] NOT NULL,
    [ShipToAddressID] [int] NOT NULL,
    [ShipMethodID] [int] NOT NULL,
    [CreditCardID] [int] NULL,
    [CreditCardApprovalCode] [varchar](15) NULL,
    [CurrencyRateID] [int] NULL,
    [SubTotal] [money] NOT NULL,
    [TaxAmt] [money] NOT NULL,
    [Freight] [money] NOT NULL,
    [TotalDue] [money] NOT NULL,
    [Comment] [nvarchar](128) NULL,
    [rowguid] [uniqueidentifier] NOT NULL,
    [ModifiedDate] [datetime] NOT NULL
) 
GO
 
-- Load ~31MB of data into the test database
INSERT INTO dbo.SalesOrderHeader
SELECT RevisionNumber, 
    DATEADD(DD, 1126+number, OrderDate), 
    DATEADD(DD, 1126+number, DueDate), 
    DATEADD(DD, 1126+number, ShipDate), 
    soh.Status, OnlineOrderFlag, SalesOrderNumber, 
    PurchaseOrderNumber, AccountNumber, 
    CustomerID, SalesPersonID, TerritoryID, 
    BillToAddressID, ShipToAddressID, 
    ShipMethodID, CreditCardID, CreditCardApprovalCode, 
    CurrencyRateID, SubTotal, TaxAmt, Freight, 
    TotalDue, Comment, rowguid, 
    DATEADD(DD, 1126+number, ModifiedDate)
FROM AdventureWorks2008R2.Sales.SalesOrderHeader AS soh
CROSS JOIN master.dbo.spt_values AS sv
WHERE sv.type = N'P'
  AND sv.number > 0 AND sv.number < 6
 
-- Grow the first data file to 63MB leaving 32MB free space
ALTER DATABASE [MultipleDataFiles] 
MODIFY FILE (    NAME = N'MultipleDataFiles_UserObjects1', 
                SIZE = 64512KB )
GO
 
-- Add second file with 32MB size
ALTER DATABASE [MultipleDataFiles] 
ADD FILE (    NAME = N'MultipleDataFiles_UserObjects2', 
            FILENAME = N'H:\SQLData\MultipleDataFiles_UserObjects2.ndf' , 
            SIZE = 32768KB , 
            FILEGROWTH = 32768KB ) 
TO FILEGROUP [UserObjects]
GO
 
-- Add third file with 32MB size
ALTER DATABASE [MultipleDataFiles] 
ADD FILE (    NAME = N'MultipleDataFiles_UserObjects3', 
            FILENAME = N'H:\SQLData\MultipleDataFiles_UserObjects3.ndf' , 
            SIZE = 32768KB , 
            FILEGROWTH = 32768KB ) 
TO FILEGROUP [UserObjects]
GO
 
-- Add fourth file with 32MB size
ALTER DATABASE [MultipleDataFiles] 
ADD FILE (    NAME = N'MultipleDataFiles_UserObjects4', 
            FILENAME = N'H:\SQLData\MultipleDataFiles_UserObjects4.ndf' , 
            SIZE = 32768KB , 
            FILEGROWTH = 32768KB ) 
TO FILEGROUP [UserObjects]
GO

With new database setup, the exact same test from the first demo can be run to view how proportional fill functions with one data file larger than the others, but with the same free space.

image

The impact of different free space amounts

Since proportional fill is free space based, lets look at the impact that having different free space in one file has to the writes that occur.  To setup this test, the database will be created with one file sized at 64MB and remaining files sized at 32MB.

-- Delete target files from previous tests
EXECUTE sp_configure 'show advanced options', 1; RECONFIGURE;
EXECUTE sp_configure 'xp_cmdshell', 1; RECONFIGURE; 
EXEC xp_cmdshell 'DEL C:\SQLskills\MultipleDataFiles*';
EXECUTE sp_configure 'xp_cmdshell', 0; RECONFIGURE;
EXECUTE sp_configure 'show advanced options', 0; RECONFIGURE;
 
-- Drop the database from the server
USE [master]
GO
IF DB_ID('MultipleDataFiles') IS NOT NULL
BEGIN
    ALTER DATABASE [MultipleDataFiles] SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
    DROP DATABASE [MultipleDataFiles];
END
GO
-- Create a multi-file database with one file larger than the others
CREATE DATABASE [MultipleDataFiles] ON  PRIMARY 
( NAME = N'MultipleDataFiles', 
    FILENAME = N'H:\SQLData\MultipleDataFiles.mdf' , 
    SIZE = 32768KB , 
    FILEGROWTH = 32768KB ), 
FILEGROUP [UserObjects] 
( NAME = N'MultipleDataFiles_UserObjects1', 
    FILENAME = N'H:\SQLData\MultipleDataFiles_UserObjects1.ndf' , 
    SIZE = 65536KB , 
    FILEGROWTH = 32768KB ), 
( NAME = N'MultipleDataFiles_UserObjects2', 
    FILENAME = N'H:\SQLData\MultipleDataFiles_UserObjects2.ndf' , 
    SIZE = 32768KB , 
    FILEGROWTH = 32768KB ), 
( NAME = N'MultipleDataFiles_UserObjects3', 
    FILENAME = N'H:\SQLData\MultipleDataFiles_UserObjects3.ndf' , 
    SIZE = 32768KB , 
    FILEGROWTH = 32768KB ), 
( NAME = N'MultipleDataFiles_UserObjects4', 
    FILENAME = N'H:\SQLData\MultipleDataFiles_UserObjects4.ndf' , 
    SIZE = 32768KB , 
    FILEGROWTH = 32768KB )
 LOG ON 
( NAME = N'MultipleDataFiles_log', 
    FILENAME = N'L:\SQLLogs\MultipleDataFiles_log.ldf' , 
    SIZE = 131072KB , 
    FILEGROWTH = 32768KB )
GO
ALTER DATABASE [MultipleDataFiles] 
MODIFY FILEGROUP [UserObjects] DEFAULT
GO

The exact same test can be rerun and when the event data is parsed, the 64MB file will show roughly twice as many write operations and pages as the 32MB files, right in proportion to its free space.

  image

The impact of autogrowth

One question I had after beginning to look at this was, what impact autogrowth would have on data files that were evenly sized, and using the same sizes for autogrowth? We could easily test this by upping the number of executions for our insert operation from the original test to force the database to grow, but I was really interested in the impact over repeated growth operations, and being impatient I wasn’t really willing to wait for large insert operations to complete.  I instead went back a recreated the database using 8MB data files set to grow by 8MB.  Then I changed the GO 3 batch terminator for the INSERT in the test to a GO 5 to retest the impact of autogrowth.

-- Delete target files from previous tests
EXECUTE sp_configure 'show advanced options', 1; RECONFIGURE;
EXECUTE sp_configure 'xp_cmdshell', 1; RECONFIGURE; 
EXEC xp_cmdshell 'DEL C:\SQLskills\MultipleDataFiles*';
EXECUTE sp_configure 'xp_cmdshell', 0; RECONFIGURE;
EXECUTE sp_configure 'show advanced options', 0; RECONFIGURE;
 
-- Drop the database from the server
USE [master]
GO
IF DB_ID('MultipleDataFiles') IS NOT NULL
BEGIN
    ALTER DATABASE [MultipleDataFiles] SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
    DROP DATABASE [MultipleDataFiles];
END
GO
-- Create a multi-file database with one file larger than the others
CREATE DATABASE [MultipleDataFiles] ON  PRIMARY 
( NAME = N'MultipleDataFiles', 
    FILENAME = N'H:\SQLData\MultipleDataFiles.mdf' , 
    SIZE = 32768KB , 
    FILEGROWTH = 32768KB ), 
FILEGROUP [UserObjects] 
( NAME = N'MultipleDataFiles_UserObjects1', 
    FILENAME = N'H:\SQLData\MultipleDataFiles_UserObjects1.ndf' , 
    SIZE = 8192KB , 
    FILEGROWTH = 8192KB ), 
( NAME = N'MultipleDataFiles_UserObjects2', 
    FILENAME = N'H:\SQLData\MultipleDataFiles_UserObjects2.ndf' , 
    SIZE = 8192KB , 
    FILEGROWTH = 8192KB ), 
( NAME = N'MultipleDataFiles_UserObjects3', 
    FILENAME = N'H:\SQLData\MultipleDataFiles_UserObjects3.ndf' , 
    SIZE = 8192KB , 
    FILEGROWTH = 8192KB ), 
( NAME = N'MultipleDataFiles_UserObjects4', 
    FILENAME = N'H:\SQLData\MultipleDataFiles_UserObjects4.ndf' , 
    SIZE = 8192KB , 
    FILEGROWTH = 8192KB )
 LOG ON 
( NAME = N'MultipleDataFiles_log', 
    FILENAME = N'L:\SQLLogs\MultipleDataFiles_log.ldf' , 
    SIZE = 131072KB , 
    FILEGROWTH = 32768KB )
GO
ALTER DATABASE [MultipleDataFiles] 
MODIFY FILEGROUP [UserObjects] DEFAULT
GO

The outcome of the autogrowth tests at five iterations for the INSERT batch seemed odd to me because one of the files was lagging the other three significantly.

image

I wasn’t satisfied with the initial results of a couple of iterations with five batch executions for the INSERT so I decided to validate the results at various scales including ten, twenty and fifty iterations of the INSERT operation.

Using GO 10

image

Using GO 20

image

Using GO 50

image

At the end of each of these series of tests, the files were always within a single autogrowth size of each other, so it is obvious that proportional fill is keeping things relatively equal throughout the tests, but there is the potential for hot spotting of a single data file when auto grow occurs, at least until the other data files grow as well.  In this test the auto grow numbers were kept small, primarily due to storage limitations on the SSD in my laptop, and in the configuration of the VM I was working on, but I am definitely going to make it a point to test this again at a later date using larger autogrowth numbers to see what the impact is longer term and whether the hot spots caused by autogrowth can impact performance significantly?  I have always espoused manual file size management, even in large environments so that certain factors like a filegroup with multiple files can be addressed at the same time.

So there you have it, evidence of how proportional fill functions inside of SQL Server.  Hope you found it interesting.

See you on the playground!

One thought on “Looking at multiple data files and proportional fill with Extended Events

Leave a Reply

Your email address will not be published. Required fields are marked *

Other articles

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.