Parsing SQLIO Output to Excel Charts using Regex in PowerShell

Today Joe Webb (Blog|Twitter) blogged about The Power of Regex in Powershell, and in his post he shows how to parse the SQL Server Error Log for events of interest.  At the end of his blog post Joe asked about other places where Regular Expressions have been useful in PowerShell so I thought I’d blog my script for parsing SQLIO output using Regex in PowerShell, to populate an Excel worksheet and build charts based on the results automatically.

If you’ve never used SQLIO, Brent Ozar (Blog|Twitter) has a article on SQLServerPedia titled SAN Performance Tuning with SQLIO that includes a lot of information as well as a video tutorial on its use.  I have been using SQLIO quite a bit this year, mainly to gather performance information for systems before and after reconfiguring their storage to show percent impact information to the business for making changes in storage like migrating from RAID 5 to 10, and aligning disk partitions used for database and log file storage.  I have also used it in benchmarking new SAN configurations and the SAN configuration of our Virtual Machine environment to show team members why I have recommended that a number of our physical servers be migrated onto Virtual Machines.

I generally use a param.txt file similar to:

G:\MSSQL\testfile1.dat 4 0x0 32768

which creates a 32GB test file and then have a DOS batch file that runs the actual SQLIO tests using a script like:

sqlio -kW -t16 -s10 -o8 -fsequential -b8 -BH -LS -Fparam.txt

sqlio -kW -t8 -s360 -o8 -frandom -b8 -BH -LS G:\MSSQL\testfile1.dat > post.txt
sqlio -kW -t8 -s360 -o8 -frandom -b32 -BH -LS G:\MSSQL\testfile1.dat >> post.txt
sqlio -kW -t8 -s360 -o8 -frandom -b64 -BH -LS G:\MSSQL\testfile1.dat >> post.txt
sqlio -kW -t8 -s360 -o8 -frandom -b128 -BH -LS G:\MSSQL\testfile1.dat >> post.txt
sqlio -kW -t8 -s360 -o8 -frandom -b256 -BH -LS G:\MSSQL\testfile1.dat >> post.txt

sqlio -kR -t8 -s360 -o8 -frandom -b8 -BH -LS G:\MSSQL\testfile1.dat >> post.txt
sqlio -kR -t8 -s360 -o8 -frandom -b32 -BH -LS G:\MSSQL\testfile1.dat >> post.txt
sqlio -kR -t8 -s360 -o8 -frandom -b64 -BH -LS G:\MSSQL\testfile1.dat >> post.txt
sqlio -kR -t8 -s360 -o8 -frandom -b128 -BH -LS G:\MSSQL\testfile1.dat >> post.txt
sqlio -kR -t8 -s360 -o8 -frandom -b256 -BH -LS G:\MSSQL\testfile1.dat >> post.txt

sqlio -kW -t8 -s360 -o8 -fsequential -b8 -BH -LS G:\MSSQL\testfile1.dat >> post.txt
sqlio -kW -t8 -s360 -o8 -fsequential -b32 -BH -LS G:\MSSQL\testfile1.dat >> post.txt
sqlio -kW -t8 -s360 -o8 -fsequential -b64 -BH -LS G:\MSSQL\testfile1.dat >> post.txt
sqlio -kW -t8 -s360 -o8 -fsequential -b128 -BH -LS G:\MSSQL\testfile1.dat >> post.txt
sqlio -kW -t8 -s360 -o8 -fsequential -b256 -BH -LS G:\MSSQL\testfile1.dat >> post.txt

sqlio -kR -t8 -s360 -o8 -fsequential -b8 -BH -LS G:\MSSQL\testfile1.dat >> post.txt
sqlio -kR -t8 -s360 -o8 -fsequential -b32 -BH -LS G:\MSSQL\testfile1.dat >> post.txt
sqlio -kR -t8 -s360 -o8 -fsequential -b64 -BH -LS G:\MSSQL\testfile1.dat >> post.txt
sqlio -kR -t8 -s360 -o8 -fsequential -b128 -BH -LS G:\MSSQL\testfile1.dat >> post.txt
sqlio -kR -t8 -s360 -o8 -fsequential -b256 -BH -LS G:\MSSQL\testfile1.dat >> post.txt

Which runs the gamit of tests using 8, 32, 64, 128, and 256K block sizes, with 16 threads, and 8 pending IO.  You can certainly expand the test set much further than this and do a very exhaustive testing regimen, I am by no means saying that my tests above are the best fit, but they have sufficiently met my needs for the tuning I have been doing.  Once the tests have been run the post.txt file will contain an output entry similar the following for every test that was run.

sqlio v1.5.SG
using system counter for latency timings, 14318180 counts per second
16 threads writing for 60 secs to file d:\MSSQL\testfile1.dat
    using 8KB random IOs
    enabling multiple I/Os per thread with 8 outstanding
    buffering set to use hardware disk cache (but not file cache)
using current size: 2048 MB for file: d:\MSSQL\testfile1.dat
initialization done
CUMULATIVE DATA:
throughput metrics:
IOs/sec: 18527.91
MBs/sec:   144.74
latency metrics:
Min_Latency(ms): 0
Avg_Latency(ms): 6
Max_Latency(ms): 4900
histogram:
ms: 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+
%: 78  6  8  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  2

Linchi Shea (Blog) wrote a blog post three years ago that showed how to Parse the sqlio.exe Output using Perl.  I found this in my search for a smarter way of parsing these long output files I was generating other than manually, but I don’t have Perl installed anywhere and I really felt that I should be able to do this natively using the tools readily available to me on my Windows 7 laptop, so I looked to PowerShell for a solution. 

Reading the file data in is accomplished using Get-Content which returns a collection of strings.  At first I tried casting this to a [string] object directly, but soon found out that when I did this the resulting string was missing the carriage returns and line feeds that existed in the original file.  This posed a couple of problems for me in parsing the outputs initially, but doing a [string]::Join operation using [Environment]::NewLine as the concatenater allowed an appropriately CRLF delimited string to be returned. 

$filedata = [string]::Join([Environment]::NewLine,(Get-Content $FileName))

Now that I had my file data loaded into a the $filedate object, the next step was to split this string based on the “sqlio v1.5.SG” header that is output at the beginning of each test run.  Since PowerShell is .NET based, this is easily accomplished by using the Split() method of the System.String object which is System.Type for the $filedata object.

$Results = $filedata.Split( [String[]]"sqlio v1.5.SG", [StringSplitOptions]::RemoveEmptyEntries )

At first I started to write a foreach loop with this object, lets face it old habits die hard, but instead I made use of piping which allows a collection like the one returned by the Split() method to be worked on in a single operation.  Piping in PowerShell is similar to a manufacturing process where an object gets transformed and then passed on until the ultimate result is achieved.  Once an object is piped, it becomes accessible through the use of the $_ variable name, allowing the object to be transformed along the pipeline.  By piping the result of the Split() I was able to write a SELECT statement that utilized Regex to get the information of interest out of each test run.

     select @{Name="Threads"; Expression={[int]([regex]::Match($_, "(\d+)?\sthreads\s(reading|writing)").Groups[1].Value)}},`
                @{Name="Operation"; Expression={switch ([regex]::Match($_, "(\d+)?\sthreads\s(reading|writing)").Groups[2].Value)
                                                {
                                                    "reading" {"Read"}
                                                    "writing" {"Write"}
                                                }    }},`
                @{Name="Duration"; Expression={[int]([regex]::Match($_, "for\s(\d+)?\ssecs").Groups[1].Value)}},`
                @{Name="IOSize"; Expression={[int]([regex]::Match($_, "\tusing\s(\d+)?KB\s(sequential|random)").Groups[1].Value)}},`
                @{Name="IOType"; Expression={switch ([regex]::Match($_, "\tusing\s(\d+)?KB\s(sequential|random)").Groups[2].Value)
                                                {
                                                    "random" {"Random"}
                                                    "sequential" {"Sequential"}
                                                }  }},`
                @{Name="PendingIO"; Expression={[int]([regex]::Match($_, "with\s(\d+)?\soutstanding").Groups[1].Value)}},`
                @{Name="FileSize"; Expression={[int]([regex]::Match($_, "\s(\d+)?\sMB\sfor\sfile").Groups[1].Value)}},`
                @{Name="IOPS"; Expression={[decimal]([regex]::Match($_, "IOs\/sec\:\s+(\d+\.\d+)?").Groups[1].Value)}},`
                @{Name="MBs_Sec"; Expression={[decimal]([regex]::Match($_, "MBs\/sec\:\s+(\d+\.\d+)?").Groups[1].Value)}},`
                @{Name="MinLat_ms"; Expression={[int]([regex]::Match($_, "Min.{0,}?\:\s(\d+)?").Groups[1].Value)}},`
                @{Name="AvgLat_ms"; Expression={[int]([regex]::Match($_, "Avg.{0,}?\:\s(\d+)?").Groups[1].Value)}},`
                @{Name="MaxLat_ms"; Expression={[int]([regex]::Match($_, "Max.{0,}?\:\s(\d+)?").Groups[1].Value)}}`

The select returns a new type of object that has Properties named Threads, Operation, Duration, IOSize, IOType, PendingIO, FileSize, IOPS, MBs_sec, MinLat_ms, AvgLat_ms, and MaxLat_ms.  This alone is sufficient to proceed to creating an Excel file for the information, but I wanted the information sorted in the same manner every time I ran this, primarily because I change how I run my tests sometimes, and I like consistency in the output.  Since I am so anal retentive thorough, I fixed this problem by piping the output from the select to the Sort-Object commandlet and sorted by the IOSize, IOType, Operation, and Threads.

Getting the data into Excel was actually fairly simple to do, thanks to Buck Woody(Blog|Twitter) and SQL Rockstar Tom LaRock(Blog|Twitter).  Buck wrote a blog post titled Create an Excel Graph of your Big Tables – With Powershell, and Tom wrote a Simple Talk Article, Monitoring SQL Server Virtual Log File Fragmentation, that use the Interop Excel objects to create an Excel workbook from PowerShell, populate the workbook with data, and build charts using the data.  My code for the Excel integration is based entirely on their examples.

$Excel = New-Object -ComObject Excel.Application
$Excel.Visible = $true
$WorkBook = $Excel.WorkBooks.Add()
$WorkBook.WorkSheets.Item(1).Name = "RawData"
$WorkBook.WorkSheets.Item(3).Delete()
$WorkBook.WorkSheets.Item(2).Delete()
$WorkSheet = $WorkBook.WorkSheets.Item("RawData")
$x = 2
$WorkSheet.Cells.Item(1,1) = "Threads"
$WorkSheet.Cells.Item(1,2) = "Operation"
$WorkSheet.Cells.Item(1,3) = "Duration"
$WorkSheet.Cells.Item(1,4) = "IOSize"
$WorkSheet.Cells.Item(1,5) = "IOType"
$WorkSheet.Cells.Item(1,6) = "PendingIO"
$WorkSheet.Cells.Item(1,7) = "FileSize"
$WorkSheet.Cells.Item(1,8) = "IOPS"
$WorkSheet.Cells.Item(1,9) = "MBs/Sec"
$WorkSheet.Cells.Item(1,10) = "Min_Lat(ms)"
$WorkSheet.Cells.Item(1,11) = "Avg_Lat(ms)"
$WorkSheet.Cells.Item(1,12) = "Max_Lat(ms)"
$WorkSheet.Cells.Item(1,13) = "Caption"

$Results | % {
    $WorkSheet.Cells.Item($x,1) = $_.Threads
    $WorkSheet.Cells.Item($x,2) = $_.Operation
    $WorkSheet.Cells.Item($x,3) = $_.Duration
    $WorkSheet.Cells.Item($x,4) = $_.IOSize
    $WorkSheet.Cells.Item($x,5) = $_.IOType
    $WorkSheet.Cells.Item($x,6) = $_.PendingIO
    $WorkSheet.Cells.Item($x,7) = $_.FileSize
    $WorkSheet.Cells.Item($x,8) = $_.IOPS
    $WorkSheet.Cells.Item($x,9) = $_.MBs_Sec
    $WorkSheet.Cells.Item($x,10) = $_.MinLat_ms
    $WorkSheet.Cells.Item($x,11) = $_.AvgLat_ms
    $WorkSheet.Cells.Item($x,12) = $_.MaxLat_ms
    $WorkSheet.Cells.Item($x,13) = [string]$_.IOSize + "KB " + [string]$_.IOType + " " + `
                                [string]$_.Operation + " " + [string]$_.Threads + `
                                " Threads " + [string]$_.PendingIO + " pending"
    $x++}

$WorkBook.Charts.Add() | Out-Null
$Chart = $WorkBook.ActiveChart
$Chart.SetSourceData($WorkSheet.Range("H1:H$x"))
$Chart.SeriesCollection(1).xValues = $WorkSheet.Range("M2:M$x")
$Chart.SetSourceData($WorkSheet.Range("H1:H$x"))
$Chart.SeriesCollection(1).xValues = $WorkSheet.Range("M2:M$x")
$Chart.Name = "IOPS"

$WorkBook.Charts.Add() | Out-Null
$WorkBook.ActiveChart.SetSourceData($WorkSheet.Range("I1:I$x"))
$Chart = $WorkBook.ActiveChart
$Chart.SeriesCollection(1).xValues = $WorkSheet.Range("M2:M$x")
$Chart.Name = "MBs Sec"

I don’t know why, but the first Chart doesn’t refresh properly unless I call SetSourceData() and set the SeriesCollection xValues properties twice.  Since the code block that loads the data into the worksheet tracks the rows with the $x variable, the Range for the chart is always row 1 to $x.  The H column happens to be the column that holds the IOPS values and the I column holds the MBs Sec values in the workbook.  The only thing missing at this point a parameter declaration at the head of the script file to require an input parameter $FileName which is passed to Get-Content to read the information.

param(    [Parameter(Mandatory=$TRUE)]
        [ValidateNotNullOrEmpty()]
        [string]
        $FileName)

The complete script is attached to this blog post.  If you use it let me know your thoughts.

SQLIOResults.ps1 (4.98 kb)

Does tempdb Get Recreated From model at Startup?

In my last post Does the tempdb Log file get Zero Initialized at Startup? I questioned whether or not tempdb is actually created from the model database or not at startup.  There is actually an easy way to prove that this statement, at least internally to the tempdb database is in fact TRUE.  Many thanks go out to Bob Ward (Blog | Twitter) for pointing this out after trading emails with him.

To validate that tempdb is actually copied at startup from the model database, all that is necessary is to make a change to the model database, and restart the SQL Server database instance to validate that the change is in fact reflected in the tempdb database after instance recovery completes.  To do this, I created a table in the model database and then created a row in the table using the following script:

USE [model]
GO
CREATE TABLE [SomeTable]
(RowID INT IDENTITY PRIMARY KEY)
GO
INSERT INTO [SomeTable] DEFAULT VALUES
GO
SELECT *
FROM [SomeTable]

Once this query has been run, and the instance has been restarted. to prove that the contents of the model database have in fact been copied into tempdb, all that is needed is to query the SomeTable table from tempdb. 

USE [tempdb]
GO
SELECT *
FROM [SomeTable]

This will return one row which has a RowID of 1 which matches the information previously created in model, proving that tempdb, at least internally is recreated from model, inline with the KB article previously mentioned (See: KB 307487).

This still leaves a few questions about how tempdb is actually allocated when the physical files already exist, but proves that my previous statement is false:

So that leaves me asking, does SQL Server really recreate tempdb from the model database every time it starts?  According to KB Article 307487 :

“When SQL Server starts, the tempdb is re-created by using a copy of the model database and is reset to its last configured size.”

It would certainly seem as if this statement is wrong based on testing.

The tempdb database is certainly created at least internally from a copy of the model database, but the question about how the physical file allocations occur still stands.

Does the tempdb Log file get Zero Initialized at Startup?

While working on a problem today I happened to think about what the impact to startup might be for a really large tempdb transaction log file.  Its fairly common knowledge that data files in SQL Server 2005+ on Windows Server 2003+ can be instant initialized, but the transaction log files can not.  If this is news to you see the following blog posts:

Kimberly L. Tripp | Instant Initialization – What, Why and How?
In Recovery… | Misconceptions around instant file initialization
In Recovery… | Search Engine Q&A #24: Why can’t the transaction log use instant initialization?
Tibor Karaszi : Do you have Instant File Initialization?

The thought occurred to me today that despite having log files 8GB in size for tempdb, I’ve never really noticed that it takes that long for SQL Server to startup.  So I jumped on twitter and shot a tweet out to Paul Randal (Blog | Twitter) and I also included the #sqlhelp hash tag to see what others in the community thought.  I got a couple of comments, one linking me to Paul Randal’s blog post, another saying test it, and another saying the transaction log is always zero initialized.

When I got home tonight I was still thinking about this and as I went to go test it, I remembered that I had read a blog post once about a trace flag that would output information about zero file initialization.  A quick search and I found the post on the Premier Field Engineers Blog.  So I jumped on a test system I have and added the –T3004 and –T3605 trace flags to the startup parameters.  Since the PFE blog provided the following disclaimer about using these trace flags so will I.

WARNING: These trace flags should be used under the guidance of Microsoft SQL Server support. They are used in this post for discussion purposes only and may not be supported in future versions.

After setting the flags I checked tempdb and it was currently setup with a 4GB transaction log file.  With this information in hand I restarted the instance and once it was online I opened the ErrorLog to look at what –T3004 could tell me about tempdb log file initialization.

2010-05-13 18:42:13.52 spid12s     Clearing tempdb database.
<……….skipped content…………>
2010-05-13 18:42:30.93 spid12s     Zeroing T:\SQLData\MSSQL10.MSSQLSERVER\MSSQL\Data\templog.ldf from page 2 to 17 (0x4000 to 0x22000)
2010-05-13 18:42:30.93 spid12s     Zeroing completed on T:\SQLData\MSSQL10.MSSQLSERVER\MSSQL\Data\templog.ldf
2010-05-13 18:42:30.93 spid12s     Starting up database ‘tempdb’.
2010-05-13 18:42:30.96 spid12s     FixupLogTail(progress) zeroing T:\SQLData\MSSQL10.MSSQLSERVER\MSSQL\Data\templog.ldf from 0x5000 to 0x6000.
2010-05-13 18:42:30.96 spid12s     Zeroing T:\SQLData\MSSQL10.MSSQLSERVER\MSSQL\Data\templog.ldf from page 3 to 483 (0x6000 to 0x3c6000)
2010-05-13 18:42:30.96 spid12s     Zeroing completed on T:\SQLData\MSSQL10.MSSQLSERVER\MSSQL\Data\templog.ldf

Hmm, so the log file is zeroed for tempdb, but it doesn’t take a rocket scientist to notice that the page counts being zeroed out (17-2=15 total pages) don’t add up to 4GB of space.  To check this, I created a new user database with a 8GB data file, and a 4GB log file to see the output for zeroing out a 4GB transaction log.

2010-05-13 18:45:42.61 spid54      Zeroing L:\SQLData\MSSQL10.MSSQLSERVER\MSSQL\Data\ZeroLog_log.ldf from page 0 to 524288 (0x0 to 0x100000000)
2010-05-13 18:46:20.92 spid54      Zeroing completed on L:\SQLData\MSSQL10.MSSQLSERVER\MSSQL\Data\ZeroLog_log.ldf
2010-05-13 18:46:36.35 spid54      Starting up database ‘ZeroLog’.
2010-05-13 18:46:36.36 spid54      FixupLogTail(progress) zeroing L:\SQLData\MSSQL10.MSSQLSERVER\MSSQL\Data\ZeroLog_log.ldf from 0x5000 to 0x6000.
2010-05-13 18:46:36.36 spid54      Zeroing L:\SQLData\MSSQL10.MSSQLSERVER\MSSQL\Data\ZeroLog_log.ldf from page 3 to 483 (0x6000 to 0x3c6000)
2010-05-13 18:46:36.37 spid54      Zeroing completed on L:\SQLData\MSSQL10.MSSQLSERVER\MSSQL\Data\ZeroLog_log.ldf

So a user database with a 4GB log file will zero out 524288 pages which is 4GB of space.  I got a private message on twitter about the topic from Remus Rusanu (Blog | Twitter), telling me that the entire file isn’t initialized at startup, but if you use ALTER DATABASE to grow the size of the tempdb log, the space you grow by will be zero initialized entirely, so to test that I grew the log file out to 8GB in size:

ALTER DATABASE [tempdb] MODIFY FILE ( NAME = N’templog’, SIZE = 8388608KB )

When I ran this, it returned completed immediately, so I jumped over to the ErrorLog to pull the results and was momentarily confused by what I got back. 

2010-05-13 19:20:24.02 spid57      Zeroing T:\SQLData\MSSQL10.MSSQLSERVER\MSSQL\Data\templog.ldf from page 524288 to 524319 (0x100000000 to 0x10003e000)
2010-05-13 19:20:24.02 spid57      Zeroing completed on T:\SQLData\MSSQL10.MSSQLSERVER\MSSQL\Data\templog.ldf

It only initialized 31 pages??  Can that be correct?  So I went back and checked my log files actual size and I remembered this little problem mentioned by Kimberly Tripp on her blog post Kimberly L. Tripp | Transaction Log VLFs – too many or too few? that occurs when you grow the file in 4GB increments.  So I reran the ALTER DATABASE statement and this time it took a minute for the command to complete.  Much better, now I know we did some zeroing out of the file, and the ErrorLog output proved it:

2010-05-13 19:21:17.47 spid57      Zeroing T:\SQLData\MSSQL10.MSSQLSERVER\MSSQL\Data\templog.ldf from page 524319 to 1048576 (0x10003e000 to 0x200000000)
2010-05-13 19:21:49.54 spid57      Zeroing completed on T:\SQLData\MSSQL10.MSSQLSERVER\MSSQL\Data\templog.ldf

Much better, so now the begging question is, why isn’t the log file zero initialized in its entirety at startup?  I can’t speak with complete certainty on this, but I would guess that it has to do with the fact that the tempdb transaction log is never used for crash recovery, so it doesn’t really matter that the space isn’t zero initialized.  Paul Randal explains in his blog post, In Recovery… | Search Engine Q&A #24: Why can’t the transaction log use instant initialization?, how the parity bits are used during crash recovery to identify where recovery should stop processing log records.  Perhaps full zero initialization is skipped for the tempdb log at startup because the log is never used for crash recovery, but that doesn’t explain why the log, when grown, does perform full zero initialization, unless it is due to the way that the log can wrap around, for example based on the last image above (VLF Usage After Log Reuse), if the log space continues to be used without truncation when the log gets back to FSeqNo 29, the log will have to grow causing the allocation to become non-sequential since FSeqNo 30-35 are still active.  Maybe someone else will explain the reason why the tempdb log has to be zero initialized when grown but not at startup better.

EDIT:

After posting this, I jumped back over to twitter and saw some interesting comments from Brent Ozar (Blog | Twitter) that made me go back and test the impact of deleting the tempdb files from disk and then starting the instance up.  When I did this, the entire file was zero initialized

2010-05-13 20:21:06.21 spid12s     Clearing tempdb database.
2010-05-13 20:21:06.23 spid12s     Zeroing T:\SQLData\MSSQL10.MSSQLSERVER\MSSQL\Data\templog.ldf from page 0 to 1048576 (0x0 to 0x200000000)
<……….skipped content…………>
2010-05-13 20:22:22.81 spid12s     Zeroing completed on T:\SQLData\MSSQL10.MSSQLSERVER\MSSQL\Data\templog.ldf
2010-05-13 20:22:22.84 spid12s     Starting up database ‘tempdb’.
2010-05-13 20:22:22.85 spid12s     FixupLogTail(progress) zeroing T:\SQLData\MSSQL10.MSSQLSERVER\MSSQL\Data\templog.ldf from 0x5000 to 0x6000.
2010-05-13 20:22:22.85 spid12s     Zeroing T:\SQLData\MSSQL10.MSSQLSERVER\MSSQL\Data\templog.ldf from page 3 to 483 (0x6000 to 0x3c6000)
2010-05-13 20:22:22.86 spid12s     Zeroing completed on T:\SQLData\MSSQL10.MSSQLSERVER\MSSQL\Data\templog.ldf

So the answer is YES the tempdb transaction log is completely initialized when it is first physically created, but after that, its not zero initialized entirely as the instance starts up.  There is a definite difference in startup times on my server when I deleted the files. 

With Files

2010-05-13 18:42:12.54 Server      Microsoft SQL Server 2008 (SP1) – 10.0.2757.0 (X64)
2010-05-13 18:42:31.04 spid8s      Recovery is complete. This is an informational message only. No user action is required.

Without Files

2010-05-13 20:21:05.49 Server      Microsoft SQL Server 2008 (SP1) – 10.0.2757.0 (X64)
2010-05-13 20:22:22.97 spid9s      Recovery is complete. This is an informational message only. No user action is required.

When the files preexist, instance startup only took 19 seconds, without the files it took 1 minute and 17 seconds.  This actually makes the zero initialization during log growth make sense.  I’d say a fair bit of my misunderstanding of this is the way tempdb is often referred to be as being recreated at restart.  Its not actually recreated based on these tests, but it is cleared, as show by the “Clearing tempdb database.” log entries.  Interesting stuff.

ANOTHER EDIT:

So after thinking about this some more, curiosity got the best of me and I wanted to know what would happen if the size of the tempdb log file on disk was different from the size configured for tempdb.  To test this, I used ALTER DATABASE to change the size of the log file from 8GB to 1GB and restarted SQL Server.  This reduced the size of the log file on startup from 8GB to 1GB and the log showed that it didn’t zero initialize the entire file:

2010-05-13 21:20:45.92 spid12s     Clearing tempdb database.

<……….skipped content…………>
2010-05-13 21:21:02.59 spid12s     Zeroing T:\SQLData\MSSQL10.MSSQLSERVER\MSSQL\Data\templog.ldf from page 2 to 17 (0x4000 to 0x22000)
2010-05-13 21:21:02.59 spid12s     Zeroing completed on T:\SQLData\MSSQL10.MSSQLSERVER\MSSQL\Data\templog.ldf
2010-05-13 21:21:02.62 spid12s     Starting up database ‘tempdb’.
2010-05-13 21:21:02.63 spid12s     FixupLogTail(progress) zeroing T:\SQLData\MSSQL10.MSSQLSERVER\MSSQL\Data\templog.ldf from 0x5000 to 0x6000.
2010-05-13 21:21:02.63 spid12s     Zeroing T:\SQLData\MSSQL10.MSSQLSERVER\MSSQL\Data\templog.ldf from page 3 to 483 (0x6000 to 0x3c6000)
2010-05-13 21:21:02.64 spid12s     Zeroing completed on T:\SQLData\MSSQL10.MSSQLSERVER\MSSQL\Data\templog.ldf

In addition to this, the startup time was fast, taking only 17 seconds to complete recovery of the instance.

2010-05-13 21:20:45.23 Server      Microsoft SQL Server 2008 (SP1) – 10.0.2757.0 (X64)
2010-05-13 21:21:02.78 spid9s      Recovery is complete. This is an informational message only. No user action is required.

So I shut down SQL Server and renamed the log file to templog_small.ldf and then restarted SQL.  Once recovery completed, I grew the transaction log back to 8GB and once again shutdown SQL Server.  Then I renamed the current 8GB log file to templog_big.ldf and renamed templog_small.ldf to templog.ldf, replacing the 8GB log file with a 1GB log file.  SQL Server recognized the change during startup, and once again zero initialized the entire log.

2010-05-13 21:25:11.72 spid12s     Clearing tempdb database.
2010-05-13 21:25:11.73 spid12s     Zeroing T:\SQLData\MSSQL10.MSSQLSERVER\MSSQL\Data\templog.ldf from page 0 to 1048576 (0x0 to 0x200000000)
<……….skipped content…………>
2010-05-13 21:26:17.95 spid12s     Zeroing completed on T:\SQLData\MSSQL10.MSSQLSERVER\MSSQL\Data\templog.ldf
2010-05-13 21:26:28.05 spid12s     Zeroing T:\SQLData\MSSQL10.MSSQLSERVER\MSSQL\Data\templog.ldf from page 2 to 17 (0x4000 to 0x22000)
2010-05-13 21:26:28.05 spid12s     Zeroing completed on T:\SQLData\MSSQL10.MSSQLSERVER\MSSQL\Data\templog.ldf
2010-05-13 21:26:28.07 spid12s     Starting up database ‘tempdb’.
2010-05-13 21:26:28.09 spid12s     FixupLogTail(progress) zeroing T:\SQLData\MSSQL10.MSSQLSERVER\MSSQL\Data\templog.ldf from 0x5000 to 0x6000.
2010-05-13 21:26:28.09 spid12s     Zeroing T:\SQLData\MSSQL10.MSSQLSERVER\MSSQL\Data\templog.ldf from page 3 to 483 (0x6000 to 0x3c6000)
2010-05-13 21:26:28.10 spid12s     Zeroing completed on T:\SQLData\MSSQL10.MSSQLSERVER\MSSQL\Data\templog.ldf

This once again impacted the time required to recover the instance.

2010-05-13 21:25:11.01 Server      Microsoft SQL Server 2008 (SP1) – 10.0.2757.0 (X64)
2010-05-13 21:26:28.23 spid9s      Recovery is complete. This is an informational message only. No user action is required.

So that leaves me asking, does SQL Server really recreate tempdb from the model database every time it starts?  According to KB Article 307487 :

“When SQL Server starts, the tempdb is re-created by using a copy of the model database and is reset to its last configured size.”

It would certainly seem as if this statement is wrong based on testing.  What I find the most interesting is if I reverse the process and replace a 1GB log file with a previously created 8GB log file, SQL Server doesn’t zero initialize the 8GB file, it just shrinks it back to 1GB.  It only performs the zero initialization as a part of having to grow the preexisting log file during instance startup.