TSQL Tuesday #11 – Misconceptions – The Tempdb Log File and VLF Counts

T-SQL Tuesday

This post is part of the monthly community event called T-SQL Tuesday started by Adam Machanic (blog|twitter) and hosted by someone else each month. This month the host is Sankar Reddy (blog|twitter) and the topic is Misconceptions in SQL Server. You can follow posts for this theme on Twitter by looking at #TSQL2sDay hashtag.

It is fairly common knowledge these days that proper Transaction Log management includes properly sizing the log file so that the number of VLF’s is balanced with their size for optimum performance.  If this is a foreign concept or news to you, I’d recommend that you jump over to SQLSkills and read Kimberly Tripp’s blog posts on the subject.

Transaction Log VLFs – too many or too few?

Back in July, Crys Manson (Blog|Twitter) asked about the importance of VLF count on the TempDB Log, and a conversation ensued about what the impact of tempdb recreation would be on the VLF count for the tempdb log file.

@crysmansonDoes # of VLFs matter as much with tempdb log? Recommend reading? If you grow out tempdb (for VLF) everything is lost on restart. #sqlhelp

@SQLSarg

@crysmanson if you just let tempdb autogrow out then it is reset to the configured size at startup #sqlhelp

@SQLSoldier

@SQLSarg So the question is will it create new VLF’s? #sqlhelp

Not knowing the answer, I did some quick testing to see, and found that when the instance restarts, the number of VLF’s inside the tempdb log file is reset based on the size of the file, following the information contained in Kimberly’s blog post.  If the initial size of the tempdb log file is 1GB or larger, it is evenly divided into 16 VLF’s regardless of how large the file and resulting VLF’s might be.  To validate this, I wrote a Powershell script to test various sizes for the tempdb log, following the prescribed best practices for VLF sizing from Kimberly’s post both before and after restart.

[System.Reflection.Assembly]::LoadWithPartialName("Microsoft.SqlServer.SMO")  | Out-Null

function ResetTest ([int]$size)
{
    $log.Refresh();

    if ($log.Size -gt $size)
    {
        $log.set_Size($size);
        $log.Alter();
        Restart-Service -Force -Name "MSSQLSERVER" -WarningAction SilentlyContinue |Out-Null
    }
}

function RunTest ([string]$testnumber, [int]$setsize, [int]$growsize)
{
    $log.Refresh();
    $size = $log.Size;
    while ($size -lt $setsize)
    {
        $log.set_Size(($size + $growsize));
        $log.Alter();
        $log.Refresh();
        $size = $log.Size;
    }
    
    $vlfbefore = $tempdb.ExecuteWithResults("DBCC LOGINFO").Tables[0].Rows.Count
    
    Restart-Service -Force -Name "MSSQLSERVER" -WarningAction SilentlyContinue | Out-Null
    
    Start-Sleep -Seconds 60 #Wait for Instance Recovery to Complete 
    
    $tempdb.Refresh();

    $vlfafter = $tempdb.ExecuteWithResults("DBCC LOGINFO").Tables[0].Rows.Count
        
    $results = New-Object PSObject -Property @{            
        TestNumber                = $testnumber                 
        LogFileSize               = $log.Size              
        VLFBeforeRestart         = $vlfbefore
        VLFAfterRestart          = $vlfafter}    

    $results | select TestNumber, LogFileSize, VLFBeforeRestart, VLFAfterRestart
}
        
cls

$smosvr = New-Object ('Microsoft.SqlServer.Management.Smo.Server') "(local)"
$tempdb = $smosvr.Databases['tempdb']
$log = $tempdb.LogFiles.Item(0);

ResetTest 8387584
RunTest "16GB Log" 16775168 8387584
ResetTest 8387584
RunTest "24GB Log" 25162752 8387584
ResetTest 8387584
RunTest "32GB Log" 33550336 8387584

Regardless of how you manually grew the log file to set appropriate VLF sizes, at server restart the log is always reset to 16 VLF’s of equal size, which is not necessarily ideal according to additional discussion that occurred on Twitter.  However, there is not much that you can do about this, aside from setting the log file to 8GB for its initial size and then setting a reasonable size for Autogrowth so that it can grow if necessary, and still minimize the VLF count which once again may not be ideal for all environments.

Bug in SQL 2008: Why you shouldn’t create large tables in the Model Database

If you follow me on Twitter (@SQLPoolBoy) you probably saw tweets about a bug I encountered in SQL Server 2008 recently.  What was hoped to be a normal Monday morning started off with alerts from a new production server that DBCC CHECKDB had failed for our new Sharepoint 2010 primary content database.  Thankfully, the Sharepoint 2010 portal was still in pre-production configuration and validation, so while I had a problem it wasn’t necessarily a major one.  When I started looking at the issue, one item that struck me as odd was, the CHECKDB job didn’t output error information when it was running against this database, it just failed, and terminated the connection, so in order to determine what exactly was wrong I had to rerun DBCC CHECKDB WITH ALL_ERRORMSGS, NO_INFOMSGS on the database again.  Since we moved the content database from our Dev/Test server where it was initially configured to the production server on Friday, I decided to run CHECKDB against the Dev/Test server while CHECKDB ran again on the production server so I could help determine if the problem pre-existed the migration, or was new to the production server.

Of course it being a Monday, both of the CHECKDB runs failed exactly the same way, without providing information as to where the corruption in the database existed.  So on the test server, I placed the database into SINGLE_USER mode and reran CHECKDB.  The output was similar to the following:

Msg 824, Level 24, State 2, Line 1

SQL Server detected a logical consistency-based I/O error: incorrect pageid (expected 1:691341; actual 0:0). It occurred during a read of page (1:691341) in database ID 25 at offset 0x0000015191a000 in file ‘D:\SQLData\MSSQL10.MSSQLSERVER\MSSQL\DATA\Test.mdf’. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

The same week that we moved this content database from our development to or anticipated production environment, we had a disk failure in our RAID 4 array, so I immediately suspected that we might have a problem with one of our other disks, so I ran disk diagnostics on the development server but found nothing wrong.  I was about to toss in the towel and admit failure as a DBA when I tried rebuilding indexes as a last hope to see if possible the corruption was index related and I ran into a Numbers table in the Sharepoint content Database.  Knowing that this isn’t supposed to be there, my memory kicked in and I realized I had created this in model as a part of my previous blog posts on Tempdb, Does tempdb Get Recreated From model at Startup and Does the tempdb Log file get Zero Initialized at Startup? As a followup to those posts I decided to get really stupid and create a massive Numbers table in model to see if I could significantly impact instance startup by doing something stupid, only I never got around to playing further since it took a while to create the Numbers table seeded with the maximum value for an interger column.

So I dropped it from the content database and immediately it became corruption free.  So I ran CHECKDB on model and it was fine, ran a SELECT * FROM Numbers and it returned from model.  Try it in another Sharepoint database and I got the exact same error.  I ran a reindex of the other Sharepoint database on the development server and it failed once again on the Numbers table, and once again, dropping this table from the database cleared the issue, so it is related to this oversized Numbers table that I created in model.

At this point I decided to create a connect feedback for the issue and solicit other members of the community for verification that this is in fact a problem inside of SQL Server 2008.  The details are available on the following link:

https://connect.microsoft.com/SQLServer/feedback/details/569843/database-corruption-from-model-database-with-large-table

I have been able to repro this on Windows Server 2008 R2 x64 and Windows Server 2008 x64with SQL Server 2008 x64 SP1 and SP1 + various CU’s installed, as well as on Windows Server 2008 x86 with SQL Server 2008 SP1 x86 installed.  However, I couldn’t repro it SQL Server 2005 on any platform.

Here is the entire repro script from the Connect Feedback (Be forewarned, this will break your SQL Instance until you drop the Numbers table from Model):

USE [master]
GO
ALTER DATABASE model SET RECOVERY FULL
GO
USE [model]
GO
— Suppress data loading messages
SET NOCOUNT ON
— Create Sample Data using a Table Varable
SELECT TOP 2147483647 IDENTITY(INT,0,1) AS N
INTO Numbers
FROM sysobjects a, sysobjects b, sysobjects c, sysobjects d, sysobjects e
— Create a Primary Key and Clustered Index to control data order
ALTER TABLE dbo.Numbers ADD CONSTRAINT
PK_Numbers PRIMARY KEY CLUSTERED (N)
USE master;
GO
ALTER DATABASE [model] SET RECOVERY SIMPLE
GO
USE [model]
GO
DBCC SHRINKFILE (N’modellog’ , 0, TRUNCATEONLY)
GO
USE [model]
GO
DBCC SHRINKFILE (N’modeldev’ , 6144)
GO
USE [master];
GO
CREATE DATABASE Test
GO
USE [Test]
GO
SELECT * FROM Numbers
GO

I also sent this to a good friend of mine at Microsoft, Bob Ward, who graciously took the time to run the repro script and validate that this is in fact a form of bug in SQL Server 2008.  He is still looking into it, but I would suspect that since it requires a 4GB+ table to be created in the model database, and the corruption problem has a valid workaround of dropping the table from model and any user database created based on the bloated model database, that this doesn’t necessarily warrant a Critical fix priority.  What is really interesting with this bug is that smaller tables, 1-2GB inside of model don’t trigger the same error.  I seems to be size based, with really large objects in model actually being the root cause of the problem.

I have been keeping in touch with Bob since he first replied to my email regarding this, and he is still looking into the reported bug.  One certainly has to question what kind of person would actually create a 4GB table in the model database of SQL Server, knowing that this object would be copied to every database created on the server after it was created (wait a minute, I actually did this).  As a note, the tempdb database on the server also reported corruption issues when it was restarted with this large object in the model database, and there was an impact to startup time by restarting the instance with this object in the model database (potentially a topic for another blog post as originally intended).

Does tempdb Get Recreated From model at Startup?

In my last post Does the tempdb Log file get Zero Initialized at Startup? I questioned whether or not tempdb is actually created from the model database or not at startup.  There is actually an easy way to prove that this statement, at least internally to the tempdb database is in fact TRUE.  Many thanks go out to Bob Ward (Blog | Twitter) for pointing this out after trading emails with him.

To validate that tempdb is actually copied at startup from the model database, all that is necessary is to make a change to the model database, and restart the SQL Server database instance to validate that the change is in fact reflected in the tempdb database after instance recovery completes.  To do this, I created a table in the model database and then created a row in the table using the following script:

USE [model]
GO
CREATE TABLE [SomeTable]
(RowID INT IDENTITY PRIMARY KEY)
GO
INSERT INTO [SomeTable] DEFAULT VALUES
GO
SELECT *
FROM [SomeTable]

Once this query has been run, and the instance has been restarted. to prove that the contents of the model database have in fact been copied into tempdb, all that is needed is to query the SomeTable table from tempdb. 

USE [tempdb]
GO
SELECT *
FROM [SomeTable]

This will return one row which has a RowID of 1 which matches the information previously created in model, proving that tempdb, at least internally is recreated from model, inline with the KB article previously mentioned (See: KB 307487).

This still leaves a few questions about how tempdb is actually allocated when the physical files already exist, but proves that my previous statement is false:

So that leaves me asking, does SQL Server really recreate tempdb from the model database every time it starts?  According to KB Article 307487 :

“When SQL Server starts, the tempdb is re-created by using a copy of the model database and is reset to its last configured size.”

It would certainly seem as if this statement is wrong based on testing.

The tempdb database is certainly created at least internally from a copy of the model database, but the question about how the physical file allocations occur still stands.