SQL Server 2016 Upgrade Testing with the New Cardinality Estimator: Context Matters

This week I’ve been working with SQL Server 2016 and it’s been a lot of fun – this is truly a great release from the SQL Server team.  Yesterday I was working on some upgrade testing with regard to the new Cardinality Estimator (CE), introduced in SQL Server 2104 and well explained in this whitepaper written by my friend Joe Sack.  If you are upgrading to SQL Server 2014 or 2016, you should test not just your upgrade process, but also the performance of your queries with the new CE.  In some cases we have seen good/great improvements in query performance with the new CE, but in other cases we have seen significant regressions which dramatically affect overall system performance.

There are a variety of things to look for when testing beyond just query duration and resource use – you should also be looking at estimates and plan shape, which means capturing the execution plan.  I was doing just that yesterday, and in my testing I was changing the compatibility mode for the user database and running my queries with the different CEs.  I was then confirming the CE version used for the query by looking at the plan and checking the CardinalityEstimationModelVersion attribute.  Remember that CE version is tied to the compatibility mode, and the CE that will be used is tied to the current database context.  So if you’re running cross database or distributed queries, this is something to which you need to pay attention.

Consider this scenario…

I have a new installation of SQL Server 2016, and I restore my user database to the instance.  By default the compatibility is not changed unless the database you’re restoring has a compatibility level of 90 (remember you can upgrade from SQL Server 2005+ to any other version).  In that case, it will automatically be bumped up to 100 for both SQL Server 2014 and SQL Server 2016.  For more details on supported compatibility levels for versions and the differences between the levels, see ALTER Database Compatibility Level.

Because this is a new installation of SQL Server 2016, the system databases have compatibility level of 130.  I restore two user databases to my instance (previously running on SQL Server 2012), call them AdventureWorks and WideWorldImporters.  I leave the compatibility level for AdventureWorks at 110, and set it to 130 for WideWorldImporters.  I then start running queries, some of which are specific to a database, others which query both databases.

In the case where I’m in the context of AdventureWorks, the old CE version used – even when I query the WideWorldImporters database.  And if I’m in the WideWorldImporters database, querying across to AdventureWorks, the new CE is used.  Here’s setup code:

/*
	Restore the databases
	(change locations as appropriate)
*/
USE [master];
GO

RESTORE DATABASE [WideWorldImporters]
	FROM  DISK = N'C:\Backups\WideWorldImporters-Full.bak'
	WITH  FILE = 1,
	MOVE N'WWI_Primary' TO N'C:\Databases\WideWorldImporters\WideWorldImporters.mdf',
	MOVE N'WWI_UserData' TO N'C:\Databases\WideWorldImporters\WideWorldImporters_UserData.ndf',
	MOVE N'WWI_Log' TO N'C:\Databases\WideWorldImporters\WideWorldImporters.ldf',
	MOVE N'WWI_InMemory_Data_1' TO N'C:\Databases\WideWorldImporters\WideWorldImporters_InMemory_Data_1',
	NOUNLOAD,
	REPLACE,
	STATS = 5;
GO

RESTORE DATABASE [AdventureWorks2016]
	FROM  DISK = N'C:\Backups\AW2014_Base.bak'
	WITH  FILE = 1,
	MOVE N'AdventureWorks2014_Data' TO N'C:\Databases\AdventureWorks2016\AdventureWorks2016_Data.mdf',
	MOVE N'AdventureWorks2014_Log' TO N'C:\Databases\AdventureWorks2016\AdventureWorks2016_Log.ldf',
	NOUNLOAD,
	REPLACE,
	STATS = 5;
GO

/*
	Set the compatibility levels
*/
USE [master];
GO
ALTER DATABASE [WideWorldImporters] SET COMPATIBILITY_LEVEL = 130;
GO
ALTER DATABASE [AdventureWorks2016] SET COMPATIBILITY_LEVEL = 110;
GO

First, run the query in the context of WideWorldImporters, with the execution plan displayed:

USE [WideWorldImporters];
GO

SET STATISTICS XML ON;
GO

SELECT
	[o].[CustomerID],
	1.[StoreID],
	[o].[OrderDate],
	[ol].[StockItemID],
	[ol].[Quantity],
	[ol].[UnitPrice]
FROM [WideWorldImporters].[Sales].[Orders] [o]
JOIN [WideWorldImporters].[Sales].[OrderLines] [ol] on [o].[OrderID] = [ol].[OrderID]
JOIN [AdventureWorks2016].[Sales].[Customer] 1 ON 1.[CustomerID] = [o].[CustomerID]
WHERE [o].[OrderDate] BETWEEN '2016-05-01' AND '2016-05-31'
ORDER BY [o].[OrderDate] DESC;
GO

SET STATISTICS XML OFF;
GO

If you open up the graphical plan, and display the Properties window (F4) and then click on the SELECT operator in the plan, I can see that the CardinalityEstimationModelVersion is 130 (you can also view the XML and do a search to find it there):

Execution plan using the new CE

Execution plan using the new CE

Now run the query in the context of AdventureWorks, again with the execution plan displayed:

USE [AdventureWorks2016];
GO

SET STATISTICS XML ON;
GO

SELECT
	[o].[CustomerID],
	1.[StoreID],
	[o].[OrderDate],
	[ol].[StockItemID],
	[ol].[Quantity],
	[ol].[UnitPrice]
FROM [WideWorldImporters].[Sales].[Orders] [o]
JOIN [WideWorldImporters].[Sales].[OrderLines] [ol] on [o].[OrderID] = [ol].[OrderID]
JOIN [AdventureWorks2016].[Sales].[Customer] 1 ON 1.[CustomerID] = [o].[CustomerID]
WHERE [o].[OrderDate] BETWEEN '2016-05-01' AND '2016-05-31'
ORDER BY [o].[OrderDate] DESC;
GO

SET STATISTICS XML OFF;
GO

This time when you look at the graphical plan, the CardinalityEstimationModelVersion is 70, indicating it’s using the old CE:

Execution plan using the old CE

Execution plan using the old CE

In this example, I have users connecting to both databases, and they can be querying one database or both.  Note that if you happen to keep users in the context of one database (e.g. a user database that has no data in it, but you use as a “gateway” to other database), then the compatibility level for that database will determine what CE the queries use.  The exception to all this, of course, is the CE-related traceflags.  You can override the compatibility level and force the optimizer to use a specific CE by using one of two trace flags:

  • Use Trace Flag 9481 to revert to the legacy CE behavior from the context of a database with a compatibility level of 120 or higher
  • Use Trace Flag 2312 to enable to the new CE from the context of a database with a compatibility level below 120

Therefore, when you’re testing your upgrade to SQL Server 2014 or 2016, take the time to look beyond the basics.  Upgrading is not just about making sure nothing breaks, it’s also about making sure that performance doesn’t degrade, even if you don’t change hardware (especially if you don’t change hardware!).  We don’t typically expect that upgrading to a newer software version would negatively affect performance, but because the change the SQL Server’s Cardinality Estimator is significant, this is one you definitely want to test.

 

Mapping Events from Trace to Extended Events

Every time I talk about migrating from Profiler and Trace to Extended Events (XE), I seem to add something to my presentation.  It’s not always intentional, and I don’t know if it’s because I just can’t stop tweaking my demos and contents, or something else, but here in London at IEPTO2 this week, I added a query that helps you see what event in Extended Events is comparable to the event you’re used to using in Trace/Profiler.  While most of the events in XE have a name similar to the event in Trace (e.g. sp_statement_completed for SP:StmtCompleted), some mappings aren’t so intuitive.  For example, SP:Starting in Trace is module_start in XE, and SP:Completed in Trace is module_end in XE.  That’s not so bad, but if you’re ever monitoring file growths, note that the database_file_size_change event in XE is the event for the following four events in trace: Data File Auto Grow, Data File Auto Shrink, Log File Auto Grow, and Log File Auto Shrink.

(Note: if you happen to notice databases_data_file_size_changed or databases_log_file_size_changed in the list of XE, do not use either event.  Both are deprecated and won’t return any information.  Use database_file_size_change.)

Anyway, those fun one-offs aside, you’ll see that the query and its output are pretty straightforward:

SELECT 
	[t].[trace_event_id] [TraceEventID], 
	[t].[name] [Trace_Event], 
	CASE	
		WHEN [t].[name] LIKE '%Audit%' AND [xe].[xe_event_name] IS NULL THEN 'This event is not implemented in Extended Events, it may be a Server Audit Event'
		WHEN ([t].[name] LIKE '%Audit Lo%' OR [t].[name] LIKE'%Audit Server%') AND [xe].[xe_event_name] IS NOT NULL THEN 'The XE Event [' + [xe].[xe_event_name] + '] exists, but use Server Audit for this event instead'
		WHEN [xe].[xe_event_name] IS NULL THEN 'No comparable XE Event'
		ELSE [xe].[xe_event_name]
	END	[XE_Event],
	[xe].[package_name] [XE_Package]
FROM [sys].[trace_events] [t] 
LEFT OUTER JOIN [sys].[trace_xe_event_map] [xe] ON [t].[trace_event_id] = [xe].[trace_event_id]
ORDER BY [t].[name] ;
GO 

You can run this on SQL Server 2012 or higher.  You could run it on SQL Server 2008/2008R2, but remember that there was not a comparable event in Extended Events for every event in Trace until SQL Server 2012.  Thus you will have some unmapped events in the 2008/2008R2 releases.

Something that you’ll notice in the query: I’m accounting for Audit events.  The  Audit Feature was introduced in SQL Server 2008 but was Enterprise Edition only.  In SQL Server 2012, Server Audit is supported in all editions, so all Audit events from Trace should instead be captured with Server Audit (which leverages Extended Events behind the scenes).

Finally, don’t forget that if you’re looking to take an existing Trace and convert it to Extended Events, you can use Jonathan’s stored procedure to get the event session DDL.

Happy Extended Eventing!

 

SQL Trace versus Extended Events: How long does it take to startup a trace vs. an event session?

Yesterday I asked how everyone uses Trace and Profiler…  Jonathan and I have talked about creating a library of Extended Events script to help people migrate off of Profiler and Trace and that post was designed to generate some ideas about what people might find useful.

I haven’t gotten a ton of comments, but I did get a few (thank you to those have responded!), and I decided to take one of them and create a Trace and create an Extended Events session and see how long it took for each.  Jonathan has mentioned before that he can create an XE session as fast as a Trace, and I’ve been thinking that I can as well so I thought I’d test it.  It’s a straight-forward Trace versus Extended Events test.  Want to see what’s faster?  Watch the video here.

 

Tell me how you use Profiler and Trace

Back in April I wrote a post asking why people tend to avoid Extended Events.  Many of you provided feedback, which I greatly appreciated.  A common theme in the responses was time.  Many people found that creating an event session in Extended Events that was comparable to one they would create in Trace took longer.  Ok, I get that…so as a follow up I am interested in knowing how you typically use Profiler and Trace.  If you would be willing to share that information, I would be extremely grateful.  At a minimum, I’d like to know what events you add, what columns you select for those events, what filters you add, and then how you view the data (e.g. do you run Profiler and watch the data live, or do you script it out as a server side trace and write it to a file, then view it later for analysis?).  If it’s easiest to post the trace definition (i.e. export it from the Profiler UI) that works too.  Again, any information you can share in the comments (or if you want to send me an email that’s great too) would be greatly appreciated.  My goal is to create a library of event session definitions for these “typical” traces that people create and share them publicly.  Stay tuned for another post once I (hopefully!) start getting some information.  Thanks again!

SQL Server Query Store: Default Settings

Last week I wrote a quick post about storage settings for Query Store.  Today I’m going to take a moment to discuss the default settings that are applied when you initially enable Query Store.  I’m not going go through all the settings and what they mean, you can find that information here.  But I do want to discuss from where they originate.

When you enable Query Store through the UI, you have the option to change any of options (e.g. Max Size (MB), Size Based Cleanup Mode).  The values for the settings are pulled from model, just like other database defaults (Recovery Model, Compatability Mode, etc.).  Now, what’s interesting is that you can change these defaults in model, but only using T-SQL.  In fact, you can only enable Query Store for model and msdb using TSQL, the Query Store page is not available for either database when you expand Databases, then System Database, and right-click on the database and select Properties.

The Query Store documentation specifically states that it cannot be enabled for master and tempdb, which leads us to believe we can enable it for model and msdb.  Now, I haven’t figured out a use case for either (I have yet to troubleshoot query performance against model or msdb), but I figured I’d try it for both.  I started with model, just using the basic syntax which would pull the default values for the options:

USE [master];
GO
ALTER DATABASE [model] SET QUERY_STORE = ON;
GO
ALTER DATABASE [model] SET QUERY_STORE (OPERATION_MODE = READ_WRITE);
GO

Then I checked to confirm the values that were set:

USE [model];
GO
SELECT actual_state_desc, readonly_reason, desired_state_desc, current_storage_size_mb,
max_storage_size_mb, flush_interval_seconds, interval_length_minutes,
stale_query_threshold_days, size_based_cleanup_mode_desc,
query_capture_mode_desc, max_plans_per_query
FROM sys.database_query_store_options;
GO
Output from sys.database_query_store_options for model

Output from sys.database_query_store_options for model

My output was empty.  Huh?

I tested msdb.

USE [master];
GO
ALTER DATABASE [msdb] SET QUERY_STORE = ON;
GO
ALTER DATABASE [msdb] SET QUERY_STORE (OPERATION_MODE = READ_WRITE);
GO
USE [msdb];
GO
SELECT actual_state_desc, readonly_reason, desired_state_desc, current_storage_size_mb,
max_storage_size_mb, flush_interval_seconds, interval_length_minutes,
stale_query_threshold_days, size_based_cleanup_mode_desc,
query_capture_mode_desc, max_plans_per_query
FROM sys.database_query_store_options;
GO

This time I had output:

Output from sys.database_query_store_options for msdb

Output from sys.database_query_store_options for msdb

Ok.  So…maybe I can only use model to change default values.  Let’s see if that works.  I edited the Query Store settings for model:

USE [master];
GO
ALTER DATABASE [model] SET QUERY_STORE (
OPERATION_MODE = READ_WRITE,
CLEANUP_POLICY = (STALE_QUERY_THRESHOLD_DAYS = 78),
DATA_FLUSH_INTERVAL_SECONDS = 900,
INTERVAL_LENGTH_MINUTES = 60,
MAX_STORAGE_SIZE_MB = 2048,
QUERY_CAPTURE_MODE = ALL,
SIZE_BASED_CLEANUP_MODE = AUTO,
MAX_PLANS_PER_QUERY = 200);
GO

Then I created a new database and enabled Query Store:

CREATE DATABASE [QSTest]
CONTAINMENT = NONE
ON  PRIMARY
( NAME = N'QSTest', FILENAME = N'C:\Databases\QSTest.mdf' , SIZE = 8192KB , FILEGROWTH = 65536KB )
LOG ON
( NAME = N'QSTest_log', FILENAME = N'C:\Databases\QSTest_log.ldf' , SIZE = 8192KB , FILEGROWTH = 65536KB );
GO
USE [master];
GO
ALTER DATABASE [QSTest] SET QUERY_STORE = ON;
GO
ALTER DATABASE [QSTest] SET QUERY_STORE (OPERATION_MODE = READ_WRITE);
GO

I checked sys.database_query_store_options:

Output from sys.database_query_store_options for QSTest user database

Output from sys.database_query_store_options for QSTest user database

It definitely picked up the changes I made to model.

So…if I can’t see the settings for model in sys.database_query_store_options, then is Query Store really enabled?  I ran a few queries against model, then checked the Query Store catalog views:

USE [model];
GO
SELECT *
FROM sys.all_objects;
GO
SELECT *
FROM sys.all_columns;
GO
SELECT *
FROM sys.database_files;
GO
SELECT qst.query_text_id, qsq.query_id, qsq.object_id, qsq.count_compiles, qst.query_sql_text, qst.statement_sql_handle, qsq.query_hash
FROM sys.query_store_query_text qst
JOIN sys.query_store_query qsq ON qst.query_text_id = qsq.query_text_id;
GO

Guess what?  There was no data in the Query Store tables.  So…let’s repeat the test against msdb.

USE [msdb];
GO
SELECT *
FROM sys.all_objects;
GO
SELECT *
FROM sys.all_columns;
GO
SELECT *
FROM sys.database_files;
GO
SELECT qst.query_text_id, qsq.query_id, qsq.object_id, qsq.count_compiles, qst.query_sql_text, qst.statement_sql_handle, qsq.query_hash
FROM sys.query_store_query_text qst
JOIN sys.query_store_query qsq ON qst.query_text_id = qsq.query_text_id;
GO

In this case, there WAS data in the Query Store tables.

So, what does this mean?  That you can only truly enable and use Query Store against one system database: msdb.  It cannot be enabled for master and tempdb (it throws an error: Msg 12438, Level 16, State 1, Line 19
Cannot perform action because Query Store cannot be enabled on system database tempdb.) and while it can be enabled for model, it’s only for the purposes of changing the default settings for Query Store for new databases on this instance (I’ve confirmed this with the SQL Server team).  Is that something you want to do?  Perhaps…but I can do those on a per-database case when I’m initially enabling Query Store, or later on if so desired. I’m always wary of changing settings in model, though I do change Recovery Model to SIMPLE so I don’t end up with a ridiculously large transaction log…the default is FULL and if someone doesn’t setup transaction log backups then that log is going to grow until it consumes all space on the disk.  I also set Auto Growth values to fixed, reasonable numbers.

For each database where I enable Query Store, I’d consider the workload and then look at the settings.  I tend to think that the default value of 100MB for MAX_STORAGE_SIZE_MB is really low, and I would be inclined to bump up STALE_QUERY_THRESHOLD_DAYS from 30 to something a bit higher.  I’d also probably drop  DATA_FLUSH_INTERVAL_SECONDS to something lower than 900 seconds (15 minutes) if my storage can support it.  This setting determines how often Query Store data is flushed to disk.  If it’s every 15 minutes, then I could potentially lose 15 minutes of Query Store data if my server happened to crash before it could be written to disk.  I’d also think about changing INTERVAL_LENGTH_MINUTES to a value smaller than 60 if I wanted to aggregate my query data over a smaller amount of time.  Sometimes interesting events happen within a 60 minute time frame, and they get can lost when data is aggregated across that window.  However, aggregating more frequently means I’m adding processing overhead to the system – there’s a trade-off there to figure out.

The take home message…you probably don’t need to enable Query Store for any system database, but if you want to change the default values that user databases pick up when you enable it, you can do that using model.

SQL Server Query Store: Data Storage Options (and a Request!)

Microsoft recently announced that SQL Server 2016 RTM will be available June 1, 2016, and it was also confirmed that Query Store will be available in ALL editions of SQL Server.  This is pretty exciting, as there was some concern that it would be Enterprise-only.  I’ve done a couple sessions on Query Store this year, and the response to the feature has been fantastic.  If you haven’t checked it out yet, I highly recommend it (download 2016 RC3 here), and in this post I want to talk about the Query Store data – where it’s located and how it’s retained.

Query Store Data Storage

Enabling Query Store is pretty simple.  It’s done a per-database basis (note that master and tempdb cannot be enabled for Query Store) either through the UI or via T-SQL.  I first restored a copy of the AdventureWorks2016 database (download from here).   Interestingly enough, Query Store is already enabled for the database.  But if it were not, to enable it you right-click on the database and select Properties, go to the Query Store page, and then select Read Write for the Operation Mode (Requested).  This is the code that runs if you script it out from the UI:

USE [master]
GO
ALTER DATABASE [AdventureWorks2016] SET QUERY_STORE = ON
GO
ALTER DATABASE [AdventureWorks2016] SET QUERY_STORE (OPERATION_MODE = READ_WRITE)
GO

 

Multiple settings are then configured to the default values for Query Store – it’s analogous to running this code:

USE [master]
GO
ALTER DATABASE [AdventureWorks2016] SET QUERY_STORE = ON
GO
ALTER DATABASE [AdventureWorks2016] SET QUERY_STORE (
OPERATION_MODE = READ_WRITE,
CLEANUP_POLICY = (STALE_QUERY_THRESHOLD_DAYS = 30),
DATA_FLUSH_INTERVAL_SECONDS = 900,
INTERVAL_LENGTH_MINUTES = 60,
MAX_STORAGE_SIZE_MB = 100,
QUERY_CAPTURE_MODE = ALL,
SIZE_BASED_CLEANUP_MODE = AUTO,
MAX_PLANS_PER_QUERY = 200);
GO

 

The setting I want to point out today is MAX_STORAGE_SIZE.  Understand that with Query Store, the data is persisted *IN* the user database.  Which means that if you backup and restore the database to another instance (or detach and attach), the Query Store data goes with the database.  This is pretty handy.  You have the ability to set how much space can be used by Query Store within the user database with the MAX_STORAGE_SIZE setting.  The default value, 1024MB, is a good start.  Because workloads vary, and what data *you* might want to keep is different than what someone else want to keep, and there aren’t any recommendations as to what this size “should” be at this point.  It’s something you’re going to have test out in your environment – just like figuring out the “best” size for the transaction log for a database, or the size of tempdb for an instance.  Note that the more varied the workload (more unique plans), the more space you might need.  Remus Rusanu points out in a recent post that the SQL Server team vastly under-estimated the space needed for a client workload – it consumed the Query Store allocated space in minutes rather than months.  Disk space is pretty cheap – I would probably start with a few GB of space for Query Store and monitor how much is used with this query:

SELECT
actual_state_desc,
current_storage_size_mb,
max_storage_size_mb,
flush_interval_seconds,
interval_length_minutes,
size_based_cleanup_mode_desc
FROM sys.database_query_store_options;
GO

SQL Server can automatically start to purge data from the store based on how much of that allocated space is used.  This is determined by size_based_cleanup_mode_desc setting, which defaults to AUTO.  With AUTO set, SQL Server will start purging data when the space used gets close the max_storage_size_mb setting.  If you have this set to OFF, then when current_storage_size_mb is reached, Query Store will stop collecting data and actual_state_desc will switch to READ_ONLY.

 Using Query Store to Find Query Regressions or Improvements

Ok, so now we’re clear on those storage-related settings.  There’s more of a conversation to have about how you’re going to use this data to find regressed queries, etc…but that’s an entirely separate post.  I want you now to think about the value of this data and how can you use it to measure the impact of code changes, SQL Server patches and upgrades, hardware changes and so on.

When you restore a copy of your production database to your Test/Dev/QA/UAT environment (because you have at least one of those, right?), the Query Store data from production will come with it.  This is great because you then run your workload in that environment and do comparisons using the Query Store data.  Awesome.  But what about this scenario, proposed by an attendee of last week’s IEPTO2 class…

You have an automated process developed in-house to backup and restore production databases down to lower environments (like Test/Dev/QA/UAT) on a regular basis (daily, weekly).  Because the Query Store data lives in the user database, you actually want to clear the Query Store data once the database is restored in the lower environment.  Then you test new code, etc. against the database.

When the production data is restored again, all data collected during testing in Test/Dev/QA/UAT will be lost…unless there is a way to export it.  Meaning, let’s save off that Query Store generated from testing, before the refresh so I can do comparisons between THOSE tests.

Now, there are catalog views that allow you to view the Query Store data.  You can copy that data into another database using SELECT INTO, and then do comparisons, but wouldn’t it be nice to have some kind of export option?  There’s a Connect item for that:

Export Query Store tables separately from the database tables: https://connect.microsoft.com/SQLServer/feedback/details/2620017/export-query-store-tables-separately-from-the-database-tables

If you think this is something that would be useful for Query Store, please up vote it!  Again, Query Store is available in ALL editions of SQL Server 2016, this is definitely a feature you CAN use and will want to use!  This potential option won’t make it into RTM, but with the change in how SQL Server is releasing CUs, the more important the SQL Server team sees this option (as a result of votes), the faster it might make it in a release.

If you haven’t looked into Query Store, I definitely recommend doing so!  When I demo’d the feature last week, one of our attendees said that he would consider skipping 2014 and going straight to 2016, just for this feature alone.  A feature that’s available in EVERY Edition.

Why do YOU avoid Extended Events?

On Monday this week I had an interesting exchange on Twitter with a bunch of folks who are die-hard Profiler/Trace users, and have no interest in using Extended Events.  To wit:

Tweet from Denny about Profiler (used with permission)

Tweet from Denny about Profiler (used with permission)

Now, Denny and I are good friends, and his tweet didn’t upset me in any way, it just got me thinking.  Why are DBAs and developers so resistant to using Extended Events?  I have some theories, but I realized I should collect some data.

Therefore, whether you:

  • Have never tried Extended Events,
  • Have tried Extended Events but would rather keep using Profiler/Trace,
  • Use Extended Events occasionally but still prefer Profiler/Trace,

I want to hear from you.  Whatever the reason, I want to know – so please leave a comment below.  My goal is to understand what the main challenges are so that I can then provide options and solutions, or create Connect items for the product team to address any gaps in functionality.

Extended Events *is* the replacement for Profiler/Trace; it’s not going away.  I really want people to be prepared for the time when Profiler and Trace are removed from the product.  And I want to provide feedback to the SQL Server product team to address limitations that people find in Extended Events.  If the feature is lacking something, we need to work together to create change.

Thanks in advance for your help, and if you haven’t tried XE, or are looking for a refresher, then please attend my webinar next Tuesday, April 5th at 12PM EDT: Kicking and Screaming: Replacing Profiler with Extended Events.  I’d love to see you there and can help get you started with XE!

EDIT 2:37PM EDT: If you are still running 2008 or 2008R2, then our advice has always been to stick with Trace and Profiler.  If you’re running SQL Server 2012 and higher, then I recommended Extended Events.  Why?  Because it wasn’t until SQL Server 2012 that every event from Trace had a comparable event in Extended Events.  So,if your argument is that you don’t want to learn XQuery and XML because you’re on 2008 or 2008R2, I’m right there with you and will tell you that’s fine, wait until 2012 to use XE.

Instant File Initialization: Easier to Enable in SQL Server 2016 (and some updated numbers)

The ability to have SQL Server data files skip zero initialization when they are created or grown has been available since SQL Server 2005.  By default, when you create a new data file in SQL Server, or extend the size of an existing one, zeroes are written to the file.  Depending on the size of the file or its growth, and the type of storage, this can take a while.  With Instant File Initialization (IFI), space is allocated for the data file but no zeroes are written.  Prior to SQL Server 2016, to enable this feature you had to edit the Local Security Policy to give the account that runs the SQL Server service the “Perform volume maintenance tasks” right (from Start | Run, type secpol, within the Local Security Policy expand Local Policies, then User Rights Assignment).  This was a task that DBAs had to perform separate from the SQL Server installation (or have a server admin do it for them), and if you did not make the change before installing SQL Server, then it required restarting SQL Server after making the change for it to take affect.  This has changed with the SQL Server 2016 installation, as you can now select the option “Grant Perform Volume Maintenance Task privilege to SQL Server Database Engine Service” when you specify the service accounts, and this will grant the right to the service account at that time.

Enable Instant File Initialization during SQL Server 2016 installation

Enable Instant File Initialization during SQL Server 2016 installation

There is a potential security risk to using this feature, which Kimberly discusses in her Instant Initialization – What, Why and How? post.  The information presented in her post is still valid and worth the read, but Glenn and I did re-run some tests recently, just to get some current numbers to show the benefits of IFI.  We ran the same four tests that Kimberly ran way back in 2007 (!) on four different sets of storage: two sets of 15K disks (one in a RAID10 array, the other in a RAID1 array) and two sets of flash storage (FusionIO cards).  More information on the storage at the end of the post.  The tests were:

 

1 Create 20GB database
2 Grow existing database by 10GB
3 Restore 30GB empty database
4 Restore 30GB database with 10GB data

 

The tests were run on two different physical servers, both running SQL Server 2014.  Details for each storage system are listed below for reference, and the test results were as we expected:

 

Duration for file modification or database restore, with and without IFI

Duration for file modification or database restore, with and without IFI

 

The time to zero out a file and write data is a function of sequential write performance on the drive(s) where the SQL Server data file(s) are located, when IFI is not enabled.  When IFI is enabled, creating or growing a data file is so fast that the time is not of significant consequence.  The time it takes to create or grow a value varies in seconds between 15K, SSD, flash, and magnetic storage when IFI is enabled.  However, if you do not enable IFI, there can be drastic differences in create, grow, and restore times depending on storage.

Storage Details:

  • 15K RAID10 = Six (6) 300GB 15K disks in RAID 10
  • Flash Drive1 = 640GB Fusion-io ioDrive Duo
  • Flash Drive2 = 2.41TB Fusion-io ioDrive2 Duo
  • 15K RAID1 = Two (2) 300GB Seagate Savvio 15K drives in RAID 1

Note: This post was edited on April 13, 2016, to clarify the storage configuration based on a helpful comment.

SQL Server Setup Error: The directory name is invalid

A few weeks ago I got an email from someone who had attended our Accidental DBA IE class last year, and this person was getting the following error when trying to apply a cumulative update:

SQL Server Setup failure: The directory name is invalid.

SQL Server Setup failure: The directory name is invalid.

The initial email didn’t have a lot of details, so I started asking questions to understand what version was being installed, the environment configuration, etc.  Turns out this was a two-node Windows Server Failover Cluster (WSFC) with multiple SQL Server 2012 instances installed, and one of the instances was still running on the node this person was trying to patch.  To be clear, the two nodes were SRV1 and SRV2, and the instances were PROD-A and PROD-B running on SRV1, and PROD-C which was running on SRV2.  This person was trying to install the cumulative update on SRV2.

Now, those of you that manage clusters may be thinking “Doesn’t this DBA know that the way you do rolling upgrades is by not having any instances running on the node you’re trying to patch?”  Well, not everyone is an experienced DBA, a lot of people are Accidental or Junior DBAs, and if this is the first cluster you’re supporting, you may not know that, or understand why.  Further, when you update a single node on a stand-alone server (one that’s NOT in a cluster) it’s not like you shut down the instance yourself and apply the CU, right?

We checked the summary installation log, located in C:\Program Files\Microsoft SQL Server\110\Setup Bootstrap\Log and found the following Exit message:

The directory ‘M:\a13e546ad3e0de04a828\’ doesn’t exist.

The M drive was a resource for PROD-C, along with the N drive.  There was also a quorum drive (Q) and the local C drive.  So how was M not available?

Well, it was initially, when the install started, and when the installer runs, it puts the files on the first network drive that it finds (if it’s an administrative installation), or the drive with the most free space (see: ROOTDRIVE property).  In this case, the M drive met the criteria.  When the installer then stopped the instance and took the cluster disks offline, the M drive was suddenly gone, hence the invalid directory.

You could argue that this is a bug…maybe…but the solution I suggested was to move PROD-C over to the SRV1 node, then run the installation.  You could also specify the directory as part of a command-line install, therefore using a different disk, but downtime was permitted in this scenario, so the failover wasn’t a deal-breaker.  Once this was done, the installation ran fine, and the latest CU was applied on that node.  The DBA then went through the process of failing all the instances over to the patched node, and then applying the CU on SRV1.

As an aside, if you’re not sure of the current service pack, cumulative update, or hotfix available for your SQL Server version, I recommend this site which has all versions and releases and links to the downloads.  And, for those of you running SQL Server 2014, CU5 for SP1 just came out yesterday and has some interesting fixes (see https://support.microsoft.com/en-us/kb/3130926).

 

Collection of Baseline Scripts

The topic of baselines in SQL Server is one that I’ve had an interest in for a long time.  In fact, the very first session I ever gave back in 2011 was on baselines.  I still believe they are incredibly important, and most of the data I capture is still the same, but I have tweaked a couple things over the years.  I’m in the process of creating a set of baseline scripts that folks can use to automate the capture of this information, in the event that they do not have/cannot afford a third-party monitoring tool (note, a monitoring tool such as SQL Sentry’s Performance Advisor can make life WAY easier, but I know that not every can justify the need to management).  For now, I’m starting with links to all relevant posts and then I’ll update this post once I have everything finalized.

These scripts are just a starting point for what to monitor.  One thing I like to point in our IEPTO2: Performance Tuning and Optimization course is that there is A LOT of data you can capture related to your SQL Server environment.  Your options include Performance Monitor (using Custom Data Collectors), queries via Extended Events or Trace (depending on version), and any data from the DMVs or system views within SQL Server. You have decide what to capture based on

1) What problem you might be trying to solve in your environment, and

2) What information is most important for you to have.  Start simple, and then work your way up.

Figure out the one or two most critical things to capture, and start there, and then add on.

If you find there’s something missing from my scripts, let me know and I’ll try to get it added!

Monitoring in general

Configuration Information

Disk and I/O

Maintenance

Errors

Performance

Pluralsight