Inconsistent analysis in read committed using locking

Well, I had a second post already planned and partially written but a comment on the first post (in what’s going to become a multi-part series) has made me decide to do a different part 2 post. So, make sure you read this post first: Locking, isolation, and read consistency.

The comment was about how read committed using locking has inconsistencies – and, yes, that’s true. Read committed using locking is prone to multiple inconsistencies; there are things you can do to reduce some anomalies but in a highly volatile OLTP environment you will get inconsistencies for long running readers. This is why having a secondary server for analysis is always a great idea. Then, everyone gets the same version of the truth – the point in time to which the secondary was updated (is that nightly, weekly, monthly?). Now, if the secondary is getting updates through transactional replication, you are back to the same problem. But, here, I cannot imagine NOT using read committed with versioning. Probably one of the BEST places to use versioning is on a replication subscriber. Replication won’t be blocked by readers and readers won’t be blocked by replication:

SubscriberLockingWithVersioning

Why is read committed with versioning preferred for readers?

More and more, we’re being tasked with analyzing “current” data. This is harder and harder to do without introducing anomalies. So, I’ll discuss non-repeatable reads in the bounds of a single statement. To get this to occur with a small data set and without setting up a bunch of connections, etc. I’ll engineer the anomaly.

Example

To do this, I’m going to use a sample database called Credit. You can download a copy of it from here. Restore the 2008 database if you’re working with 2008, 2008R2, 2012, or 2014. Only restore the 2000 database if you’re working with 2000 or 2005 (but, you can’t do versioning on 2000 so this is a 2005 or higher example).

Also, if you’re working with SQL Server 2014, you’ll potentially want to change your compatibility mode to allow for the new cardinality estimation model. However, I often recommend staying with the old CE until you’ve done some testing. To read a bit more on that, check out the section titled: Cardinality Estimator Options for SQL Server 2014 in this blog post.

So, to setup for this example – we need to:

(1) Restore the Credit sample database

(2) Leave the compatibility mode at the level restored (this will use the legacy CE). Consider testing with the new CE but for this example, the behavior (and all examples / locking, etc.) don’t actually change.

Scenario / example

(3) This is where things will get a bit more interesting – you’ll need a couple of windows to create the scenario…

FIRST WINDOW – setup the table:

USE [Credit];
GO

IF OBJECTPROPERTY(object_id('[dbo].[MembersOrdered]'), 'IsUserTable') = 1
    DROP TABLE [dbo].[MembersOrdered];
GO

-- Create a copy of the Member table to mess with!
SELECT *
INTO [dbo].[MembersOrdered]
FROM [dbo].[Member];
GO

-- Create a bad clustered index that's prone to both fragmentation and
-- record relocation. This is part of what makes this scenario more likely!
-- But, that's not the main point of discussion HERE. This is good
-- because it's also easy to visualize
CREATE CLUSTERED INDEX [MembersOrderedLastname]
ON [dbo].[MembersOrdered]
    ([lastname], [firstname], [middleinitial]);
GO

SELECT COUNT(*) FROM [dbo].[MembersOrdered];
GO

Now… we’ll engineer the problem. But, I’ll describe it visually here:

This is a rough approximation of what the data looks like when the table is clustered by lastname

This is a rough approximation of what the data looks like when the table is clustered by lastname (not that I’m recommending that you do this).

The table has NO nonclustered indexes and ONLY a clustered index on lastname. The data is essentially ordered by lastname (not going to get technical about what the data looks like on the page here or the fact that it’s really the slot array that maintains order not the actual rows… for this example – the logical structure of the table is good enough to get the point).

(4) So, now that the table has been created – we will start by creating a problem (a row that will end up blocking us):

SECOND WINDOW – we’ll create a blocking transaction (t1)…

USE [Credit];
GO

BEGIN TRANSACTION
UPDATE [dbo].[MembersOrdered]
    SET [lastname] = 'test',
        [firstname] = 'test'
WHERE [member_no] = 9965;

Don’t get me wrong – this is a horribly bad practice (to have interaction in the midst of a transaction and to only send part of the transaction… if you walk away – SQL Server is incredibly patient and will hold locks until the transaction is finished [or, killed by an administrator]). But, the point… now our table will essentially look like this:

 

Yes, there are other locks that exist but this is the only one we really care about here...

Yes, there are other locks that exist but this is the only one we really care about here…

And, yes, there are MORE locks than just what I show here. SQL Server will also hold an IX lock on the page where this row resides and an IX lock on the table as well. And, of course, SQL Server will also hold a shared lock on the database. All of these locks (IX and the database-level S lock) are used as indicators. Individually, they don’t really block all that many other operations. You can have MANY IX locks on the same page at the same time and you can have many IX locks on the same table at the same time. Essentially these locks are used to show that someone has the “intent” to exclusively lock resources on this resource.

If you’re interested, check out sys.dm_tran_locks to see the locks held at each point of this example…

(5) So, now that the table has a locked row, we will try again to get a row count. Go back to the FIRST WINDOW and JUST re-run the query to get a count:

SELECT COUNT(*) FROM [dbo].[MembersOrdered];
GO

This will be blocked. And, it will sit blocked until the transaction finishes (either through a commit or rollback OR by being killed – none of which we’ll do…yet ).

So, what’s interesting and we need a bit of back-story to fill in is that our count query reads ONLY committed rows (we know that – it’s read committed locking). What a lot of people don’t know is that the row-level shared locks are released as soon as the resource has been read. This is to reduce blocking and because there’s no need (in read committed ) to preserve the state of the row (like repeatable reads does). And, I’ll prove that with an update…

(6) Now we’ll update a row that’s just sitting there and not locked (but, may have already been counted [locked, read, and then the lock was released] by our count query).

THIRD WINDOW – update and move a row that was already counted

USE [Credit];
GO

UPDATE [dbo].[MembersOrdered]
 SET [lastname] = 'ZZZuckerman',
 [firstname] = 'ZZZachary'
WHERE [lastname] = 'ANDERSON'
 AND [firstname] = 'AMTLVWQBYOEMHD';
GO

OK, now here’s another horribly bad practice (not using a key for a modification). But, this table has all sorts of problems (which is what makes it a wonder example for this). And, if I don’t go with something that’s efficiently indexed for this update then the locking would be different. I NEED to get row-level locking here. So, I have to do the update above. And, now what’s happened:

There are so many bad things going on with this . What if there are two rows with that name (ok, unlikely) but you still shouldn't do modifications without a KEY (primary / or candidate)!

There are so many bad things going on with this poor table. What if there are two rows with that name (ok, unlikely) but you still shouldn’t do modifications without a KEY (primary / or candidate)!

(7) Now, it’s time to “complete” the transaction that was blocking our count query.

Go back to the SECOND WINDOW and complete the transaction (to be honest, you can COMMIT or ROLLBACK – it really won’t matter for this example):

ROLLBACK TRANSACTION;

(8) Step 7 will have unblocked the row count query so all you need to do is go back to the first window and check the count that’s sitting there… which will be 10001  rows in our table that only has 10000 rows. Why? Because you were NOT guaranteed repeatable reads even in the bounds of a single statement. 

Re-run the count query and you’ll get 10000 again.

Why read committed with versioning solves this problem?

When you change the database to read_commmitted_snapshot you change ALL read committed statements to use version-based reads. The best part is that we’ll never WAIT for in-flight rows nor will we see rows whose transactions completed within the bounds of your statement [the longer running the statements the more that this is possible]. The incorrect count would not occur for a few reasons but I’ll create a list of the reasons here:

When the row modification (t1) is made – the row will be versioned (SQL uses “copy on write” / “point in time” = snapshot) to take the BEFORE modification VERSION of the row and put it into the version store.

When the count begins it will just quickly continue (and not be blocked by the X lock held on the ‘Gohan’ row). This is not a problem, the columns are changing but the row is not being deleted – yes, we should count that row.

If I could switch to the window and run the update fast enough, it too would not be blocked by the version-based read NOR would the moved row be visible (even if it made it there in time) by the version-based read because it’s new location (handled as an INSERT) would be NEWER than our count statement’s START time. Our count will reconcile to the point in time when the statement BEGAN. Nothing is blocked and those changes (even when committed before we get there) are NOT visible to our version-based read.

Is Summary 

Yes, read committed with versioning is often MUCH preferred over read committed using locking. But, versioning does come at an expense. You do need to do monitoring and analysis of THE version store (in THE tempdb). Someday (SQL Server 2016???) maybe we’ll multiple tempdbs (that would be awesome) and ideally, we’ll have some nice capabilities for controlling / limiting version store overhead (of one database over another). But, for right now – you have to make sure that your tempdb is optimized! And, if you have an optimized tempdb and you don’t have a lot of really poorly written long-running queries, then you might find out that the overhead is less than you expected. This is absolutely something that you should be not only considering but using!

Whitepaper: Working with tempdb in SQL Server 2005

Plus, be sure to review Paul’s tempdb category for other tips / tricks, etc.

OK, so now the originally planned part 2 is now trending to be part 3. Depending on questions – I’ll get to that in the next few days!

Thanks for reading,
k

 

Locking, isolation, and read consistency

Today I ran into a really interesting locking / blocking problem that I want to discuss… originally, a related discussion came up last week in our IEPTO1 course (Immersion Event on Performance Tuning – Part 1) but today I ran into a similar problem that brought me back to the original discussion and I was able to reproduce the original scenario with a really interesting combination of things (some of which I did not expect and that led me down a super interesting path). However, before I can get to the weirdness (and try to make sense of it), I need to make sure the foundation is set well… In IEPTO1, I discuss locking types and how long locks are held for the different isolation levels – spending a great deal of time talking about the default mode of locking (read committed – with locking) vs. all other options:

  • Read uncommitted ( same as using NOLOCK ) and a lower isolation level than the default. This allows dirty reads.
  • Read committed using locking ( this is the DEFAULT when read_committed_snapshot has NOT been turned on )
  • Read committed using version ( this is the DEFAULT when read_committed_snapshot has been turned on )
  • Repeatable reads ( which also uses locking – even when one or both flavors of versioning has been enabled )
  • Serializable ( which also uses locking [ same as using HOLDLOCK ] – even when one or both flavors of versioning has been enabled )
  • Snapshot Isolation ( which is ALLOWED if allow_snapshot_isolation has been turned on and the client requests it, but won’t be used by anyone unless SET TRANSACTION ISOLATION LEVEL SNAPSHOT has been requested. Here you’ll have all the overhead of versioning without anyone using it unless you make code changes for snapshot isolation.)

Essentially, your database can be configured for one of FOUR possible states:

  1. No options set (connections default to read committed using locking)
  2. ONLY read_committed_snapshot set ( connections default to read committed using versioning; no other code changes needed for read committed statements ). This provides statement-level read consistency. Every read will reconcile to the point in time when the STATEMENT started.
  3. ONLY allow_snapshot_isolation set ( connections DEFAULT to read committed using LOCKING – because of the absence of read_committed_snapshot ). For those that request snapshot isolation, this provides transaction-level read consistency. Every read will reconcile to the point in time when the TRANSACTION started.
  4. BOTH read_committed_snapshot and allow_snapshot_isolation set. Without snapshot transaction isolation level requested, statements will reconcile to the point in time that the statement started and when snapshot isolation is requested, then all statements will reconcile to the point in time that the transaction began.

To be honest, I’ve been wanting to get that “simple” view of the locking world written down for a while. Even the combination of options ( that a database can really be configured into one of four possible states ) is not very well known. Often content about versioning just states that you need to turn both options on and they don’t describe that they’re really distinct environments.

Statement-level Read Consistency

What I love about read_committed_snapshot is that you get statement-level read consistency… what this means is that you can get a definable point in time to which your statement reconciles – that point, the time when the statement started. So, if you ask for a count of rows, you get THE count of rows that were present when your statement started. The best part about it is that this count is not only accurate to the point in time that hte statement started, it also does so without preventing transactions / locks on the object where you’re counting rows. Without a lot of detail here – it does this by copying the transactionally consistent VERSION of the row into the version store and leaving behind a pointer to allow readers to use without being blocked AND without blocking other writers. The marketing tagline makes it sound perfect ( readers don’t block writers and writers don’t block readers ). All of this happens AUTOMATICALLY and without code changes for all statements running with read committed isolation. If a statement has a hard-coded lock hint ( like NOLOCK or HOLDLOCK, etc. ) then their statement-level hints will override this and use locking.

NOTE: Windows Azure SQL Database defaults to read committed using VERSIONING for new databases.

Transaction-level Read Consistency

What I love about allow_snapshot_isolation is that you don’t have to use it, unless you really want version-based TRANSACTIONS. And, oddly, many of you won’t want this for the majority of your complex updates / OLTP transactions ( where you’ll also have to deal with version-based reads and update conflicts ). If your writers are in read committed using locking OR read committed using versioning then an update will use locks and not version-based reads so conflicts are prevented. The place to use transaction-level read consistency is for “transactions” that will handle multiple reads across volatile data. Imagine that you want to have 3 reports / queries reconcile to the same point in time ( and you want to do this real-time reporting in your OLTP environment ). What you could do is this:

SET TRANSACTION ISOLATION LEVEL SNAPSHOT;
GO
BEGIN TRANSACTION
SELECT... --query1
SELECT... --query2
SELECT... --query3
COMMIT TRANSACTION
GO

So, your use of snapshot isolation should be a bit more limited IMO. And, largely limited to controlled reporting. Yes, you can do this with modifications / transactions… but if you have transaction-level read consistency and then you have multiple readers / writers to the same data then you’ll have to make sure you have good error handling ( you should be using TRY / CATCH ) so that you can deal with the conflict detected by SQL Server and resolved with the following error:

Msg 3960, Level 16, State 2, Line 6
Snapshot isolation transaction aborted due to update conflict. You cannot use snapshot isolation to access table 'dbo.member' directly or indirectly in database 'Credit' to update, delete, or insert the row that has been modified or deleted by another transaction. Retry the transaction or change the isolation level for the update/delete statement.

In Summary

So, to bring it all together. All LOCKING discussions really need to start with how the session and the database are handling isolation. Different behaviors occur in the different configurations and some locks are held for different amounts of time depending on your isolation level.

And, that’s a good start for now. I wanted to get a few of these basics out there first as I have a couple of problems I want to describe (and resolve) with long running transactions and blocking. And, some special cases with schema locks as well as some special code I’ve written to reduce some of the horrible problems you can run into with long blocking chains (and my code works across earlier versions of SQL Server and for MORE statements than just partition switching and index rebuilds… but, I’m getting ahead of myself on this one). Stay tuned, I’m going to get to those in tomorrow’s post and I’ll compare them against the low-priority lock wait feature in SQL Server 2014.

Also, if you’re more interested in learning more about versioning – check out the whitepaper I wrote (for SQL 2005) and which was updated for name changes and a few other things by Neal Graves ( thanks Neal! ): SQL Server 2005 Row Versioning-based Transaction Isolation

Thanks for reading!
k

Stored Procedure Execution with Parameters, Variables, and Literals

In Nov 2014, SQLskills made an offer to user group leaders to deliver remote user group sessions in 2015 and we’ve been having great fun delivering them. So far the team has delivered 34 sessions and we have 72 signed up! At my last session (for the San Diego SQL Server User Group), I delivered a session on stored procedures. It was great fun and I had a few questions at the end that made me come up with some new sample code… that’s where this post is coming from!

The question was because I said to make note of something being a parameter and not a variable (during the lecture) and so the question was – what’s the difference and how does that change the behavior?

Definitions

Parameters are values that are being passed into a stored procedure. Since these are part of the call to execute the procedure; these are “known” during compilation / optimization (but, that’s only IF SQL Server has to compile / optimize, more on that in a minute)

Variables are assigned at runtime and therefore are “unknown” during compilation / optimization.

Literals are known at all times as they are exactly that – a simple / straightforward value

Review this heavily commented stored procedure:

CREATE PROCEDURE ParamVarLit
    (@p1    varchar(15)) -- defining the parameter name @p1 AND the data type
AS
-- Declarations / assignments
DECLARE @v1 varchar(15); -- defining the variable name @v1 AND the data type
SELECT @v1 = @p1;        -- assigning the variable to the parameter input value

DECLARE @v2 varchar(15); -- defining the variable name @v2 AND the data type
SELECT @v2 = 'Tripp';    -- assigning the variable to a literal value

                         -- Note: also acceptible is this format
                         -- DECLARE @v2 varchar(15) = 'Tripp'

-- Statement 1 (using a parameter)
SELECT [m].*
FROM [dbo].[member] AS [m]
WHERE [m].[lastname] = @p1 

-- Statement 2 (using a variable)
SELECT [m].*
FROM [dbo].[member] AS [m]
WHERE [m].[lastname] = @v1; -- or @v2, these will work EXACTLY the same way!

-- Statement 3 (using a literal)
SELECT [m].*
FROM [dbo].[member] AS [m]
WHERE [m].[lastname] = 'Tripp';
GO

In the stored procedure you can see that there’s one input parameter, we’ll assign that when we call the stored procedure.

We can assign parameters by name (line two) or we can assign them by position (note, when you have a lot of parameters this can be frustrating)

EXEC [dbo].[ParamVarLit] @p1 = 'Tripp' -- assignment by name
EXEC [dbo].[ParamVarLit] 'Tripp'       -- assignment by position

Outside of that, everything else is being defined / assigned  within the stored procedure.

Setup

Now, we have to see how each of them works from an optimization perspective. To do this, I’m going to use a sample database called Credit. You can download a copy of it from here. Restore the 2008 database if you’re working with 2008, 2008R2, 2012, or 2014. Only restore the 2000 database if you’re working with 2000 or 2005.

Also, if you’re working with SQL Server 2014, you’ll potentially want to change your compatibility mode to allow for the new cardinality estimation model. However, I often recommend staying with the old CE until you’ve done some testing. To read a bit more on that, check out the section titled: Cardinality Estimator Options for SQL Server 2014 in this blog post.

So, to setup for this demo – we need to:

(1) Restore the Credit sample database

(2) Leave the compatibility mode at the level restored (this will use the legacy CE). Consider testing with the new CE but for this example, the behavior (and all estimates) don’t actually change.

(3) Update a row and add an index – use this code:

USE Credit;
GO

UPDATE [dbo].[member]
    SET [lastname] = 'Tripp'
    WHERE [member_no] = 1234;
GO

CREATE INDEX [MemberLastName] ON [dbo].[member] ([Lastname]);
GO

Execution

To execute this procedure it’s simple – just use one of the execution methods above and make sure that you turn on “Show Actual Execution Plan” from the Query, drop-down menu.

EXEC [dbo].[ParamVarLit] @p1 = 'Tripp' -- assignment by name
GO

This is what you’ll see when you execute:

ExecutionPlans

 

Notice that the first statement and the third statement both use an index to look up the row but statement two does not. Why? It’s all tied to whether or not the value was “known” at the time of optimization. And, most importantly – this procedure was the executed when there wasn’t already a plan in cache. Because there wasn’t already a plan in cache, SQL Server was able to optimize this procedure for the parameters passed in at THIS execution. This does NOT happen for subsequent executions.

More specifically, when the statement KNOWS that the value is ‘Tripp’ (statement 1 knows because it’s getting optmized for ‘Tripp’ and statement 3 knows because the value is hard-coded for ‘Tripp’) then SQL Server can look to the statistics to determine how much data is going to be processed. In this case, SQL Server estimates that there are very few rows with a last name of ‘Tripp’ (from the statistics, it thinks there’s only 1). As a result, an index would be helpful to find this highly selective result so it chose a plan to use an index.

For statement 2 though, SQL Server doesn’t seem to know to use an index. Why? Because here the variable (@v1) was unknown at the time of compilation / optimization. The variable is not assigned until actual execution but execution only occurs after a plan has been generated. So, the problem with variables is that SQL Server doesn’t know their actual values until after it’s chosen a plan. This can be both good and bad. Remember, SQL Server has to do something… So, in this case, SQL Server uses an average to estimate the rows and come up with a plan. This average comes from the density_vector component of a statistic rather than this histogram. If your data is reasonably evenly distributed then this can be good. And, it also means that the plan won’t change change after it’s kicked out of cache and a different execution occurs with different parameters. Some have learned this trick and have used it with success – but, only because their data is either evenly distributed OR the executions are using values that all resemble the average.

NOTE: This is EXACTLY the same behavior as using the OPTION (OPTIMIZE FOR UNKNOWN) clause on the statement within the stored procedure.

In this case, however, ‘Tripp’ is NOT like the average value and so the plan for statement 2 is not ideal for a variable assigned to ‘Tripp’. The data has a lot of duplicates and the average number of rows for most names is quite high (where an index is no longer userful). However, ‘Tripp’ is really not an average data value here and so the plan might be good for most other values. But, in this case, it’s not good for the value Tripp.

Execute the procedure again but supply a different value for @p1:

EXEC [dbo].[ParamVarLit] @p1 = 'Anderson' -- assignment by name
GO

This is what you’ll see when you execute with ‘Anderson’ AFTER having created a plan for ‘Tripp':

ExecutionPlans

Wait – there’s no difference?

Nope, absolutely none! Really, review every aspect of your output /plan [not the actual values] and you’ll see it’s exactly the same! The plan that you see is always the estimated plan and the estimated plan is chosen when the stored procedure is optimized / compiled. Optimization / compilation occurs only when there isn’t already a plan in cache for that procedure.

Why are they the same?

The parameter – ‘Tripp’ was used on the first execution and this is what was “sniffed” and use for optimization. When I say “sniffed” all that means is that the value was KNOWN such that the optimizer could look at the statistics (and specifically the histogram) to estimate how many rows had a last name of ‘Tripp.’ It turns out that the estimate was 1. You can see this by hovering over the Index Seek in the first statement:

Diagrams

 

If you were to kick this plan out of cache and re-execute with ‘Anderson’ then something interesting would happen:

EXEC [sp_recompile] '[dbo].[ParamVarLit]';
GO

EXEC [dbo].[ParamVarLit] @p1 = 'Anderson';
GO

Results in this plan:

Statement1Estimate_Anderson

There’s really one incredibly important observation here: ONLY the first statement’s plan changed!

The first statement’s plan changed because on this execution SQL Server was able to “sniff” the parameter and optimize / compile a plan specific to it. More specifically, when the statement KNOWS that the value is ‘Anderson’ (again, only statement 1 knows this) then SQL Server can look to the statistics to determine how much data is going to be processed. In this case, SQL Server estimates that there are numerous rows with a last name of ‘Anderson’ (from the statistics, estimates 385 rows). As a result, the data is not selective enough to warrant using an index so in this case, SQL Server uses a table scan. (shown as a clustered index scan solely because the table has a clustered index)

Bringing It All Together

Parameters are evaluated and sniffed ONLY when a plan is being created. This ONLY happens when a plan is NOT already in the cache. And, sniffing is fantastic for THAT specific execution because “sniffing” a parameter lets SQL Server use the histogram component of statistic to determine the estimate. While the histogram is not always perfect, it’s usually a more accurate way of estimating rows. But, this can also lead to parameter sniffing problems where subsequent executions don’t perform well because the plan in cache wasn’t optimized for their values. There are many solutions to this problem, I covered a few of the options here: Building High Performance Stored Procedures.

Variables are ALWAYS deemed “unknown” and they cannot be sniffed. In this case, SQL Server doesn’t have a value to lookup in a statistic’s histogram. Instead, SQL Server uses an average to estimate the rows and come up with a plan. But. as I mentioned – this can be sometimes good and sometimes bad.

The literal is the easiest of them all. SQL Server knows this value and there’s absolutely nothing that will ever change that value. This can be sniffed and it will use the histogram getting the best estimate to use for optimization.

Play around with this sample procedure. Review the plans and the estimates v. actuals. Next week I’ll dive more into the statistics themselves and where the specific estimates are coming from!

Have fun and thanks for reading!
k

Prepared statements: be careful when given “THE” solution for all of your development practices

I’m not sure about how to title this OR how to avoid the term “it depends” but I also don’t feel that “it depends” is a bad phrase. Some folks think of it as a cop-out when they don’t have a better answer but the problem is that it’s often the RIGHT answer. There are rarely simple solutions that work in ALL situations. And, as a result – it depends – is really correct. What’s most important is knowing what it depends ON.

So, where am I going with this one?

I recently received a question that started by referencing a post titled: NHibernate, the database query cache and parameter sizes. My issue is that there is truth in that post (and also in the comments) but there are also a few statements that just aren’t quite right. In fact, there’s one that I thought was right but I had to really dig into it to see what was really happening and what’s happening is NOT what I expected. And, this is really the crux of the problem. Sometimes the behavior (or, solution) is completely counter-intuitive. But, above all, there’s one fatal mistake in this post (and many others) – where they state what THE solution is. Some statements are claiming that they are [ALWAYS] better than the other and unfortunately, that’s just not the case… (comments were closed so I couldn’t comment on it directly)

What I really want to dissect a bit is this part of the post:

There are arguments for both sides, but most DBAs would really like to see only a single query plan for all parameter sizes. I didn’t have the answer off the top of my head, but I was pretty sure that NHibernate did the right thing from the point of view of the database.

As it turn out, I was wrong. By default, NHibernate will generate two separate query plans for this query in the database. Sounds bad, right?

Not really, here is the fix, just put the following in your configuration:

<property name='prepare_sql'>true</property>

And, this particular comment:

The side-effect of calling Prepare() is that a so-called “prepared statement” is created in the DB. This is essentially a statement that is precompiled in the DB and gives back a handle so future calls on THE SAME CONNECTION can be made just referencing this handle and not sending the entire sql again. If all calls to the DB were made on the same connection its true that the execution plans would be reused, but as we know – connections are drawn from the connection pool as needed and handed back in a random manner. The prepared statements on the connection are un-prepared everytime a new session is created and the connection is soft-initialized.

First and foremost, the post and the comments aren’t bad. These people are working to make their environments better and to help others to sort out their bizarre behaviors / problems. They are truly trying to help. I don’t suspect even a tiny bit of their intentionally wanting to lead anyone astray. However, some of this is just wrong. So, let me begin.

There ARE arguments for both sides.

Yes, this is TERRIBLY important. The reason there are arguments for both sides is that EACH side is correct SOME of the time. Neither is correct ALL of the time and, of course, something has to be chosen as the default. Let me try to give a basic example of this… I’ll use a traffic analogy. It’s 2am, you’ve been coding for quite a few hours, and you want to get some great diner food – in Seattle. But, you live in Redmond (across the bridge). The shortest distance between you and the restaurant will get you there in 30 minutes but requires you to cross a toll-bridge. So, the trade-offs are money for time. But, you decide to go solely “for performance” and you choose that plan… take the 520 over to Seattle.

Now, let’s say that plan is “in cache” and MUST be re-used the next time you want to go to the diner. However, the next time you want to go – it’s 4:30pm, on a Friday, and there’s a Husky game… the plan to use 520 is a bad idea. Not because it’s not the shortest distance but because of other problems (congestion, “blocking,” etc…). And, don’t get me wrong – SQL Server does NOT change plans for “blocking” or “congestion” (so, the analogy breaks down a bit here). But, you can still see the point.

A chosen plan for one set of parameters may not be best with different parameters.

And, this is why there isn’t a single right answer for EVERY statement. So, if you’re looking for THE right choice (should the statement’s plan be placed in the cache or not) then this is the tough part – what’s really best depends on what I call – the stability of the plan generated against a wide variety of different parameters.

With the right kind of testing, you can execute a statement with a variety of different parameter combinations and very whether or not each of their optimal plans has the same pattern / shape. If they do – then, I’d say that statement’s optimal plan is stable. However, (and this is often MORE likely), if the optimal plan varies, then that statement’s optimal plan is unstable (across different parameter combinations).

The end result is really a set of two rules:

  1. When a statement’s optimal plan is stable and does NOT vary across executions – a plan should be saved.
  2. When a statement’s optimal plan is unstable and varies across executions – a plan should not be saved.

“Most DBAs would really like to see only a single query plan for all parameter sizes.”

Well, only when the plan is stable. If the plan is unstable then another problem occurs – this is often referred to as “parameter sniffing problems” or “parameter sensitivity problems” (a.k.a. PSP). PSP can actually end up being a worse problem. If a bad plan sits in cache and is re-used for all executions (executing an inefficient plan) then you can end up with statements doing a tremendous amount of I/O or spilling to disk with bad memory grants or doing the wrong type of join, etc. when they shouldn’t. I’ve seen PSP essentially take a server down (causing so much I/O that the server is essentially inaccessible and unable to process any additional requests. Some clients even “react” to this by restarting SQL Server… problem solved. Well, cache cleared and the bad plan will be thrown out. Hopefully, the next execution will be what we want and a reasonable plan will get into cache. At least temporarily.

And, this is how it goes. Poor performance occurs and often a variety of things are blamed (cache or I/O or statistics or indexes) and some quick fix (updating statistics is another common one) SEEMS to solve the problem. But, ultimately, it’s probably a side-effect of what you’ve done that’s solved the problem (NOTE: Updating statistics often “invalidates” plans so the side-effect of having updated statistics might be what solved your problem and NOT the updating of statistics itself). I talked about this in a related post: What caused that plan to go horribly wrong – should you update statistics?

So, to bring it back to the main point… HOW you execute and whether or not YOU choose to force the statement to be cached are VERY important decisions and unfortunately depend on a statement by statement basis. I’ll try to bring it all together in just a moment.

“I was pretty sure that NHibernate did the right thing from the point of view of the database. As it turn out, I was wrong. By default, NHibernate will generate two separate query plans for this query in the database. Sounds bad, right?

First off, I’d have to agree WITH the default. The default is often referred to as “simple” parameterization. This is where SQL Server looks at the statement and decides – no, I shouldn’t cache a single plan because the plan might not be “safe” across all executions. And, as a result, it must be compiled for EVERY execution. I agree with the default because many executing a bad plan can be WAY more expensive than recompiling to get a good plan and executing that.

But, there is a bad side to this too. If the statement is STABLE then recompiling every time wastes time in compiling (CPU) and, it can also lead to plan cache bloat. (I’ve talked more about this here: Plan cache and optimizing for adhoc workloads)

So… I think that NHibernate is doing the right thing by default.

But, having said that, that default is NOT great for everything.

“Just put the following in your configuration: <property name=‘prepare_sql’>true</property>”

MAYBE – but, only when the plan is STABLE.  (yes, I know… I still haven’t fully described how you can tell if a plan is stable)

So, what does that do? With a bit of digging (and, from the email that I received), it appears as though this essentially uses / created a “prepared” statement. And, this also ties into one the comment that I pointed out above. And, while the comment sounds correct (and, I even agreed with it at first); that’s not actually how it works (yes, I was surprised too).

Prepared statements using sp_prepare

It turns out that telling NHibernate to use prepare_sql = true effectively does something like the following:

DECLARE @handle int
EXEC sp_prepare @handle OUTPUT,
     N'@P1 VARCHAR(15)',
     N'SELECT [m].* FROM [dbo].[member] AS [m] WHERE [m].[lastname] LIKE @p1';
EXEC sp_execute @handle, N'Anderson';
EXEC sp_unprepare @handle;
GO

At first glance it seemed like SQL would have to generate a different plan for every execution (except those that used the same handle) but I found that as long as statement was the same – the plan was the same. ACROSS CONNECTIONS. This surprised me. Even with the sp_unprepare. Even with the explicit @handle. YES, they re-used the plan. So, the effect is what was desired by the prepare_sql property being set to true. But, this is NOT always ideal (again, remember, the execution of a bad plan can often be more expensive than the cost of recompilation). But, not, recompiling every time should be avoided where the plans are stable.

So, then I started wondering about what sp_unprepare does and I found another post discussing this a bit more: Watch out those prepared SQL statements which is interesting in and of itself but that just exposes yet-another issue with where / how the handles are managed. So, if you’re calling these yourself then you definitely want to sp_unprepare. But, that still doesn’t get rid of the PSP problem. And, again, these PSP problems can be across connections.

Prepare your statements or not – that is the question?!

It really all boils down to this – effective testing. Most developers are fantastic at code coverage testing. However, most don’t know about what I call “plan stability” testing. The good news is that if you are doing good code-coverage testing then it should make code-stability testing easy!

Code-coverage testing should thoroughly test all of your code branches and all of your possible parameter combinations (or, at least, a healthy set of the more common combinations). In addition to making sure that these code combinations “do the right thing,” what I want you to do is see if their optimal plan varies across these executions. There are a few ways to do this:

For statements: If you can isolate the statement and hard-code different combinations of parameters – this might be the easiest way to do a test. What I would do with the statement above is something like this:

SELECT [m].* FROM [dbo].[member] AS [m] WHERE [m].[lastname] = 'Tripp';
SELECT [m].* FROM [dbo].[member] AS [m] WHERE [m].[lastname] = 'Anderson';
SELECT [m].* FROM [dbo].[member] AS [m] WHERE [m].[lastname] = 'Smith';
SELECT [m].* FROM [dbo].[member] AS [m] WHERE [m].[lastname] = 'Abbas';
GO

Then, execute all of these in SSMS with the option “Include Actual Execution Plan” (from the Query drop-down). Now, before you think you need to understand everything about plans and what they mean… here’s the good news. For “stability-testing” you do NOT need to care what the plan does. What you need to look at is the plan shape (OK, sometimes it’s not quite this easy and I would look a little deeper [like at the types of operators and sometimes deeper still] but I will venture that the large majority of huge performance problems are caused by vastly different plan shapes and so the majority of problems can be found JUST by looking at the shape of the plan and not having to dive in much deeper).

If the plan shape varies then the optimal plan for this statement varies; do not FORCE / PREPARE it into cache. 

If the plan shape does not vary then the optimal plan for this statement is consistent / stable; this statement should be forced / prepared in the cache. (IMPORTANT NOTE: if you don’t do a very good job testing a wide-enough range of combinations you might make a bad decision here. It’s very important to have realistic data, realistic parameter combinations, and a number of them.)

For stored procedures: For stored procedures there’s a similar / easy way to see if the plan varies from execution to execution by using EXEC procedure parameters WITH RECOMPILE. While this does have some limitations, this can also be helpful at knowing if a stored procedure might be prone to similar PSP. I did write a detailed post on that here: Building High Performance Stored Procedures.

The End Result

Blog posts like the one I referenced are based in fact. And, I believe that what they’ve done did help for THEIR problem. However, trying to over simplify (for ALL environments) a single coding practice is often NOT IDEAL. Unfortunately, it depends IS the right general answer here.

If you’re looking for a lot more content around this, check out the links I’ve included in this post – there are numerous free resources including an online video I did as part of PASStv (linked to in the Building High Performance Stored Procedures post). And, if you still want more content, demos, and details, check out my two Pluralsight courses specifically targeting these problems. Start with this one: SQL Server: Optimizing Ad Hoc Statement Performance and then watch this one: SQL Server: Optimizing Stored Procedure Performance.

Have fun!
k

Our 5th SQLintersection is coming up in May 2015

I can’t believe we’re already approaching our 5th co-located SQLintersection & DEVintersection events… if you’re interested in how we got here – check out some of my past posts:

  1. SQLintersection: a new year, a new conference
  2. SQLintersection’s Fall Conference – It’s all about ROI!
  3. Fall SQLintersection is coming up soon and we can’t wait!
  4. SQLintersection Conference and SQLafterDark Evening Event – what a fantastic week in VegasSQLafterDark

The last post is from our most recent event in Las Vegas this past fall. It was our biggest show yet and while we’ve largely kept our focus on ROI and technical content the same (performance / troubleshooting / tales-from-the-trenches with best practices on how to fix them) we’ve also added even more social events so that you really get time to intersect with the conference attendees and speakers. The addition of the SQL-specific evening event SQLafterDark was wildly popular from our Fall show and that’s returning for Spring!

And, so are many of our phenomenal speakers. But, we also have a few new industry-experts rounding out our line-up. We’re still working out a few remaining sessions but the show is already packed with enough to keep you busy for weeks after the show ends.

SQLintersection: Great Speakers!

Once again, I think a great show starts with great speakers and current / useful content. All of these speakers are industry-experts that have worked in data / SQL for years (some can even boast decades) but all are still focused on consulting and working in the trenches. Every speaker is a SQL Server MVP (with the except of the Microsoft speakers – but, I don’t think anyone’s going to question their SQL knowledge :)) and some are Microsoft Certified Masters in SQL Server. But, no matter what – ALL are focused on providing the most ROI that’s possible in their session(s) and/or their workshops. Check out this list of speakers:

  • Aaron Bertrand, Sr. Consultant, SQL Sentry, Inc. [blog | twitter]
  • Allen White, Practice Leader, Upsearch [blog | twitter]
  • Brent Ozar, Brent Ozar Unlimited [blog | twitter]
  • David Pless, Dedicated Support Engineer (PFE), Microsoft
  • Erin Stellato, Principal Consultant, SQLskills.com [blog | twitter]
  • Kevin Kline, Director of Engineering Services, SQL Sentry, Inc. [blog | twitter]
  • Kimberly L. Tripp, President/Founder, SQLskills.com [blog | twitter]
  • Paul S. Randal, CEO / Owner, SQLskills.com [blog | twitter]
  • Shep Sheppard, Senior Program Manager, Microsoft AzureCAT [blog twitter]
  • Tim Chapman, Dedicated Support Engineer (PFE), Microsoft [blog | twitter]
  • Tim Radney, Principal Consultant, SQLskills.com [blog | twitter]

SQLintersection: When is it all happening?

The show officially runs from Monday, May 18 through Thursday, May 21 but there are both pre-conference and post-conference workshops that extend the show over a total of up to 6 full days. For the full conference, you’ll want to be there from Sunday, May 17 through Friday, May 22.

  • Sunday, May 17 – pre, pre-con day. There are two workshops running:
    • Very Large Tables: Optimizing Performance and Availability through Partitioning by Kimberly L. Tripp
      Manage SQL Server Efficiently with PowerShell by Allen White
  • Monday, May 18 – pre-con day. There are two workshops running. And, the first keynote of the conference is Monday evening.
    • Performance Troubleshooting using Waits and Latches by Paul Randal
      Setting Up and Troubleshooting SQL Server by Brent Ozar
  • Tuesday, May 19 through Thursday, May 21 is the main conference. Conference sessions will run all day in multiple tracks:
    • Check out our sessions online here: http://www.sqlintersection.com/SQL-Conference/Sessions
    • Be sure to check out our cross-conference events and sessions.
    • Get your pop-culture trivia and techie-SQL-trivia hat on and join us for SQLafterDark on Wednesday evening, May 20th
  • Friday, May 21 is our final full day running with an absolutely applicable workshop for most development environments.
    • Performance Tuning When You Can’t Fix the Queries by Brent Ozar

SQLintersection: Why is it for you?

If you want practical information delivered by speakers that not-only know the technologies but are competent and consistently, highly-rated presenters – this is the show for you. You will understand the RIGHT features to troubleshoot and solve your performance and availability problems now! 

Check us out: www.SQLintersection.com.

We hope to see you there!
kt

PS – Use the discount code ‘SQLskills’ when you register and receive $50 off registration!

Nonclustered indexes require the “lookup” key in the b-tree when?

I received a great question in email and it’s something I cover in our IEPTO1 (Immersion Event on Performance Tuning, Part 1) so I thought I’d write a post about it…

Question: 

When you have a non-unique key value in a nonclustered index, SQL Server adds the RID / Row Identifier (if the NC is on a heap table) or the clustered key (if the table has a clustered index) to the navigational structure (the b-tree) of the nonclustered index to make non-unique index rows unique. What is the reason behind that implementation?

Answer: 

SQL Server requires the lookup value (the RID or the clustering key) to go up the tree when a nonclustered key is non-unique in order to efficiently locate rows in the leaf level.

Scenario: Take a non-unique, nonclustered index on gender (on an employee table that’s clustered by EmployeeID). To show this, I’ll create a simple tree structure with only 24 rows (12 female and 12 male employees).

Note: this is NOT exactly what the index looks like but for simplicity, I’ll use this. Also, this data set is small for simplicity (you can understand this problem with a small data set); the problem is even more exaggerated for larger and larger data sets. Finally, if you’re new to index internals, consider checking out my online course here.

Possible Option 1: Only push the non-unique nonclustered key up the tree

If the leaf level is sorted only by the nonclustered key of gender then there’s no order to the EmployeeID values within each gender grouping:

Imagine a nonclustered index on gender with a clustered index on EmployeeID. If the leaf level were not sorted then every request (DELETE / UPDATE) would have to scan ALL of the Fs to find the female row they were searching for… This is a “What If” picture; this is NOT how nonclustered indexes are stored.

In this case, the index has very few uses. We can get a count of Female or Male employees easily but not much else is efficient. A SELECT will have to scan (from the start – every time); it can stop once the record is found (EmployeeID is unique) but the cost of always starting at page 1 (for female) or page 4 (for male) would be too expensive with many rows / pages.

Having said that though, you might even start by wondering what an index on gender would benefit anyway? The most useful requests are exactly that (counts by gender, list by gender). So, for most queries this structure could work and it really wouldn’t matter for queries. Inserts could just go to the end of each section. But, what about deletes? What if an employee were to leave? If you were to execute the following:

DELETE [Employee] WHERE [EmployeeID] = 27;
GO

How would SQL Server find that row? They would know that the employee is female when they went to the data row itself. But, finding this row within the female grouping of this nonclustered index would ALWAYS require a scan through all female rows. The same would be true for an update.

So, while this could work for inserts as well as some queries, it’s horribly inefficient for updates and deletes.

Possible Option 2: Sort the leaf level but still ONLY push the non-unique nonclustered key up the tree

Even if they were sorted – if you don’t “push” the clustered key value (EmployeeID) up the tree, the index cannot be efficiently searched / modified, we still have to start at the lowest value and scan until we find the row to DELETE / UPDATE. This is a “What If” picture; this is NOT how nonclustered indexes are stored.

Here we don’t save much; we can still stop a scan once a value was found. But, we can do that in the structure for Option 1 (as long as EmployeeID was defined as unique).  About the only added benefit of this structure is for an ORDER BY query.

SELECT [EmployeeID], [Gender]
FROM [Employee]
ORDER BY [Gender], [EmployeeID];
GO

For Option 1 this query would have to add an additional sort. For Option 2, we’d save this. But, we’d still have the same problem for a DELETE / UPDATE; we’d have to scan to find the specific row to modify.

So, we’re really back to the same issues as Option 1: this could work for inserts as well as even a few more queries, it’s horribly inefficient for updates and deletes.

SQL Server pushes the “lookup value” up the tree

This makes each row in a nonclustered tree unique. This makes every query directly seekable to the specific page on which that row resides. We can support queries with additional predicates (such as WHERE EmployeeID >). Again, this index doesn’t provide a lot of uses BUT it’s still MORE useful when it’s sorted AND seekable by the composite key and where that key is unique up the tree.

A delete can go directly to the leaf page.

DELETE [Employee] WHERE [EmployeeID] = 27;
GO

After going to the row, we’d find that EmployeeID 27 is female. Using the root page we can see that the first pointer is for EmployeeID 2 or higher. The second pointer is for EmployeeID 59 or higher. Since we’re looking for 27 – we’d use the first pointer which points to page 1 for data.

In the same way, an update can go directly to the leaf page. An insert has a specific place to insert (yes, these can cause splits).

With both the non-unique nonclustered key as well as the clustered key, you have a tree structure that can be efficiently searched / modified / updated.

So, while I could have used a larger and more realistic example for this – this definitely shows the need for the clustering key in the b-tree portion of an index.

Have fun and keep those questions coming!

Cheers,
k

 

 

SQLintersection Conference and SQLafterDark Evening Event – what a fantastic week in Vegas!

DEV-F2014

SQL-F2014

 

 

 

201310281834-_MG_9240

The wonderful skyline in Vegas. This is the view of the hotel “New York, New York” which is directly across the street from the MGM Grand where we’ve held SQLintersection for Fall 2013, Fall 2014, and we’ll be heading back for Fall 2015. We hope to see you there!

It’s been a busy few weeks… and it all came together amazing well last week as we had our twice yearly SQLintersection conference (the Fall event is usually held in Las Vegas). It was only 2 years ago that we decided to move to a different format and really handle a conference the way that WE want to (you can read a bit more about that here: SQLintersection: a new year, a new conference). And, I’m happy to say that most of it went really well. We’re still learning little tips and tricks along the way and there are ALWAYS things that we can improve but overall, if we can keep on the same track as last week’s event, we’ll be very happy!

First off, our speaker line-up was PHENOMENAL!

  • Aaron Bertrand, Sr. Consultant, SQL Sentry, Inc. [blog | twitter]
  • Andrew J. Kelly, Mentor, SolidQ [blog | twitter]
  • Bob Ward, Principal Architect Escalation Engineer, Microsoft [blog | twitter]
  • Brent Ozar, Brent Ozar Unlimited [blog | twitter]
  • David Pless, Dedicated Support Engineer (PFE), Microsoft
  • Glenn Berry, Principal Consultant, SQLskills.com [blog twitter]
  • Jeremiah Peschka, Brent Ozar Unlimited [blog | twitter]
  • Jonathan Kehayias, Principal Consultant, SQLskills.com [blog | twitter]
  • Joseph Sack, Sack Consulting [website | blog]
  • Kendra Little, Managing Director, Brent Ozar Unlimited [blog | twitter]
  • Kevin Farlee, Storage Engine Program Manager, SQL Server, Microsoft
  • Kevin Kline, Director of Engineering Services, SQL Sentry, Inc. [blog | twitter]
  • Kimberly L. Tripp, President/Founder, SQLskills.com [blog | twitter]
  • Paul S. Randal, CEO/Owner, SQLskills.com [blog | twitter]
  • Paul White, SQL Kiwi Limited [blog | twitter]
  • Shep Sheppard, Senior Program Manager, Microsoft AzureCAT [blog twitter]
  • Tim Chapman, Dedicated Support Engineer (PFE), Microsoft [blog | twitter]

Our Content was Top Notch!

In addition to our 8 full-day workshops (3 precons on Sunday, 3 precons on Monday, and 2 post-cons on Friday), our 44 regular conference sessions and our two fantastic SQL-oriented keynotes (in additional to many developer and show keynotes), we also held our first-ever SQLafterDark evening event. We put this together somewhat at the last minute as we decided that we wanted to have more of a social evening event just for SQL folks. And, despite the late addition – it worked out perfectly.

201411081046-2014-11-08 10.46.10

SQLintersection workshop workbooks – lots of content over 3 different workshop days – and a few folks did ALL THREE days of workshops!

SQLafterDark – a new and PERMANENT addition to SQLintersection

TempLogo

Almost every speaker was able to join us and we came up with some great prizes to raffle off AND some great questions for a trivia game. We started the evening with everyone coming in and grabbing a drink. We socialized for the first 30+ mins and I brought attendees and speakers together by hearing questions and then matching the right folks together. We all relaxed and unwound from an intense day of sessions and we all had a drink (the first round was sponsored by SQL Sentry and SQLskills).

201411122102-_MG_0215

Roundtables, an open bar, TONs of great prizes, a great set of SQL folks all in one place – to relax / chat / unwind. Definitely an event to repeat!

201411122102-_MG_0218

Since it was SQLafterDark, I decided to go with a grainy black and white image of the event. What a fantastic evening. A new STAPLE for our SQLintersection event.

B2WKQh5CYAA_m9S

Our 24-question, two-round trivia game was filled with both technical and pop-culture trivia (questions supplied by SQL speakers and tied to their interests). Here’s the team that WON the overall game for the night. They each received a SQL Sentry Plan Explorer Pro license for their having won the challenging game. They also got bragging rights for the rest of the conference!

Accessing our FANTASTIC content post-show

This is one of the areas that we still need to improve upon. And, for this I’d like to apologize. It’s now 3 days after the show and the content has not yet been posted online. For those of you who have sent me mail – keep on doing that. I’m happy to get you the content you need directly. As for a show (with many moving parts), I’ve learned that I can only “ask” so much. And, since I don’t directly control the posting, it’s harder than I imagined to get all of the “updates” posted to our update location. If you were at the event then you know the location where this content is supposed to be and you know that I had told you that it should be available by Monday. I’m not going to point fingers but I had been told that it would be posted Friday / Monday after the show. But, a combination of things has delayed this process (no, it’s NOT acceptable IMO). This is one of the things that we WILL be improving for our next show – I PROMISE!

So… until all of the updates are posted (which really should be VERY soon; I’ve been PROMISED today / tonight), feel free to email me with your requests for the content that you need / want (give me the speaker and session name and I’ll work to get the resources to you ASAP).

UPDATE: About 2 hours after I wrote this post, I received the email that our resources are online! Yeah. And, we’ll make sure that this process is streamlined for the next conference so that things don’t go into the following week. Thanks for your patience everyone!!

Finally, a few pictures from the DEVintersection and SQLintersection conferences!

201411110952-_MG_0148

Our individual show logos were highlighted on the walls…

201411110950-_MG_0142

This was just as we were setting up for the first evening keynote by Microsoft Executive Vice President Scott Guthrie. Scott’s on stage; he’s in red (way off in the distance).

201411110955-_MG_0153

Here, Scott Guthrie and Richard Campbell (of DotNetRocks and RunAs Radio) prepare for the opening keynote. Talk about two fantastic techie folks taking the stage…

201411110956-_MG_0156

One more of Scott and Richard.

201411110951-_MG_0147

Turn off the lights, we’re going to open the doors!

201411111003-_MG_0166

And now, it’s time to start the keynote!

201411111009-_MG_0177

A few more angles for our opening keynote session with Scott Guthrie – Executive Vice President of the Cloud and Enterprise group in Microsoft. He is best known for his work on ASP.NET, which he and colleague Mark Anders developed while at Microsoft.

201411111011-_MG_0181

A few more angles for our opening keynote session with Scott Guthrie – Executive Vice President of the Cloud and Enterprise group in Microsoft. He is best known for his work on ASP.NET, which he and colleague Mark Anders developed while at Microsoft.

201411111006-_MG_0173

Last one! A few more angles for our opening keynote session with Scott Guthrie – Executive Vice President of the Cloud and Enterprise group in Microsoft. He is best known for his work on ASP.NET, which he and colleague Mark Anders developed while at Microsoft.

201411111304-2014-11-11 13.04.16

Lunch with 1500+ of my favorite techie folks (the lunch hall had just opened so I grabbed a quick shot before folks sat down). I heard lots of great things about our food during the event – which is always a good thing as it’s hard to feed so many folks at the same time! But, Vegas does this really well!

And now for our SQLintersection sessions:

201411131010-_MG_0228

Here’s Kendra Little showing an execution plan.

Here’s Bob Ward delivering an 8:30am session on Troubleshooting. I love this session and his words of wisdom on how to troubleshoot so that you can avoid calling him! This session always makes me think of Cuba Gooding Jr. in Jerry Maguire – “Help me, help you.”

201411121452-_MG_0208

I had a packed house for sessions on statement execution and stored procedures and with lunch in between my two sessions, I got to answer a lot of questions and hang out in the room (I try to do this if I can as it gives me more time with attendees). THANKS for all of your great questions!

201411121245-_MG_0201

Another angle, another session… this time it’s optimizing stored procedures!

201411121451-_MG_0206

OK, thanks Paul – you got me again! ;-)

201411121108-_MG_0200

Here’s Paul White discussing Parallelism in a two-parter of deep 300 and 400 level sessions.

201411120415-_MG_0191

Here’s Brent talking something… (OK, I can’t remember exactly which session this was). But, it is Brent – just doing his magical thing!

201411121107-_MG_0199

And here’s Microsoft’s Tim Chapman talking about tools for troubleshooting.

201411120414-_MG_0188

Andrew Kelly talks TempDB optimizations and usage.

201411120414-_MG_0189

Another one of Andrew Kelly talking tempdb.

201411111457-_MG_0193

And here’s my lovely husband talking Index Fragmentation. Or, something. Really, I don’t listen to him anymore… Ha! Just kidding!! Just checking to see if anyone’s actually reading all of these captions. ;-)

More fun stuff

On Tuesday evening, we had a speaker cocktail party after speakers / attendees went to the expo hall. I don’t have a lot of pictures from this but I do have a favorite that I took of two of our dear friends – Mark Minasi (left) and Richard Campbell (right). These two are an ABSOLUTE blast to sit back and hang out with…

201411120402-2014-11-11 20.02.13

Then, on Friday evening, Paul and I wrapped up the week heading out for dinner and a show (“O”) with Bob Ward and his lovely wife, Ginger. What a great way to end the week! If you have the opportunity to see a Cirque du Soleil show, I highly recommend it. In fact, one of my favorite things to do while in Vegas is “hit a show” and we try to do this almost every time we’re in Vegas. We’ve seen some wonderful presenters over the years – Bette Midler, Cher, Elton John, Rod Stewart… and our favorite Wayne Newton (no, seriously – he was so genuine and so down to earth – it was incredible). My Mother and Wayne Newton were born on the EXACT same day so we actually got tickets because my Mother wanted to see him (we never would have chosen him on our own but we’re SOOOOO glad that we did). Anyway, Vegas is really AMAZING for shows!

Here are a few pictures from our evening out and about with Bob and Ginger:

201411141932-2014-11-14 19.32.04

There’s a piano bar just off the main entrance of the Bellagio. We stopped there before dinner. I asked to hear some ColdPlay and the pianist knocked it out of the park. It was EXCELLENT.

201411142012-IMG_0804

Here we are at dinner… tapas, margaritas, and wonderful conversation – what a way to end a crazy week. Thanks Bob / Ginger!

201411142153-2014-11-14 21.53.23

This is during the “pre-show” as everyone is getting settled into their seats.

201411142334-2014-11-14 23.34.40

This is the ceiling in the Bellagio “O” theater – a sight in an of itself. Gorgeous theater specially-built for “O” (which is a water show – and, the stage is a big part [and, super interesting part] of the show)! Highly recommended!!

201411142157-2014-11-14 21.57.43

Yep… he’s all mine. I can’t even get a quick selfie shot that’s nice. Welcome to my world.

OK, that’s it from our Fall SQLintersection show. We’re already planning for Spring and Fall 2015. Be sure to save the date!

SQLintersection 2015: Save the Date

Spring 2015 – the week of May 18, 2015 and the location is still TBD (we should know later this week)

Fall 2015 – week of November 2, 2015 and we’re back in Las Vegas, NV

Thanks so much! We’re really looking forward to our next show already (already planning a few things for SQLafterDark).

Cheers,
k

Building High Performance Stored Procedures

[NOTE: Update 12 Nov 2014. The most recent bug (KB article KB2965069) is THANKFULLY very unlikely for most environments (mentioned below). As a result, I’m glad to say that we can use OPTION (RECOMPILE) as a much easier (and safer) solution. So, while I’ll still leave solution 3 as an option if you run into troubles with OPTION (RECOMPILE), I can luckily say that if you’re on the latest SP/CU – the problems are incredibly unlikely. The BEST hybrid solution is shown below as solution 2.]

In my SQL PASS Summit 2014 session last week, I spoke about building high performance stored procedures and properly dealing with parameter sniffing problems… if you’re interested, the session was recorded for PASStv and you can watch it here.

It all ties to something incredibly common in development – I call it the multipurpose procedure. Simply put, it’s one application dialog with many possible options and then sitting behind it – ONE stored procedure. This one stored procedure is supposed to handle every possible case of execution with multiple parameters. For example, imagine a dialog to search for a customer.

CustomerSearch

The user can enter in any combination of these elements and the procedure’s header looks like this:

CREATE PROC [dbo].[GetCustomerInformation]
(
    @CustomerID      INT = NULL
    , @LastName	     VARCHAR (30) = NULL
    , @FirstName     VARCHAR (30) = NULL
    , @MiddleInitial CHAR(1) = NULL
    , @EmailAddress  VARCHAR(128) = NULL
    , @Region_no     TINYINT = NULL
    , @Cust_code     TINYINT = NULL
)
...

And then the procedure’s main WHERE clause looks like this:

WHERE ([C].[CustomerID] = @CustomerID OR @CustomerID IS NULL)
    AND ([C].[lastname] LIKE @LastName OR @LastName IS NULL)
    AND ([C].[firstname] LIKE @FirstName OR @FirstName IS NULL)
    AND ([C].[middleinitial] = @MiddleInitial OR @MiddleInitial IS NULL)
    AND ([C].[EmailAddress] LIKE @EmailAddress OR @EmailAddress IS NULL)
    AND ([C].[region_no] = @Region_no OR @Region_no IS NULL)
    AND ([C].[cust_code] = @Cust_code OR @Cust_code IS NULL)GO

Or, possibly like this:

WHERE [C].[CustomerID] = COALESCE(@CustomerID, [C].[CustomerID])
    AND [C].[lastname] LIKE COALESCE(@lastname, [C].[lastname])
    AND [C].[firstname] LIKE COALESCE(@firstname, [C].[firstname])
    AND [C].[middleinitial] = COALESCE(@MiddleInitial, [C].[middleinitial])
    AND [C].[Email] LIKE COALESCE(@EmailAddress, [C].[Email])
    AND [C].[region_no] = COALESCE(@Region_no, [C].[region_no])
    AND [C].[cust_code] = COALESCE(@Cust_code, [C].[Cust_code]);

Or, maybe even like this:

WHERE [C].[CustomerID] = CASE WHEN @CustomerID IS NULL THEN [C].[CustomerID] ELSE @CustomerID END
    AND [C].[lastname] LIKE CASE WHEN @lastname IS NULL THEN [C].[lastname] ELSE @lastname END
    AND [C].[firstname] LIKE CASE WHEN @firstname IS NULL THEN [C].[firstname] ELSE @firstname END
    AND [C].[middleinitial] = CASE WHEN @MiddleInitial IS NULL THEN [C].[middleinitial] ELSE @MiddleInitial END
    AND [C].[Email] LIKE CASE WHEN @EmailAddress IS NULL THEN [C].[Email] ELSE @EmailAddress END
    AND [C].[region_no] = CASE WHEN @Region_no IS NULL THEN [C].[region_no] ELSE @Region_no END
    AND [C].[cust_code] = CASE WHEN @Cust_code IS NULL THEN [C].[cust_code] ELSE @Cust_code END

But, no matter which of these it looks like – they’re all going to perform horribly. An OSFA procedure does not optimize well and the end result is that you’ll get one plan in cache. If that was generated from a different combination of parameters than the ones you’re executing, you might get an absolutely abysmal plan. The concept is fairly simple – when a procedure executes and there isn’t already a plan in cache (for that procedure), then SQL Server has to generate one. To do so it “sniffs” the input parameters and optimizes based on the parameters sniffed. This is the plan that gets stored with the procedure and saved for subsequent executions. Depending on your workload characteristics, you might end up with a very common plan in cache and other users also executing the common parameters might see reasonable performance (especially if the WHERE does NOT use LIKE and has nothing but equality-based criteria [those aren’t quite as bad as this one]). But, there will still be cases where an atypical execution performs horribly.

And, unfortunately, sometimes the worst combination happens – a very atypical execution is the one that gets sniffed and then everyone suffers.

So, the admin might update statistics or force a recompile (or, even restart SQL Server) to try and get around this problem. These things “fix” the immediate problem by kicking the plan out of cache but they are NOT A SOLUTION.

Solution 1: The Simple Solution – OPTION (RECOMPILE)

Since SQL Server 2005, we’ve had an option to add OPTION (RECOMPILE) to the offending statement. When OPTION (RECOMPILE) works – it works incredibly well. However, there have been bugs and you’ll want to make sure that you’re on the latest SP or CU. It’s definitely had a bit of a checkered past. Here’s a list of issues and links to KB articles / bugs around this problem:

  • KB2965069 FIX: Incorrect result when you execute a query that uses WITH RECOMPILE option in SQL Server 2012 or SQL Server 2014. Fixed in:
    • Cumulative Update 4 for SQL Server 2014
    • Cumulative Update 2 for SQL Server 2012 SP2
    • Cumulative Update 11 for SQL Server 2012 SP1
  • KB968693 FIX: A query that uses parameters and the RECOMPILE option returns incorrect results when you run the query in multiple connections concurrently in SQL Server 2008. Fixed in:
    • SQL Server 2008 CU4
    • SQL Server 2008 R2 CU1
  • I don’t have a KB article link for this one but in both SQL Server 2008 and SQL Server 2008 R2 there are versions where some performance related features were unable. For example, filtered indexes could not be used inside of stored procedures even if you used OPTION (RECOMPILE) because of changes with the behavior or bugs. So, I’d fall back on using DSE (shown by solution 3).

Anyway, despite these issues (the first one listed above – is INCREDIBLY unlikely), I LOVE the simplicity of OPTION(RECOMPILE) because all you have to do is change the offending statement like this:

SELECT ...
FROM ...
WHERE ...
OPTION (RECOMPILE);

The reason I love this is for the simplicity of the statement. The reason that I have an issue with just doing this as a standard is two-fold:

  • First, if you do this too often (or, for extremely frequently executed procedures), you might end up using too much CPU.
  • Second, I’ve seen some environments where this is used to fix ONE problem and then it becomes a standard for ALL problems. This DEFINITELY should NOT become a standard coding practice. You only want to use this if your testing shows parameter sniffing problems and plan unstability. Stored procedures that have stable plans should be compiled, saved, and reused.

A better solution is one where you recompile for unstable plans but for stable plans – you place those in cache for reuse.

A Hybrid Solution – recompile when unstable, cache when stable

My main problem with adding OPTION (RECOMPILE) to many statements is that it can be costly to do when you’re executing thousands of procedures over and over again. And, if you really think about it, not all combinations really need to be recompiled.

For example, take an easier combination of just 3 parameters: CustomerID, Lastname, and Firstname. The possible combinations for execution will be:

  1. CustomerID alone
  2. Lastname alone
  3. Firstname alone
  4. CustomerID and Lastname together
  5. CustomerID and Firstname together
  6. Firstname and Lastname together
  7. Or, all three – CustomerID, Lastname, and Firstname

My questions for you to consider are:

  • Are all of those going to want the same execution plan?
  • And, what if they supply wildcards?

The most important consideration is that ANY query that submits CustomerID should NOT be recompiled, right? Think about it – if someone says I want to search for all of the people that have an ‘e’ in their last name (Lastname LIKE ‘%e%’) AND that have an ‘a’ in their first name (Firstname LIKE ‘%a%’) AND that have a CustomerID of 123456789 then all of a sudden this becomes an incredibly simple query. We’re going to go an lookup CustomerID 123456789 and then check to see if they have an ‘a’ in their first name and an ‘e’ in their last name. But, the lookup is incredibly simple. So, ANY execution where CustomerID is supplied should NOT be recompiled; the plan for those executions is STABLE. However, Lastname and Firstname searches might be highly unstable – especially if there’s a leading wildcard. So, if wildcard is present then let’s recompile if they don’t supply at least 3 characters for BOTH the first name and last name – preceding the wildcard character. And, it’s also OK if they don’t supply a wildcard at all. Then, the overall number of recompilations will be much lower – we’ll save CPU and we’ll get better performance by using a lot of STABLE, precompiled plans.

To do this, there are TWO versions. Solution 2 uses OPTION (RECOMPILE) strategically. Solution 3 using dynamic string execution instead.

Solution 2 – recompile when unstable [by adding OPTION (RECOMPILE)], cache when stable (using sp_executesql)

There are absolutely some criteria when plan stability can be predicted. For these, we will build the statement to concatenate only the non-NULL parameters and then execute normally using sp_executesql. For where it’s unknown or unstable, we’ll recompile by OPTION (RECOMPILE) to the statement. For all executions, use sp_executesql. And, of course, this is ONLY substitution for parameters. And, this should only be done for the most frequently executed procedures. You don’t have to do this for every procedure but it’s extremely beneficial for things that are heavily executed.

----------------------------------------------
-- Solution 2
----------------------------------------------

CREATE PROC [dbo].[GetCustomerInformation]
(
    @CustomerID	BIGINT = NULL
    , @LastName	VARCHAR (30) = NULL
    , @FirstName VARCHAR (30) = NULL
    , @MiddleInitial CHAR(1) = NULL
    , @EmailAddress VARCHAR(128) = NULL
    , @Region_no TINYINT = NULL
    , @Member_code TINYINT = NULL
)
AS
IF (@CustomerID IS NULL
    AND @LastName IS NULL
    AND @FirstName IS NULL
    AND @MiddleInitial IS NULL
    AND @EmailAddress IS NULL
    AND @Region_no IS NULL
    AND @Member_code IS NULL)
BEGIN
    RAISERROR ('You must supply at least one parameter.', 16, -1);
    RETURN;
END;

DECLARE @ExecStr NVARCHAR (4000),
        @Recompile  BIT = 1;

SELECT @ExecStr =
	N'SELECT COUNT(*) FROM [dbo].[Customers] AS [C] WHERE 1=1';

IF @CustomerID IS NOT NULL
	SELECT @ExecStr = @ExecStr
		+ N' AND [C].[CustomerID] = @CustID';

IF @LastName IS NOT NULL
	SELECT @ExecStr = @ExecStr
		+ N' AND [C].[LastName] LIKE @LName'; 

IF @FirstName IS NOT NULL
	SELECT @ExecStr = @ExecStr
		+ N' AND [C].[Firstname] LIKE @FName';

IF @MiddleInitial IS NOT NULL
	SELECT @ExecStr = @ExecStr
		+ N' AND [C].[MiddleInitial] = @MI';

IF @EmailAddress IS NOT NULL
	SELECT @ExecStr = @ExecStr
		+ N' AND [C].[EmailAddress] LIKE @Email';

IF @Region_no IS NOT NULL
	SELECT @ExecStr = @ExecStr
		+ N' AND [C].[Region_no] = @RegionNo';

IF @Member_code IS NOT NULL
	SELECT @ExecStr = @ExecStr
		+ N' AND [C].[Member_code] = @MemberCode';

-- These are highly limited sets
IF (@CustomerID IS NOT NULL)
    SET @Recompile = 0

IF (PATINDEX('%[%_?]%', @LastName) >= 4
        OR PATINDEX('%[%_?]%', @LastName) = 0)
    AND (PATINDEX('%[%_?]%', @FirstName) >= 4
        OR PATINDEX('%[%_?]%', @FirstName) = 0)
    SET @Recompile = 0

IF (PATINDEX('%[%_?]%', @EmailAddress) >= 4
        OR PATINDEX('%[%_?]%', @EmailAddress) = 0)
    SET @Recompile = 0

IF @Recompile = 1
BEGIN
    --SELECT @ExecStr, @Lastname, @Firstname, @CustomerID;
    SELECT @ExecStr = @ExecStr + N' OPTION(RECOMPILE)';
END;

EXEC [sp_executesql] @ExecStr
    , N'@CustID bigint, @LName varchar(30), @FName varchar(30)
    , @MI char(1), @Email varchar(128), @RegionNo tinyint
    , @MemberCode tinyint'
    , @CustID = @CustomerID
    , @LName = @LastName
    , @FName = @FirstName
    , @MI = @MiddleInitial
    , @Email = @EmailAddress
    , @RegionNo = @Region_no
    , @MemberCode = @Member_code;
GO

Solution 2 is the preferred and best method to create a balance where you don’t just recompile every time. This will give you better long-term scalability. 

Solution 3 – recompile when unstable [by using DSE], cache when stable (using sp_executesql)

In the past, if I’ve ever run into problems with OPTION (RECOMPILE), I always consider (albeit, not lightly) rewriting the statement using a dynamic string instead; this always works! But, yes, it’s a pain to write. It’s a pain to troubleshoot. And, above all, yes, you have to be careful of SQL injection. There are ways to reduce and even eliminate the potential for problems (and, not all DSE can). Above all, you do need to make sure you have clean input (limiting only to valid input for the type of column such as characters for names and emails, etc.). Check out this blog post if you want more information Little Bobby Tables, SQL Injection and EXECUTE AS.

Based on that blog post, the solution to this problem is the following code.

-- To reduce the potential for SQL injection, use loginless
-- users. Check out the "Little Bobby Tables" blog post for
-- more information on what I'm doing in this code / execution
-- to reduce the potential surface area of the DSE here.

CREATE USER [User_GetCustomerInformation]
WITHOUT LOGIN;
GO

GRANT SELECT ON [dbo].[Customers]
TO [User_GetCustomerInformation];
GO

-- You'll need this if you want to review the showplan
-- for these executions.
GRANT SHOWPLAN TO [User_GetCustomerInformation];
GO

----------------------------------------------
-- Solution 3
----------------------------------------------

CREATE PROC [dbo].[GetCustomerInformation]
(
 @CustomerID BIGINT = NULL
 , @LastName VARCHAR (30) = NULL
 , @FirstName VARCHAR (30) = NULL
 , @MiddleInitial CHAR(1) = NULL
 , @EmailAddress VARCHAR(128) = NULL
 , @Region_no TINYINT = NULL
 , @Cust_code TINYINT = NULL
)
WITH EXECUTE AS N'User_GetCustomerInformation'
AS
IF (@CustomerID IS NULL
 AND @LastName IS NULL
 AND @FirstName IS NULL
 AND @MiddleInitial IS NULL
 AND @EmailAddress IS NULL
 AND @Region_no IS NULL
 AND @Cust_code IS NULL)
BEGIN
 RAISERROR ('You must supply at least one parameter.', 16, -1);
 RETURN;
END;

DECLARE @spexecutesqlStr NVARCHAR (4000),
 @ExecStr NVARCHAR (4000),
 @Recompile BIT = 1;

SELECT @spexecutesqlStr =
 N'SELECT * FROM [dbo].[Customers] AS [C] WHERE 1=1';

SELECT @ExecStr =
 N'SELECT * FROM [dbo].[Customers] AS [C] WHERE 1=1';

IF @CustomerID IS NOT NULL
BEGIN
 SELECT @spexecutesqlStr = @spexecutesqlStr
 + N' AND [C].[CustomerID] = @CustID';
 SELECT @ExecStr = @ExecStr
 + N' AND [C].[CustomerID] = CONVERT(BIGINT, ' + CONVERT(NVARCHAR(30),@CustomerID) + N')';
END

IF @LastName IS NOT NULL
BEGIN
 SELECT @spexecutesqlStr = @spexecutesqlStr
 + N' AND [C].[LastName] LIKE @LName';
 SELECT @ExecStr = @ExecStr
 + N' AND [C].[LastName] LIKE CONVERT(VARCHAR(30),' + QUOTENAME(@LastName, '''''') + N')';
END

IF @FirstName IS NOT NULL
BEGIN
 SELECT @spexecutesqlStr = @spexecutesqlStr
 + N' AND [C].[Firstname] LIKE @FName';
 SELECT @ExecStr = @ExecStr
 + N' AND [C].[FirstName] LIKE CONVERT(VARCHAR(30),' + QUOTENAME(@FirstName, '''''') + N')';
END

IF @MiddleInitial IS NOT NULL
BEGIN
 SELECT @spexecutesqlStr = @spexecutesqlStr
 + N' AND [C].[MiddleInitial] = @MI';
 SELECT @ExecStr = @ExecStr
 + N' AND [C].[MiddleInitial] = CONVERT(CHAR(1), ' + QUOTENAME(@MiddleInitial, '''''') + N')';
END

IF @EmailAddress IS NOT NULL
BEGIN
 SELECT @spexecutesqlStr = @spexecutesqlStr
 + N' AND [C].[EmailAddress] LIKE @Email';
 SELECT @ExecStr = @ExecStr
 + N' AND [C].[EmailAddress] LIKE CONVERT(VARCHAR(128), ' + QUOTENAME(@EmailAddress, '''''') + N')';
END

IF @Region_no IS NOT NULL
BEGIN
 SELECT @spexecutesqlStr = @spexecutesqlStr
 + N' AND [C].[Region_no] = @RegionNo';
 SELECT @ExecStr = @ExecStr
 + N' AND [C].[Region_no] = CONVERT(TINYINT, ' + CONVERT(NVARCHAR(5),@Region_no) + N')';
END

IF @Cust_code IS NOT NULL
BEGIN
 SELECT @spexecutesqlStr = @spexecutesqlStr
 + N' AND [C].[Cust_code] = @MemberCode';
 SELECT @ExecStr = @ExecStr
 + N' AND [C].[Cust_code] = CONVERT(TINYINT, ' + CONVERT(NVARCHAR(5), @Cust_code) + N')';
END

-- These are highly limited sets
IF (@CustomerID IS NOT NULL)
 SET @Recompile = 0

IF (PATINDEX('%[%_?]%', @LastName) >= 4
 OR PATINDEX('%[%_?]%', @LastName) = 0)
 AND (PATINDEX('%[%_?]%', @FirstName) >= 4
 OR PATINDEX('%[%_?]%', @FirstName) = 0)
 SET @Recompile = 0

IF (PATINDEX('%[%_?]%', @EmailAddress) >= 4
 OR PATINDEX('%[%_?]%', @EmailAddress) = 0)
 SET @Recompile = 0

IF @Recompile = 1
BEGIN
 -- Use this next line for testing
 -- SELECT @ExecStr -- For testing
 EXEC (@ExecStr);
END
ELSE
BEGIN
 -- Use this next line for testing
 -- SELECT @spexecutesqlStr, @Lastname, @Firstname, @CustomerID;
 EXEC [sp_executesql] @spexecutesqlStr
 , N'@CustID bigint, @LName varchar(30), @FName varchar(30)
 , @MI char(1), @Email varchar(128), @RegionNo tinyint
 , @CustomerCode tinyint'
 , @CustID = @CustomerID
 , @LName = @LastName
 , @FName = @FirstName
 , @MI = @MiddleInitial
 , @Email = @EmailAddress
 , @RegionNo = @Region_no
 , @CustomerCode = @Cust_code;
END;
GO

This solution should NOT be used as a standard. But, it is an option for handling some cases where OPTION (RECOMPILE) isn’t giving you the plans / performance that you need. This solution is more complicated for developers to code. But, I’m not saying you have to do this for absolutely everything. Just work through the most critical procedures that are causing you the most grief.

The RIGHT way to handle multipurpose procedures

The hybrid solution (solution 2) is more complicated for developers to code. Remember, I’m not saying you have to do this for absolutely everything. Just work through the most critical procedures that are causing you the most grief (because they’re the most heavily executed procedures). The hybrid solution really gives you the best of all worlds here:

  • It reduces CPU by only recompiling for unstable combinations
  • It allows every combination executed with sp_executesql to get its own plan (rather than one per procedure)
  • It allows stable plans to be put in cache and reused for subsequent executions (which reduces your impact to CPU and gives you long-term scalability)

Above all, you need to test that this works as expected but not only will you get better plans but you’ll also scale better!

Cheers,
kt

For more information

  • Check out my PASS presentation DBA-313 here.
  • Check out my Pluralsight course: SQL Server: Optimizing Ad Hoc Statement Performance here.
  • Check out my Pluralsight course: SQL Server: Optimizing Stored Procedure Performance here.
  • Check out our Immersion Event: IEPTO2: Immersion Event on Performance Tuning and Optimization – Part 2 here.

And so it (PASS 2014) begins…

Tonight is the opening evening event for PASS 2014 and we’re (SQLskills, as a team), really looking forward to seeing familiar faces and hearing your fantastic questions, problems, and DBA-stories. I’m putting the final touches on my session (and also going over feedback from last year’s session) and I stumbled on a bunch of photos that I wanted to share from last year.

I look forward to seeing everyone and taking lots of funs pictures. But, above all – I love the stories. Be sure to stop by and see me!

Here’s a really cool shot that @SQLBalls took of my session last year:

Bradley Ball's Pano shot of Ballroom A for my session

Bradley Ball’s Pano shot of Ballroom A for my session

Here’s another shot during the session. I had lots of great questions from the group (from here down – you can click on the photos to see a larger image of the shot).

DuringSessions

SQL PASS 2013 – my session on skewed data

And how COOL is this? Because my session was broadcast on #PASStv there were folks watching from all over. I got this shot from a group that was watching together in their offices. This year I’m on #PassTV again TOMORROW so I hope to see a few of you again. Well, see you on twitter. ;-)

WatchingFromAfar

Watching #PASStv

This shot is from 45 minutes after my session and we were still going. I’m HAPPY to say that I have the last slot of the day once again and I expect that it may look like this again… 45 minutes after my session ends! I think the questions, the problems, the excitement that attendees have is what I [personally] love the most about these conferences!

AnsweringQs

45 minutes after my session and we were still going!

And now for the friends… here’s Erin and I. She’s MUCH taller than I am. See…

Erin&Kimberly

@ErinStellato is MUCH taller than I am!

 

Out for a few at the SQL Sentry party 
IsThatQwAPhotoBomb     MeStevePaulJesFromSQLAsylum

And now the attendee party – these are always great fun. I’m so glad that a few of you sent me the shots from your phones! 
NascarAttendeePhoto_@LaerteSQLDBA     NascarAttendeePhoto_@murilocmiranda     NascarAttendeePhoto_@ScottOstr     NascarAttendeePhoto_@SQLenergy     NascarAttendeePhoto_@yourSQLman

 

And last year’s AFTER PARTY was absolutely spectacular. We were in Charlotte, NC so SQL Sentry (based in Charlotte) KNOCKED IT OUT OF THE PARK with a KILLER event. They gave away some drives and a few of us (er, well – me and @PaulRandal) just HAD to tag along. One of my favorite shots is of @AaronBertrand, @KEKline and I. What a great time!! And, if you really want to see some great video action – check out SQL Sentry’s “Who’s the World’s Fastest DBA” video from the event here.

 

Aaron, me, and Kevin at Charlotte Motor Speedway… I WANNA GO FAST!

This last shot isn’t a complete shot of the team but we had most of us when @SQLAsylum passed by us and we couldn’t hesitate to get a shot from Pat. Every year he shoots the event and gets some great (and sometimes very candid) shots. I’m looking forward to seeing his pics for this year! As always, THANKS Pat!! He blogged his goals for 2014 here.

SQLskillsTeamAtPassFromSQLAsylum

Part of the SQLskills crew – we’ll get a COMPLETE group shot this year!

Looking forward to seeing everyone… I hope that everyone has a fantastic SUMMIT 2014.

Cheers,
kt

Fall SQLintersection is coming up soon and we can’t wait!

In just a couple of weeks (9-14 November 2014), we’ll be at our twice-yearly conference, SQLintersection. We’re really looking forward to it as our conference is different than many others… our sessions are hand-picked and combined to create daily “tracks” with themes. Often, I’ll even ask a speaker if they could talk about x or y or z so that it fits in well with that day’s theme. And, I only ask experts in those areas to speak. But, not just experts – truly great speakers. We really want you to walk away from a session empowered. And, nothing is more empowering than truly understanding something. Our speakers are available. If you don’t get a question answered during a session then stay after. Or, come to one of our open Q&A sessions (these are great fun). The point – we’re there to get your questions answered so that you can go back to the office with immediate solutions; immediate ROI for your time away.

SQLintersection: Great Speakers!

I’m so excited about this lineup. All of these speakers are top-rated, SQL experts that have been around in this field for years (some can even boast decades) but are still focused on consulting. Every speaker is a SQL Server MVP (with the except of the Microsoft speakers – but, I don’t think anyone’s going to question their SQL knowledge :)) and some are Microsoft Certified Masters in SQL Server. But, no matter what – ALL are focused on providing the most ROI that’s possible in their session(s) and/or their workshops. Check out this list of speakers:

  • Aaron Bertrand, Sr. Consultant, SQL Sentry, Inc. [blog | twitter]
  • Andrew J. Kelly, Mentor, SolidQ [blog | twitter]
  • Bob Ward, Principal Architect Escalation Engineer, Microsoft [blog | twitter]
  • Brent Ozar, Brent Ozar Unlimited [blog | twitter]
  • David Pless, Dedicated Support Engineer (PFE), Microsoft
  • Glenn Berry, Principal Consultant, SQLskills.com [blog | twitter]
  • Jeremiah Peschka, Brent Ozar Unlimited [blog | twitter]
  • Jonathan Kehayias, Principal Consultant, SQLskills.com [blog | twitter]
  • Joseph Sack, Sack Consulting [website | blog]
  • Kendra Little, Managing Director, Brent Ozar Unlimited [blog | twitter]
  • Kevin Farlee, Storage Engine Program Manager, SQL Server, Microsoft
  • Kevin Kline, Director of Engineering Services, SQL Sentry, Inc. [blog | twitter]
  • Kimberly L. Tripp, President/Founder, SQLskills.com [blog | twitter]
  • Paul S. Randal, CEO / Owner, SQLskills.com [blog | twitter]
  • Paul White, SQL Kiwi Limited [blog | twitter]
  • Shep Sheppard, Senior Program Manager, Microsoft AzureCAT [blog | twitter]
  • Tim Chapman, Dedicated Support Engineer (PFE), Microsoft [blog | twitter]

SQLintersection: When is it all happening?

The show officially runs from Monday, November 10 through Thursday, November 13 but there are both pre-conference and post-conference workshops that extend the show over a total of up to 6 full days. For the full conference, you’ll want to be there from Sunday, November 9 through Friday, November 14.

  • Sunday, November 9 – pre, pre-con day. There are three workshops running:
    • SQL Server Architecture: Choosing Virtualization, Clustering, AlwaysOn, and More by Jeremiah Peschka, Kendra Little, and Brent Ozar
    • Core SQL Server Fundamentals by Andrew Kelly
    • Mastering SQL Server Execution Plan Analysis by Paul White
  • Monday, November 10 – pre-con day. There are three workshops running. And, the first keynote of the conference is Monday evening.
    • Performance Troubleshooting Using Waits and Latches by Paul S. Randal
    • High Performance, Scalable Asynchronous Processing Using Service Broker by Jonathan Kehayias
    • Very Large Tables: Optimizing Performance and Availability through Partitioning by Kimberly L. Tripp
  • Tuesday, November 11 through Thursday, April 13 is the main conference. Conference sessions will run all day in multiple tracks:
    • Download our full schedule here
    • Be sure to check out some of our cross-conference events such as our Women In Tech luncheon on Tuesday!
  • Friday, November 13 is our final full day running with a few post-con workshops. There are a few workshops running.
    • Queries Gone Wild 2: Statistics and Cardinality in Versions 2008, 2008R2, 2012, and 2014 by Kimberly L. Tripp and Joe Sack
    • Developer’s Guide to Tuning Somebody Else’s SQL Server by Jeremiah Peschka, Kendra Little, and Brent Ozar

SQLintersection: Why is it for you?

If you want practical information delivered by speakers that not-only know the technologies but are competent and consistently, highly-rated presenters – this is the show for you. You will understand the RIGHT features to troubleshoot and solve your performance and availability problems now!

We hope to see you there!
kt

PS – Use the discount code ‘SQLskills’ when you register and receive $50 off registration!