Women in Tech – a great post by Stacia Misner

This "comment" started out as one but I've decided to make it a post as I hope that a few people that may follow me but not Stacia – will check out her post on WIT.

The post to read is here: http://blog.datainspirations.com/2010/07/29/maybe-its-just-me-a-perspective-from-one-woman-in-it

First – Great post.

Second – Fantastic comments from folks in the SQL community (once again reminding me of why I like it so much)!

I do love the SQL community for so many reasons and this is DEFINITELY one of them. To be honest, I've never really felt singled out as a "WOMAN" in tech. I've always felt like just another person struggling to do the tech and to do it well. I've found that you must do things well to get ahead but I (luckily) haven't been in a situation where nothing but my gender stopped me from getting ahead or getting something done. To take that even further since I'm sometimes the only woman in the room where I'm teaching (big smiles) the thing I love the most is the willingness and interest for everyone to listen, contribute and grow – regardless of the fact that their instructor is a woman. I suspect that maybe I don't meet the people that don't want to hear tech from a woman as they go to someone else's sessions instead of mine but the good news is that I've often had very well attended sessions (so it truly can't be the majority of men that can't listen to a woman OR work with a woman OR deal with women in tech). And, this week my class was the norm – in terms of men/women ratios (maybe even a bit higher than normal). We had 31 students + 2 instructors – 6 students were women (Paul Randal and I were the instructors). That's only 20% and still fairly low (overall – but again, a bit higher than some) but I didn't feel like there were any issues with men/women in the class – everyone was happy to be there and sharing information (regardless of gender, ethnicity, etc.).

Having said all of that – I do think that there have been sterotypes over the years. I remember (*VERY* clearly) something that was said to me when I was in high school (I wish I didn't have to say that it was over 20 years ago but it was roughly 25 years ago when this occurred). My favorite math teacher (I was also his TA) introduced me to a mathematician that was a woman… She was definitely an inspiration and talked to me about different fields and directions in which to go – in Math. It was a nice dinner. I remember thinking that I might want to continue my studies in Math (which I did as a Math/CompSci major in college). But, what I remember clearly is her telling me that she didn't wear makeup or paint her nails because she wouldn't be respected in the field. I've never forgotten that but I've also never lived by that. And, unless it's me (and my bad memory ;-), I really don't think it has hurt me. Maybe it's SQL. Maybe I've been sheltered – but I guess I have to say that I'm happy it's been this way as well.

Thanks for (another!) great post Stacia and some really great comments.

Once again – I'm reminded of how much I love our community!

Cheers,
kt

PS – Check out a past post about Women in Technology and some of the resources that are out there: http://www.sqlskills.com/blogs/kimberly/post/Resources-A-panel-on-HA-and-a-second-on-Women-In-Tech.aspx. Also, if you're interested in helping your daughters get more involved in technology check out:

RunAs Radio interview posted

Last week we sat down with our good friends Richard Campbell and Greg Hughes of RunAs Radio and recorded an interview about how to learn effectively. It was incented by Paul's recent post as part of TSQL Tuesday: T-SQL Tuesday #008: Top ten mistakes to make when attending a class. Additionally, I've written some related posts on types of training here.

Here's the specific link to the show: Kim Tripp and Paul Randal Talk about Learning Effectively!

Enjoy!
kt

Just added – NEW SQL Immersion Event in Bellevue, WA – in August

It's official! We're doing another 5-day SQL Immersion Event (our ever-popular 5-day Deep Dive into Internals, Performance and Maintenance) this year in the US!

We decided to have a class near Seattle because Washington is *fabulous* in August!

The class will cover our best content in these areas:

  • On-disk structures: how the data is stored
  • Index internals: how the data is organized
  • Logging and recovery: how the data is protected
  • Choosing the RIGHT Data Type
  • Table & Index Partitioning Strategies
  • Data Access
  • Indexing Strategies
  • Data and log file provisioning and management
  • Tempdb
  • Index and statistics maintenance
  • Using backup and restore (plus internals)
  • Consistency checking and disaster recovery

The event will be held in the Marriott Courtyard Bellevue Downtown and will be fully catered – there's a special room rate of $159/night and even a hotel room discount for the first 10 people to register and stay at the event hotel.

The full cost is US $3,100.00 with an early-bird special of US $2,500.00 for registrations before midnight PST June 18th.

Check out the Immersion Event page for full details, registration, and feedback from attendees of our event in Boston earlier this year (or jump straight to registration).

We hope to see you there!
kt

A new and improved sp_helpindex (jokingly sp_helpindex8)

NOTE: I've released other versions of sp_helpindex since this post. Check out the category: sp_helpindex rewrites for all of the options!

OK, I first blogged about re-writing sp_helpindex here as sp_helpindex2 (April 2008). Shortly thereafter (Aug 2008), a reader found a bug and when I went digging, so did I (a couple of others). I did a few tweaks and the version that's stuck for the past couple of years is in this blog post: Updates (fixes) to sp_helpindex2.

However, as much as I've used that – I've always been frustrated by its output. And, so, over the months+, I wrote sp_helpindex3 (this solely added the index_id as the first column and then ordered the output; I've always hated the weird way in which sp_helpindex/2 ordered the output). Then, I started writing sp_helpindex4 which listed the clustering key columns with each index so that I could more easily identify columns on large tables without a lot of scrolling. Ultimately, I didn't like this because the clustering key (or some of the columns in a multi-column clustering key – might already be in the nonclustered index and so I still had to figure things out more manually). And, so I finally sat down and created what's jokingly being called sp_helpindex8.

Geeky note: Er, you might wonder with how many geeky SQL people I've joked about sp_helpindex8 but in fact, in my last few events (both onsite with a couple of customers as well as when teaching the Indexing Strategies and Performance Tuning portion of the Microsoft Certified MastersSQL Server course (two weeks ago)), I've actually joked – and promised – that I'd have this blogged before this particular 3 week MCM rotation ends. And, well, that's this week. So, I've actually joked with quite a few folks on this and I'm close to the end of time on blogging it. So, here I am – sitting in a hotel room in Houston, after having presented at the Houston Area User Group (7-9pm) (yes, with Paul) and after having been onsite all day with a customer – blogging this darn thing. I just have to get it out there. ;-)

Having said that – I'm pretty happy with this version but it was pretty painful to write. And, before I get into what it does – and why I wanted this version – let me explain what I didn't do and why. There are a lot of other things that could be added. And, quite a few folks have said – why don't you do this or why don't you do that… Specific suggestions have been made such as adding index usage stats (from the current state shown by the DMV) or add index operational stats (to show if an index is suffering from latch IO waits, for example) or add index physical stats (to show if an index is fragmented), etc. And while many of these suggestions would make for lots of interesting uses of this output, I've stayed away from these. Why? Some are expensive to run – such as physical stats – even a limited scan can be painful on a really large table. Or, the results might not be able to be truly trusted  – such as with index_operational_stats. Why? Because the information from that DMV might only cover a short period of time. The information help by them is cleared whenever SQL Server is restarted (or if a database is taken offline or restored, etc.). So, when were those usage or operational stats last cleared? Moreover, is this single "look" at the DMV information really the best on which to base our decisions (even just in general).

In the end, I think that physical stats, operational stats and even usage stats – should be used wisely and carefully (definitely used but not by this). In fact, we often recommend that the best way to use these is by studying them over time and then analyzing their trends not just the point in time values. And, there are LOTS of posts that can help you with understanding this information; here's just a couple to get you started:

So, having said that – I've kept my sp_helpindex8 more pure ;-). What I want to see is EXACTLY what's stored in that index. In fact, this has always been my main goal. The first version (sp_helpindex2) added INCLUDED columns (2005+) as well as filter definitions (2008) to the output. So, what was missing? The added columns that SQL Server might have to add if a nonclustered is nonunique and/or when the clustering key is nonunique. The reason this is so important to me is that there are many things coming together and recommending indexes (DTA, the missing index DMVs, application tools, etc.) and while many of these tools are solely there to "help" us get better performance and "help" us create more effective indexing strategies – these tools sometimes recommend the same index over and over again (see Paul's recent blog post: Missing index DMVs bug that could cost your sanity… about the bug in the missing index DMVs) OR the tools might recommend similar indexes to what we already have.

Additionally, the output from sp_helpindex/2 only shows the columns that are EXPLICITLY defined in the key and in the leaf (using INCLUDE) but they don't show what SQL adds for you. Most folks know that SQL Server adds the clustering key to the nonclustered indexes and when you have a simple single-column clustering key it's not difficult to know that this IS "there" in your index even though it doesn't show up. But, even with that, I think there are still a few things that are lesser known. For example, if your nonclustered index is nonunique – did you know that the clustering key is not only added to the leaf level of the index (like INCLUDE does) but that the clustering key is added into the tree (as part of the key and ordering?). And, if have one index that lists this clustering key explicitly and another that doesn't – are you actually going to be able to easily tell that these are in fact the SAME. Ultimately, I want you to be able to better recognize what's REALLY in that index. What you might end up seeing is that two indexes are more similar than you first thought and as a result – maybe you can drop one of them. Or, you might be able to combine them to reduce your overall indexes (I call this index consolidation). And, all of this becomes significantly more challenging when you have more complex clustering keys (and, yes, they do exist!). The other thing I did is delimit the columns and index names properly using brackets. This is also nice because you might have identifiers that make the output of sp_helpindex/2 hard to read. This should help. However, I do have to admit that you probably have other problems if your index/column names have commas, spaces or brackets in them. But, this should still help/work.

And, that's why I wrote "sp_helpindex8" it's a MUCH better version of sp_helpindex but it requires a few other components. And, it was a lot more challenging to write. Basically, this information is more internal to SQL Server and not directly exposed anywhere. So, we had to do a bit of digging. Paul started the process (thanks!) but, I ended up spending about 5 hours getting it exactly the way I wanted it. And, I also ended up breaking it up into two parts: one sp that gets column definitions and the other that builds the final output. And, I've actually named it with a longer name (sorry!). But, I rarely type in the name anyway. What I typically do is go to SSMS, Tools | Options, then under Environment | Keyboard, I set the keyboard shortcut for Ctrl+F1 to my sp_SQLskills_SQL2008_helpindex.

ssms customkeyboardshortcuts A new and improved sp helpindex (jokingly sp helpindex8)

Now, when I want to see the true defintions of indexes on a table – I highlight the tablename and hit Ctrl+F1. But, if you want to rename this to sp_helpindex8 – I won't be offended. ;-)

So, how do you create this?

  1. You need to create sp_SQLskills_ExposeColsInIndexLevels (sp_SQLskills_ExposeColsInIndexLevels.sql (6.73 kb)). This is what gives us the tree/leaf definitions. And, this works for both SQL Server 2005 and SQL Server 2008.
  2. You need to create the version specific sp_helpindex8. These are named: sp_SQLskills_SQL2005_helpindex (sp_SQLskills_SQL2005_helpindex.sql (10.50 kb)) and  sp_SQLskills_SQL2008_helpindex (sp_SQLskills_SQL2008_helpindex.sql (10.73 kb)). Again, I know they're long names but using the keyboard shortcuts can help. And, if you do rename these make sure you do a replace ALL. There are multiple places where there are comments and/or statements that need to execute to get everything to work (for example the sp_MS_marksystemobject execution).

NOTE: The very first version of the 2005 script had a bug in it (thanks Calvin!). Funnily one I already knew about and thought I had caught (it was fixed in the 2008 version). Anyway, as of 3:15pm CT on May 19, it's fixed in the scripts here. There could be others but so far – it's looking good!

Enjoy!
kt

Determining the position of search arguments in a join

As a follow-on discussion to my SQL Server Magazine Q&A titled “Determining the position of search arguments in a join” where I was responding to a reader question about whether or not search arguments could be moved up into the FROM clause instead of placed in the WHERE clause and whether or not it would improve performance. While I was so focused on creating a scenario to highlight the problems when doing this, I missed a great point about what happened in my specific example. Fellow SQL Server MVP, Phil Brammer (blog | twitter) brought up this point first and it sparked a very passionate discussion on the SQL Server MVP forum. As a result, I thought I’d take a few minutes to create a few more examples and state a few key points. The most important of which is that search arguments should not be moved from the WHERE clause into the FROM clause unless you really know what you’re doing – you may completely change the meaning of the query… Let me explain.

Details:

In the prior Q&A, Listing 1 and Listing 2 show INNER JOINs that describe the Customers along with some of their Sales information (specifically: OrderDate, AccountNumber, FirstName, LastName, and StateProvinceCode). In the joins, we asked for only Customers with a RegionCode of US. Regardless of having the search argument in the WHERE clause or the FROM clause, SQL Server executes the same plan, has the same performance and returns the same data. While people often state that a WHERE clause applies to the entire set (and also imply order), that’s NOT the case in terms of how the join is processed. SQL Server knows that the search argument applies to the entire set but it can certainly apply the filter prior to the join if it’s selective enough and would help improve performance. Moving the search argument up into the FROM clause is not a “tip” for improving performance. However, if all of the joins are inner joins, then it really doesn’t matter. You can certainly move the search arguments up; they won’t change the result set and it’s very unlikely that they’ll change the query plan. Listing 1 and Listing 2 were both executed at the same time to produce these two plans:

queryplans Determining the position of search arguments in a join

First, notice that they are 50% each. Also, without worrying about the plan itself, notice that each plan has the same pattern, the same percentages, and the same join types. These are in fact, the same plan – Listing 1 has the search argument in the WHERE clause and Listing 2 has the search argument in the FROM clause.

To illustrate how moving a search argument could affect results, I created OUTER joins, added a somewhat random search argument and then did the same thing – Listing 3 was written as an outer join and Listing 4 was a copy of that query with the search argument moved up into the FROM clause. While my examples (solely looking at the results between Listing 3 and Listing 4) prove the point that a search argument cannot be moved from the WHERE clause into the FROM clause without understanding the effects, my example also introduced another phenomena. It was this phenomena that sparked debate. When a search argument exists on an inner table of an outer join, the purpose of the outer join is effectively lost. In fact, the query is not likely to even work the way you expect. To illustrate this, I’ve created a [quick] Venn diagram of what’s logically happening:

 

Venn%20diagrams%20 %201024x768 Determining the position of search arguments in a join

In fact, the two following queries will produce identical results:

– The original Listing 3 query
SELECT so.OrderDate, c.AccountNumber, p.FirstName
     , p.LastName, sp.StateProvinceCode
FROM Sales.Customer AS c
     LEFT OUTER JOIN Sales.SalesOrderHeader AS so
          ON so.CustomerID = c.CustomerID
     LEFT OUTER JOIN Person.Person AS p
          ON c.PersonID = p.BusinessEntityID
     LEFT OUTER JOIN Person.Address AS pa
          ON so.ShipToAddressID = pa.AddressID
     LEFT OUTER JOIN Person.StateProvince AS sp
          ON sp.StateProvinceID = pa.StateProvinceID
WHERE sp.CountryRegionCode = 'US'
ORDER BY sp.StateProvinceCode, so.OrderDate 

– Listing 3 is really an inner join
SELECT so.OrderDate, c.AccountNumber, p.FirstName
     , p.LastName, sp.StateProvinceCode
FROM Sales.Customer AS c
     INNER JOIN Sales.SalesOrderHeader AS so
          ON so.CustomerID = c.CustomerID
     INNER JOIN Person.Person AS p
          ON c.PersonID = p.BusinessEntityID
     INNER JOIN Person.Address AS pa
          ON so.ShipToAddressID = pa.AddressID
     INNER JOIN Person.StateProvince AS sp
          ON sp.StateProvinceID = pa.StateProvinceID
WHERE sp.CountryRegionCode = 'US'
ORDER BY sp.StateProvinceCode, so.OrderDate 

A better example – which preserved the purpose of the outer join – would be to place a search argument on the table which is our outer table in the query. Before I show the query itself, let’s get some background on the data here.

In the Sales.Customer table there are 3,520 Customers in TerritoryID 1 (the northwestern US):

SELECT COUNT(*)
FROM Sales.Customer AS c
WHERE c.TerritoryID = 1 – 3520 customers 

These Customers represent a total number of Sales of 4,594 orders:

SELECT COUNT(*)
FROM Sales.Customer AS c
     JOIN Sales.SalesOrderHeader AS so
          ON so.CustomerID = c.CustomerID
WHERE c.TerritoryID = 1  – 4594 orders 

There are 92 customers that have never placed an order:

SELECT COUNT(*)
FROM Sales.Customer AS c
WHERE NOT EXISTS (SELECT *
                  FROM Sales.SalesOrderHeader AS so
                  WHERE so.CustomerID = c.CustomerID)
AND c.TerritoryID = 1 – 92 customers have not placed orders 

If we want to see ALL Customers in TerritoryID 1 – regardless of whether or not they have placed an order and with their order information if they have, we can use an outer join to do this. For those customers who have not placed an order, NULLs will be produced:

SELECT so.OrderDate, c.AccountNumber, p.FirstName
     , p.LastName, sp.StateProvinceCode
FROM Sales.Customer AS c
     LEFT OUTER JOIN Sales.SalesOrderHeader AS so
          ON so.CustomerID = c.CustomerID
     LEFT OUTER JOIN Person.Person AS p
          ON c.PersonID = p.BusinessEntityID
     LEFT OUTER JOIN Person.Address AS pa
          ON so.ShipToAddressID = pa.AddressID
     LEFT OUTER JOIN Person.StateProvince AS sp
          ON sp.StateProvinceID = pa.StateProvinceID
WHERE C.TerritoryID = 1 – 4686 rows
ORDER BY sp.StateProvinceCode, so.OrderDate 

This is different from an inner join because the INNER join will return only those customers that have a matching row in each and every joined table. In this case, all 3,520 customers have a matching row in Person.Person, Person.Address and Person.StateProvince. As a result, the inner join produces all 3,520 customers with their sales for 4,594 total rows. If you notice, the outer join above returned 4686 rows (4686 – 4594 = 92).

SELECT so.OrderDate, c.AccountNumber, p.FirstName
     , p.LastName, sp.StateProvinceCode
FROM Sales.Customer AS c
     INNER JOIN Sales.SalesOrderHeader AS so
          ON so.CustomerID = c.CustomerID
     INNER JOIN Person.Person AS p
          ON c.PersonID = p.BusinessEntityID
     INNER JOIN Person.Address AS pa
          ON so.ShipToAddressID = pa.AddressID
     INNER JOIN Person.StateProvince AS sp
          ON sp.StateProvinceID = pa.StateProvinceID
WHERE C.TerritoryID = 1 – 4594
ORDER BY sp.StateProvinceCode, so.OrderDate 

Using this query as our example, let’s go back to the original discussion about whether or not moving the search argument can improve performance… In the query below, I *solely* move the search argument (c.TerritoryID = 1) up into the FROM clause for the following query:

SELECT so.OrderDate, c.AccountNumber, p.FirstName
     , p.LastName, sp.StateProvinceCode
FROM Sales.Customer AS c
     LEFT OUTER JOIN Sales.SalesOrderHeader AS so
          ON so.CustomerID = c.CustomerID
              AND C.TerritoryID = 1 – 20986 rows
     LEFT OUTER JOIN Person.Person AS p
          ON c.PersonID = p.BusinessEntityID
     LEFT OUTER JOIN Person.Address AS pa
          ON so.ShipToAddressID = pa.AddressID
     LEFT OUTER JOIN Person.StateProvince AS sp
          ON sp.StateProvinceID = pa.StateProvinceID
ORDER BY sp.StateProvinceCode, so.OrderDate 

And it returns a drastically different result set of 20,986 rows. What’s happened here?

In this case, we’ve preserved ALL customer rows and then found the matching rows for ONLY those customers where TerritoryID = 1. To show you the numbers, it breaks down like this:

There are 19,820 customers – of which, 3,520 are in TerritoryID 1. Of these 3,520 customers in TerritoryID 1, all but 92 have placed orders – for a total of 4,594 orders. Bringing everything together:

19,820 customers – 3,520 in TerritoryID 1 = 16,300 rows

.4,594 orders + 92 customers that haven’t placed an order (remember – all customers are preserved in the outer join) = 4,686 rows

16,300 + 4,686 = 20,986 rows

Bringing this all together and to quote fellow MVP Hugo Kornelis (blog) in the discussion: "For OUTER joins, moving predicates does change the results. This makes performance comparisons moot. If a query gives incorrect results, I don't care if it's faster or slower than one that does yield the desired results."

This is absolutely the most important point. You cannot just move a search argument “up” into the FROM without repercussions. In general, I recommend search arguments to be in the WHERE clause and join conditions in the FROM clause.

But, to make it even better, I thought I’d end with an even more simplified example of the sets described above:

SELECT c.CustomerID, so.SalesOrderNumber, so.OrderDate
FROM Sales.Customer AS c
     INNER JOIN Sales.SalesOrderHeader AS so
          ON so.CustomerID = c.CustomerID
WHERE c.TerritoryID = 1 – Northwest US, 4594 rows
go 

– in this simple inner join
SELECT c.CustomerID, so.SalesOrderNumber, so.OrderDate
FROM Sales.Customer AS c
     INNER JOIN Sales.SalesOrderHeader AS so
          ON so.CustomerID = c.CustomerID
              AND c.TerritoryID = 1 – Northwest US, 4594 rows
go

– What if we want ALL Customers in the Northwest of the US,
– regardless of whether or not they placed an order:
SELECT c.CustomerID, so.SalesOrderNumber, so.OrderDate
FROM Sales.Customer AS c
     LEFT OUTER JOIN Sales.SalesOrderHeader AS so
          ON so.CustomerID = c.CustomerID
WHERE c.TerritoryID = 1 – Northwest US, 4594 rows
go – Northwest US, 4686 rows 

– So, now what if we thought that the performance could be
– better if we pushed this up into the FROM clause…
SELECT c.CustomerID, so.SalesOrderNumber, so.OrderDate
FROM Sales.Customer AS c
     LEFT OUTER JOIN Sales.SalesOrderHeader AS so
          ON so.CustomerID = c.CustomerID
              AND c.TerritoryID = 1 – 20986 rows

– Absolutely ALL customers, regardless of whether or not they
– placed an order. For the Northwest territory we will see their
– order information but for all other customers (even those that
– have placed an order), we will NOT see their order information.
go 

What's probably most important is for you to thoroughly review the output of the last result set (of 20,986 rows). While I've explained the number – do you understand the results? Out of the 3,520 customers in TerritoryID 1 – 3,428 of them create 4,594 sales. For ONLY these sales, we see the SalesOrderNumber and SalesOrderDate. For all other customers – even those that have placed orders – we see NULLs. Why? Because moving this search argument up into the FROM has defined the criteria for the INNER join *BEFORE* the outer join is performed. This is why we see NULLs for those that have placed orders. If you order the results by TerritoryID this may help you better visualize what's happening!

Thanks for reading and thanks for the comments and discussions… I love the SQL Community for its never ending passion!

Cheers,
Kimberly

Upcoming Events for May, June, July and August – including a new Immersion Event!

OK, it seems like when it rains it pours! I still have a few blog posts on my ToDo list and I do hope to get to them over the next few days (one that I've promissed to the SQL MCM folks in this month's rotation as well so now there's even more pressure!).

Anyway, in the midst of working on some really fun projects (mostly architecture/partitioning/tuning), there are a few upcoming public events. Here's a quick update so you can find all of the details: 

Houston, Texas (the Woodlands) – May 18, 2010

London, England – June 17, 2010

Dublin, Ireland – June 28th to July 8th, 2010

Bellevue, Washington, August 9th to August 13th, 2010

  • SQL Immersion: 5 day workshop with the same format as our popular Boston Immersion Event that was presented in March.
  • Registration details coming soon and event details are here.
  • EARLYBIRD discount offer expires on June 18th. 
  • NOTE: This will be our only public Immersion Event in the 2nd half of 2010.

We hope to see you at one of these events and feel free to email one of us (kimberlyATsqlskillsDOTcom or paulATsqlskillsDOTcom) if you have any questions!

Thanks,
Kimberly

The MOSS Show Sharepoint Podcast – Episode 21 is live!

Paul and I chatted with Hilton Giesenow about SQL Server, Sharepoint and Involuntary DBAs.  Paul blogged a bit more details about it in the post titled: Paul and Kimberly interview from MVP Summit. And, you can get the actual interview here:

Episode 21 – Diving Into SQL Server With Paul Randal & Kim Tripp

Enjoy!
kt

Clearing the cache – are there other options?

OK, I've had a bit of a tangent going on optimizing stored procedures and its all revolved around adhoc plan caching and potentially wasted cache due to the single-use plans created (see the most recent posts titled:Plan cache and optimizing for adhoc workloads and Plan cache, adhoc workloads and clearing the single-use plan cache bloat). To be more specific, these are statements that when generated (via adhoc or prepared statements) don't end up being parameterized (in general – only relatively simple statements can be effectively parameterized and when your statements complex, SQL Server won't parameterize them – this can be good and in some systems this can be terribly bad). But, now you know how bad… you know how much of your cache you're wasting but where is it going and are there any similarities to these statements? Should you (or can you?) bring them together to see if there is a commonality that can be leveraged?

In general, the answer is "it depends" (you knew that was coming!). However, I do have good news, you can definitely gauge what's actually going on and how many of the statements are in fact, similar. This might allow you to leverage another database option (forced parameterization) and/or determine that you really NEED stored procedures. However, that might end up being more complex as it will require rewriting the app (or at least some of the app). Again, the good news here is that you'll be able to tell which part of the app.

Take for example the following:

DBCC FREEPROCCACHE
go
SELECT * FROM dbo.Member WHERE Lastname = 'Tripp';
go
SELECT * FROM dbo.Member WHERE Lastname = 'Tripped';
go
SELECT * FROM dbo.Member WHERE Lastname = 'Tripper';
go
SELECT * FROM dbo.Member WHERE Lastname = 'Tripps';
go
SELECT * FROM dbo.Member WHERE Lastname = 'Falls';
go

SELECT cp.objtype, cp.cacheobjtype, cp.size_in_bytes, cp.refcounts, cp.usecounts, st.text –, *
FROM sys.dm_exec_cached_plans AS cp
CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) AS st
WHERE cp.objtype IN ('Adhoc', 'Prepared')
        AND st.text LIKE '%from dbo.member%' 
        AND st.text NOT LIKE '%SELECT cp.objecttype%'
ORDER BY cp.objtype
go

objtype       cacheobjtype          size_in_bytes   refcounts  usecounts     text
Adhoc         Compiled Plan         24576               2              1                 SELECT * FROM dbo.Member WHERE Lastname = 'Falls'; 
Adhoc         Compiled Plan         24576               2              1                 SELECT * FROM dbo.Member WHERE Lastname = 'Tripps'; 
Adhoc         Compiled Plan         24576               2              1                 SELECT * FROM dbo.Member WHERE Lastname = 'Tripper'; 
Adhoc         Compiled Plan         24576               2              1                 SELECT * FROM dbo.Member WHERE Lastname = 'Tripped'; 
Adhoc         Compiled Plan         24576               2              1                 SELECT * FROM dbo.Member WHERE Lastname = 'Tripp'; 

The first thing to notice is that each of these queries has a *different* statement in the adhoc plan cache. The second is to recognize that each of these statements take 24K in the plan cache. If we were to change to "optimize for ad hoc workloads" then things might be a bit better:

sp_configure 'optimize for ad hoc workloads', 1
go
reconfigure
go
DBCC FREEPROCCACHE
go
SELECT * FROM dbo.Member WHERE Lastname = 'Tripp';
go
SELECT * FROM dbo.Member WHERE Lastname = 'Tripped';
go
SELECT * FROM dbo.Member WHERE Lastname = 'Tripper';
go
SELECT * FROM dbo.Member WHERE Lastname = 'Tripps';
go
SELECT * FROM dbo.Member WHERE Lastname = 'Falls';
go

SELECT cp.objtype, cp.cacheobjtype, cp.size_in_bytes, cp.refcounts, cp.usecounts, st.text –, *
FROM sys.dm_exec_cached_plans AS cp
CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) AS st
WHERE cp.objtype IN ('Adhoc', 'Prepared')
        AND st.text LIKE '%from dbo.member%' 
        AND st.text NOT LIKE '%SELECT cp.objecttype%'
ORDER BY cp.objtype
go

objtype       cacheobjtype         size_in_bytes   refcounts  usecounts     text
Adhoc         Compiled Plan Stub  320                 2              1                 SELECT * FROM dbo.Member WHERE Lastname = 'Falls'; 
Adhoc         Compiled Plan Stub  320                 2              1                 SELECT * FROM dbo.Member WHERE Lastname = 'Tripps'; 
Adhoc         Compiled Plan Stub  320                 2              1                 SELECT * FROM dbo.Member WHERE Lastname = 'Tripper'; 
Adhoc         Compiled Plan Stub  320                 2              1                 SELECT * FROM dbo.Member WHERE Lastname = 'Tripped'; 
Adhoc         Compiled Plan Stub  320                 2              1                 SELECT * FROM dbo.Member WHERE Lastname = 'Tripp'; 

But, this is not the only issue. Even though each statement will only be in the cache once and will only take 320 bytes (instead of 24K) the cumulative effect of this might be significant. So, how can you see this? If you try to aggregate over these statements then you'll only be able to use the first 40-50 characters while that works here it definitely won't work for more complex statements. Until – SQL Server 2008. In 2008, you can use the query_hash to find similar statements and their overall affect on the query cache.

SELECT qs2.query_hash AS [Query Hash]
        , SUM(qs2.size_in_bytes) AS [Total Cache Size]
        , SUM(qs2.total_worker_time)/SUM(qs2.execution_count) AS [Avg CPU Time]
        , SUM(qs2.total_elapsed_time)/SUM(qs2.execution_count) AS [Avg Duration]
        , COUNT(*) AS [Number of plans] 
        , MIN(qs2.statement_text) AS [Statement Text]
FROM (SELECT qs.*, cp.size_in_bytes, SUBSTRING(ST.text, (QS.statement_start_offset/2) + 1, ((CASE statement_end_offset WHEN -1 THEN DATALENGTH(st.text
                ELSE QS.statement_end_offset END - QS.statement_start_offset)/2) + 1) AS statement_text 
                FROM sys.dm_exec_cached_plans AS cp 
                        JOIN sys.dm_exec_query_stats AS qs ON cp.plan_handle = qs.plan_handle
                CROSS APPLY sys.dm_exec_sql_text(QS.sql_handle) as ST) as qs2
GROUP BY qs2.query_hash ORDER BY [Avg Duration] DESC;

This is incredibly important because this will give you the information to determine what SIMILAR query (or queries) are consuming the cache and which ones have the highest cumulative effect. And, if you see a statement that is consistent and can be optimized using indexes (and consistently using the same indexes) what you might try first is "forced parameterization." This is a DATABASE-level option (available in 2005+) that significantly improves but still doesn't guarantee that a statement will be parameterized. See the Books Online for Forced Parameterization (SQL Server 2008 Forced Parameterization and SQL Server 2005 Forced Parameterization) for the rules on when SQL Server parameterizes. However, there are many statements that will. If you have A LOT of plan cache bloat AND you think your plans might benefit from forced parameterization – this is definitely something you can try.

Again, the good news here is that the QP will try harder to parameterize but there are still many cases where it won't. In this case, SQL Server parameterizes this statement (it becomes a Prepared statement) as:

(@0 varchar(8000))select * from dbo . Member where Lastname = @0

You can see this through the following query/results:

SELECT st.text, cp.objtype, cp.cacheobjtype, cp.size_in_bytes, cp.refcounts, cp.usecounts, qp.query_plan –, *
FROM sys.dm_exec_cached_plans AS cp
        CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) AS st
        CROSS APPLY sys.dm_exec_query_plan(plan_handle) AS qp
WHERE cp.objtype IN ('Adhoc', 'Prepared')
        AND st.text LIKE '%member%'
ORDER BY cp.objtype
go

text objtype cacheobjtype size_in_bytes refcounts usecounts query_plan
SELECT * FROM dbo.Member WHERE Lastname = 'Falls';   Adhoc Compiled Plan 24576 2 1 <ShowPlanXML xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan" Version="1.1" Build="10.0.2531.0"><BatchSequence><Batch><Statements><StmtSimple StatementText="SELECT * FROM dbo.Member WHERE Lastname = 'Falls';&#xD;&#xA;" StatementId="1" StatementCompId="1" StatementType="SELECT" ParameterizedPlanHandle="0x0600050088DDF510B8C0630A000000000000000000000000" ParameterizedText="(@0 varchar(8000))select * from dbo . Member where Lastname = @0" /></Statements></Batch></BatchSequence></ShowPlanXML>
SELECT * FROM dbo.Member WHERE Lastname = 'Tripps';   Adhoc Compiled Plan 24576 2 1 <ShowPlanXML xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan" Version="1.1" Build="10.0.2531.0"><BatchSequence><Batch><Statements><StmtSimple StatementText="SELECT * FROM dbo.Member WHERE Lastname = 'Tripps';&#xD;&#xA;" StatementId="1" StatementCompId="1" StatementType="SELECT" ParameterizedPlanHandle="0x0600050088DDF510B8C0630A000000000000000000000000" ParameterizedText="(@0 varchar(8000))select * from dbo . Member where Lastname = @0" /></Statements></Batch></BatchSequence></ShowPlanXML>
SELECT * FROM dbo.Member WHERE Lastname = 'Tripper';   Adhoc Compiled Plan 24576 2 1 <ShowPlanXML xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan" Version="1.1" Build="10.0.2531.0"><BatchSequence><Batch><Statements><StmtSimple StatementText="SELECT * FROM dbo.Member WHERE Lastname = 'Tripper';&#xD;&#xA;" StatementId="1" StatementCompId="1" StatementType="SELECT" ParameterizedPlanHandle="0x0600050088DDF510B8C0630A000000000000000000000000" ParameterizedText="(@0 varchar(8000))select * from dbo . Member where Lastname = @0" /></Statements></Batch></BatchSequence></ShowPlanXML>
SELECT * FROM dbo.Member WHERE Lastname = 'Tripped';   Adhoc Compiled Plan 24576 2 1 <ShowPlanXML xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan" Version="1.1" Build="10.0.2531.0"><BatchSequence><Batch><Statements><StmtSimple StatementText="SELECT * FROM dbo.Member WHERE Lastname = 'Tripped';&#xD;&#xA;" StatementId="1" StatementCompId="1" StatementType="SELECT" ParameterizedPlanHandle="0x0600050088DDF510B8C0630A000000000000000000000000" ParameterizedText="(@0 varchar(8000))select * from dbo . Member where Lastname = @0" /></Statements></Batch></BatchSequence></ShowPlanXML>
SELECT * FROM dbo.Member WHERE Lastname = 'Tripp';   Adhoc Compiled Plan 24576 2 1 <ShowPlanXML xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan" Version="1.1" Build="10.0.2531.0"><BatchSequence><Batch><Statements><StmtSimple StatementText="SELECT * FROM dbo.Member WHERE Lastname = 'Tripp';&#xD;&#xA;" StatementId="1" StatementCompId="1" StatementType="SELECT" ParameterizedPlanHandle="0x0600050088DDF510B8C0630A000000000000000000000000" ParameterizedText="(@0 varchar(8000))select * from dbo . Member where Lastname = @0" /></Statements></Batch></BatchSequence></ShowPlanXML>
(@0 varchar(8000))select * from dbo . Member where Lastname = @0 Prepared Compiled Plan 40960 7 10 <ShowPlanXML xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan" Version="1.1" Build="10.0.2531.0"><BatchSequence><Batch><Statements><StmtSimple StatementText="(@0 varchar(8000))select * from dbo . Member where Lastname = @0" StatementId="1" StatementCompId="1" StatementType="SELECT" StatementSubTreeCost="0.00657038" StatementEstRows="1" StatementOptmLevel="FULL" QueryHash="0xDEF0805C7B74A31E" QueryPlanHash="0x785C8E320D853B97" StatementOptmEarlyAbortReason="GoodEnoughPlanFound"><StatementSetOptions QUOTED_IDENTIFIER="true" ARITHABORT="true" CONCAT_NULL_YIELDS_NULL="true" ANSI_NULLS="true" ANSI_PADDING="true" ANSI_WARNINGS="true" NUMERIC_ROUNDABORT="false" /><QueryPlan CachedPlanSize="24" CompileTime="1" CompileCPU="1" CompileMemory="136"><RelOp NodeId="0" PhysicalOp="Nested Loops" LogicalOp="Inner Join" EstimateRows="1" EstimateIO="0" EstimateCPU="4.18e-006" AvgRowSize="173" EstimatedTotalSubtreeCost="0.00657038" Parallel="0" EstimateRebinds="0" EstimateRewinds="0"><OutputList><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="member_no" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="lastname" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="firstname" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="middleinitial" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="street" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="city" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="state_prov" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="country" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="mail_code" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="phone_no" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="photograph" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="issue_dt" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="expr_dt" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="region_no" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="corp_no" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="prev_balance" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="curr_balance" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="member_code" /></OutputList><NestedLoops Optimized="0"><OuterReferences><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="member_no" /></OuterReferences><RelOp NodeId="1" PhysicalOp="Index Seek" LogicalOp="Index Seek" EstimateRows="1" EstimateIO="0.003125" EstimateCPU="0.0001581" AvgRowSize="20" EstimatedTotalSubtreeCost="0.0032831" TableCardinality="10000" Parallel="0" EstimateRebinds="0" EstimateRewinds="0"><OutputList><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="member_no" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="lastname" /></OutputList><IndexScan Ordered="1" ScanDirection="FORWARD" ForcedIndex="0" ForceSeek="0" NoExpandHint="0"><DefinedValues><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="member_no" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="lastname" /></DefinedValue></DefinedValues><Object Database="[credit]" Schema="[dbo]" Table="[member]" Index="[IX_Member_Lastname]" IndexKind="NonClustered" /><SeekPredicates><SeekPredicateNew><SeekKeys><Prefix ScanType="EQ"><RangeColumns><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="lastname" /></RangeColumns><RangeExpressions><ScalarOperator ScalarString="[@0]"><Identifier><ColumnReference Column="@0" /></Identifier></ScalarOperator></RangeExpressions></Prefix></SeekKeys></SeekPredicateNew></SeekPredicates></IndexScan></RelOp><RelOp NodeId="3" PhysicalOp="Clustered Index Seek" LogicalOp="Clustered Index Seek" EstimateRows="1" EstimateIO="0.003125" EstimateCPU="0.0001581" AvgRowSize="161" EstimatedTotalSubtreeCost="0.0032831" TableCardinality="10000" Parallel="0" EstimateRebinds="0" EstimateRewinds="0"><OutputList><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="firstname" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="middleinitial" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="street" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="city" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="state_prov" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="country" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="mail_code" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="phone_no" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="photograph" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="issue_dt" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="expr_dt" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="region_no" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="corp_no" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="prev_balance" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="curr_balance" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="member_code" /></OutputList><IndexScan Lookup="1" Ordered="1" ScanDirection="FORWARD" ForcedIndex="0" ForceSeek="0" NoExpandHint="0"><DefinedValues><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="firstname" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="middleinitial" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="street" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="city" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="state_prov" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="country" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="mail_code" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="phone_no" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="photograph" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="issue_dt" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="expr_dt" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="region_no" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="corp_no" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="prev_balance" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="curr_balance" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="member_code" /></DefinedValue></DefinedValues><Object Database="[credit]" Schema="[dbo]" Table="[member]" Index="[member_ident]" TableReferenceId="-1" IndexKind="Clustered" /><SeekPredicates><SeekPredicateNew><SeekKeys><Prefix ScanType="EQ"><RangeColumns><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="member_no" /></RangeColumns><RangeExpressions><ScalarOperator ScalarString="[credit].[dbo].[member].[member_no]"><Identifier><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="member_no" /></Identifier></ScalarOperator></RangeExpressions></Prefix></SeekKeys></SeekPredicateNew></SeekPredicates></IndexScan></RelOp></NestedLoops></RelOp><ParameterList><ColumnReference Column="@0" ParameterCompiledValue="'Tripp'" /></ParameterList></QueryPlan></StmtSimple></Statements></Batch></BatchSequence></ShowPlanXML>

Ideally, you'll end up with less wasted cache because all adhoc statements from here – will use the Prepared version of the statement. This also means that they'll run with a compiled plan (so you'll also save compile time) and – if the plans are consistent then all of this is good. However, if other values (for example 'Smith') require a completely different plan – then this could actually be worse (this is the "It Depends" part). But, if you still have statements that do have consistent execution plans (because you know the data and/or you know exactly how this particular data is being used) then EITHER sp_executesql or a stored procedure would effectively force the parameterization as well and it would say to me that you really understand your data/application. But, if you're wrong… anytime you force the parameterize (and increase re-use of a plan – you could end up forcing the re-use of a bad (and possibly REALLY, REALLY, REALLY, REALLY, REALLY, REALLY, REALLY, REALLY, REALLY bad plan) :)).

So, while you solve one problem – you could create another one (see the blog post titled: Using the OPTION (RECOMPILE) option for a statement for more information). If the plans are NOT consistent then forcing a plan (through either forced parameterization, sp_executesql OR a regular stored procedure) can result in poor performance. When writing the stored procedure (and through good testing processes), you'll be able to create better and more effective code. This is truly my preferred method. And, I'm just about to wrap up my series with roughly 3 more posts (at least that's what I currently have planned but you guys have certainly been steering me in this direction for a bit :-)).

Thanks for reading,
kt

Plan cache, adhoc workloads and clearing the single-use plan cache bloat

In my post describing the new SQL Server 2008 configuration option [optimize for adhoc workloads] titled: Plan cache and optimizing for adhoc workloads, I asked for you to send me your plan cache numbers. In addition to the comments on the post, I received a lot of feedback in email (and lots of comments – and even a related blog posts). The general state of the cache for many was that only 10s of MB were wasted, for a couple it was 100s MB and for a few it was GBs of memory – completely wasted. I've taken some of the worst numbers and compiled them below:

 CacheType Total Plans  Total MBs  Avg Use Count   Total MBs – USE Count 1   Total Plans – USE Count 1  % of Total
Prepared

1,541

7,925.85

1,245

7,858.43

1,448

99.15

Adhoc

86,624

8,592.48

5

7,332.59

76,145

85.34

Prepared

148,527

7,428.82

444

5,074.25

94,851

68.30

Adhoc

63,471

4,565.00

1,328

4,161.54

50,737

91.16

Adhoc

1,358

2,704.27

3,676

2,673.25

425

98.85

Adhoc                

48,140

6,233.64

173

2,106.05

20,493

33.79

Prepared

18,639

2,590.82

127

2,027.72

13,614

78.27

Prepared

18,010

3,237.18

354

1,944.48

9,561

60.07

Adhoc

17,392

2,417.38

100

1,787.33

7,741

73.94

Prepared

6,276

1,875.71

59,370

1,560.66

4,020

83.20

Prepared

16,832

2,844.61

10,829

1,507.63

5,732

53.00

Prepared

3,075

1,681.56

46,262

1,427.33

604

84.88

Adhoc

29,047

1,923.84

39

1,362.09

22,827

70.80

Adhoc 

17,028

1,255.60

185

1,160.85

15,845

92.45

Adhoc

118,838

2,073.25

2,813

1,128.47

41,212

54.43

Adhoc                

13,895

1,300.72

11

875.45

9,351

67.31

Prepared 

8,266

476.90

31

462.82

8,031

97.05

Adhoc 

8,865

507.63

6

293.98

4,738

57.91

Adhoc 

10,066

401.54

9

282.87

5,717

70.45

Adhoc 

18,676

506.66

12,463

260.48

7,938

51.41

Adhoc 

2,113

223.62

142

207.36

2,024

92.73

Adhoc 

10,107

240.27

76

157.51

6,073

65.55

Prepared            

1,888

154.91

4

125.08

1,438

80.74

Prepared            

1,561

195.27

14,761

123.70

1,089

63.35

Prepared 

742

125.30

14,517

109.39

593

87.30

Prepared 

22,566

838.79

357

99.87

505

11.91

Adhoc   

478

109.01

2

92.62

423

84.96

Adhoc 

2,917

173.43

26,651

82.73

1,313

47.70

Adhoc 

2,906

122.81

24

77.08

1,911

62.76

Prepared 

609

98.06

106

75.59

360

77.09

Adhoc   

638

75.74

24

52.04

414

68.71

Prepared             

322

23.12

8

17.82

275

77.09

First, some of these do have the "optimize for adhoc workloads" option set. However, not too many. And, for those that do have this set, the single-use plan cache was for the amount of space taken by the compile plan stubs – not actual execution plans. Typically, these are significantly smaller (typically only a matter of bytes) rather than 10s-100s of kilobytes. So, while the amount of cache wasted is reduced to only a fraction of what it would have been without the option set, there's still the potential for a lot of cache wasted.

Another thing to notice is that the only interesting nmumbers came from these two types of cache: Adhoc and Prepared. This is to be expected – the numbers for single-use plan cache are worst for the scenarios where statements are auto parameterized and "prepared" but not often re-used (where the statements submitted vary wildly because of different client applications/executions). But, while it is expected, what can you do about this? Unfortunately, not much – without changing the client application interface to the data (meaning – use stored procedures).

However, what you might benefit from (even if you have the option set) is periodically clearing the cache. However, you don't necessarily need to clear all of the cache – you can just clear the Adhoc plan cache.

To clear just the "SQL Plans" from the plan cache, use:

DBCC FREESYSTEMCACHE('SQL Plans')

If you want to clear all of the cache, you can use:

DBCC FREEPROCCACHE

And, in SQL Server 2005/2008 there are a host of combinations that will allow you to clear as much or as little as you'd like – down to just a single plan (using the plan_handle) in SQL Server 2008. For more information, check out the BOL topic: DBCC FREEPROCCACHE.

But, if you want to be more clever about it, you might want to automate the process of clearing these 'SQL Plans' when they're wasting more than a certain percentage of the memory that SQL Server's using OR more simply, once the single-use plan cache reaches a certain size (depending on the amount of memory that you have you might choose something as low as 500MB or possibly as high as 2-3GB). To help you out, I decided to write the queries to do this and give you a few options from which to choose. And, it's definitely easier to write more interesting code in 2008 because of some of the new memory-related DMVs (specifically: sys.dm_os_sys_memory and sys.dm_os_process_memory). However, you can still do this fairly easily in both editions.

So, let me summarize:
1) If you're running SQL Server 2008 and you have cache being wasted by single-use plans, be sure to use the new "optimize for adhoc workloads."
2) If you still find that you're wasting 100s of MB or GB of cache, consider creating a job that programmatically checks cache and then clears the 'SQL Plans' from cache based on one of the options/code below.

  1.  
    1. This is probably the easiest and most simplfied option.
    2. This is probably overkill for most folks but probably won't hurt either, especially if you have a lot of other types of cache being wasted by single-use plans.
    3. This is probably the most interesting and uses some of the new DMVs to see how much of the ACTUAL working cache is going to single-use plans. However, this is only for 2008. Could I write it for 2005, yes, but it's not really worth it. I'd just go with option 1 (or possibly 2) for 2005.

1. Clearing *JUST* the 'SQL Plans' based on *just* the amount of Adhoc/Prepared single-use plans (2005/2008):

DECLARE @MB decimal(19,3)
        , @Count bigint
        , @StrMB nvarchar(20)

SELECT @MB = sum(cast((CASE WHEN usecounts = 1 AND objtype IN ('Adhoc', 'Prepared') THEN size_in_bytes ELSE 0 END) as decimal(12,2)))/1024/1024 
        , @Count = sum(CASE WHEN usecounts = 1 AND objtype IN ('Adhoc', 'Prepared') THEN 1 ELSE 0 END)
        , @StrMB = convert(nvarchar(20), @MB)
FROM sys.dm_exec_cached_plans

IF @MB > 10
        BEGIN
                DBCC FREESYSTEMCACHE('SQL Plans'
                RAISERROR ('%s MB was allocated to single-use plan cache. Single-use plans have been cleared.', 10, 1, @StrMB)
        END
ELSE
        BEGIN
                RAISERROR ('Only %s MB is allocated to single-use plan cache – no need to clear cache now.', 10, 1, @StrMB)
                – Note: this is only a warning message and not an actual error.
        END
go

2. Clearing *ALL* of your cache based on the total amount of wasted by single-use plans (2005/2008):

DECLARE @MB decimal(19,3)
        , @Count bigint
        , @StrMB nvarchar(20)

SELECT @MB = sum(cast((CASE WHEN usecounts = 1 THEN size_in_bytes ELSE 0 END) as decimal(12,2)))/1024/1024 
        , @Count = sum(CASE WHEN usecounts = 1 THEN 1 ELSE 0 END)
        , @StrMB = convert(nvarchar(20), @MB)
FROM sys.dm_exec_cached_plans

IF @MB > 1000
        DBCC FREEPROCCACHE
ELSE
        RAISERROR ('Only %s MB is allocated to single-use plan cache – no need to clear cache now.', 10, 1, @StrMB)
go

3. Stored Procedure to report/track + logic to go into a job based on percentage OR MB of wasted cache (2008 only):

This I put into a script that you can play with here (sp_SQLskills_CheckPlanCache.sql (3.67 kb)).

Enjoy!
kt

 

Public classes coming up in Dublin (including prior course testimonials)

Paul and I enjoyed Dublin so much in 2008 and 2009 that we're going back for two weeks this time – and teaching three classes while we're there! We'll be working with our fabulous friends (Carmel, Sandra, and Bob) at Prodata and the event will be hosted by the nice folks at Microsoft Ireland. We'll also be doing a user group presentation, as usual, and you can register for this on the MTUG website. Not sure what we'll talk about at the user group yet (probably SQL Server related :-) – suggestions?

If you register for any of the classes, be sure to use this special discount code "KTB" which gets you 5% off and we'll know you saw the class on my blog.

We decided to structure the first class as a 4-day version of our popular Immersion Event and the following week do two deeper classes focusing on performance and disaster recovery. And, even with as much content as we've recorded and mentioned online, our "live" Immersion Events are more in-depth and interactive. You'll get a lot of information in a relatively short period of time and you'll be able to go back and immediately apply it. We base our content on what really works in production and what we've learned from actual implementations/architectures.

We've taught several classes this year already and had rave reviews of the content, our knowledge, and presentation style. Check them out for yourself (with quotes extracted from their blog posts):

  • Greg Gonzalez, President and CEO of SQL Sentry wrote a long blog post about the Immersion Event he attended in Boston in April:
    • "When it comes to the instructors themselves, Kimberly and Paul simply have no equal.  Not only are they both ultimate authorities, but they have endless enthusiasm about the material, and spot on delivery.  If either ever got tired they never showed it, even after going all day and all week.  We witnessed countless demos over the course of the week, some extremely involved, multi-step processes, and I can’t recall a one that didn’t go the way it was supposed to."
    • "These are not beginner presenters, and they put an extreme amount of preparation and attention to detail into everything that they do.  Completely, utterly professional."
    • "You might think that with this extreme level of skill comes extreme levels of egotism and lack of patience.  Nothing could be further from the truth. … They simply know how to teach, and are approachable, humble, and patient."
  • Fellow-MVP Aaron Bertrand discussing the same Immersion Event:
    • "The additional insight around what's printed in the book or in Books Online is invaluable; and the experience Paul and Kimberly have had with real live customers yields a lot more information and things to watch out for than you'd ever get from the documentation alone."
  • Newly-minted MCM Robert Davis of Microsoft explaining how the previous training he'd received from us at Microsoft helped him pass his MCM certification (which we also teach):
    • "I try to take the SQLSkills classes when I can. I highly recommend to everyone that they take advantage of any training opportunities that they see from this group."
  • Newly-minted MCM Brent Ozar of Quest summing up the first week of the recent MCM class that we taught:
    • "To call them good trainers is an epic understatement.  They know how to deliver very, very technical material in ways that illustrate it well.  I had to stop Paul at one point and ask him how long it took to build a particular slide because the animations were so good at conveying a hard-to-describe process."

We also have a few testimonials and quotes from past customers and finally, here's the press release from last year's class. Phew!

4-Day Immersion Event, June 28-July 1 2010, Dublin, Ireland: See the Prodata site for full details and for registration links and options.

Day 1: SQL Server Internals

  • On-disk structures: how the data is stored
    • Records, pages, allocation bitmaps
  • Index internals: how the data is organized
  • Logging and recovery: how the data is protected
    • What is logging?
    • How recovery works
    • How the transaction log is structured
    • Recovery models 

Day 2: Designing for Performance

  • Choosing the RIGHT Data Type
    • Understanding data types
    • Character data, overflow and LOB
    • LOB considerations
    • Date and time data types
    • Heterogeneous data types
    • Sparse columns
    • Application inconsistencies in types
  • Table & Index Partitioning Strategies
    • Concepts/motivating factors in Partitioning
    • SQL Server 2000+ Partitioned Views
    • SQL Server 2005+ Partitioned Tables
    • Partitioning Design Techniques Combined
    • Partitioned and non-partitioned Indexes
    • Implementing the Sliding Window Scenario

Day 3: Indexing for Performance

  • Data Access
    • Data Access Patterns
    • Covering
    • Using INCLUDE (SQL Server 2005+)
    • Using Filters (SQL Server 2008+)
  • Indexing Strategies
    • Indexing for AND
    • Indexing for OR
    • Indexing for Joins
    • Indexing for Aggregates
    • Indexed Views (Overview)

Day 4: Essential Database Maintenance

  • Data and log file provisioning and management
    • Database layout considerations
    • Growing and shrinking
    • Instant initialization
  • Tempdb
  • Index and statistics maintenance
  • Using backup and restore
  • Consistency checking

Performance Optimization Masterclass, July 5-6 2010, Dublin, Ireland: See the Prodata site for full details and for registration links and options.

(This is *NOT* duplicate content to the 4-day class – it's deeper knowledge and requires the equivalent knowledge to the 4-day class to get the best out of it.)

Part 1: Optimizing Procedural Code

  • Module 1: Batches and Plan Cache
    • Statement execution and plan caching
    • sp_executesql and dynamic string execution
  • Module 2: Optmizing Procedural Code
  • Understanding stored procedures
    • Understanding recompilation
    • Creating an optimal plan
    • Forcing plans & plan guides

Part 2: Resource Monitoring and Troubleshooting

  • Module 1: Waits & Queues
    • What are waits and queues?
    • What should you be looking for?
  • Module 2: Locking / blocking
    • Locking overview
    • Examples of blocking scenarios
    • Troubleshooting blocking
    • Blocking avoidance strategies
  • Module 3: Resource governor
  • Module 4: Extended Events
  • Module 5: Performance Data Collection

Part 3: Are Your Index Strategies Working?

  • Module 1: index cleanup
    • Index usage and consolidation
  • Module 2: Index health
    • What is fragmentation
    • Detecting fragmentation
    • Removing fragmentation
  • Module 3: Missing indexes
    • DTA

Disaster Recovery Masterclass, July 7-8 2010, Dublin, Ireland: See the Prodata site for full details and for registration links and options.

(This is *NOT* duplicate content to the 4-day class – it's deeper knowledge and really requires the equivalent knowledge to the 4-day class to get the best out of it.)

Part I: Fundamentals

  • Module 1: Internals
    • Introduction to database structures
    • Introduction to logging and recovery
    • How recovery models affect disaster recovery
  • Module 2: Data Storage
    • Choosing a RAID level
    • Storage tuning for fast recovery

Part II: Strategy

  • Module 3: Planning a disaster recovery strategy
    • Requirements and limitations
    • Testing
  • Module 4: Architecting for recoverability
    • Partial database availability
    • Planning a backup strategy
    • Database snapshots
    • Utilizing high-availability technologies

Part III: Practicalities

  • Module 5: Recovering from a disaster with backups
    • Tail-of-the-log backups
    • Determining restore sequence
    • Point-in-time restore
    • Piecemeal restore
    • System databases
  • Module 6: Recovering from a disaster without backups
    • Interpreting DBCC CHECKDB output
    • Using database repair
    • Using EMERGENCY mode

We really hope to see you at one or more of these classes! And, if you're interested in attending two or more classes – special discounts are available. Additionally, we're also offering discounts to folks that attended last year's Dublin Immersion Event – so that they can continue where they left of and add on one or both of the 2-day classes offered in the second week. Feel free to shoot either of us an email [Kimberly OR Paul at SQLskills dot com] OR use one of the contact pages on the ProdData site. You can also find this event on LinkedIn here.

We look forward to seeing you there!
kt (and pr)