OK, I've had a bit of a tangent going on optimizing stored procedures and its all revolved around adhoc plan caching and potentially wasted cache due to the single-use plans created (see the most recent posts titled:Plan cache and optimizing for adhoc workloads and Plan cache, adhoc workloads and clearing the single-use plan cache bloat). To be more specific, these are statements that when generated (via adhoc or prepared statements) don't end up being parameterized (in general - only relatively simple statements can be effectively parameterized and when your statements complex, SQL Server won't parameterize them - this can be good and in some systems this can be terribly bad). But, now you know how bad... you know how much of your cache you're wasting but where is it going and are there any similarities to these statements? Should you (or can you?) bring them together to see if there is a commonality that can be leveraged?

In general, the answer is "it depends" (you knew that was coming!). However, I do have good news, you can definitely gauge what's actually going on and how many of the statements are in fact, similar. This might allow you to leverage another database option (forced parameterization) and/or determine that you really NEED stored procedures. However, that might end up being more complex as it will require rewriting the app (or at least some of the app). Again, the good news here is that you'll be able to tell which part of the app.

Take for example the following:

DBCC FREEPROCCACHE
go
SELECT * FROM dbo.Member WHERE Lastname = 'Tripp';
go
SELECT
* FROM dbo.Member WHERE Lastname = 'Tripped'
;
go
SELECT
* FROM dbo.Member WHERE Lastname = 'Tripper'
;
go
SELECT
* FROM dbo.Member WHERE Lastname = 'Tripps'
;
go
SELECT
* FROM dbo.Member WHERE Lastname = 'Falls'
;
go

SELECT cp.objtype, cp.cacheobjtype, cp.size_in_bytes, cp.refcounts, cp.usecounts, st.text --, *
FROM sys.dm_exec_cached_plans AS cp
CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) AS st
WHERE cp.objtype IN ('Adhoc', 'Prepared'
)
        AND st.text LIKE '%from dbo.member%' 
        AND st.text NOT LIKE
'%SELECT cp.objecttype%'
ORDER BY cp.objtype
go

objtype       cacheobjtype          size_in_bytes   refcounts  usecounts     text
Adhoc         Compiled Plan         24576               2              1                 SELECT * FROM dbo.Member WHERE Lastname = 'Falls'; 
Adhoc         Compiled Plan         24576               2              1                 SELECT * FROM dbo.Member WHERE Lastname = 'Tripps'; 
Adhoc         Compiled Plan         24576               2              1                 SELECT * FROM dbo.Member WHERE Lastname = 'Tripper'; 
Adhoc         Compiled Plan         24576               2              1                 SELECT * FROM dbo.Member WHERE Lastname = 'Tripped'; 
Adhoc         Compiled Plan         24576               2              1                 SELECT * FROM dbo.Member WHERE Lastname = 'Tripp'; 

The first thing to notice is that each of these queries has a *different* statement in the adhoc plan cache. The second is to recognize that each of these statements take 24K in the plan cache. If we were to change to "optimize for ad hoc workloads" then things might be a bit better:

sp_configure 'optimize for ad hoc workloads', 1
go
reconfigure
go
DBCC
FREEPROCCACHE
go
SELECT * FROM dbo.Member WHERE Lastname = 'Tripp';
go
SELECT
* FROM dbo.Member WHERE Lastname = 'Tripped'
;
go
SELECT
* FROM dbo.Member WHERE Lastname = 'Tripper'
;
go
SELECT
* FROM dbo.Member WHERE Lastname = 'Tripps'
;
go
SELECT
* FROM dbo.Member WHERE Lastname = 'Falls'
;
go

SELECT cp.objtype, cp.cacheobjtype, cp.size_in_bytes, cp.refcounts, cp.usecounts, st.text --, *
FROM sys.dm_exec_cached_plans AS cp
CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) AS st
WHERE cp.objtype IN ('Adhoc', 'Prepared'
)
        AND st.text LIKE '%from dbo.member%' 
        AND st.text NOT LIKE
'%SELECT cp.objecttype%'
ORDER BY cp.objtype
go

objtype       cacheobjtype         size_in_bytes   refcounts  usecounts     text
Adhoc         Compiled Plan Stub  320                 2              1                 SELECT * FROM dbo.Member WHERE Lastname = 'Falls'; 
Adhoc         Compiled Plan Stub  320                 2              1                 SELECT * FROM dbo.Member WHERE Lastname = 'Tripps'; 
Adhoc         Compiled Plan Stub  320                 2              1                 SELECT * FROM dbo.Member WHERE Lastname = 'Tripper'; 
Adhoc         Compiled Plan Stub  320                 2              1                 SELECT * FROM dbo.Member WHERE Lastname = 'Tripped'; 
Adhoc         Compiled Plan Stub  320                 2              1                 SELECT * FROM dbo.Member WHERE Lastname = 'Tripp'; 

But, this is not the only issue. Even though each statement will only be in the cache once and will only take 320 bytes (instead of 24K) the cumulative effect of this might be significant. So, how can you see this? If you try to aggregate over these statements then you'll only be able to use the first 40-50 characters while that works here it definitely won't work for more complex statements. Until - SQL Server 2008. In 2008, you can use the query_hash to find similar statements and their overall affect on the query cache.

SELECT qs2.query_hash AS [Query Hash]
        , SUM(qs2.size_in_bytes) AS [Total Cache Size]
        , SUM(qs2.total_worker_time)/SUM(qs2.execution_count) AS [Avg CPU Time]
        , SUM(qs2.total_elapsed_time)/SUM(qs2.execution_count) AS [Avg Duration]
        , COUNT(*) AS [Number of plans] 
        , MIN(qs2.statement_text) AS [Statement Text]
FROM (SELECT qs.*, cp.size_in_bytes, SUBSTRING(ST.text, (QS.statement_start_offset/2) + 1, ((CASE statement_end_offset WHEN -1 THEN DATALENGTH(st.text) 
                ELSE QS.statement_end_offset END - QS.statement_start_offset)/2) + 1) AS statement_text 
                FROM sys.dm_exec_cached_plans AS cp 
                        JOIN sys.dm_exec_query_stats AS qs ON cp.plan_handle = qs.plan_handle
                CROSS APPLY sys.dm_exec_sql_text(QS.sql_handle) as ST) as qs2
GROUP BY qs2.query_hash ORDER BY [Avg Duration] DESC;

This is incredibly important because this will give you the information to determine what SIMILAR query (or queries) are consuming the cache and which ones have the highest cumulative effect. And, if you see a statement that is consistent and can be optimized using indexes (and consistently using the same indexes) what you might try first is "forced parameterization." This is a DATABASE-level option (available in 2005+) that significantly improves but still doesn't guarantee that a statement will be parameterized. See the Books Online for Forced Parameterization (SQL Server 2008 Forced Parameterization and SQL Server 2005 Forced Parameterization) for the rules on when SQL Server parameterizes. However, there are many statements that will. If you have A LOT of plan cache bloat AND you think your plans might benefit from forced parameterization - this is definitely something you can try.

Again, the good news here is that the QP will try harder to parameterize but there are still many cases where it won't. In this case, SQL Server parameterizes this statement (it becomes a Prepared statement) as:

(@0 varchar(8000))select * from dbo . Member where Lastname = @0

You can see this through the following query/results:

SELECT st.text, cp.objtype, cp.cacheobjtype, cp.size_in_bytes, cp.refcounts, cp.usecounts, qp.query_plan --, *
FROM sys.dm_exec_cached_plans AS cp
        CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) AS st
        CROSS APPLY sys.dm_exec_query_plan(plan_handle) AS qp
WHERE cp.objtype IN ('Adhoc', 'Prepared'
)
        AND st.text LIKE
'%member%'
ORDER BY cp.objtype
go

text objtype cacheobjtype size_in_bytes refcounts usecounts query_plan
SELECT * FROM dbo.Member WHERE Lastname = 'Falls';   Adhoc Compiled Plan 24576 2 1 <ShowPlanXML xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan" Version="1.1" Build="10.0.2531.0"><BatchSequence><Batch><Statements><StmtSimple StatementText="SELECT * FROM dbo.Member WHERE Lastname = 'Falls';&#xD;&#xA;" StatementId="1" StatementCompId="1" StatementType="SELECT" ParameterizedPlanHandle="0x0600050088DDF510B8C0630A000000000000000000000000" ParameterizedText="(@0 varchar(8000))select * from dbo . Member where Lastname = @0" /></Statements></Batch></BatchSequence></ShowPlanXML>
SELECT * FROM dbo.Member WHERE Lastname = 'Tripps';   Adhoc Compiled Plan 24576 2 1 <ShowPlanXML xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan" Version="1.1" Build="10.0.2531.0"><BatchSequence><Batch><Statements><StmtSimple StatementText="SELECT * FROM dbo.Member WHERE Lastname = 'Tripps';&#xD;&#xA;" StatementId="1" StatementCompId="1" StatementType="SELECT" ParameterizedPlanHandle="0x0600050088DDF510B8C0630A000000000000000000000000" ParameterizedText="(@0 varchar(8000))select * from dbo . Member where Lastname = @0" /></Statements></Batch></BatchSequence></ShowPlanXML>
SELECT * FROM dbo.Member WHERE Lastname = 'Tripper';   Adhoc Compiled Plan 24576 2 1 <ShowPlanXML xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan" Version="1.1" Build="10.0.2531.0"><BatchSequence><Batch><Statements><StmtSimple StatementText="SELECT * FROM dbo.Member WHERE Lastname = 'Tripper';&#xD;&#xA;" StatementId="1" StatementCompId="1" StatementType="SELECT" ParameterizedPlanHandle="0x0600050088DDF510B8C0630A000000000000000000000000" ParameterizedText="(@0 varchar(8000))select * from dbo . Member where Lastname = @0" /></Statements></Batch></BatchSequence></ShowPlanXML>
SELECT * FROM dbo.Member WHERE Lastname = 'Tripped';   Adhoc Compiled Plan 24576 2 1 <ShowPlanXML xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan" Version="1.1" Build="10.0.2531.0"><BatchSequence><Batch><Statements><StmtSimple StatementText="SELECT * FROM dbo.Member WHERE Lastname = 'Tripped';&#xD;&#xA;" StatementId="1" StatementCompId="1" StatementType="SELECT" ParameterizedPlanHandle="0x0600050088DDF510B8C0630A000000000000000000000000" ParameterizedText="(@0 varchar(8000))select * from dbo . Member where Lastname = @0" /></Statements></Batch></BatchSequence></ShowPlanXML>
SELECT * FROM dbo.Member WHERE Lastname = 'Tripp';   Adhoc Compiled Plan 24576 2 1 <ShowPlanXML xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan" Version="1.1" Build="10.0.2531.0"><BatchSequence><Batch><Statements><StmtSimple StatementText="SELECT * FROM dbo.Member WHERE Lastname = 'Tripp';&#xD;&#xA;" StatementId="1" StatementCompId="1" StatementType="SELECT" ParameterizedPlanHandle="0x0600050088DDF510B8C0630A000000000000000000000000" ParameterizedText="(@0 varchar(8000))select * from dbo . Member where Lastname = @0" /></Statements></Batch></BatchSequence></ShowPlanXML>
(@0 varchar(8000))select * from dbo . Member where Lastname = @0 Prepared Compiled Plan 40960 7 10 <ShowPlanXML xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan" Version="1.1" Build="10.0.2531.0"><BatchSequence><Batch><Statements><StmtSimple StatementText="(@0 varchar(8000))select * from dbo . Member where Lastname = @0" StatementId="1" StatementCompId="1" StatementType="SELECT" StatementSubTreeCost="0.00657038" StatementEstRows="1" StatementOptmLevel="FULL" QueryHash="0xDEF0805C7B74A31E" QueryPlanHash="0x785C8E320D853B97" StatementOptmEarlyAbortReason="GoodEnoughPlanFound"><StatementSetOptions QUOTED_IDENTIFIER="true" ARITHABORT="true" CONCAT_NULL_YIELDS_NULL="true" ANSI_NULLS="true" ANSI_PADDING="true" ANSI_WARNINGS="true" NUMERIC_ROUNDABORT="false" /><QueryPlan CachedPlanSize="24" CompileTime="1" CompileCPU="1" CompileMemory="136"><RelOp NodeId="0" PhysicalOp="Nested Loops" LogicalOp="Inner Join" EstimateRows="1" EstimateIO="0" EstimateCPU="4.18e-006" AvgRowSize="173" EstimatedTotalSubtreeCost="0.00657038" Parallel="0" EstimateRebinds="0" EstimateRewinds="0"><OutputList><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="member_no" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="lastname" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="firstname" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="middleinitial" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="street" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="city" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="state_prov" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="country" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="mail_code" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="phone_no" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="photograph" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="issue_dt" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="expr_dt" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="region_no" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="corp_no" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="prev_balance" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="curr_balance" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="member_code" /></OutputList><NestedLoops Optimized="0"><OuterReferences><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="member_no" /></OuterReferences><RelOp NodeId="1" PhysicalOp="Index Seek" LogicalOp="Index Seek" EstimateRows="1" EstimateIO="0.003125" EstimateCPU="0.0001581" AvgRowSize="20" EstimatedTotalSubtreeCost="0.0032831" TableCardinality="10000" Parallel="0" EstimateRebinds="0" EstimateRewinds="0"><OutputList><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="member_no" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="lastname" /></OutputList><IndexScan Ordered="1" ScanDirection="FORWARD" ForcedIndex="0" ForceSeek="0" NoExpandHint="0"><DefinedValues><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="member_no" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="lastname" /></DefinedValue></DefinedValues><Object Database="[credit]" Schema="[dbo]" Table="[member]" Index="[IX_Member_Lastname]" IndexKind="NonClustered" /><SeekPredicates><SeekPredicateNew><SeekKeys><Prefix ScanType="EQ"><RangeColumns><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="lastname" /></RangeColumns><RangeExpressions><ScalarOperator ScalarString="[@0]"><Identifier><ColumnReference Column="@0" /></Identifier></ScalarOperator></RangeExpressions></Prefix></SeekKeys></SeekPredicateNew></SeekPredicates></IndexScan></RelOp><RelOp NodeId="3" PhysicalOp="Clustered Index Seek" LogicalOp="Clustered Index Seek" EstimateRows="1" EstimateIO="0.003125" EstimateCPU="0.0001581" AvgRowSize="161" EstimatedTotalSubtreeCost="0.0032831" TableCardinality="10000" Parallel="0" EstimateRebinds="0" EstimateRewinds="0"><OutputList><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="firstname" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="middleinitial" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="street" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="city" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="state_prov" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="country" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="mail_code" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="phone_no" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="photograph" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="issue_dt" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="expr_dt" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="region_no" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="corp_no" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="prev_balance" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="curr_balance" /><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="member_code" /></OutputList><IndexScan Lookup="1" Ordered="1" ScanDirection="FORWARD" ForcedIndex="0" ForceSeek="0" NoExpandHint="0"><DefinedValues><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="firstname" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="middleinitial" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="street" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="city" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="state_prov" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="country" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="mail_code" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="phone_no" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="photograph" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="issue_dt" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="expr_dt" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="region_no" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="corp_no" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="prev_balance" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="curr_balance" /></DefinedValue><DefinedValue><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="member_code" /></DefinedValue></DefinedValues><Object Database="[credit]" Schema="[dbo]" Table="[member]" Index="[member_ident]" TableReferenceId="-1" IndexKind="Clustered" /><SeekPredicates><SeekPredicateNew><SeekKeys><Prefix ScanType="EQ"><RangeColumns><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="member_no" /></RangeColumns><RangeExpressions><ScalarOperator ScalarString="[credit].[dbo].[member].[member_no]"><Identifier><ColumnReference Database="[credit]" Schema="[dbo]" Table="[member]" Column="member_no" /></Identifier></ScalarOperator></RangeExpressions></Prefix></SeekKeys></SeekPredicateNew></SeekPredicates></IndexScan></RelOp></NestedLoops></RelOp><ParameterList><ColumnReference Column="@0" ParameterCompiledValue="'Tripp'" /></ParameterList></QueryPlan></StmtSimple></Statements></Batch></BatchSequence></ShowPlanXML>

Ideally, you'll end up with less wasted cache because all adhoc statements from here - will use the Prepared version of the statement. This also means that they'll run with a compiled plan (so you'll also save compile time) and - if the plans are consistent then all of this is good. However, if other values (for example 'Smith') require a completely different plan - then this could actually be worse (this is the "It Depends" part). But, if you still have statements that do have consistent execution plans (because you know the data and/or you know exactly how this particular data is being used) then EITHER sp_executesql or a stored procedure would effectively force the parameterization as well and it would say to me that you really understand your data/application. But, if you're wrong... anytime you force the parameterize (and increase re-use of a plan - you could end up forcing the re-use of a bad (and possibly REALLY, REALLY, REALLY, REALLY, REALLY, REALLY, REALLY, REALLY, REALLY bad plan) :)).

So, while you solve one problem - you could create another one (see the blog post titled: Using the OPTION (RECOMPILE) option for a statement for more information). If the plans are NOT consistent then forcing a plan (through either forced parameterization, sp_executesql OR a regular stored procedure) can result in poor performance. When writing the stored procedure (and through good testing processes), you'll be able to create better and more effective code. This is truly my preferred method. And, I'm just about to wrap up my series with roughly 3 more posts (at least that's what I currently have planned but you guys have certainly been steering me in this direction for a bit :-)).

Thanks for reading,
kt

In my post describing the new SQL Server 2008 configuration option [optimize for adhoc workloads] titled: Plan cache and optimizing for adhoc workloads, I asked for you to send me your plan cache numbers. In addition to the comments on the post, I received a lot of feedback in email (and lots of comments - and even a related blog posts). The general state of the cache for many was that only 10s of MB were wasted, for a couple it was 100s MB and for a few it was GBs of memory - completely wasted. I've taken some of the worst numbers and compiled them below:

 CacheType Total Plans  Total MBs  Avg Use Count   Total MBs - USE Count 1   Total Plans - USE Count 1  % of Total
Prepared

1,541

7,925.85

1,245

7,858.43

1,448

99.15

Adhoc

86,624

8,592.48

5

7,332.59

76,145

85.34

Prepared

148,527

7,428.82

444

5,074.25

94,851

68.30

Adhoc

63,471

4,565.00

1,328

4,161.54

50,737

91.16

Adhoc

1,358

2,704.27

3,676

2,673.25

425

98.85

Adhoc                

48,140

6,233.64

173

2,106.05

20,493

33.79

Prepared

18,639

2,590.82

127

2,027.72

13,614

78.27

Prepared

18,010

3,237.18

354

1,944.48

9,561

60.07

Adhoc

17,392

2,417.38

100

1,787.33

7,741

73.94

Prepared

6,276

1,875.71

59,370

1,560.66

4,020

83.20

Prepared

16,832

2,844.61

10,829

1,507.63

5,732

53.00

Prepared

3,075

1,681.56

46,262

1,427.33

604

84.88

Adhoc

29,047

1,923.84

39

1,362.09

22,827

70.80

Adhoc 

17,028

1,255.60

185

1,160.85

15,845

92.45

Adhoc

118,838

2,073.25

2,813

1,128.47

41,212

54.43

Adhoc                

13,895

1,300.72

11

875.45

9,351

67.31

Prepared 

8,266

476.90

31

462.82

8,031

97.05

Adhoc 

8,865

507.63

6

293.98

4,738

57.91

Adhoc 

10,066

401.54

9

282.87

5,717

70.45

Adhoc 

18,676

506.66

12,463

260.48

7,938

51.41

Adhoc 

2,113

223.62

142

207.36

2,024

92.73

Adhoc 

10,107

240.27

76

157.51

6,073

65.55

Prepared            

1,888

154.91

4

125.08

1,438

80.74

Prepared            

1,561

195.27

14,761

123.70

1,089

63.35

Prepared 

742

125.30

14,517

109.39

593

87.30

Prepared 

22,566

838.79

357

99.87

505

11.91

Adhoc   

478

109.01

2

92.62

423

84.96

Adhoc 

2,917

173.43

26,651

82.73

1,313

47.70

Adhoc 

2,906

122.81

24

77.08

1,911

62.76

Prepared 

609

98.06

106

75.59

360

77.09

Adhoc   

638

75.74

24

52.04

414

68.71

Prepared             

322

23.12

8

17.82

275

77.09

First, some of these do have the "optimize for adhoc workloads" option set. However, not too many. And, for those that do have this set, the single-use plan cache was for the amount of space taken by the compile plan stubs - not actual execution plans. Typically, these are significantly smaller (typically only a matter of bytes) rather than 10s-100s of kilobytes. So, while the amount of cache wasted is reduced to only a fraction of what it would have been without the option set, there's still the potential for a lot of cache wasted.

Another thing to notice is that the only interesting nmumbers came from these two types of cache: Adhoc and Prepared. This is to be expected - the numbers for single-use plan cache are worst for the scenarios where statements are auto parameterized and "prepared" but not often re-used (where the statements submitted vary wildly because of different client applications/executions). But, while it is expected, what can you do about this? Unfortunately, not much - without changing the client application interface to the data (meaning - use stored procedures).

However, what you might benefit from (even if you have the option set) is periodically clearing the cache. However, you don't necessarily need to clear all of the cache - you can just clear the Adhoc plan cache.

To clear just the "SQL Plans" from the plan cache, use:

DBCC FREESYSTEMCACHE('SQL Plans')

If you want to clear all of the cache, you can use:

DBCC FREEPROCCACHE

And, in SQL Server 2005/2008 there are a host of combinations that will allow you to clear as much or as little as you'd like - down to just a single plan (using the plan_handle) in SQL Server 2008. For more information, check out the BOL topic: DBCC FREEPROCCACHE.

But, if you want to be more clever about it, you might want to automate the process of clearing these 'SQL Plans' when they're wasting more than a certain percentage of the memory that SQL Server's using OR more simply, once the single-use plan cache reaches a certain size (depending on the amount of memory that you have you might choose something as low as 500MB or possibly as high as 2-3GB). To help you out, I decided to write the queries to do this and give you a few options from which to choose. And, it's definitely easier to write more interesting code in 2008 because of some of the new memory-related DMVs (specifically: sys.dm_os_sys_memory and sys.dm_os_process_memory). However, you can still do this fairly easily in both editions.

So, let me summarize:
1) If you're running SQL Server 2008 and you have cache being wasted by single-use plans, be sure to use the new "optimize for adhoc workloads."
2) If you still find that you're wasting 100s of MB or GB of cache, consider creating a job that programmatically checks cache and then clears the 'SQL Plans' from cache based on one of the options/code below.

  1.  
    1. This is probably the easiest and most simplfied option.
    2. This is probably overkill for most folks but probably won't hurt either, especially if you have a lot of other types of cache being wasted by single-use plans.
    3. This is probably the most interesting and uses some of the new DMVs to see how much of the ACTUAL working cache is going to single-use plans. However, this is only for 2008. Could I write it for 2005, yes, but it's not really worth it. I'd just go with option 1 (or possibly 2) for 2005.

1. Clearing *JUST* the 'SQL Plans' based on *just* the amount of Adhoc/Prepared single-use plans (2005/2008):

DECLARE @MB decimal(19,3)
        , @Count bigint
        , @StrMB nvarchar(20)

SELECT @MB = sum(cast((CASE WHEN usecounts = 1 AND objtype IN ('Adhoc', 'Prepared') THEN size_in_bytes ELSE 0 END) as decimal(12,2)))/1024/1024 
        , @Count = sum(CASE WHEN usecounts = 1 AND objtype IN ('Adhoc', 'Prepared') THEN 1 ELSE 0 END)
        , @StrMB = convert(nvarchar(20), @MB)
FROM sys.dm_exec_cached_plans

IF @MB > 10
        BEGIN
                DBCC FREESYSTEMCACHE('SQL Plans') 
                RAISERROR ('%s MB was allocated to single-use plan cache. Single-use plans have been cleared.', 10, 1, @StrMB)
       
END
ELSE
        BEGIN
                RAISERROR ('Only %s MB is allocated to single-use plan cache - no need to clear cache now.', 10, 1, @StrMB)
               
-- Note: this is only a warning message and not an actual error.
        END
go

2. Clearing *ALL* of your cache based on the total amount of wasted by single-use plans (2005/2008):

DECLARE @MB decimal(19,3)
        , @Count bigint
        , @StrMB nvarchar(20)

SELECT @MB = sum(cast((CASE WHEN usecounts = 1 THEN size_in_bytes ELSE 0 END) as decimal(12,2)))/1024/1024 
        , @Count = sum(CASE WHEN usecounts = 1 THEN 1 ELSE 0 END)
        , @StrMB = convert(nvarchar(20), @MB)
FROM sys.dm_exec_cached_plans

IF @MB > 1000
        DBCC FREEPROCCACHE
ELSE
        RAISERROR ('Only %s MB is allocated to single-use plan cache - no need to clear cache now.', 10, 1, @StrMB)
go

3. Stored Procedure to report/track + logic to go into a job based on percentage OR MB of wasted cache (2008 only):

This I put into a script that you can play with here (sp_SQLskills_CheckPlanCache.sql (3.67 kb)).

Enjoy!
kt

 

I mentioned that servers receiving a lot of dynamic constructs (typically those generated by client SQL generaters/ORM/LINQ, etc.) can start to consume too much plan cache and have problems with "single-use plans" in my last post titled: Statement execution and why you should use stored procedures. I also mentioned that SQL Server 2008 has an option/feature specifically to help reduce the bloat on the plan cache by only storing a query plan hash the first time a plan is created.

First - let's check to see how your plan cache is currently allocated:

(note: updated to decimal(18,2) as a few of you had overflow errors due to high use counts!)

SELECT objtype AS [CacheType]
        , count_big(*) AS [Total Plans]
        , sum(cast(size_in_bytes as decimal(18,2)))/1024/1024 AS [Total MBs]
        , avg(usecounts) AS [Avg Use Count]
        , sum(cast((CASE WHEN usecounts = 1 THEN size_in_bytes ELSE 0 END) as decimal(18,2)))/1024/1024 AS [Total MBs - USE Count 1]
        , sum(CASE WHEN usecounts = 1 THEN 1 ELSE 0 END) AS [Total Plans - USE Count 1]
FROM sys.
dm_exec_cached_plans
GROUP BY objtype
ORDER BY [Total MBs - USE Count 1]
DESC
go

This statement will show you how much of your cache is allocated to single use plans... and, I'd love to hear what your numbers are... this query works in 2005 and 2008; however, the primary solution I'm describing here (optimize for adhoc workloads) will only work in SQL Server 2008.

If you have a lot of your cache going to plans that are only executed once, then it's time to clean up the cache and take better advantage of it with plans that are more consistent and more stable. Ideally, this means using more stored procedures and writing these stored procedures effectively for better performance (I'm currently in the midst of doing this in my Optimizing Procedural Code category). However, if you absolutely must use a lot of adhoc SQL, then you should consider upgrading and turning on this new option. It's an advanced configuration option so you won't see it until you "show advanced options" and it's set using sp_configure. There have been some other really good posts out there on how to use this and what this is so I'm just going to bring together some great resources for you to read. The most important post to read (and it's especially important for those on you on versions of SQL Server 2000 or SQL Server 2005 RTM/SP1 [er... why aren't you on SP2?]) is that plan cache can get out of control. 2005 SP2 and 2008 reduce the total size but there no upper limit (which is again - a GREAT reason for the addition of "optimize for adhoc workloads"). Kalen Delaney talks about how things really work in her SQL Server 2005 SP2 post titled: Did You Know? SP2 does NOT limit the amount of plan cache you can have (key word there is LIMIT). Again, SQL Server doesn't set an upper limit but it does [drastically] reduce the total size that's possible (as of SP2). As for even more details on plan caching, recompilation and SQL Server 2008's better cache control - check out Greg Low's (blog|twitter) whitepaper titled: Plan Caching in SQL Server 2008. Taken STRAIGHT from the first section of the whitepaper:

SQL Server 2008 and SQL Server 2005 SP2
* 75% of visible target memory from 0-4GB + 10% of visible target memory from 4Gb-64GB + 5% of visible target memory > 64GB
 
SQL Server 2005 RTM and SQL Server 2005 SP1
* 75% of visible target memory from 0-8GB + 50% of visible target memory from 8Gb-64GB + 25% of visible target memory > 64GB
 
SQL Server 2000
* SQL Server 2000 4GB upper cap on the plan cache 

Finally, lots of additional posts on this topic will give you even more details:

However, I'm still really interested in seeing your numbers from the query above - let me know!

NEW/ADDITIONAL REQUEST: Let me know your max server memory setting as well as the total memory available on the box?

Thanks for reading!
kt

OK, so, this is where we're at...

When a client application makes a request to SQL Server there are a few ways in which they can do it:

  1. Submit an adhoc transact-SQL statement
    • This statement can be parameterized and the parameterized statement can be deemed:
      • Safe: These are statements where the optimizer has evaluated the plan and feels that changes to values will not affect the execution plan. And, there are quite a few rules that you must meet for this to be the case. And, as a result, this is NOT something I really rely on and/or aim for (for optimization - I'd rather use stored procedures). You can monitor this through the Perfmon counter: SQLServer:SQL Statistics (Safe Auto-Params/sec).
      • Unsafe: These are statements where the optimizer feels that subsequent values will affect the execution plan. And, for adhoc statements many will be deemed unsafe. You can monitor this through the Perfmon counter: SQLServer:SQL Statistics (Unsafe Auto-Params/sec).
    • This statement might not end up being parameterized (this is not likely and the "failed parameterized statements" count should be low - you can monitor it through the Perfmon counter: SQLServer:SQL Statistics (Failed Auto-Params/sec).
  2. Submit a statement via sp_executesql that parameterizes the statement through defined parameters - this creates a plan that is neither safe nor unsafe but is ALWAYS an exact match for subsequent executions (even when the parameter values change). This is where I generally have problems with it...
  3. Execute a stored proceudre which has one or more of the above in it and/or DSE in it. This is where I think things start to get interesting and this is really the reson for why I started this series on optimizing procedural code.

First, and foremost, I believe that stored procedures are a SIGNIFICANTLY better way to develop high-end applications that absolutely must scale. Don't get me wrong, I understand the benefits of some of the tools out there that can generate your SQL and allow more rapid development of your application. And, if the application isn't trying to handle thousands of inserts/sec then this might work out really well. However, you still have to be careful. Many of these applications (that auto-generate SQL code) rely (very heavily in some cases) on sp_executesql and this can result in poor performance. Additionally, when there's A LOT of adhoc SQL then you can end up with a lot of your cache going to "single-plan use" where quite a bit of it is wasted (NOTE: there's a SQL Server 2008 configuration option [called "optimize for adhoc workloads"] that you really need to know about IF you have a lot of adhoc statements. This can significantly reduce the amount of cache that's wasted on plans that only execute once... I'll do a post on that feature NEXT). Again, while there are quite a few benefits the performance problems can become SEVERE. OK, I've definitely said this before but why am I rehashing this?

Because this is where we're at - we've looked at some of the problems with DSE in my post titled: Little Bobby Tables, SQL Injection and EXECUTE AS. Then, we started to look at EXEC and sp_executesql - how are they different? in the second post. Finally, I started to show *statement-level* recompilation using OPTION (RECOMPILE) in the post titled: Using the OPTION (RECOMPILE) option for a statement. The real reason for why I started this series is to prove: where, why and how stored procedures really shine (we're getting close!).

Reasons to use stored procedures:

  • Centralization - Sharing common code between applications asnd only need to make your changes ONCE! And, there's no possibility of different business rules being enforced in one application and not another and/or for incorrect business rules to be "out there" in your applications.
  • Simplification - Users should not need to know your schema. They shoudn't care that you've denormalized something and they should reap the benenfits through YOUR creating an access method that is simple and works even if you end up changing back and normalizing. The database schema should be isolated from the users...
  • Change control - You should be able to change your schema without breaking applications. This about this one a bit - this is EXACTLY the tactic that SQL Server takes with it's own internal tables. We access them trhough stored procedurs, views (aka catalog views) and functions. If they need to change the underlying schema then we don't need to worry - as long as they also make the sp, function or view return the same way it always has. This has complicated things over time. First, the SQL team has bascially decided that (for backward compatiblity) the sps will never handle new features (and, yes, this is HORRIBLY annoying (and, it's the reason for why I've rewriten sp_helpindex here [and, fyi, I have an even better/newer/shinier version coming soon]). And, they've had to add functions like DATABASEPROPERTYEX for the "extended" properties that were added around recovery models in SQL Server 2000. However, as long as the funtionality of the interface stays the same then our applications don't break. That's incredibly important and probably one of my favorite reasons to create some separation between the schema and the users (and/or interface).
  • Security - This is HUGE. And, it requires an understanding of object chaining and permissions. The general idea is that when objects are all within the same schema (and therefore have the same owner) that when execution rights are given on a stored procedure the caller does not need explicit permissions to the base objects (execution is essentially granted because the ownership chain is not broken). Another way to think about it is that your granting rights to execute a "process" and one that you have control (and ownershp) over all of the object involved. So, SQL Server only requires execute rights on the sp. However, this ONLY goes for explicity statements and not statements that are built dynamically. For those, the CALLER needs explicit permission OR the procedure needs to be created with EXECUTE AS. So, if you plan to use EXECUTE AS be sure to read my post titled: Little Bobby Tables, SQL Injection and EXECUTE AS.

But, more than anything else - you CAN use them for better performance. No, it's not solely because the might have an already compiled (and therefore optimized) plan that's available at runtime (yes, that CAN help) but compilation (or recompilation) can often be a small portion of runtime compared to the execution of a bad plan. And, this is where optimizing the procedural code can help. The bad news is that you have to know what you're doing. And, I could make some arguments that SQL Server could do some things better here but what I've been doing with this series is getting you familar with the options that will finally come together in the next a post titled: Stored Procedure Performance. (NOTE: I'll update this with a corrected title/link once the post is completed. I will try to get this this within the next couple of days.)

Thanks for reading!!
kt

I started this series with the post titled: Little Bobby Tables, SQL Injection and EXECUTE AS. I then moved to discussing some of the differences with the post titled: EXEC and sp_executesql - how are they different?

Today, I want to address a few of the comments as well as continue with a few tips and tricks using these commands.

First off - could we have helped the performance of the sp_executesql statement?

Yes...

If we know that a statement returns a varying amount of data (based on the parameters supplied) then we can use the SQL Server 2005 feature WITH RECOMPILE to tell SQL Server that the statement being executed should have it's own plan created and that prior plans (if they exist) should not reuse the statement. It also tells SQL Server that this particular plan is a "single-use plan" that should not be reused for subsequent users. To see the combination of all of these things - I'll use some of the DMVs that track plan cache and plan utilization.

DBCC FREEPROCCACHE
go

SELECT st.text, qs.EXECUTION_COUNT --, qs.*, cp.*
FROM sys.dm_exec_query_stats AS
qs
CROSS APPLY sys.dm_exec_sql_text(sql_handle) AS st
CROSS APPLY sys.dm_exec_query_plan(plan_handle) AS cp
WHERE st.text like
'%FROM dbo.member%'
go

Right now, this query returns 0 rows.

I'll execute the following and then recheck the plan cache:

DECLARE @ExecStr nvarchar(4000)
SELECT @ExecStr =
'SELECT * FROM dbo.member WHERE lastname LIKE @lastname'
EXEC sp_executesql @ExecStr, N'@lastname varchar(15)', 'Tripp'
go

Now, we have a row for our parameterized query plan:

text                                                                                                                                                            EXECUTION_COUNT
(@lastname varchar(15))SELECT * FROM dbo.member WHERE lastname LIKE @lastname              1

So, what is this showing us... it's showing us that there's a plan in cache for this statement. And, if we're interested in seeing the plan, we can remove the commented out cp.* in the query above to get the cp.query_plan column. It shows the following:

<ShowPlanXML xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan" Version="1.1" ...LOTS OF OTHER STUFF THAT I... </ShowPlanXML>  (here's the FULL text of this: query plan xml.sqlplan (11.98 kb))

OK, I know that's not overly helpful... but, you're more than welcome to copy it and save it as a .sqlplan file. Once saved as a .sqlplan file - you can open it directly into SQL Server 2005 or SQL Server 2008's Management Studio. However, the very *nice* feature of SQL Server 2008's Management Studio is that a single click on an XML showplan - will go DIRECTLY into a graphical query plan window:

 

And, once again, we see the optimal plan (to use the index and do a bookmark lookup) because this query is highly selective (only 1 row).

We'll execute the exact same statement again - using the value of Anderson just to get setup to the point where we were last week:

DECLARE @ExecStr nvarchar(4000)
SELECT @ExecStr = 'SELECT * FROM dbo.member WHERE lastname LIKE @lastname'
EXEC sp_executesql @ExecStr, N'@lastname varchar(15)', 'Anderson'
go

And, we see that it uses the EXACT same plan (looking at showplan). In fact, we can see that from checking our plan cache as well:

text                                                                                                                                                            EXECUTION_COUNT
(@lastname varchar(15))SELECT * FROM dbo.member WHERE lastname LIKE @lastname              2

And, while we know that this plan (to use the index) is good for the highly selective value of 'Tripp' it's is not good for the value of Anderson as there are many rows that match. If we suspected this and/or knew this when we were executing (from the client) then we could use OPTION (RECOMPILE) to force SQL Server to get a new plan:

DECLARE @ExecStr nvarchar(4000)
SELECT @ExecStr = 'SELECT * FROM dbo.member WHERE lastname LIKE @lastname OPTION (RECOMPILE)'
EXEC sp_executesql @ExecStr, N'@lastname varchar(15)', 'Anderson'
go

The query plan as a result of this execution is:

And, this is the more optimal plan given this parameter. At ths point, the questions (IMO) are:

  1. Would someone really be able to programmatically guestimate that a parameter submitted from the client is "atypical" and/or that it warrants recompilation? And, I guess I can answer yes - for some parameters - like those with a wildcard. But, if we're just talking about two different values against a single column, this would be guessing statistics of the data.
  2. Should this particular statement always be executed to recompile and never save a plan?
  3. What did SQL Server do with the plan itself?
I'll go with answering #3 first as that one is easy to answer. Using the same statement, I'll again query the plan cache:

text                                                                                                                                                            EXECUTION_COUNT
(@lastname varchar(15))SELECT * FROM dbo.member WHERE lastname LIKE @lastname              2

Even though this is the third time we have executed this statement, this final execution was NOT put in cache. It was used solely for the execution with OPTION (RECOMPILE). And, it will NOT affect future executions. If I go back and execute without the OPTION (RECOMPILE) then I will get the prior plan (to use the index).

Now, the other two questions - these are a lot more interesting and this is where I think that stored procedures should be used. Personally, I think that developers that know the data and know the application - will be a lot better at creating the RIGHT code especially when they understand all of the options available to them.

Here's the way I think about stored procedures and recompilation:

  • If I know that a particular statement always returns the same number of rows and uses the same plan (and, I'd know this from testing), then I'll create the stored procedure normally and let the plan get cached.
  • If I know that a particular statement wildly varies from execution to execution and the optimal plan varies (again, I should know this from testing multiple sample executions), then I'll create the stored procedure normally and I'll use OPTION (RECOMPILE) to make sure that the statement's plan is not cached or saved with the stored procedure. On each execution that stored procedure will get different parameters and the particularly nasty statement will get a new plan on each execution.

However, this is also where things get more challenging. I've often seen stored procedures where people try and use conditional logic to break up the different types of parameters and this doesn't usually work out as well as expected (I'll blog this next). And, this is always where some decide that they want to dynamically build the statement that gets executed - now, we need to determine whether or not we should use sp_executesql or EXEC. And, there are really a couple of options at this point. Ultimately, in one or two more posts - I'll finally show you where EXEC is a clear winner over sp_executesql because even the OPTION (RECOMPILE) doesn't always help ALL kinds of plans (and especially one of the more common types of plans I see).

So, I'm getting closer... at least two more to go here. Thanks for the great questions/comments!
kt

In my last post: Little Bobby Tables, SQL Injection and EXECUTE AS, I wanted to highlight how to avoid SQL Injection when using EXEC. A few people brought up the point that I could have avoided SQL Injection by using sp_executesql. And, while sp_executesql can be used in some cases, it's not always the most ideal from a performance perspective. So, to really show this, I'm going to start with focusing on the similarities and differences of EXEC and sp_executesql. I'll start here with some performance details but you'll find that I'm going to just hit the tip of the iceberg with this one. I'll definitely need another post or two!

First, a quick overview:

sp_executesql (also known as "Forced Statement Caching")

  • Allows for statements to be parameterized
  • Only allows parameters where SQL Server would normally allow parameters; however, this string can be built using forms of dynamic constructs. I'll give more details on this in additional posts.
  • Has strongly typed variables/parameters - and this can reduce injection and offer some performance benefits!
  • Creates a plan on first execution (similar to stored procedures) and subsequent executions reuse this plan

EXEC  (also known as "Dynamic String Execution" or DSE)

  • Allows *any* construct to be built
  • Treats the statement similarly to an adhoc statement. This means that the statement goes through the same process that adHoc statements do - they are parsed, probably parameterized and possibly deemed "safe" for subsequent executions to re-use
  • Does not have strongly typed parameters in the adhoc statement and therefore can cause problems when the statements are executed (I have ways around this)
  • Does not force a plan to be cached.
    • This can be a pro in that SQL Server can create a plan for each execution
    • This can be a con in the SQL Server needs to recompile/optimize for each execution

Let's start with using sp_executesql to parameterize a query where SQL Server would also allow parameters:

DECLARE @ExecStr nvarchar(4000)
SELECT @ExecStr =
'SELECT * FROM dbo.member WHERE lastname LIKE @lastname'
EXEC sp_executesql @ExecStr, N'@lastname varchar(15)',
'Tripp'
go

Because 'Tripp' is a highly selective name, SQL Server uses an index to the lookup of the data:

So, for our next execution, I'll supply a different lastname - a lastname of Anderson. In this database, Anderson is NOT highly selective:

DECLARE @ExecStr nvarchar(4000)
SELECT @ExecStr =
'SELECT * FROM dbo.member WHERE lastname LIKE @lastname'
EXEC sp_executesql @ExecStr, N'@lastname varchar(15)',
'Anderson'
go

However, the query plan looks exactly the same:

Or does it? It turns out that this query plan does look a tiny bit different but it's not very obvious... it's in the thickness of the lines. In this second execution it clearly shows that more data is being passed between the steps. But, is this a bad thing? Maybe, maybe not. Let's drill in a bit deeper. If I hover over the Index Seek (on member.test), I can see the following tooltip:

The key point here is that it shows an "Estimated Number of Rows" of 1.96 but an "Actual Number of Rows" of 385. That's pretty far off... why? Because this statement's plan was determined by the first execution of sp_executesql. Let's try another execution:

Because the query has LIKE in it, we can use wildcards. And, let's do that! This time I'll supply a wildcard of %e%:

DECLARE @ExecStr nvarchar(4000)
SELECT @ExecStr =
'SELECT * FROM dbo.member WHERE lastname LIKE @lastname'
EXEC sp_executesql @ExecStr, N'@lastname varchar(15)',
'%e%'
go

And, again, the query plan looks the same... but with even thicker lines. Yes, I realize... this is not blindingly obvious:

And, by turning on SET STATISTICS IO ON, we can also review the [Logical] IOs performed:

For 'Tripp' 

Table 'member'. Scan count 1, logical reads 8, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

For 'Anderson'

Table 'member'. Scan count 1, logical reads 772, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

For '%e%'

Table 'member'. Scan count 1, logical reads 10019, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

This table only has 144 pages. In the cases of 'Anderson' and '%e%', SQL Server would have been better off doing a table scan. But, because I used sp_executesql I forced SQL Server to do caching. Subsequent users use the plan whether it's optimal or not.

And, there are even more complicated scenarios than this. I plan to keep tackling these issues over the next few days and I'll add quite a bit more to this. However, it's late. So, I'll dangle the carrot for now. Within the next couple of days, I'll show a series of EXEC statements that generate both SAFE and UNSAFE plans.

Thanks for reading!
kt

OK, I know many of you have seen this before (an oldie, but a goodie!):

(image from xkcd.com, with "copy and share" license described here: License)

But, what can you do to prevent this? And, when would this even be possible?

This is possible when DSE (dynamic string execution) occurs. There are still some VERY relevant and important reasons to use DSE and some are performance related (ok, this is another post for another day) but suffice it to say - I use DSE but I also know how to prevent Little Bobby Tables from running amuk within my database.

(April 5 note: after a few comments about sp_executesql, I've modified the following paragraph to explain a bit more... I'm *very* aware of sp_executesql and while it does have some benefits, it also has some negatives! I promise, this is coming soon in another post. I really want this post to focus solely on how to minimize injection when you do NEED to use a dynamically constructed string and there are times when this is the ONLY way to solve certain performance problems. sp_executesql - while often compared to EXEC as a better choice just isn't ALWAYS better and really it's not a simple comparison as they are not equivalent!! In fact, there are some cases where I would specifically avoid sp_executesql... So, I promise... this will be coming up in a blog post soon!! Great comments! Thanks!! :))

So, while the discussion on when, where and why you might use DSE is another one - especially when compared with sp_executesql (yes, I hear another blog post coming - and, well, I'm kind of on a roll right now :)... I want to give you a couple of very cool tips/tricks to reduce the possibilities of problems if/when you do use it (and, there REALLY are places where it's necessary!). It's especially important as there are a few things that many folks just don't even know exist. And, there are 3 parts to this article... I would expect that most apps need QUOTENAME() and/or REPLACE() but ALL apps should consider my recommendations/comments for EXECUTE AS - if/when you're actually using DSE (through EXEC).

  • QUOTENAME()
  • REPLACE()
  • EXECUTE AS

Part I: QUOTENAME()

In a question/thread (this is many moons ago), the following question was asked (NOTE: I have "tweaked" the code to highlight ONLY the problem here AND I've made it so that it won't execute):

I have a procedure that creates a database (below). When people use spaces in their database names it fails. How can I make it work if they supply spaces?

CREATE PROCEDURE dbo.CreateDBProc
(
     @DBName
sysname
)
AS
DECLARE
@ExecStr nvarchar(2000
)
SELECT @ExecStr = N'CREATE DATABASE ' + @DBName
-- + all of the other stuff to place the files, etc...
SELECT @ExecStr
--EXEC(@ExecStr)
go

And, there was a reply to this question where the answer was (and I am NOT recommending this answer. Please do NOT stop reading here otherwise you are asking for a world of pain - and LOTS of SQL Injection):

Just go ahead and do the following. By putting brackets around the name you will solve this problem.

CREATE PROCEDURE dbo.CreateDBProc
(
    
@DBName
sysname
)
AS
DECLARE
@ExecStr nvarchar(2000
)
SELECT @ExecStr = N'CREATE DATABASE [' + @DBName + N']'
-- + all of the other stuff to place the files, etc...
SELECT @ExecStr
--EXEC(@ExecStr)
go

OK, this is a SCARY recommendation. Not just because it won't always work (what if someone actually wants [don't get me wrong - I don't know WHY they'd want to do this but...] to put [brackets] in the name?) but because it's prone to SQL Injection.

This won't work:

EXEC dbo.CreateDBProc N'This is the most stupid :) name I can think of with [brackets] and all sorts of junk! in the name'

And, this is just scary what I can do:

EXEC dbo.CreateDBProc N'fakedbname] DROP DATABASE foo--'

EXEC dbo.CreateDBProc N'[fakedbname] EXEC(''CREATE PROC Test AS SELECT * FROM pubs.dbo.authors'') DROP DATABASE foo --'

So, what should we do?

ANSWER: If you're building strings where identifiers are parameterized (and, there are many reasons for why this many not be the best idea - but, if you absolutely must), USE QUOTENAME()!

So, here's the correct way to code this procedure so that an identifier cannot be subjected to SQL Injection:

CREATE PROCEDURE dbo.CreateDBProc
(
    
@DBName
sysname
)
AS
DECLARE
@ExecStr nvarchar(2000
)
SELECT @DBName = QUOTENAME(@DBName, N']'
)
SELECT @ExecStr = N'CREATE DATABASE ' + @DBName
-- + all of the other stuff to place the files, etc...
SELECT @ExecStr
--EXEC(@ExecStr)
go

Now, no matter what messed up name someone actually WANTS to create - this will protect it such that SQL Injection is not possible. You can try any of these combinations with this version of the proc and while creating a database actually named "[fakedbname] DROP DATABASE foo --" probably isn't desirable, dropping a database isn't either (and I would argue is MUCH worse).

All of these executions are "protected" from SQL Injection:

EXEC dbo.CreateDBProc N'this is my test database name'
EXEC dbo.CreateDBProc N
'[fakedbname] DROP DATABASE foo --'
EXEC dbo.CreateDBProc N
'[fakedbname] EXEC(''CREATE PROC Test AS SELECT * FROM pubs.dbo.authors'') DROP DATABASE foo --'
EXEC dbo.CreateDBProc N'This is the most stupid :) name I can think of with [brackets] and all sorts of junk! in the name'

Note: The first and second execute (if you actually execute instead of just SELECT @ExecStr) *and* create databases with these ugly names. The third and fourth don't execute because some of the injected characters are not supported for file/directory names. However, the most important point is that *all* are protected from SQL Injection!

Part II: REPLACE()

But, what if the string you need to work with isn't an identifier? Can you still use QUOTENAME()? And, the answer is YES, IF the string is compatible with SYSNAME which is an nvarchar(128). However, if you need longer strings or if you are protecting a string which is not an identifier - what else can you do? Well... it's definitely NOT as pretty but it works. The answer here is to use REPLACE. If I had wanted to do this with the procedure above, I would have had to do the following:

CREATE PROCEDURE dbo.CreateDBProcQuotes
(
     
@DBName sysname
)
AS
DECLARE
@ExecStr nvarchar(2000)
SELECT @DBName = REPLACE(@DBName, N'''', N'''''')
SELECT @ExecStr = N'CREATE DATABASE ' + @DBName -- + all of the other stuff to place the files, etc...
SELECT @ExecStr
--EXEC(@ExecStr)
go

However, this will NOT work here because an identifier must be properly quoted within the string and there are only 2 ways to "quote" identifiers. One is the [bracket] and the other is "double quotes" not single. Instead, this is what the procedure will look like (nope! we don't need REPLACE):

CREATE PROCEDURE dbo.CreateDBProcQuotes
(
     
@DBName sysname
)
AS
DECLARE
@ExecStr nvarchar(2000)
SELECT @ExecStr = N'CREATE DATABASE [' + @DBName + N']'-- + all of the other stuff to place the files, etc...
SELECT @ExecStr
--EXEC(@ExecStr)
go

And, SQL only supports brackets by default (without any special settings). To allow double quotes, you can turn on QUOTED_IDENTIFIER (SET QUOTED_IDENTIFIER ON) to allow double quotes instead of brackets but this has a few negative consequences (See SET Options That Affect Results) and personally, I always try to avoid making SET options changes in most of my stored procs.

CREATE PROCEDURE dbo.CreateDBProcQuotes
(
     
@DBName sysname
)
AS
SET QUOTED_IDENTIFIER ON
DECLARE
@ExecStr nvarchar(2000)
SELECT @DBName = REPLACE(@DBName, N'''', '''''')
SELECT @ExecStr = N'CREATE DATABASE "' + @DBName + N'"'-- + all of the other stuff to place the files, etc...
SELECT @ExecStr
--EXEC(@ExecStr)
go

Once again, REPLACE isn't needed. And, to be honest, this becomes a nightmare with " [double quotes] or actual [brackets] in the name. QUOTENAME() deals with those properly. So, with an identifier you really need QUOTENAME(). So, where does replace come in? With actual strings in your DSE. Imagine that you're wanting to pass in a first name but something about your application requires that this happen through DSE (and, this CAN definitely happen - mostly for performance purposes). Don't get me wrong, I just don't have the ability to do all of this within one post but simply put, because of how stored procedures create their execution plans there are OFTEN reasons for why you might want a string to be built dynamically. No, probably not one quite this simple but I'm trying to keep this particular example simple. This is probably what it would look like to start:

CREATE PROCEDURE GetMembers
(
      @FirstName nvarchar(50
)
)
AS
DECLARE
@ExecStr nvarchar(2000
)
SELECT @ExecStr = N'SELECT * FROM dbo.member WHERE FirstName = ' + @FirstName

SELECT @ExecStr
-- EXEC(@ExecStr)
go

And, on first test:

EXEC GetMembers N'Kimberly'

And, on first test it generates: SELECT * FROM dbo.member WHERE FirstName = Kimberly which gives you an error because the firstname isn't properly quoted. So, you think... oh, I need to quote these in my EXEC statement:  

ALTER PROCEDURE GetMembers
(
    
@FirstName nvarchar(50
)
)
AS
DECLARE
@ExecStr nvarchar(2000
)
SELECT @ExecStr = N'SELECT * FROM dbo.member WHERE FirstName = ''' + @FirstName +
N''''
SELECT
@ExecStr
--EXEC(@ExecStr)

go

EXEC GetMembers N'Kimberly'

Ah, that works... for awhile. Then, someone complains that it doesn't work for C'Anne (I just met someone named C'Anne - pronounced SeaAnne) this past week.

EXEC GetMembers N'C''Anne'

And, it generates:

SELECT * FROM dbo.member WHERE FirstName = 'C'Anne'

which gives you a syntax error.

Msg 102, Level 15, State 1, Line 1
Incorrect syntax near 'Anne'.
Msg 105, Level 15, State 1, Line 1
Unclosed quotation mark after the character string ''.

Darn. But, at least this is how you catch it. Instead of with Little Bobby Tables. What if this gets executed:

EXEC GetMembers N'''Robert''); DROP TABLE Students; --'''

Luckily, I got:

Msg 3701, Level 11, State 5, Line 1
Cannot drop the table 'Students', because it does not exist or you do not have permission.

But, what if there had been a table called Students AND the context of the execution was such that the could execute this? Well, that's what's proposed by the cartoon and well... I've *unfortunately* heard of that happening!

So, what do you do? First, you need to protect the string being inserted! This is what you need:

ALTER PROCEDURE GetMembers
(
     @FirstName nvarchar(50
)
)
AS
DECLARE
@ExecStr nvarchar(2000
)
SELECT @ExecStr = N'SELECT * FROM dbo.member WHERE FirstName = ''' + REPLACE(@FirstName, N'''', N'''''') +
N''''
SELECT @ExecStr
--EXEC(@ExecStr)

go

And, this requires that the string be surrounded with quotes AND the string submited have all single quotes replaced with '' [single quote followed immediately by a second single quote]). Yes, I realize that this is starting to look ugly but... what happens with Little Bobby Tables? We get his row. NOTE: We do have other potential performance problems here because the parameter (when it gets thrown back out into the string - is no longer an NVARCHAR, it's only a varchar). So, we should also consider adding an N into the string like the following:

SELECT @ExecStr = N'SELECT * FROM dbo.member WHERE FirstName = N''' + REPLACE(@FirstName, N'''', N'''''') + N''''

And, what if the string being built has numerics or other types - what should we do there? To be honest, I often EXPLICITLY type the variable in what looks like a completely unnecessary CONVERT. However, for performance purposes I *often* do this. Ugh, the future posts are REALLY starting to build from here! Again, I'm not hitting performance in this one so I'm not going to take the bait of the tangent. Another time for sure!

Finally, what if our strings are REALLY complex and we need to do some strange stuff and well... we're not sure that the string can be completely protected? What if we're passing in a much more complicated "piece" of a WHERE clause or a full and customized ORDER BY? What if it's not just a SINGLE value? Ugh... this is much harder but you definitely need EXECUTE AS.

Part III: EXECUTE AS

This is the ultimate in protection for your DSE code. This is where you decide that you're going to allow virtually anything in the WHERE clause or ORDER by but you really want to protect your database. Here, what I need is EXECUTE AS. And I'm not a big fan of this kind of code - but, again, I'm not looking at perf or anything here... I'm *just* looking at the security and potential of the injection.

SQL Server 2005 introduced EXECUTE AS and it has a lot of options - in fact, 4 options:

  • EXECUTE AS Caller
  • EXECUTE AS User
  • EXECUTE AS Self
  • EXECUTE AS Owner

EXECUTE AS Caller is the default. This means that the user that executes the procedure must have the DIRECT rights associated with the Dynamically Executed String. And, so, this is good. This should protect MOST cases altogether. However, what if the users have higher privileges? Well, I'd say that there's already something wrong with your database (because this should NEVER be the case) but... well... I've seen apps (and even RECENTLY) heard of apps that connect as SA. Yes, seriously. So... what can they do... Well, much worse that DROP TABLE. How about DROP DATABASE. Or, well, let's just say that if I found vulnerabilities in that - I could possible get outside of their SQL Server and do even more damage - to files, to the network. So... this is REALLY REALLY REALLY REALLY BAD. I would STRONGLY suggest that anyone that connects as SA - IMMEDIATELY WORK HARD TO CHANGE THIS.

EXECUTE AS Owner is almost as bad (although nothing's as bad as connecting as SA) - especially when the OWNER has elevated rights. So, I'd try to avoid EXECUTE AS Owner unless you SEVERELY restrict who has permissions to the procedure in general. But, don't EXECUTE AS Owner for a stored procedure that's granted EXEC to public. That completely defeats the purpose!

EXECUTE AS Self is a bit strange. But, when a stored procedure is created in a schema, the ownership chaining and privileges follow from the OWNER of the schema. So, what if the author of the stored procedure has rights that the owner doesn't. No, I don't really recommend this but that's where "self" comes in. This is where the procedure can execute under the context of the actual creator (not the owner). However, often these are the same and often they're DBO. This is even worse... Although still, not as bad as connecting as SA.

EXECUTE AS User (where the User is a low-privileged user with very few rights) is good. And, in fact, this will be the answer to this problem!

Yes, even if you connect as SA, I can STILL protect you. Yes, we can still protect this code even when run under the connection context of SA - with EXECUTE AS.

CREATE USER User_GetMembers
WITHOUT
LOGIN
go

GRANT SELECT ON Member TO User_GetMembers
go

ALTER PROCEDURE GetMembers
(
    
@FirstName nvarchar(50)
)
WITH EXECUTE AS N'User_GetMembers'
AS
DECLARE
@ExecStr nvarchar(2000)
SELECT @ExecStr = N'SELECT * FROM dbo.member WHERE FirstName = ''' + REPLACE(@FirstName, '''', '''''') + ''''
SELECT @ExecStr
--EXEC(@ExecStr)
go

And, even if for some reason you can't use REPLACE() this procedure will be limited in what they can do because the user (User_procedurename) has limited rights within the database. In fact, I would recommend that the user ONLY have the necessary rights for THAT stored procedure and I'd even recommend that you create one user for each stored procedure (that has DSE). Yes, that's potentially a lot of users... but, better than the alternative(s)!

Summary/Resources

Finally, please be sure that you're doing some form of auditing if you're using EXECUTE AS. And, you should also do a quick check to see if anyone is ELEVATING user's rights to a more privileged user. Check out these two blog posts for additional information:

And... ah... I'm done! There's a lot here but a lot of tips. You CAN prevent SQL Injection.

Thanks for reading!
kt

OK, we were in Iceland and then Florida for our Accidental DBA workshops and both went really well. People agree that there are quite a few involuntary/accidental DBAs out there and overall, we helped quite a few to see a lot of options for better performance, availability, recovery, and/or just manageability.

So, this is our "resources post". We waited until after the SQL Connections delivery to post these as we figured we might add a few more to the list (as is typical when you deliver content more than once - it's really never the same twice!).

Also, I used a few "interactive" (or build) slides in my presentation - specifically on transaction log backups and the concepts of "clearing the log" which really only clears the inactive portion of the log. To help you visualize this, I've added these slides here: TrippRandal_ClearingTheLog-BuildSlides.zip (647.2 KB).

Finally, we've taken all of the scripts that we demo'ed and placed them on SQLskills on our Past Events page here: http://www.sqlskills.com/pastConferences.asp.

And, if you were there and you think we missed something, feel free to ping me (or Paul!) with an email and we'll make sure to update this resources post (and/or [at least] help you find it what you're looking for!!).

Next stop - Microsoft TechEd ITPro in June (we're back in Orlando again)!
kt

Hey there everyone! I know I still owe you a few Q&A entries (for sessions 7, 8 and 9) but I wanted to get this blog entry out there so that you can play a bit with some of the resources. This series was targeted at developers but really helps to "bridge the gap" between development and administration by always remembering the impacts of what you implement (and techniques to help you to see if you do). More specifically, everything you do and/or design, has the potential for a negative impact to something else - there's no free lunch, eh?

So, this series focused more on Scalability but always remembered the impact to availability and/or reliability. This last session brought together the three primary areas to remember while developing a scalable and reliable architecture:

  1. Know your data
    • Design for Performance - Sessions 1, 2, 3, 6, and 8
  2. Know your users
    • Indexing for Performance - Sessions 4, 5 and 9
    • Optimizing Procedural Code - Session 7
    • Controlling Mixed Workloads and Concurrency - Session 6
  3. Users lie
    • Profile - to make sure that you're tuning what's really happening as opposed to what you think was going to happen! - Session 9

This last session had some great questions and as a result, a few new resources were used. Here are a few of the things we talked about:

Event Notifications and DDL Triggers

DMVs

Webcast links for the entire series!

Part 1: Creating a Recoverable Database
For the MSDN Download for Part 1, click here.
For the SQLskills Blog Entry for Part 1, click here.

Part 2: Creating a Reliable and Automated Backup Strategy
For the MSDN Download for Part 2, click here.
For the SQLskills Blog Entry for Part 2, click here.

Part 3: Designing Tables that Scale, Best Practices in Data Types and Initial Table Structures
For the MSDN Download for Part 3, click here.
For the SQLskills Blog Entry for Part 3, click here.

Part 4: SQL Server Indexing Best Practices
For the MSDN Download for Part 4, click here.
For the SQLskills Blog Entries for Part 4
Resource links blog entry, click here.
Q&A blog entry, click
here.

Part 5: SQL Server Index Defrag Best Practices
For the MSDN Download for Part 5, click here.
For the SQLskills Blog entry, click here.

Part 6: Mixed Workloads, Secondary Databases, Locking and Isolation
For the MSDN Download for Part 6, click here.
For the SQLskills Blog Entry for Part 6, click here.

Part 7: Understanding Plan Caching and Optimizing Procedure Performance
For the MSDN Download for Part 7, click here.

Part 8: Data Loading and Aging Strategies - Table and Index Partitioning
For the MSDN Download for Part 8, click here.

Part 9: Profiling for Better Performance
For the MSDN Download for Part 9, click here.

Part 10: Session Summary - Common Roadblocks to Scalability
For the MSDN Download for Part 10, click here.
Transcript can be found here.

So, the series comes to an end (even though I still have more work to do). I have to say that it was a lot of fun and I enjoyed everyone's questions. And /start shameless plug here/ starting in March, SQLskills will begin a 10-12 part series on TechNet. The series will include sessions from my colleague Bob Beauchemin as well as me. This will definitely be more Operations and DBA focused but for all of you developers - it may help you better understand the system, High Availability and a myriad of New Features in SQL Server 2005.

I hope to see you there - or at least your DBA... ;-)

Thanks again everyone,

Kimberly

This is a much needed and much overdue blog entry... In 8 Steps to Better Transaction Log throughput, I mentioned a customer that was helped by TWO typical optimization problems I see. In that blog entry, I said I would write two blog entries - that one on transaction log optimization and another on common tempdb optimizations. Well, I forgot...until I was reminded with an email this morning (thanks Marcus!).

First - a bit of understanding of TempDB - what goes there?

  • Internal temporary objects needed by SQL Server in the midst of other complex operations. For example, worktables created by a hash aggregate will be stored in TempDB or interim tables uses in hash joins (almost anything that shows as "hash" something in your query plan output is likely to go to tempdb).
  • User objects created with either # (for local temporary objects), ## (globabl temporary objects) or @ (table variables)
    • # = Local temporary object
      Local temp objects are objects accessible ONLY in the session that created it. These objects are also removed automatically when the session that created it ends (unless manually dropped).
    • ## = Globabl temporary object
      Global temporary objects are objects that are accessible to ANYONE who can login to your SQL Server. They will only persist as long as the user that created it lasts (unless manually dropped) but anyone who logs in during that time can directly query, modify or drop these temporary objects. These objects are also removed automatically when the session that created it ends (unless manually dropped) OR if being used by another session when the session that created it ends, when the session using it finishes using it (and it's only as long as any locks are held). If other sessions need more permanent use of a temporary object you should consider creating a permanent objects and dropping it manually.
    • @ = User-defined Table Variable
      User-defined Table Variables were introduced in SQL Server 2000 (or, wow - was it 7.0?) and provide an alternative to temporary tables by allowing you to create a variable defined as type TABLE and then you can populate and use it in a variety of ways. There has been A LOT of debate over whether or not you should always use table variables or always use temp tables. My response is that I ALWAYS avoid the word always! My point is that table variables are NOT always better nor are temp tables always better. There are key uses to each. I tend to like temp tables in scenarios where the object is used over a longer period of time - I can create non-key indexes on it and it's more flexible to create to begin with (SELECT INTO can be used to create the temp table). I also have the ability to use the temporary table in nested subprocedures because it's not local to the procedure in which it was created. However, if you don't need any of those things then a table variable might be better. When it is likely to be better - when you have smaller objects that don't need to be accessed outside of the procedure in which it was created and when you only need KEY indexes (a table variable ONLY supports the indexes created by a create table statement - meaning PRIMARY KEY and UNIQUE KEY).
  • Objects created by client applications - this is possibly a large part of your problem... Profiling can help you to determine if there's a lot of TempDB usage from your client applications.

OK, so now that you know what goes there - how do you make it optimal?

First and foremost, TempDB is in cache just as any other database is in cache. TempDB does not spill to disk unless you are low on cache and/or if you have a lot of inflight transactions in TempDB. Although TempDB is not persisted from shutdown to restart - it still needs to do some logging and therefore you should consider its optimization a lot like other databases.

Things you should do for TempDB (that are a lot like what you should do for every database):

  1. Isolate the data and log portion of TempDB.
  2. Place them on clean, defragmented disks.
  3. Pre-allocate them so they don't need to do a lot of autogrowth.
  4. Make sure you have sufficient memory to support active objects (check for disk activity to the disks that contain TempDB files).
  5. Make sure that transactions are written efficiently so that there are no unusually long running transactions that are unnecessarily holding resources (and therefore locks and therefore log activity).

And - if you need to move TempDB, you should review this KB Article: Moving SQL Server databases to a new location with Detach/Attach

Things you should do SPECIFICALLY for TempDB (especially if you're running on a multiproc machine):

Before I say what... let me tell you why? TempDB has a large number of objects being created all the time. For an object to be created, space must be allocated to it. Space allocation is determined by looking at some of the internal system pages (the GAM, and SGAM). In the end, it is these pages that start to have significant contention (with just one file) in a VERY active TempDB. To minimize that contention you can create multiple files.

  1. Consider creating multiple files for TempDB (even if on the same physical disks) so that there is less of a bottleneck when objects are being allocated. Make sure to read associated KB.
  2. Consider setting a trace flag to have object allocation grab extents rather than pages. Make sure to read associated KB.

BOTH of these last two are described in detail by a KB article: FIX: Concurrency enhancements for the tempdb database.

OK - so that should really help! Moving forward (meaning SQL Server 2005), having multiple files can still help for TempDB.

Scott Stanfield, a Microsoft Regional Director, had a great idea for Tech*Ed US (well, it was a great idea before he knew exactly how much work was involved :)...

He decided that he would get other Microsoft Regional Directors together to discuss (and record) interesting topics for just 10 minutes (of course, some of us went over... SORRY!). We could pick anything we wanted and then he would - with the help of a few other RDs such as Scott Golightly, Patrick "Beach Master" Hynds, Kate Gregory, and J. Michael Palermo - record our sessions to later edit and post on the GrokTalk site.

So, after A LOT OF HARD WORK - they're done! All 35 of the GrokTalks are posted and you can find them on the GrokTalk site: http://www.groktalk.net/

Finally, since I delivered a GrokTalk on Stored Procedures, I'm going to make sure that you have a comprehensive set of resources related to stored procedure optimizations here:

Well, now that should keep you busy...

Enjoy!!

Many of you in the .NET Community are already familar with theServerSide.NET but many of you in the SQL Server community are not...well, here's your chance to bridge the gap over to .NET and hear a bunch of great .NET interviews! My interview is more SQL-centric but there are lots of great ones on Development in general and even one on SQL Server 2005 Reporting Services by Jason Carlson.

My interview was recorded with them back in February when I was speaking at VSLive in San Francisco and they've been saving them up, editing them and then posting an interview roughly each week (mine is the 29th posted!). The interview was great fun and Paul Ballard asked some really interesting questions about tuning, indexes, stored procedures and SQL Server 2005. Mostly it's an interview about optimization tips in both SQL Server 2000 and 2005 and just in general - things to look forward to in SQL Server 2005.

Here's a link to all of theServerSide.NET interviews: http://www.theserverside.net/talks/index.tss 
Here's a link specifically to mine: http://www.theserverside.net/talks/videos/KimberlyTripp/interview.tss?bandwidth=dsl
Here's the link to Jason Carlson's Reporting Services interview: http://www.theserverside.net/talks/videos/JasonCarlson/interview.tss?bandwidth=dsl and finally, here's a TechNet Webcast that Jason did titled: Authoring Reports in SQL Server 2000 Reporting Services (Level 200)

Enjoy!

Well... the performance ramifications are... not good!

Without seeing more of the proc I'd have to suggest that you avoid this like the plague! I think I can convince you with this simple example which shows why it's better to have multiple procedures and limit the flexibility a bit OR use DSE = dynamic string execution (and unfortunately even recompilation issues don't help - which I talked about here)...

And - if we go down the path of DSE I'm sure that will start a separate thread as far as "sql injection" issues/attacks and security (the user executing the procedure will need permission to directly execute the statement which is in the DSE string). But - there are a few ways to protect parameters in a stored procedure. In this procedure, the parameters will not allow “injection” because of the addition of the QUOTENAME function. This function is INVALUABLE for protecting isolated input values (which is another trick: isolate parameters if you can).

In the following “multi-purpose” procedure it becomes obvious (after testing) that ONLY DSE solves the performance problem. I could still argue that more than one procedure could be beneficial especially as it wouldn't have the access requirements that this one has (remember - users would have to have permissions to the base table). BUT that's a lot more work to maintain/call, etc. Regardless, one procedure with “all purpose parameters” is definitely NOT a good choice (without DSE). (And fyi - a future release might help with some of the permissions issues - that's all I'll say for now! :)

OK - so see the ProcedureParameters.sql (4.3 KB) procedure to play with this within the CreditSampleDB.zip (55.79 KB). You will need to modify some data to get rows to return to these queries; use the updates in the .sql script.

CREATE PROC dbo.GetMemberInfoParam
(
 @Lastname varchar(30) = NULL,
 @Firstname varchar(30) = NULL,
 @member_no int = NULL
)
AS
SET NOCOUNT ON

SELECT m.*
FROM dbo.member AS m
WHERE (m.lastname LIKE @lastname OR @lastname IS NULL)
 AND (m.member_no = @member_no OR @member_no IS NULL)
 AND (m.firstname LIKE @firstname OR @firstname IS NULL)
go

This next procedure uses dynamic string execution to build the exact and needed string. You could have built this from the client as well... and that would introduce other sql-injection issues as well. But - this is a simpe idea and I hope this helps.

CREATE PROC dbo.GetMemberInfoParamDSE
(
 @Lastname varchar(30) = NULL,
 @Firstname varchar(30) = NULL,
 @member_no int = NULL
)
AS
SET NOCOUNT ON

IF @LastName IS NULL AND @FirstName IS NULL AND @Member_no IS NULL
 RAISERROR ('You must supply at least one parameter.', 16, -1)

DECLARE @ExecStr varchar(1000)
  , @MemberNoStr varchar(100)

SELECT @ExecStr = 'SELECT m.* FROM dbo.member AS m WHERE '

IF @LastName IS NOT NULL
 SELECT @Lastname = 'm.lastname LIKE ' + QUOTENAME(@lastname, '''')
IF @FirstName IS NOT NULL
 SELECT @Firstname = 'm.firstname LIKE ' + QUOTENAME(@firstname, '''')
IF @Member_no IS NOT NULL
 SELECT @MemberNoStr = 'm.member_no = ' + convert(varchar(5), @member_no)

SELECT @ExecStr = @ExecStr + ISNULL(@LastName, ' ')
 +
 CASE
  WHEN @LastName IS NOT NULL AND @FirstName IS NOT NULL
   THEN ' AND '
  ELSE ' '
 END
 +
 ISNULL(@FirstName, ' ')
 +
 CASE
  WHEN (@LastName IS NOT NULL OR @FirstName IS NOT NULL)
     AND @MemberNoStr IS NOT NULL
   THEN ' AND '
  ELSE ' '
 END
 +
 ISNULL(@MemberNoStr, ' ')

EXEC(@ExecStr)
go

The first procedure generates a plan based on the MOST selective criteria (there's more info in the script and you'll see this in the showplan/statistics io output that I describe - also in the script). The second procedure generates a good plan for each and every execution (as expected). I think this is a good example of what you can do!

I'm going to put together one more entry to bring all of these points together as well. Stay tuned!

Have fun!

After a group of RDs started chatting about stored procedures I finally felt like I could join in. Often they talk about .Net and client stuff and well, quite honestly, I don't care what you do with the data...only that I serve it up quickly. Ok, I'm really kidding here but my primary focus is backend server tuning and availability, so when the group started talking about stored procedures, I had to chime in. It all started because someone had blogged that stored procedures are not precompiled and are therefore of no use... which is COMPLETELY wrong. However, the irony is that I don't even believe that they should be compiled (and saved and therefore reused) ALL the time. In fact, a precompiled plan is also an optimized plan and that plan may or may not always be the best plan for every execution....... So, here's a bunch of stuff about sprocs. You definitely want to use them - but use them effectively!!!!
 
Benefits of stored procedures:
  • Centralized logic which can be changed with minimal client impact
  • Logic on the server so roundtrips are minimized
  • Compiled plans saved in cache. Not every plan is saved, not every plan should be saved. However, the default is that stored procedure plans are compiled and saved - on first exection. The plan is NOT created when the stored procedure is created (I've also seen this written up incorrectly before) NOR is this plan saved permanently. There are many reasons for why the procedure's plan may fall out of cache:
    • Server restart
    • Falls out of cache due to low re-use (and not enough cache to keep it around)
    • Specifically being removed from cache by:
      • Executing DBCC FREEPROCCACHE to kick ALL plans out of cache (fyi - you can see what's in cache by querying master.dbo.syscacheobects)
      • DATA on which the procedure depends changing enough to cause the statistics to be invalidated and therefore sql server proactively invalidate the plan
      • Executing an sp_recompile (on the object(s) on which the procedure depends. This last one is also nice and something I sometimes force (off hours when a lot of indexes are being rebuilt and/or added). You can execute sp_recompile tname and it will cause all plans which access this table to be invalidated.
Stored Procedures for Security:
  • Stored Procedures can be secure AND easier to manage - in terms of permissions.
  • If I am the owner of a table and I create a procedure based on my table - I can give people access to the procedure withOUT giving base table permissions. This will allow better security and minimize the accidental "oh darns" when someone forgets a where clause on a delete. Imagine a user who types in DELETE Sales WHERE invnum = 1234 BUT highlights ONLY the DELETE Sales...whose problem is that? Mine as the DBA. We can all argue that users should not have ad-hoc access to the server (and I agree) BUT stored procedures create a layer of abstraction which is ALWAYS good especially if you are using DYNAMIC SQL.
Dynamic sql has many connotations... There is "Dynamic String Execution" which can be WITHIN a stored procedure and there's Dynamic SQL which is what an application builds and sends off to the server. As for a client who sends an ad-hoc string to the server or for a dynamically built string within a stored procedure - BOTH REQUIRE that the user have the ability to execute the command directly - which means there's more room for error. However, if it doesn't need to be in a dynamic string (and many statements can be parameterized) then it could be in a stored procedure WITHOUT dynamic string exection withOUT the possible room for HUGE SQL injection problems/errors. Speaking of SQL Injection - IF you use stored procedures and mostly only allow for Identifiers as parameters (i.e. tablename, viewname column names, etc.) then you can use the QUOTENAME() function to significantly reduce SQL injection.
Caching (in General) for better Performance:
 
There are really three areas that need to be understood to really get the issues related to stored procedures:
  • Ad-hoc statement caching (that was new in SQL Server 7.0 and higher)
  • Forced statement caching (through sp_executesql)
  • Stored procedure caching (by creating stored procedures)
Ad-hoc Statement Caching
When a statement is deemed "safe" sql server will take EVEN ad-hoc statements and store a plan of execution in cache for subsequent users to use. For the plan to be re-used the statement has to be almost an identical match, the parameter has to be the EXACT same data type, the objects cannot be ambiguous and MOST statements won't really benefit from this. If you want to see an example of ad-hoc statement caching do the following:
-- 1) Clear Cache with
DBCC FREEPROCCACHE

-- 2) Look at what executable plans are in cache
SELECT sc.*
FROM master.dbo.syscacheobjects AS sc
WHERE sc.cacheobjtype = 'Executable Plan'

-- 3) execute the following statement
SELECT t.*
FROM pubs.dbo.titles AS t
WHERE t.price = 19.99

-- 4) Look again at what executable plans are in cache and you'll find that there's a
--     plan for a NUMERIC(4,2) (look at the "sql" column in output - far RIGHT)

SELECT sc.*
FROM master.dbo.syscacheobjects AS sc
WHERE sc.cacheobjtype = 'Executable Plan'
 
-- 5) If you execute the EXACT same statement with a 4,2 then you will get THAT plan
--     but if you execute with a 5,2 you'll get a new plan (the plan is not safe). Execute this:

SELECT t.*
FROM pubs.dbo.titles AS t
WHERE price = 199.99

-- 6) Look again at what executable plans are in cache...

SELECT sc.*
FROM master.dbo.syscacheobjects AS sc
WHERE sc.cacheobjtype = 'Executable Plan'
OK - so the fact that SQL Server can cache the plan is good... How often is something actually deemed safe - well, it's not that likely. A better way of getting better plan re-use IS either of the two following BUT in these two cases it's ALWAYS cached - which might not ALWAYS be good....
 
Forced Statement Caching (through sp_executesql)
This is good IF you know the plans are consistent (more on this coming up) and IF you want to type the data more strictly. Remember how SQL Server had to type the data above (numeric(4,2) or numeric(5,2), etc.). Well here you can type the parameters and force the plan to be cached. All subsequent uses will get the same plan:
DECLARE @ExecStr    nvarchar(4000)
SELECT @ExecStr = N'SELECT t.* FROM dbo.titles AS t WHERE t.price = @price'
EXEC sp_executesql @ExecStr, N'@price money', 19.99
BUT this falls into the category of being hard to know and there are better ways with stored procedures. Generally, I recommend stored procedures over statement caching...
 
Stored procedure caching (by creating stored procedures)
OK - I could go on for hours here and I'll make it sufficient to say that NOT all procedures should be cached but MANY should. The ONLY way to get the security and consistency desired from stored procedures is to know when to save them and when not to. It also requires that you typically have more smaller stored procedures rather than fewer larger procedures (i.e. modular code is a KEY component to better performance). Anyway, having said this... IF the first person who executes the procedure (and there isn't a plan for the procedure already in cache (and just to make this even more clear - stored procedure plans - when saved - are ONLY in cache they are NEVER saved to disk)) then a plan will be generated - and used for ALL subsequent users (unless a recompile is forced or occurs due to issues in third bullet in first section :). So now that there's a plan - will that plan be perfect...not necessarily. I have a great script that shows this and it's hard to explain over a short blog but the key point is this:  (I took this small section from yet another of my emails so forgive the duplication):
I would say that forced statement caching and stored procedure caching have the same problem(s) and that's that NOT all statements should be cached. There are certain strategies that can help to minimize costly overhead and there are certainly tips/tricks that we use to see if a plan should be saved or not. I think the number one thing I'd recommend is more modular procedures where the data returned SIGNIFICANTLY varies. The key reason is that smaller procedures tend to be more controllable. And - when a complex procedure is separately out then you can set that smaller more modular proc to be recompiled on each execution. That's probably the easiest way. There are certainly other things you can do to more effectively determine the cause of the problem (and that would be better) but I'd say that forcing recompilation on a smaller piece of code that generates widely varying sizes/sets of data is a good and generic solution.
So - having gotten through all of this... If you really want to play with this A LOT more below are a couple of scripts that I think are VERY useful. I use these in my lectures on Optimizing Stored Procedure performance. Regardless, this is the correct order to read/review/learn these scripts:
 
Create the Credit database (you can use CreditSampleDB.zip (55.79 KB) to create it)
Use "RecompilationIssues.sql (3.67 KB)" to get insight into bad plans being created/saved/re-used
Use "ModularProcedures.SQL (4.28 KB)" to get insight into what happens even with procedures that use conditional logic - not that there's anything wrong with that?!

Theme design by Nukeation based on Jelle Druyts