Monday, February 11, 2008
So there's all sorts of legacy behaviors in SQL Server that will keep me employable for years and years because it didn't make sense when Microsoft started shipping SQL Server and there's too many people to change it now ;).

RAND() is one of those things that behaves differently in the various commercial database vendors, and it takes some time to grok what's happening.

Let's say that I want to get a random set of rows from a table - that seems like a nice thing to do, right?

So I go write:

SELECT * FROM MyTable WHERE RAND() > 0.5

This seems like a great first attempt to get about half the rows back, assuming a good pseudo-random number implementation.  However, you'll notice that if you run this, the behavior isn't quite what might be expected (at least the way I've described it in this example).

So when one executes this query, you'll notice that sometimes you get all rows from your table and other times you'll get none.  But wait - functions are run per-row, right?  Well, no, at least not in all cases.  The complete details are beyond a single blog posting, but for largely historical reasons the RAND() function is actually executed per query, not per row.  That means that, logically, you get the query from the plan cache, get it ready to run, and then run RAND() once, caching the value.  Then, each and every row uses that cached value when evaluating any scalar logic. 

So, for this example, either that initial call to RAND() returned a value greater than .5 or it didn't.  As such, you get all or no rows only.

SQL Server added TABLESAMPLE as a mechanism to address part of this need - retreiving a sample from a table.  However, it isn't arbitrarily composable, and the semantics used in the default system implementation in SQL 2005 are actually intended to be used for statistics generation - a slightly different goal than quality sampling.

But wait - a lot of functions DO execute per row.  How can I tell which is which?  Well, I am not sure that there is a publicly exposed way, other than trial and error, to determine this.  I need to go do some more homework, as my memory is, well, random.

Even worse, some function computations are moved around in query trees, and so the actual number of executions may not be what you think as "per row".  For example, if you execute a filter before/after the scalar logic, then the results may be different.  Put another way, if you use an index seek vs. a table scan, how many times do you expect any old function to be executed?

It's enough to make your head spin! 

The basic rules are that, as long as the optimizer knows about the semantic issues, it will try very hard to not move around functions who have different semantics when executed different numbers of times.

So, be careful with your functions, built-in, user-defined, or whatever exotic form gets dreamed up next.  There's a lot going on under the covers, and knowing a few details can help you more deeply understand what a query is doing.


Thanks,

Conor Cunningham



Monday, February 11, 2008 9:02:29 PM (Central Standard Time, UTC-06:00)  #    Comments [2]  | 
Sunday, February 10, 2008
I previously blogged about the sequence project operator, which I helped architect/develop while working on the SQL Server QP (here).  In that same post, I promised to tell you about the Segment operator in more detail as well.

Segment is largely a "helper" operator in SQL Server.  It does very little and there isn't a direct way to create one.  They are often used to help facilitate more complex operations.

Every operator in SQL Server's QP takes rows of input and produces rows of output.  Some have multiple children (joins), while others are just unary.  Segment is a unary operator that takes rows of input and determines if some set of columns have changed from the previous row.  Usually, the input on those columns are ordered or grouped to make this meaningful.  An extra column is created, like in a compute scalar operation, which records if the value has changed.  That computed value is then used to drive other physical operators.

You can see what columns are considered in each segment by looking at showplan_xml (from the ranking query in my previous post):

                <RelOp NodeId="1" PhysicalOp="Segment" LogicalOp="Segment" EstimateRows="1000" EstimateIO="0" EstimateCPU="2e-005" AvgRowSize="23" EstimatedTotalSubtreeCost="0.0328114" Parallel="0" EstimateRebinds="0" EstimateRewinds="0">
                  <OutputList>
                    <ColumnReference Database="[t1]" Schema="[dbo]" Table="[z]" Column="col1" />
                    <ColumnReference Database="[t1]" Schema="[dbo]" Table="[z]" Column="col2" />
                    <ColumnReference Column="Segment1005" />
                    <ColumnReference Column="Segment1006" />
                  </OutputList>
                  <Segment>
                    <GroupBy>
                      <ColumnReference Database="[t1]" Schema="[dbo]" Table="[z]" Column="col2" />
                    </GroupBy>
                    <SegmentColumn>
                      <ColumnReference Column="Segment1006" />
                    </SegmentColumn>


The grouping columns are listed, and you can see these "Segment100x" columns that get fabricated along the way.  These are not surfaced outside the query, so you can't see the values of this directly.  However, you can see if the grouping columns have changed.

So, in this plan we have two segment columns on different grouping column sets, both of these passed into the sequence project (you can only actually see that they stop being propogated in the showplan_xml, but the QP is usually smart enough to not pass things along unnecessarily, so that usually means that the buck stops where it stops for a reason :).

Remember, all of the operators can change from release to release.  I don't believe any of this changed substantally from 2005 to 2008 based on what I can see in the plans.

That concludes our daily episode of the operator of the day.  You guys have a good weekend.  I am enjoying warmth and sunshine in Austin in February, which I can tell you is a substantial change from my life while in Seattle.


Sunday, February 10, 2008 11:46:38 AM (Central Standard Time, UTC-06:00)  #    Comments [0]  | 
Thursday, February 07, 2008
I received a comment asking about this, so I'll write a short note on the subject.

The QP will take and expand all *'s early in the pipeline and bind them to objects (in this case, the list of columns).  It will then remove unneeded columns due to the nature of the query. 

So for a simple EXISTS subquery like this:

SELECT col1 FROM MyTable WHERE EXISTS (SELECT * FROM Table2 WHERE MyTable.col1=Table2.col2)
The * will be expanded to some potentially big column list and then it will be determined that the semantics of the EXISTS does not require any of those columns, so basically all of them can be removed.

"SELECT 1" will avoid having to examine any unneeded metadata for that table during query compilation.

However, at runtime the two forms of the query will be identical and will have identical runtimes.

As such, I typically use SELECT 1.  In my blog post I didn't.

Conor

Thursday, February 07, 2008 8:01:46 AM (Central Standard Time, UTC-06:00)  #    Comments [6]  | 
Wednesday, February 06, 2008
Had I known the number of responses I would have gotten about running SQL Server in virtual environments, I would have spent more time writing a questionnaire! 

I guess I'm also pleasantly surprised that so many of you read this blog - you must all be smarter and better looking than the average blog reader, for sure ;).

ok, I learned a few things of interest and I will sum them up here.  Then I'll write something more technical where you might learn a few things from me to help you in your day-to-day.

1. A number of you run SQL in virtualized environments.  VMWare more than VirtualPC (to be expected, given the relative state of the two products).  Perhaps the mix will change after Windows 2008 becomes available.

2. More than a few of you run production SQL Server systems on virtualized hardware.  I'm still asking a few clarifying questions, but it appears that this is not limited to tiny systems or lower-volume systems.

3. Within these production systems cases, the most common benefits related to manageability.  Specifically, moving the VM to a new machine when an old one died or needed to be reconfigured.  This makes sense.

4. A number of people run single-user systems to do development, try out some new configuration, etc.  Often you don't care about the system performance in this case.

5. For the group in between the single-user system and the deployed systems, there is frustration about VM configuration.  Typically the overhead is high unless you invest in a beefy storage solution (for example, a RAID-1 or RAID-5 array directly attached ).  Configuring storage and memory properly for the VM is critical for good performance.

I may return to this question in the future with a more systematic survey.  I didn't even ask you if you were running ESX vs. the non-ESX versions of VMWare or whether your CPUs have the VT capability in them!  Thanks for all who shared data with me - I learned a few things and have more to ponder.

I've been playing with VMWare more than average recently, and I've had some fun and some pain.  My experiences are not dissimilar - it can be good for manageability, but it can also be slower. 


OK now for the bit where I tell you a few things you might not know ;).

You probably know that MS isn't particularly interested in you doing this, and there's been press on the licensing (which SKUs you are do virtualization on, etc) around this for a bit.  MS is working hard on its own enterprise-class virtualization story, and I'm sure it will be at least reasonable. 


Within the server itself, there is suprisingly little knowledge of a lot of things in this area that are important to performance.  SQL Server's core engine assumes things like:

1. all CPUs are equally powerful
2. all CPUs process instructions at about the same rate.
3. a flush to disk should probably happen in a bounded amount of time.

These are simplifying assumptions that are often completely untrue, but SQL Server doesn't want to get into the business of forcing you to configure a bunch of numbers in order to use the product (or pay a high-priced consultant, for that matter - at least not to get started ;). 

Virtualization, from the system implemenation perspective, has a similar impact to processor hyperthreading.  Hyperthreading, for those who haven't been reading Intel's website much lately, allows a superscalar (multi-pipeline) CPU to schedule mutiple threads of work at the same time as long as they don't require the same pipelines.  This sounds awesome, right?  You can increase your instructions per cycle on the same hardware with only minimal extra transitors of scheduling overhead.  Well, this works great as long as you build an OS from the ground up that exposes threads as its primary primitive instead of processes.  Then you can just have a pile of threads that execute all over the place, usually without much care about how fast they are running, and you end up with everything running faster.  When Windows consumed this, it wasn't so simple.  The scheduler schedules threads to CPUs, and you can't have a part of a CPU.  So, we have logical and physical CPUs.  To most applications, they see 1 CPU per allowed thread.  this makes everything work but it hides the fact that these things are not in any way the same unit of processing power.  So some threads get assigned to fast CPUs and some get assigned to slow CPUs.  Sometimes, you get assigned back and forth.  ack!

Now let's throw SQL Server on top of this.  It's running lots of threads in one process.  They get scheduled essentially randomly across this matrix of real and fake CPUs.  Some user applications are one thread per user, and they work fine (OLTP applications).  In fact, some of them work a bit faster (10-20%) because there are "more" CPUs.  This is great, since you pay by the socket for Microsoft software (unlike some of those other guys ;). 

Data warehouse applications, however, are a much, much different beast.  A good deal of the benefit of buying the top-end SKU is to get parallel queries.  This speeds up queries by running one query on multiple threads.  It trades CPUs for runtime performance improvements.  The QP has lots of smarts to try to split up things into equal chunks of work and to then run each of those chunks on a different thread on a different CPU.  Usually the pattern for something like an index build or a hash join would be:

1. get everything onto different threads
2. do some work
3. wait for all those threads to complete
4. do something else in the query

So now I have some threads that finish earlier than others.  So they block until the slowest threads finish.  Even worse, I don't think that the query re-allocates those threads for other queries until the whole query finishes.  So, now you have some background as to why hyperthreading was not recommended for at least some SQL Server deployments.

Virtualization has some of the same issues.  Things don't proceed at the same rate on each CPU, assuming you define more than one CPU for the virtual machine.  When your physical machine is under a high load from other VMs, you get less CPU, memory, IO, etc.  For occasional applications, all of this is fine - you can trade this speed for consolodation, managebility, etc.  Occasionally things are slower.  That's fine.  For high-volume applications, this is usually not as good.

Other issues with Virtualization - memory is a big one.  SQL Server assumes, at least in the main server SKUs, that it is the only significant memory consumer on the machine.  It's a *server*.  (SQLExpress has different assumptions, but it's no memory slouch either).  Now, SQL Server will work in a memory constrained environment, but you often don't want to do that.  You take that away from a lot of different things - the buffer pool, the compiled plan cache, memory to execute queries (for example, hash join grants).  All of these things can add up if you aren't careful.

I/O is the area where I have the least experience in virtualization.  This is one of the reasons I asked people about production SQL Servers.  Usually they did get some storage array, and this makes sense - it ramps the I/O bandwidth and usually isolates it from any other operations on the machine (your OS, your application you are developing on top of SQL Server, etc).  I'm going to spend some more time on this, but I think the core idea is sound - as you start sharing your I/O bandwidth over several VMs, you are going to hit limits earlier with big IO consumers like SQL Server.  The same basic logic applies - isolate your database traffic onto different storage paths, especially when building a system to scale.  In a VM world, this can let you avoid the sharing penalties vs. the default config of everyone sharing the same hard drive.

Had you asked me while I worked at MS about a query performance problem on virtualized hardware (even from the MS product), the answer would have been "don't do that".  I think eventually there will be enough push for some other answer than that.  (Reminder - I don't work there anymore, and my opinion is purely my own at this point).


I hope that makes this a fair trade :)

I'm off to bed.  See ya'll.

Conor

Wednesday, February 06, 2008 11:13:08 PM (Central Standard Time, UTC-06:00)  #    Comments [0]  | 
Tuesday, February 05, 2008
I'd like to do a post or two on running SQL Server 2005 within a virtual machine.  Before I get into all sorts of fun details on this, I'd like to ask you guys to write in and tell me about what whether you are doing this or not.

It's ok, I don't get paid by the man anymore, you can tell me ;).  Even if they did still pay me, I'm curious. 

So, who's got a SQL 2005 Server running in a virtual pc?  in an VMWare ESX server?  How's that working out for ya?

Email me at conor@sqlskills.com!  I want to hear from you!

Who was smart enough to sell their VMWare stock on the way up but not smart enough to sell their EMC stock? (that would be me).


Thanks,

Conor
Tuesday, February 05, 2008 9:57:21 PM (Central Standard Time, UTC-06:00)  #    Comments [0]  | 
Saturday, February 02, 2008
In my previous post, I talked about the basics of partition elimination and mentioned some of the challenges.  SQL 2005 does contain logic to do partition elimination, but it is imperfect because some of the interactions with other features in the optimizer didn't always do partition elimination when customers wanted/expected it to do so.  SQL 2008 uses a somewhat different internal notation for partitioning, and it appears (from what I have seen so far) to do a better job at some of the interactions that were troublesome for 2005.  This should make partitioning work much more seamlessly in the upcoming release.

This example was motivated by a reader of the blog (thanks!) who was using a partitioned fact table for sales data and joining to another table to roll his own row-level security.  I've seen a number of customers do row-level security this way, so it's actually not a bad example for how things can break down in partition elimination in 2005 vs. 2008.  Here is the script we'll use for the example:
create partition function pf2(int) as range left for values (10, 20, 30)
create partition scheme ps2 as partition pf2 ALL to ([primary])

create table usersecuritybysite(username nvarchar(100), site int)
insert into usersecuritybysite(username, site) values ('dbo', 1)
insert into usersecuritybysite(username, site) values ('dbo', 2)

create table factsales(site int, invoice int, itemid int, col3 binary(2000)) ON ps2(site)
insert into factsales (site, invoice, itemid) values (1, 1, 1)
insert into factsales (site, invoice, itemid) values (2, 1, 1)
insert into factsales (site, invoice, itemid) values (3, 1, 1)
insert into factsales (site, invoice, itemid) values (4, 1, 1)

This obviously has far less data than an actual fact table, but I'll show you the plans that are interesting from this example and they should translate to an evaluation of a plan on a full-scale system.

So, we have two tables, both heaps.  The fact table is partitioned, and we will always join with an unpartitioned (but likely smaller) table when querying the fact table.  SQL 2005 generates the following plan:




So the optimizer generates a plan to build a hash table from usersecuritybysite and then probe it with every row from the partitioned table.  That's actually not a horrible plan, but as the cardinality of the fact table increases, this will get worse because we're scanning all partitions.  If we look at the statistics profile, we can see that all 4 partitions are scanned once in this plan.  Ideally, we'd like it to use the information from the usersecuritybysite table to eliminate partitions "almost always".  That would be safer.  (Nothing in optimization is simple, and there are actually more plans I'd consider here, but I'd like to focus on the point of not touching useless partitions in this post - in this case, I actually do want to avoid touching unneeded partitions as often as possible).



The constant scan with values 1,2,3,4 is where the partition information is encoded in SQL 2005.

Each partition has roughly the startup cost of a table, so think of this as opening 4 tables.  Databases are usually optimized so that reading lots of rows is the fastest thing - opening tables won't be as fast as opening rows, and when push comes to shove one will optimize row reads at the expense of table opens every time.

Let's look at the SQL 2008 plan:


Hey, what happened?  Well, the internal plan representation in SQL 2005 was pretty good but not perfect.  I think it's fair to say that it does a better job than partitioned views for local cases, but it wasn't as good as the non-partitioned tables in a few cases.  Many of the improvements that were driven in 2005 led to lessons that were incorporated in 2008's optimizer support.  Specifically, compilation time is higher in 2005 for partitioned tables because there are more operators in the tree.  Also, the introduction of these extra operators happened to make "trivial" plans no longer "trivial" (because partitioning is represented as a join in 2005, and joins mean cost-based choices, and therefore it is not "trivial" to choose a plan).  Finally, some of the plan choices that happen in the non-partitioned cases from years and years of tweaking queries didn't quite fit with partitioning because of the representation.  For example, index selection didn't know as much about partition selection as it needed to have to make perfect plan choices. 

So, 2008 changes things largely for the better.  The stats profile for this plan in 2008 is here:



So you'll notice a few things.  First, the constant scan is gone, and this plan looks like the non-partitioned plan.  Hey, is that a seek?  What's up with that?  This is a heap!!! Well, the optimizer now has some smarts to treat partitioning as a special leading index key, and even heaps can have them.  So, now the index selection code can do most of the partitioning applications better and avoid some of those seams I mentioned earlier.  It's harder to tell how many partitions got pruned, but if you look at the text more closely:

       |--Table Scan(OBJECT:([t1].[dbo].[factsales] AS [fs]), SEEK:([PtnId1004]=RangePartitionNew([t1].[dbo].[usersecuritybysite].[site] as [us].[site],(0),(10),(20),(30))),  WHERE:([t1].[dbo].[factsales].[site] as [fs].[site]=[t1].[dbo].[usersecuritybysite].[site] as [us].[site]) ORDERED FORWARD)

You can see that the optimizer is running some partition function on the usersecuritybysite.site column as part of the SEEK.  I'm asking my friends if this is the only indication I can see beyond looking down at the lock manager, but this is what I can see right now.

To be clear, 2005 also has the ability to partition pruning.  I'm not sure how many people store their fact tables in heaps and how often it happens to not prune.   If we create a clustered index on the fact table in 2005, you get a better plan:




This isn't quite as perfect as the 2008 plan, but it does avoid opening unneeded partitions.  In this case, the security table is scanned and joined with the list of partitions run against the partition function.  This generates the right partition to open for each security table row.  It happens to work great for this case because the table is being used as a "gate" to let the user see the partition or not.  Things get a bit more complex when the non-partitioned tables has duplicates or many rows from the fact table can be matched (you might want to pre-sort your partition requests and use a different plan in that case to avoid re-opening partitions).

So, the bottom line is that this stuff is complicated and it's a possible to get an imperfect plan at times.  You can use this technique to see if your 2005 plan is eliminating all the partitions you would expect.  If you end up with a plan that doesn't prune properly, you can do things like split the query or use a hint to force the optimizer to do what you want.  I'd recommend that you know what you are doing before going down either path.  Everyone has different levels of experience on this stuff, and it can be tricky to fully understand all of the interactions. I hope I've given you enough insight in this post to see whether you have a problem. 

Also, please be careful using statistics profile on a production machine,  and be sure to turn it off when you are done - it slows down queries, and nothing makes Conor angrier than a needlessly slow query ;).

Have a good weekend, y'all!

Conor
Saturday, February 02, 2008 3:05:08 PM (Central Standard Time, UTC-06:00)  #    Comments [0]  | 
Thursday, January 31, 2008
Horizontal Table and Index Partitioning, where different rows in the same logical table are split into different physical tables, is an essential data warehousing tool.  It allows large tables to be split into multiple physical heaps or b-trees based on a key in the data.  This often makes big tables more manageable because many operations take longer as a function of the size of the table.  Splitting those tables can allow a database administrator to take the cost of an index rebuild or statistics update in pieces instead of all at once.

Partitioned tables have somewhat increased overhead because there's all these extra physical structures to handle.  They often also have different optimization characteristics.  The optimizer can often determine that whole partitions are not needed for a particular query, and it then applies
logic to eliminate those partitions from the query plan entirely.  This is conceptually similar to index matching, and it provides an order-of-magnitude performance improvement when it works.

Partition elimination relies on a fairly complex and expensive part of the optimizer that reasons about the valid domains for each column participating in a query.  For example, if you have a query like:
SELECT * FROM A INNER JOIN B on A.a=B.b where A.a < 20 and B.b > 50;




If you stare at this while touching your nose, you may eventually figure out that no rows should be returned from such a query because logically the conditions introduce a contradiction.  In this case, the query will be simplified during optimization and you'll see a plan like this:

I will cover constant tables in a future "operator of the day" post, but you could imagine that not much will be returned from this query once this simplification has occurred.

Partition elimination, at least in SQL Server 2005, is conceptually similar.  Each access method (heap, clustered index, secondary index) that contains a partitioning is tracked and considered against the valid ranges for that query.  If it is determined that one or more partitions is not needed, these are pruned from the final plan and your query just got a lot faster.

It would be nice if that were the end of it, but it gets far more complicated.  Let's say that you write the query like this:

SELECT * FROM MyTable WHERE ptncol=?

Well, I can reason that the query will touch one and only one partition in this case.  Dynamic partition pruning deals with a number of cases like this, and boy was that fun to implement. 

Once you get your head wrapped around dynamic partition pruning, you then start to worry about a range of other nasty, nasty problems that are only somewhat visible via the output plan.  The SQL 2005 query tree representation for partitioned tables unfortunately is not orthogonal with how parallelism is applied to the query tree, so there are cases when you can't get parallel plans against partitioned tables in a plan.  Additionally, the index matching and access path selection code also has to deal with backwards compatiblity when selecting the access path, and there are lots of details about costing and index selection that are very core to compilation time performance and plan quality.  Finally, there are some operators in the query tree that can't/don't flow information about the algebraic logic in enough detail to prune partitions.

While some of the details of this are beyond what is visible in the output plans (and therefore I won't be talking about it), I hope you can at least accept that it's a complex problem to make it all work perfectly.  SQL 2005 did get a lot of the pieces working, and I think everyone would say that it's a big, big step forward compared to (distributed) partitioned views from SQL 2000. 

SQL 2008 contains some additional changes to partitioning to continue to integrate it tightly into the rest of the query optimizer, and I'll do a post on the external query representation differences between SQL 2005 and SQL 2008.  I'll also be doing a post or two on some specific pruning issues that real customers have seen in SQL 2005.

Bedtime for me - happy querying!

Conor


Thursday, January 31, 2008 11:33:00 PM (Central Standard Time, UTC-06:00)  #    Comments [0]  | 
Wednesday, January 30, 2008
So one of the features I had a hand in building in SQL Server 2005 is really useful for helping SQL Server work well in large data warehouse deployments.

I led a team writing the paper that was published about this, and you can read it in my previous blog post here.  The paper covers the theory but not the implementation.

The basic idea is that SQL Server has a lot of competing memory consumers within a single process space.  Often the naive approach (block incoming requests until there is enough memory) will fail under extreme circumstances because there is even incoming requests take some amount of memory, and you can starve on memory waiting for memory.  Additionally, many of the components are built to run best with fewer queries running in the system - for example, hash joins run a lot faster when they do not spill to disk.  When you add it all up, you have a bunch of different memory consumers with a bunch of different optimal policies to achieve optimal performance.

For the query optimizer, we tried a number of different approaches in the naive space before we eventually settled on a tiered set of monitors that block based on how much memory an in-process memory optimization has consumed compared to the free memory in the system.  This meant that the system would not block any trivially small query from compiling, allow a certain number of normal (OLTP) queries to compile, allow a fewer number of medium (think basic DSS) queries to compile, and have one final bucket to allow for the largest query in human history to compile.  (Note that I am not saying "execute" - there are a different set of queues for those queries).

What we found was that, as SQL Server started to be used in larger and larger deployments, the optimizer would consume more memory in those situations where large DW queries were coming into the system.  This led to out of memory errors thrown during compilation or execution.  The cost of restarting these queries was high (often the plan wasn't in the cache yet or was thrown out of the cache quickly due to memory pressure), and the system would just go into a memory-thrashing fit.

So how do I tell if my system is in a memory-thrashing fit due to compilation-overload?  Well, there is a command called DBCC MEMORYSTATUS that will tell you a bit about where you are.  If you scroll through the various rowsets returned, you'll eventually find one called "Optimization Queue".  This reflects the total memory available for all optimizations in the system, and the SQL OS layer controls this.  If lots of other things need memory, this will get clamped down.  The next 3 should be familiar by now - they are the various gateways to block queries of various sizes.  The "Threshold factor" is the amount of memory at which the optimizer will start trying to block that query from continuing until other queries in that bucket have completed compilation.

The "(default)" postfix is for normal user queries.  The "(internal)" queues are for internal queries.  There aren't a ton of internal queries.

Here's a screenshot of this in action:



Now, last I checked I didn't have a large-scale data warehouse running on my desktop, so these numbers are pretty tame right now.  You can run this command on a production system to see where you stand - it is intended to run quickly for debugging purposes on running systems, actually.

Enjoy!

Conor Cunningham
Wednesday, January 30, 2008 9:03:13 AM (Central Standard Time, UTC-06:00)  #    Comments [0]  | 
Tuesday, January 29, 2008
One of the reasons I'm posting up content like this is to help all of you learn a bit of what I have bouncing around my head.  I'm hoping that I'll get a few more people writing some awesome data-driven applications.

So, I'd like to get some help from you - I'd like to hear from you.  I need some stories and/or questions about any data warehouses you've been using.  I love hearing these stories, especially about the sorts of things that you had to do to get them to work well with SQL Server.

I can't promise that I'll blog on every single thing you send me, but I will read them all and post up some interesting information I have to help you out where possible.

So, if you've hit problems with memory limits compiling or running queries on a data warehouse, I want to hear about it.  If you've got some nasty, nasty query plans, please send me a picture of your schema and the statistics profile for a query and we can talk about what isn't working well.  Send me a stats problem where the optimizer isn't cutting it.

So, please mail in to conor@sqlskills.com.  I'm happy to keep necessary details confidential, if desired.  All of this will help me focus my energies on the topics that will help you the most.  Otherwise you guys get to listen to me ramble on about whatever I want :)

Thanks,

Conor

Tuesday, January 29, 2008 8:17:30 PM (Central Standard Time, UTC-06:00)  #    Comments [0]  | 
So I'll be posting explanations for each query operator in the output query trees that you can see through "set showplan_text on" in SQL Server.  I'm hopeful that this will give you the tools to better read the query plans being generated by the QP when evaluating your system.

So I'll start with one that I added into the optimizer - the "sequence project".

create table z(col1 int identity, col2 int default rand()*10000)

declare @i int 
set @i=0
while @i < 1000
begin
insert into z default values
set @i=@i+1
end

select RANK() over(order by col2), * from z



If you look at the graphical showplan you get an output plan that looks a bit funny.  A Sequence Project is a special form of projection (the thing that evaluates the scalars in the SELECT list for each row).  By default, a regular project in SQL Server does not preserve nor guarantee any sort properties in the physical plan.  As a result, ranking functions put in regular projects would not preserve order in all cases deterministically.  People seem to complain when the query results are incorrect.

Enter the sequence project....This little beauty does do scalar computations, but they are generally limited to a very small number of operations that require order for one reason or another.  In the case of ranking functions, an order is required as part of the syntax, and you will see that this particular plan needs a sort since there are no indexes on the base table to satisfy a sort request on col2.  (I'll talk about segments in a future post, but for now ignore them and assume that they are part of the sequence project when looking at this plan).

The problem with sequence projects are that while they preserve ordering semantics, they interfere with a lot of the other optimization logic in the QP.  For example, these can interfere with index matching for predicates above the sequence project in a query tree.  Operators that block other optimizations tend to make optimizer developers very unhappy and sometimes elicit violent reactions!  So, be careful when proposing "opaque" operators to the next query optimization developer you see ;).

So, SQL Server has a small number of operations that in previous releases of the product were "incorrectly" put into regular projects that are now in sequence projects in the current versions of the product.  (By "incorrectly", I mean that there were potential optimizations that could be run that would break their semantics, not that there were actively executing plans at a customer site that were incorrect.  In optimizer terminology, they model was not correctly preserving the semantics even though there were no rules that exposed this fault).

So, the challenge for you is think of some places _beyond sequence functions_ (RANK, DENSE_RANK, NTILE, ROW_NUMBER, ...) that use the sequence projection operator.  If you mail them to me with plans, I may be able to tell you more about what is happening for each of them.

Happy hunting!

Conor Cunningham
Tuesday, January 29, 2008 8:06:04 PM (Central Standard Time, UTC-06:00)  #    Comments [0]  | 
Sunday, January 27, 2008
For a number of years, the marketing for SQL Server's competitors made a large number of claims that SQL Server couldn't handle "Enterprise"-scale deployments.  Over time, those claims became much quieter as Microsoft used SQL Server extensively to run their whole business, through multi-terrabyte reference cases, and the hugely successful TerraServer project where Jim Gray led a team trying to build a huge database of satellite images using SQL Server.

This doesn't mean that there aren't challenges for all of the major vendors in adding features to help their products scale.  Each of them have some failings in some of their product offerings, so please don't interpret this as a "Microsoft SQL Server can't do the job" post.  On the contrary, I think that there is little that it can't do, but I know where the limitations get hit and I have some expertise in working around the current limitations.  Each Data Warehouse problem is a bit different, so the kinds of problems I'll discuss here may or may not happen for any given deployment.

In my previous post, I discussed how subqueries are approached by the query optimizer (as joins), and this can lead to problems when the cardinality or cost estimates are incorrect.  Often when an estimate is incorrect, "unrolling" the subquery by hand and creating a temporary table for a portion of the query will correct the mistake.  Splitting a larger query like this makes the operation more procedural and makes each piece smaller.  It also happens to give the optimizer better cardinality estimates for the portion of the query that has been unrolled - these better estimates may help the rest of the query perform more to your expectations.  This all assumes that:
1. You have the skills and desire to unroll a subquery
2. You know how to unroll it to give the optimizer information to help avoid cardinality estimate errors
3. That the query is known a priori.  Dynamically-generated queries are harder to hint

The best way to track down estimation errors is to find some time when you can run a problematic query with "set statistics profile on" enabled.  While this does make the query run more slowly, it captures useful information to help identify areas of a query where the estimated and actual cardinality are materially incorrect.  I've blogged about this previously (in previous blogs, not this one), so I'll do a post linking some of the useful backgrounders on this area.

Data Warehouses are usually the largest of the large in terms of SQL Server deployments, and usually they have a dedicated DBA to help keep it working.  This is partially because any business problem worth hundreds of thousands of dollars in hardware is also probably worth the salary of a DBA to keep it working.  However, the other side of this coin is that there aren't enough data warehouses in the world to make it truly a commodity software purchase.  As such, lots of little things can and do go wrong in data warehouse deployments that often require expertise to fix.

Let's talk about some of the reasons why the query optimizer may make incorrect guesses on data warehouse tables, starting with some of the easier ones

1. missing statistics.  I've seen a bunch of customers disable auto-create statistics because "someone told them to do so" or "I don't trust those things".  Unless you are going to hand-tune every query, you should likely leave auto-create enabled.

2. out-of-date statistics.  This is a bit trickier.  Column statistics are updated when a certain number of changes are made, and sometimes that hasn't been triggered.  You can manually update statistics in those cases.  For customers who disable auto-update statistics (<2%, but this happens a lot on the higher-end deployments), you need to have a schedule to update those stats so that the optimizer has a fighting chance.

3. a statistics sample that does not reflect the whole data set.  There are a number of assumptions made during statistics creation that are more pragmatic than mathematically proven.  To make auto-stats possible, only a small percentage of the pages are actually used to create statistics for non-indexed columns.  If those pages don't cover the whole data range well, queries over the missing ranges could get poor estimates.

4. Fundamental limits in the statistics object.  Currently there can be up to 200 steps in the histogram that is created.  If your table's data has less than 200 interesting spikes in the distribution curve, then you are probably fine.  However, your fact table may benefit from more.  Unfortunately, this can be a problem even with full stats enabled in SQL 2005 and, from what I have seen so far, in the CTPs of SQL 2008.

5. model limits in the optimizer.  Optimizers make assumptions so that they can compile a query in a reasonable amount of time.  For example, an optimizer may assume that columns are, in general, statistically independent of each other for the purposes of cardinality estimation because it's the only computationally feasible assumption to make.  This assumption is very often not true, and it can introduce errors in cardinality estimation that can lead to poor plan selection.  As databases get larger, these "out of model" errors can get larger and require more hand-holding to make sure that a given query gets a good plan from the optimizer.


I hope this gives some insight into how cardinality estimation impacts join optimization and thus subquery optimization in SQL Server 200x.

Thanks,

Conor

Sunday, January 27, 2008 2:12:04 PM (Central Standard Time, UTC-06:00)  #    Comments [0]  | 
I've been trying to think of a good way to introduce how SQL Server does subqueries based on a question from a reader.  One of the tricky pieces to query optimization is that the "language" used internally in the optimizer is actually somewhat more expressive than SQL.  The second part of the problem is that, for people who have used other products at one time or another, it's a challenge to get them to re-learn something to grok how SQL Server does it (and even to figure out what they need to re-learn).

So I'm going to do a series of posts on subqueries and the related pieces behind subqueries to put them in context.

Let's start at the beginning.  A subquery is a fancy filter condition where you can specify an query to run for each row of the outer query.  This doesn't mean that's how the optimizer will actually execute it, but you can start thinking about it like that:

SELECT * FROM MyTable WHERE productkey IN (SELECT * FROM Orders where MyTable.productkey=Orders.productkey)


One way to execute this query is to read each row from MyTable, get productkey, start up another query to look through orders, and determine if we want to keep the row from MyTable based on the result of that second query.  This is how some of the other QP's on which I have worked attack subqueries and is perfectly valid.

In the internal lanaguage of the optimizer (and you can read this in output plans), you will typically see that this is executed physically in SQL Server using a "semi-join", which you should imagine as a join operator that does what we've described within one single query instead of running an extra query per row.  A semi-join will return rows from only one side of the join (left or right semi join).  These semijoins are nice because you can actually re-use your physical join implementation to execute them.  They are also nice because you can also leverage your tree rewrite logic and cardinality estimation logic to help you think about these like joins as well.  So, you can reorder and rewrite the query into alternative, equivalent forms that might run more quickly.

The nice part about this approach is that you get the benefit of a lot of work and experience estimating cardinality for joins, which helps the SQL Server Optimizer make decisions about these rewrites.  The bad part about this is that in queries with many joins or queries with lots and lots of data, your odds of getting a mistake in join cardinality/cost estimation go up and up, and your opportunity to get a plan that is far from optimal also go up.

My next post will cover some of these things in more detail - off to get my kid up.

Conor
Sunday, January 27, 2008 7:59:43 AM (Central Standard Time, UTC-06:00)  #    Comments [3]  | 
Tuesday, January 22, 2008
This post builds upon my previous post on some of the various kinds of plans for the IN operator.  I will cover some of the query plan cache behavior associated with IN plans.

For this post we'll be using the following query which will let you look at the text of plans in the cache and also see how many plans are in the cache.  (One can use "dbcc freeproccache" to clean out the cache whenever you are testing this - please don't do this on a production system unless you know what you are doing)

(Hey, I figured out how to post code properly in the blog.  and there was much rejoicing! yeah!)
SELECT text, plan_handle, * FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_sql_text(sql_handle)
CROSS APPLY sys.dm_exec_text_query_plan(qs.plan_handle, qs.statement_start_offset, qs.statement_end_offset) AS qp

OK, the two things of interest are the number of rows returned and the text of each plan.  I will note that the data in these sys.dm_exec* functions is "mostly up to date", but it doesn't follow an exact transactional semantic - so, you may need to wait a second or two to get the "proper" state of the plan cache after each step.

If you run dbcc freeproccache, then this query, you'll see one entry in the result (for the query you ran to get the status).  If we run query 1 from our previous post (select * from t1 where col2 in (1)) and look at the result, we get something interesting:
(@1 tinyint)SELECT * FROM [t1] WHERE [col2]=@1

Hey, what's this?  Well, this is called "auto-parameterization" in SQL Server, and it happens on Trivial Plans (see previous post).  The idea is that SQL Server has a fancy query processor, but fancy means powerful for big queries and perhaps not the fastest on the small stuff.  In general, we're not saying that it's slow, but if you are compiling thousands of these things right in a row, you may notice a difference.  So, this autoparameterization feature is an attempt to reduce compilation time by finding plans that have no cost-based decisions and then templatize-ing them so that all other queries that look the same can re-use the plan.  This has a huge advantage for customers with query-generation components, as they often will generate many, many queries that only differ by the constants.  So, SQL Server uses this on trivial plans.

If you run the query "select * from t1 where col2 in (2)" (so, the constant is changed but nothing else), then run our big fancy cross apply query, then you see 2 query plans instead of 3.  Both of our new queries use the following plan:

(@1 tinyint)SELECT * FROM [t1] WHERE [col2]=@1

Now, running queries with more than one element in the IN list generates a plan for each of them.  (I don't recall exactly why off-hand, but I'll think about it some more)

So, you can see the plan entries for a range of variations.

select  * from t1 where col2 in (2,1)
select * from t1 where col2 in (1,2,3,4,5,6,7,8,9,10)
(@1 tinyint)SELECT * FROM [t1] WHERE [col2]=@1
select * from t1 where col2 in (2,1)
select * from t1 where col2 in (1,2)
select * from t1 where col2 =1 or col2=2

I'll point out that, except for trivial plans that are auto-parameterized, the plan text is used as the "key" into this cache.  It does not logically compare the output plans or even a normalized form of the input.  So, you'll notice that "
select  * from t1 where col2 in (2,1)" is in the cache twice because I put a space in the second time I ran that query.  So, you need to consider this when we get to the original question about plan cache pollution.  If your query generation engine (LINQ or whatever) adds comments into the query text sent to SQL Server, that could very well lead to more plans in the cache.

The more complex queries from my last post also generate unique plan entries in the same way, so I'll skip those here.

The next interesting factor to consider is forced parameterization.  This was added in SQL Server 2005, and it is both powerful and dangerous.  This feature is a database level feature, most literals are converted to paramters before query optimization in most query plans.  This is covered under the "Forced Parameterization" topic in SQL Server books online, so I'll suggest you read that if you want all the details.  I did a post or two on plan sensitivity to different parameter values, so beware that in some cases you may end up with query plans that are suboptimal and are stuck in the cache.  So, you should experiment with this setting to make sure that it really does benefit your application before you turn it on.

As I like to live dangerously, I will ignore all of these warnings and proceed with great vigor in the name of Science!

Let's try three queries that yielded separate plans without forced parameterization enabled:
select * from t1 where col2 in (2,1)
select * from t1 where col2 in (1,2)
select  * from t1 where col2 in (2,1)
Now we get one entry in the cache, and it looks similar to those auto-parameterized queries:
(@0 int,@1 int)select * from t1 where col2 in ( @0 , @1 )

Notice that even spaces are "fixed" in terms of cache hits in this mode.  The query is parsed and then the text is generated from the parsed tree, which gives one the ability to normalize the input SQL for the plan cache lookup.

Adding a 3-element IN list query generates a new plan in the cache:
select * from t1 where col2 in (1,2,3)
(@0 int,@1 int,@2 int)select * from t1 where col2 in ( @0 , @1 , @2 )
(@0 int,@1 int)select * from t1 where col2 in ( @0 , @1 )

OK, so we can get a plan per number of elements, which perhaps has its advantages in some cases.  Good to know.

I also tried our large cardinality IN list example (the one that generated the semi-join in my previous post):
(@0 int,@1 int,@2 int,@3 int,@4 int,@5 int,@6 int,@7 int,@8 int,@9 int,@10 int,@11 int,@12 int,@13 int,@14 int,@15 int,@16 int,@17 int,@18 int,@19 int,@20 int,@21 int,@22 int,@23 int,@24 int,@25 int,@26 int,@27 int,@28 int,@29 int,@30 int,@31 int,@32 int,@33 int,@34 int,@35 int,@36 int,@37 int,@38 int,@39 int,@40 int,@41 int,@42 int,@43 int,@44 int,@45 int,@46 int,@47 int,@48 int,@49 int,@50 int,@51 int,@52 int,@53 int,@54 int,@55 int,@56 int,@57 int,@58 int,@59 int,@60 int,@61 int,@62 int,@63 int,@64 int,@65 int)select * from t1 inner join t2 on t1 . col1 = t2 . col1 where t1 . col2 in ( @0 , @1 , @2 , @3 , @4 , @5 , @6 , @7 , @8 , @9 , @10 , @11 , @12 , @13 , @14 , @15 , @16 , @17 , @18 , @19 , @20 , @21 , @22 , @23 , @24 , @25 , @26 , @27 , @28 , @29 , @30 , @31 , @32 , @33 , @34 , @35 , @36 , @37 , @38 , @39 , @40 , @41 , @42 , @43 , @44 , @45 , @46 , @47 , @48 , @49 , @50 , @51 , @52 , @53 , @54 , @55 , @56 , @57 , @58 , @59 , @60 , @61 , @62 , @63 , @64 , @65 )


Yes, it seems to keep on going.  For once, I read the documentation and determine that one of the disqualifying conditions is:

Statements that contain more than 2,097 literals that are eligible for parameterization.

So, it will stop at some point based on how many literals you have in your query.

So while we're not up to LINQ yet, I think we have a pretty good basis to evaluate whether LINQ or any other query generator is going to cause plan cache issues.  I'll try to do some research on the shipping version of LINQ and see what I can determine by examining the plans that get generated.

That's enough for one evening from me.

Thanks,

Conor
Tuesday, January 22, 2008 9:02:28 PM (Central Standard Time, UTC-06:00)  #    Comments [1]  | 
In this post, I'll go over a few things about IN plans in the SQL Server Query Optimizer.  Currently I'm using the SQL Server 2008 November CTP, but this behavior is still valid on SQL 2005.

I'll be using the following script for my discussion, and you can try this on your own.

create database in1
use in1

create table t1(col1 int identity, col2 int default rand()*100000, col3 binary(2000))

declare @i int
set @i=0
while @i < 10000
begin
insert into t1 default values
set @i=@i+1
end

create table t2(col1 int identity, col2 binary(200))

-- plan 1
select * from t1 where col2 in (1)

-- plan 2
select * from t1 where col2 in (1,2,3,4,5,6,7,8,9,10)

-- plan 3
select * from t1 where col2 in (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16)

-- plan 4
select * from t1 inner join t2 on t1.col1=t2.col1
where t1.col2 in (
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,
51,52,53,54,55,56,57,58,59,60,61,62,63,64)

-- plan 5
select * from t1 inner join t2 on t1.col1=t2.col1
where t1.col2 in (
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,
51,52,53,54,55,56,57,58,59,60,61,62,63,64,65)


So, we have a database with 2 tables, and I created a bunch of random data in a column so that we can play with IN plans.


ok, plan 1 is "easy".  I use IN with a single operator.  This happens to turn into a simple plan that looks like this (using set showplan_text to view):

  |--Table Scan(OBJECT:([in1].[dbo].[t1]),WHERE:([in1].[dbo].[t1].[col2]=CONVERT_IMPLICIT(int,[@1],0)))

Hrm.  hey, the where clause is in the table scan?  no separate operator?  It's a physical optimization that I'll cover in another post (I may have covered this in my previous blog posts - I'll go check), and it's called "pushing non-SARG predicates" in SQL Server.

Let's compare with the plan for plan 2:
  |--Table Scan(OBJECT:([in1].[dbo].[t1]), WHERE:([in1].[dbo].[t1].[col2]=(1) OR [in1].[dbo].[t1].[col2]=(2) OR [in1].[dbo].[t1].[col2]=(3) OR [in1].[dbo].[t1].[col2]=(4) OR [in1].[dbo].[t1].[col2]=(5) OR [in1].[dbo].[t1].[col2]=(6) OR [in1].[dbo].[t1].[col2]=(7) OR [in1].[dbo].[t1].[col2]=(8) OR [in1].[dbo].[t1].[col2]=(9) OR [in1].[dbo].[t1].[col2]=(10)))

Same plan shape, just more conditions.

I wonder if this goes on forever?  Well, plan 3 looks different:

  |--Filter(WHERE:([in1].[dbo].[t1].[col2]=(1) OR [in1].[dbo].[t1].[col2]=(2) OR [in1].[dbo].[t1].[col2]=(3) OR [in1].[dbo].[t1].[col2]=(4) OR [in1].[dbo].[t1].[col2]=(5) OR [in1].[dbo].[t1].[col2]=(6) OR [in1].[dbo].[t1].[col2]=(7) OR [in1].[dbo].[t1].[col2]=(8) OR [in1].[dbo].[t1].[col2]=(9) OR [in1].[dbo].[t1].[col2]=(10) OR [in1].[dbo].[t1].[col2]=(11) OR [in1].[dbo].[t1].[col2]=(12) OR [in1].[dbo].[t1].[col2]=(13) OR [in1].[dbo].[t1].[col2]=(14) OR [in1].[dbo].[t1].[col2]=(15) OR [in1].[dbo].[t1].[col2]=(16)))
       |--Table Scan(OBJECT:([in1].[dbo].[t1]))

Now the filter is split out into another query operator.  (It starts happening with 16 predicates)

OK, these 3 plans have been simple query plans.  In fact, you can do "set showplan_xml on" and run the query and learn something about how much optimization was done on them:

<ShowPlanXML xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan" Version="1.0" Build="10.0.1075.23">
  <BatchSequence>
    <Batch>
      <Statements>
        <StmtSimple StatementText="select * from t1 where col2 in (1)" StatementId="1" StatementCompId="1" StatementType="SELECT" StatementSubTreeCost="2.48687" StatementEstRows="1" StatementOptmLevel="TRIVIAL" ParameterizedText="(@1 tinyint)SELECT * FROM [t1] WHERE [col2]=@1">

The end-to-end design of the optimizer is beyond the scope of this post, but there are various stages of optimization and queries can "quit early" if they find a "good enough" plan.  In this case, there aren't actually any real cost-based decisions to make about which plan to run, so these plans are called "trivial".

Once you start adding joins into the mix, there are join orders to consider, and those are cost-based.  You can see that the optimization level for plan 4 differs:
<StmtSimple StatementText="select * from t1 inner join t2 on t1.col1=t2.col1&#xD;&#xA;where t1.col2 in (&#xD;&#xA;1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,&#xD;&#xA;51,52,53,54,55,56,57,58,59,60,61,62,63,64)" StatementId="1" StatementCompId="1" StatementType="SELECT" StatementSubTreeCost="2.70289" StatementEstRows="1" StatementOptmLevel="FULL">

One would infer that FULL > TRIVIAL ;).

Plan 4:
  |--Hash Match(Inner Join, HASH:([in1].[dbo].[t2].[col1])=([in1].[dbo].[t1].[col1]))
       |--Table Scan(OBJECT:([in1].[dbo].[t2]))
       |--Filter(WHERE:([in1].[dbo].[t1].[col2]=(1) OR [in1].[dbo].[t1].[col2]=(2) OR [in1].[dbo].[t1].[col2]=(3) OR [in1].[dbo].[t1].[col2]=(4) OR [in1].[dbo].[t1].[col2]=(5) OR [in1].[dbo].[t1].[col2]=(6) OR [in1].[dbo].[t1].[col2]=(7) OR [in1].[dbo].[t1].[col2]=(8) OR [in1].[dbo].[t1].[col2]=(9) OR [in1].[dbo].[t1].[col2]=(10) OR [in1].[dbo].[t1].[col2]=(11) OR [in1].[dbo].[t1].[col2]=(12) OR [in1].[dbo].[t1].[col2]=(13) OR [in1].[dbo].[t1].[col2]=(14) OR [in1].[dbo].[t1].[col2]=(15) OR [in1].[dbo].[t1].[col2]=(16) OR [in1].[dbo].[t1].[col2]=(17) OR [in1].[dbo].[t1].[col2]=(18) OR [in1].[dbo].[t1].[col2]=(19) OR [in1].[dbo].[t1].[col2]=(20) OR [in1].[dbo].[t1].[col2]=(21) OR [in1].[dbo].[t1].[col2]=(22) OR [in1].[dbo].[t1].[col2]=(23) OR [in1].[dbo].[t1].[col2]=(24) OR [in1].[dbo].[t1].[col2]=(25) OR [in1].[dbo].[t1].[col2]=(26) OR [in1].[dbo].[t1].[col2]=(27) OR [in1].[dbo].[t1].[col2]=(28) OR [in1].[dbo].[t1].[col2]=(29) OR [in1].[dbo].[t1].[col2]=(30) OR [in1].[dbo].[t1].[col2]=(31) OR [in1].[dbo].[t1].[col2]=(32) OR [in1].[dbo].[t1].[col2]=(33) OR [in1].[dbo].[t1].[col2]=(34) OR [in1].[dbo].[t1].[col2]=(35) OR [in1].[dbo].[t1].[col2]=(36) OR [in1].[dbo].[t1].[col2]=(37) OR [in1].[dbo].[t1].[col2]=(38) OR [in1].[dbo].[t1].[col2]=(39) OR [in1].[dbo].[t1].[col2]=(40) OR [in1].[dbo].[t1].[col2]=(41) OR [in1].[dbo].[t1].[col2]=(42) OR [in1].[dbo].[t1].[col2]=(43) OR [in1].[dbo].[t1].[col2]=(44) OR [in1].[dbo].[t1].[col2]=(45) OR [in1].[dbo].[t1].[col2]=(46) OR [in1].[dbo].[t1].[col2]=(47) OR [in1].[dbo].[t1].[col2]=(48) OR [in1].[dbo].[t1].[col2]=(49) OR [in1].[dbo].[t1].[col2]=(50) OR [in1].[dbo].[t1].[col2]=(51) OR [in1].[dbo].[t1].[col2]=(52) OR [in1].[dbo].[t1].[col2]=(53) OR [in1].[dbo].[t1].[col2]=(54) OR [in1].[dbo].[t1].[col2]=(55) OR [in1].[dbo].[t1].[col2]=(56) OR [in1].[dbo].[t1].[col2]=(57) OR [in1].[dbo].[t1].[col2]=(58) OR [in1].[dbo].[t1].[col2]=(59) OR [in1].[dbo].[t1].[col2]=(60) OR [in1].[dbo].[t1].[col2]=(61) OR [in1].[dbo].[t1].[col2]=(62) OR [in1].[dbo].[t1].[col2]=(63) OR [in1].[dbo].[t1].[col2]=(64)))
            |--Table Scan(OBJECT:([in1].[dbo].[t1]))

ok, this looks like an extension of what we saw in plan 3 - a filter operator above a table scan, followed by a join.

Query 5 is only trivially different than 4 - 65 conditions instead of 64:

  |--Nested Loops(Left Semi Join, OUTER REFERENCES:([in1].[dbo].[t1].[col2]))
       |--Hash Match(Inner Join, HASH:([in1].[dbo].[t2].[col1])=([in1].[dbo].[t1].[col1]))
       |    |--Table Scan(OBJECT:([in1].[dbo].[t2]))
       |    |--Table Scan(OBJECT:([in1].[dbo].[t1]))
       |--Filter(WHERE:([in1].[dbo].[t1].[col2]=[Expr1008]))
            |--Constant Scan( VALUES:(((1)),((2)),((3)),((4)),((5)),((6)),((7)),((8)),((9)),((10)),((11)),((12)),((13)),((14)),((15)),((16)),((17)),((18)),((19)),((20)),((21)),((22)),((23)),((24)),((25)),((26)),((27)),((28)),((29)),((30)),((31)),((32)),((33)),((34)),((35)),((36)),((37)),((38)),((39)),((40)),((41)),((42)),((43)),((44)),((45)),((46)),((47)),((48)),((49)),((50)),((51)),((52)),((53)),((54)),((55)),((56)),((57)),((58)),((59)),((60)),((61)),((62)),((63)),((64)),((65))))

Hey, what's this?  A Semi Join?  Well, this is a join where rows from one side are joined with another table (in this case, an in-memory table) and only qualifying rows from the original table are kept.  To make this work, you can't return any columns from the "other side", and you have to make sure you don't have more than one match on the constant table.  So, you need to remove duplicates.  Try adding another '1' into the IN list and then look at the plan - it removes them.

Why are there different plans at these points?  Well, while I can't speak for the development team, as I'm not on it anymore, but I will say that often these are added as an attempt to deal with poor performance or some very common queries that are seen all the time (perhaps from a large ISV).  There's obviously not a ton of magic to why 16 and 64 are picked for the magic numbers, and I'm sure if we hunted we could find some cases where the transition between plans was not perfect.  Often changing the numbers, even if they aren't quite perfect, will just cause instability elsewhere in terms of plan quality so they are left to try to avoid pain to customers.  I'm hopeful that you can use this post to start to do the same kind of reasoning yourself.  While I have a pretty good idea of where the plans will change, I'm not revealing anything you can't see with some trial and error on your own.  I actually had forgotten the 64 number, as I think it changed at one point - so, I had to do figure it out again ;).

OK, this covers the basic forms of plans with INs that are available in SQL Server.  I plan to work on the plan cache next, including forced parameterization.

Thanks,

Conor
Tuesday, January 22, 2008 8:16:50 PM (Central Standard Time, UTC-06:00)  #    Comments [2]  | 
Saturday, January 19, 2008
I've already received a few comments and mails from people asking questions...

One comment I've received is about LINQ's IN operator(method?) and SQL Server plan cache pollution.  I'm researching the issue a bit before I post a complete answer, but I'll start with some background information on memory management in SQL Server.  I co-authored a paper on a new memory management algorithm that I helped develop that went into SQL Server 2005.  I don't think it has changed much in SQL Server 2008 from what I've seen of the November CTP at least.

The interesting thing about memory in SQL Server is that, unlike some of the other vendors in the database space, SQL Server runs all of its users within the same server process.  (This derives from UNIX traditionally using fork() to create another server process to service each user).  Windows has better thread support, and as a result it was possible to make SQL Server run all of the user activities within a single process.  This has advantages because you can avoid inter-process communication for a lot of things, like a plan cache.  You can also "be smart" on things like which activity to schedule next in ways that an Operating System will not do.  Often this can lead to large performance gains.

One downside of this single-process approach is that, at least on 32-bit machines, virtual memory space becomes a limited resource, especially on larger deployments.  Every component has to fit into about 1.4GB of space (with the remainder taken up by stacks for threads, address space for mapping DLLs into the process, etc).  The limitation does not exist on 64-bit machines, which is why I recommend that all new serious SQL Server deployments closely consider their architecture when purchasing new machines.

This paper was accepted at the Conference on Innovative Data Systems Research (CIDR), of which Jim Gray was one of the founding organizers.  It has some details in the kinds of problems that were solved and the performance gains that were achieved by more actively managing the memory consumption and consumption rates of each internal subcomponent.  In a sense, this is how one could build a special-purpose memory management system for a machine that only ran a database.

Memory-CIDR2007Accepted2.doc (400 KB)

I will start working on some posts on some of the interesting things that happen in the query processor to handle IN clauses (in SQL, not LINQ).  Then I'll do some work with LINQ's IN equivalent and see what interesting things I can post. 

Thanks,

Conor Cunningham
Saturday, January 19, 2008 11:28:32 AM (Central Standard Time, UTC-06:00)  #    Comments [0]  | 
Wednesday, January 16, 2008
Here is the introductory post to my new blog on SQL, data-driven applications, performance, scaling, and anything else I feel like posting.  For those of you who do not know me, Paul Randal (http://www.sqlskills.com/blogs/paul/) and I worked together on Microsoft SQL Server for a number of years.  While Paul worked in the Storage Engine, I worked on the Query Processor.  Like Paul, I left Microsoft about a year ago (all good things must come to an end...).  Paul and Kimberly have been gracious enough to host me on their site so I can help answer questions about SQL Server, teach people a bit about what I know, and force me to learn a few things as well.

While at Microsoft, I was a development lead on the Query Optimizer and had architectural dutues for much of the query pipeline.

I actually blogged in my official capacity at Microsoft.  You can read about  my previous posts  here (http://blogs.msdn.com/queryoptteam/) and here (http://blogs.msdn.com/sqlqueryprocessing/default.aspx).

I'll be posting some content about the new SQL Server 2008 features I find interesting over the coming weeks.  I'm also happy to answer general questions about the Query Processor or SQL Server in general.

So, welcome!

Sincerely,
Conor Cunningham

Wednesday, January 16, 2008 8:12:42 PM (Central Standard Time, UTC-06:00)  #    Comments [2]  | 

Theme design by Jelle Druyts

Pick a theme: