# Selectivity Guesses in absence of Statistics

Let’s say you have a heap table with 1,000,000 rows in it.  Let’s also say that your automatic creation of statistics are disabled, as well as updates to the statistics (and in this scenario, there are NO existing indexes or statistics).

What kind of selectivity guess would the optimizer make for a query like the following? (I copied over the FactInternetSales table with no indexes and put 1,000,000 rows in it – tested on SQL Server 2012, 11.0.2316)

```SELECT SalesOrderNumber
FROM dbo.FactInternetSales```
`WHERE ProductKey = 537;`

First of all, the actual plan on my system was as follows:

If I look at the properties of the table scan, I see the actual number of rows was 23,042 versus the estimated of 31,622.8.  (As an aside – execute SELECT POWER(1000000.0,.75) and you’ll see how the estimated number of rows was derived).  I also see that the table cardinality is known (and correct) at 1,000,000.

What if I have a predicate referencing another column, like OrderDateKey?  This too results in an estimate of 31,622.8 rows.

The following table shows a sampling of various queries and associated search conditions, along with their associated row estimates:

 Query Estimated Rows Percentage (based on 1,000,000 rows) SELECT SalesOrderNumber FROM dbo.FactInternetSales WHERE ProductKey = 537; 31,622.8 3.16228% SELECT SalesOrderNumber FROM dbo.FactInternetSales WHERE OrderDateKey = 20040224; 31,622.8 3.16228% SELECT SalesOrderNumber FROM dbo.FactInternetSales WHERE ProductKey BETWEEN 537 AND 600; 90,000 9% SELECT SalesOrderNumber FROM dbo.FactInternetSales WHERE ProductKey BETWEEN 537 AND 537; 90,000 9% SELECT SalesOrderNumber FROM dbo.FactInternetSales WHERE TotalProductCost < 100.00; 300,000 30% SELECT SalesOrderNumber FROM dbo.FactInternetSales WHERE TotalProductCost > 100.00; 300,000 30% SELECT SalesOrderNumber FROM dbo.FactInternetSales WHERE TotalProductCost > 100.00 AND TotalProductCost < 200.00; 90,000 9% SELECT SalesOrderNumber FROM dbo.FactInternetSales WHERE TotalProductCost >= 100.00 AND TotalProductCost <= 200.00; 90,000 9% SELECT SalesOrderNumber FROM dbo.FactInternetSales WHERE TotalProductCost LIKE 100.00; 100,000 10% SELECT SalesOrderNumber FROM dbo.FactInternetSales WHERE SalesOrderNumber LIKE ‘%A%’; 539,232 53.9232% SELECT SalesOrderNumber FROM dbo.FactInternetSales WHERE SalesOrderNumber LIKE ‘A%’; 269,616 26.9616% SELECT SalesOrderNumber FROM dbo.FactInternetSales WHERE SalesOrderNumber LIKE ‘%A’; 269,616 26.9616% SELECT SalesOrderNumber FROM dbo.FactInternetSales WHERE TotalProductCost > (ProductStandardCost + 10.00); 300,000 30% SELECT SalesOrderNumber FROM dbo.FactInternetSales WHERE ProductKey IN (1); 31,622.8 3.16228% SELECT SalesOrderNumber FROM dbo.FactInternetSales WHERE ProductKey IN (1,2); 62,245.6 6.22456% SELECT SalesOrderNumber FROM dbo.FactInternetSales WHERE ProductKey = 1 OR ProductKey = 2; 62,245.6 6.22456% SELECT SalesOrderNumber FROM dbo.FactInternetSales WHERE ProductKey IN (1,2,3); 91,900 9.19% SELECT SalesOrderNumber FROM dbo.FactInternetSales WHERE ProductKey = 1 OR ProductKey = 2 OR ProductKey = 3; 91,900 9.19% SELECT SalesOrderNumber FROM dbo.FactInternetSales WHERE ProductKey = 1 AND ProductKey = 2 AND ProductKey = 3; 10,746.1 1.07461% SELECT SalesOrderNumber FROM dbo.FactInternetSales WHERE ProductKey = 1 AND ProductKey = 2; 13,335.2 1.33352% SELECT SalesOrderNumber FROM dbo.FactInternetSales WHERE (TaxAmt * UnitPrice) > 10.00; 300,000 30% SELECT SalesOrderNumber FROM dbo.FactInternetSales WHERE ProductKey NOT IN (1); 968,377 96.8377% SELECT SalesOrderNumber FROM dbo.FactInternetSales WHERE ProductKey <> 1; 968,377 96.8377% SELECT SalesOrderNumber FROM dbo.FactInternetSales WHERE ProductKey NOT IN (1,2); 13,335.2 1.33352% SELECT SalesOrderNumber FROM dbo.FactInternetSales WHERE ProductKey NOT IN (1,2,3); 10746.1 1.07461% SELECT SalesOrderNumber FROM dbo.FactInternetSales WHERE ProductKey <> 1 AND ProductKey <> 2 AND ProductKey <> 3; 10746.1 1.07461% SELECT SalesOrderNumber FROM dbo.FactInternetSales WHERE EXISTS (SELECT 1); 1,000,000 100% SELECT SalesOrderNumber FROM dbo.FactInternetSales WHERE NOT EXISTS (SELECT 1); 0 0% SELECT SalesOrderNumber FROM dbo.FactInternetSales WHERE ProductKey IS NOT NULL; 1,000,000 100% SELECT SalesOrderNumber FROM dbo.FactInternetSales WHERE ProductKey IS NULL; 0 0% ALTER TABLE dbo.FactInternetSales ALTER COLUMN ProductKey int NULL; SELECT SalesOrderNumber FROM dbo.FactInternetSales WHERE ProductKey IS NOT NULL; 968,377 96.8377% ALTER TABLE dbo.FactInternetSales ALTER COLUMN ProductKey int NULL; SELECT SalesOrderNumber FROM dbo.FactInternetSales WHERE ProductKey IS NULL; 31,622.8 3.16228%

There are numerous other scenarios I could have put in this table, but you probably get the point by now that the search conditions help define the which “guess” calculation is used in absence of statistics.  When there are additional details beyond table cardinality, such as the nullability of a column or constraints, SQL Server can often leverage it accordingly.

## 4 thoughts on “Selectivity Guesses in absence of Statistics”

1. tobi says:

This table could be used to construct "cardinality hints" by introducing useless predicates which alter misestimated cardinality. A nasty hack, but it seems nicer that join hints which cause side-effects like forcing join order.

2. Chandan says:

Thanks for writing this Joe. Does it mean that in a table which is a heap and with no -nonclustered index, the actual row-counts guessed by the optimizer while compiling a plan will always be wrong?

What about a heap with some non-clustered index on some columns?

Thanks
Chandan

3. Hi Chandan,

If auto-create / auto-update statistics are enabled (which most often they should be) – the guesses don’t come in to play (for this scenario at least). Basically the guesses come into play when there are no stats involved (whether stats objects or stats associated with an index).

Cheers

## Other articles

### New Course: “SQL Server: Common Query Tuning Problems and Solutions – Part 2″

Today, Pluralsight published my latest course, “SQL Server: Common Query Tuning Problems and Solutions – Part 2“.  Part two is 2 hours and 17 minutes long

Explore

### New Course: “SQL Server: Common Query Tuning Problems and Solutions – Part 1”

Pluralsight published my latest course, “SQL Server: Common Query Tuning Problems and Solutions – Part 1“.  Part 1 is 2 hours and 31 minutes and

Explore

### SQLIntersection Post-Conference Session

I am really happy to announce that Kimberly Tripp and I will be delivering a post-conference in November’s SQLIntersection conference at the MGM Grand. The

Explore

### Optimizing Your Query Plans with the SQL Server 2014 Cardinality Estimator

I’ve been working on writing a new Microsoft white paper since January covering the main changes made in the SQL Server 2014 Cardinality Estimator, and

Explore

### Finding Timeout-Prone Code with max_elapsed_time

Consider a scenario where your application timeout setting is 30 seconds and you’ve been asked to proactively start tuning workloads that are at risk of

Explore

### Deck from “Practical SQL Server Cardinality Estimation”

I uploaded the deck from my session for SQLSaturday #287 Madison 2014 and you can download it here. It was an excellent event!  Great organizers,

Explore

## Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.