Let’s say you have a heap table with 1,000,000 rows in it.  Let’s also say that your automatic creation of statistics are disabled, as well as updates to the statistics (and in this scenario, there are NO existing indexes or statistics).

What kind of selectivity guess would the optimizer make for a query like the following? (I copied over the FactInternetSales table with no indexes and put 1,000,000 rows in it – tested on SQL Server 2012, 11.0.2316)

SELECT SalesOrderNumber
FROM dbo.FactInternetSales
WHERE ProductKey = 537;

First of all, the actual plan on my system was as follows:

snaghtml1904f4f thumb Selectivity Guesses in absence of Statistics

If I look at the properties of the table scan, I see the actual number of rows was 23,042 versus the estimated of 31,622.8.  (As an aside – execute SELECT POWER(1000000.0,.75) and you’ll see how the estimated number of rows was derived).  I also see that the table cardinality is known (and correct) at 1,000,000.

snaghtml19c0871 thumb Selectivity Guesses in absence of Statistics

What if I have a predicate referencing another column, like OrderDateKey?  This too results in an estimate of 31,622.8 rows.

The following table shows a sampling of various queries and associated search conditions, along with their associated row estimates:

Query Estimated Rows Percentage (based on 1,000,000 rows)
SELECT SalesOrderNumber

FROM dbo.FactInternetSales

WHERE ProductKey = 537;

31,622.8 3.16228%
SELECT SalesOrderNumber

FROM dbo.FactInternetSales

WHERE OrderDateKey = 20040224;

31,622.8 3.16228%
SELECT SalesOrderNumber

FROM dbo.FactInternetSales

WHERE ProductKey BETWEEN 537 AND 600;

90,000 9%
SELECT SalesOrderNumber

FROM dbo.FactInternetSales

WHERE ProductKey BETWEEN 537 AND 537;

90,000 9%
SELECT SalesOrderNumber

FROM dbo.FactInternetSales

WHERE TotalProductCost < 100.00;

300,000 30%
SELECT SalesOrderNumber

FROM dbo.FactInternetSales

WHERE TotalProductCost > 100.00;

300,000 30%
SELECT SalesOrderNumber

FROM dbo.FactInternetSales

WHERE TotalProductCost > 100.00 AND

TotalProductCost < 200.00;

90,000 9%
SELECT SalesOrderNumber

FROM dbo.FactInternetSales

WHERE TotalProductCost >= 100.00 AND

TotalProductCost <= 200.00;

90,000 9%
SELECT SalesOrderNumber

FROM dbo.FactInternetSales

WHERE TotalProductCost LIKE 100.00;

100,000 10%
SELECT SalesOrderNumber

FROM dbo.FactInternetSales

WHERE SalesOrderNumber LIKE ‘%A%’;

539,232 53.9232%
SELECT SalesOrderNumber

FROM dbo.FactInternetSales

WHERE SalesOrderNumber LIKE ‘A%’;

269,616 26.9616%
SELECT SalesOrderNumber

FROM dbo.FactInternetSales

WHERE SalesOrderNumber LIKE ‘%A’;

269,616 26.9616%
SELECT SalesOrderNumber

FROM dbo.FactInternetSales

WHERE TotalProductCost > (ProductStandardCost + 10.00);

300,000 30%
SELECT SalesOrderNumber

FROM dbo.FactInternetSales

WHERE ProductKey IN (1);

31,622.8 3.16228%
SELECT SalesOrderNumber

FROM dbo.FactInternetSales

WHERE ProductKey IN (1,2);

62,245.6 6.22456%
SELECT SalesOrderNumber

FROM dbo.FactInternetSales

WHERE ProductKey = 1 OR

ProductKey = 2;

62,245.6 6.22456%
SELECT SalesOrderNumber

FROM dbo.FactInternetSales

WHERE ProductKey IN (1,2,3);

91,900 9.19%
SELECT SalesOrderNumber

FROM dbo.FactInternetSales

WHERE ProductKey = 1 OR

ProductKey = 2 OR

ProductKey = 3;

91,900 9.19%
SELECT SalesOrderNumber

FROM dbo.FactInternetSales

WHERE ProductKey = 1 AND

ProductKey = 2 AND

ProductKey = 3;

10,746.1 1.07461%
SELECT SalesOrderNumber

FROM dbo.FactInternetSales

WHERE ProductKey = 1 AND

ProductKey = 2;

13,335.2 1.33352%
SELECT SalesOrderNumber

FROM dbo.FactInternetSales

WHERE (TaxAmt * UnitPrice) > 10.00;

300,000 30%
SELECT SalesOrderNumber

FROM dbo.FactInternetSales

WHERE ProductKey NOT IN (1);

968,377 96.8377%
SELECT SalesOrderNumber

FROM dbo.FactInternetSales

WHERE ProductKey <> 1;

968,377 96.8377%
SELECT SalesOrderNumber

FROM dbo.FactInternetSales

WHERE ProductKey NOT IN (1,2);

13,335.2 1.33352%
SELECT SalesOrderNumber

FROM dbo.FactInternetSales

WHERE ProductKey NOT IN (1,2,3);

10746.1 1.07461%
SELECT SalesOrderNumber

FROM dbo.FactInternetSales

WHERE ProductKey <> 1 AND

ProductKey <> 2 AND

ProductKey <> 3;

10746.1 1.07461%
SELECT SalesOrderNumber

FROM dbo.FactInternetSales

WHERE EXISTS (SELECT 1);

1,000,000 100%
SELECT SalesOrderNumber

FROM dbo.FactInternetSales

WHERE NOT EXISTS (SELECT 1);

0 0%
SELECT SalesOrderNumber

FROM dbo.FactInternetSales

WHERE ProductKey IS NOT NULL;

1,000,000 100%
SELECT SalesOrderNumber

FROM dbo.FactInternetSales

WHERE ProductKey IS NULL;

0 0%
ALTER TABLE dbo.FactInternetSales

ALTER COLUMN ProductKey int NULL;

SELECT SalesOrderNumber

FROM dbo.FactInternetSales

WHERE ProductKey IS NOT NULL;

968,377 96.8377%
ALTER TABLE dbo.FactInternetSales

ALTER COLUMN ProductKey int NULL;

SELECT SalesOrderNumber

FROM dbo.FactInternetSales

WHERE ProductKey IS NULL;

31,622.8 3.16228%

There are numerous other scenarios I could have put in this table, but you probably get the point by now that the search conditions help define the which “guess” calculation is used in absence of statistics.  When there are additional details beyond table cardinality, such as the nullability of a column or constraints, SQL Server can often leverage it accordingly.