Columnstore indexes and cardinality estimates in SQL Server 2014 CTP1

This post is a continuation of my previous post about SQL Server 2014 Columnstore and Query Plans. I’ll take a look at how the new cardinality estimates affects columnstore index query plans. I’ve written a bit about the new cardinality estimator in SQL Server 2014, as well as looked into how to tell what the estimator is doing by using XEvent trace. Today, I’ll apply that to columnstore as well as a data warehouse query against a traditional rowstore.

First off, since I used the AdventureWorks2012DW database and attached it in SQL Server 2014 rather than building it “by hand, from scratch”, I’m using the “old” cardinality estimator by default. This applies even to new tables in databases restored/attached from pre-2014 versions, as we’re using here. I can get queries to use the “new” estimator on a query-by-query basis by using the query hint “OPTION(QUERYTRACEON 2312)”. You can also set the trace flag at other levels of granularity (session and ‌instance-level) of course. But it’s easier to compare the two using the query hints. You can confirm that this works by referencing the “main” (e.g. SELECT, in this case) iterator’s CardinalityMode110 is OFF (old estimator) or ON (new estimator).

With the sample query from the previous post, all queries (traditional bitmap, nonclustered columnstore, and clustered columnstore) got the same estimate of 100.8 estimated rows v. 119 actual rows. Although the query shapes were different between “traditional” bitmap plans and columnstore plans, the estimates were consistent at the end. Because of the different plan shapes, it’s not straightforward to determine where the estimates started to diverge. SQL Server 2012 SP1 plans for traditional bitmap and nonclusted columnstore got the same eventual estimate as the analogous queries in SQL Server 2014 CTP1. Trace flag 2312 has no
effect on cardinality in SQL Server 2012, as you would expect (although yes, I did try it).

When the new cardinality estimates are turned on, the eventual estimate for all variations of the table is slightly worse at the end, 189.842 estimated rows v. 119 actual rows. With the “traditional” bitmap plan this doesn’t change the plan but does result in the estimated plan having a slightly higher cost for the new cardinality estimator (54%-46% of the batch). The memory grant is slightly higher as well, as you would expect. With the clustered and nonclustered columnstore the higher estimated rows results in a different plan choice (HashMatch (aggregate) – old v. Sort and StreamAggregate – new). This different plan choice results in a plan with much higher cost (71%-29% of the batch), although the choice of StreamAggregate results in a slightly lower memory grant for the new plan.

To summarize:
1. The new cardinality estimator in SQL Server 2014 works with queries against columnstores as well as rowstores.
2. The cardinality estimates are consistent between columnstores and rowstores, even though the plan shapes might be different.
3. The new cardinality estimator can change a plan. As in any query optimizer change, there can be regressions, but I’ve mostly seen improvements with the new estimator. As well as improvement with a different columnstore plan, coming up.

Cheers, Bob

Query plans are included.

query_plans2.zip

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.

Columnstore indexes and cardinality estimates in SQL Server 2014 CTP1

Other articles

Over 1000 XEvents in SQL Server 2016 CTP2. Here are the new ones.

Taking the Azure SQL Database row-level security preview for a spin

Azure SQL Database V12 Preview – Spatial Fully Functional

Not a “me-too announcement” blog on Azure SQL Database V12 preview

AzureML: What components are used by the sample experiments?

Book Review: Chris Webb’s “Power Query for Power BI and Excel”

Imagine feeling confident enough to handle whatever your database throws at you.