Batch Mode improvement for columnstores in SQL Server 2014 CTP1

This post is a continuation of my previous posts on columnstore indexes in SQL Server 2014 CTP1 and query plans. The test bed is the same as in previous posts, based on an enlarged version of a fact table in AdventureWorks2012DW.

One of the things that improves query speed when using columnstore indexes is the usage of a new batch mode, as opposed to “traditional” row mode. Batch mode refers to the fact that individual iterators execute in batches rather than row-by-row. You can observe this in query plan iterators; each one contains the properties “Actual Execution Mode” and “Estimated Execution Mode”.

In SQL Server 2014, usage of Batch Mode will be more mainstream. We’ve actually already seen evidence of improvements from SQL Server 2012 to SQL Server 2014 in the first post. In SQL Server 2012, The hash join implementation consists of two iterators: a build iterator and an actual join iterator. In SQL Server 2014, there is not a separate build iterator; everything is pushed into the join iterator. There were also other limitations in SQL Server 2012 for batch mode. For example, batch mode was only supported in inner joins and the hash table must fit entirely in memory. In addition, query patterns such as scalar aggregation, multiple distinct aggregations and IN clauses and other patterns could not use Batch Mode. A series of query rewrites for common patterns to enable queries to use Batch Mode were posted on the Technet Wiki. SQL Server 2014 removes some, if not most, of these limitations.

I’ll look at one of these queries today to see how things have improved, that’s the outer join. We’ll use the following query as a test:

select ProductAlternateKey, SUM(SalesAmount) as SumSales
from DimProduct
left outer join FactInternetSalesNewCCI fact
on DimProduct.ProductKey = fact.ProductKey
group by ProductAlternateKey
order by SumSales desc;

In SQL Server 2012, because of the outer join, none of the iterators (including the nonclustered columnstore scan) in the query use Batch Mode. In SQL Server 2014, the left outer join iterator, as well as the following HashMatch partial aggregate and the columnstore scan use Batch Mode. People who invested in the nonclustered columnstore index in SQL Server 2012 will be glad to note that these improvements work with nonclustered columnstore as well as clustered columnstore.

In addition, I tried this query in SQL Server using both the old and new cardinality estimation. The new estimation was right on the money, with a final estimate of 504 Estimated and Actual Rows v. the old estimator’s estimate of 376 Estimated v. 504 Actual. Of course, the higher, more accurate estimate results in a plan with a higher estimated cost and memory grant (the more accurate estimate is higher after all), but we’ll know that we’re using the “right” plan because of the more accurate estimates. In this case, the different estimate did not change the shape of the plan.

So, to summarize:
1. The cases in which Batch Mode will be used with Columnstore iterators in plans have been expanded in SQL Server 2014, leading to even faster queries.
2. The Batch Mode improvements work with the nonclustered as well as the clustered columnstore indexes.
3. The new cardinality estimator almost always results in more accurate cardinality estimates in SQL Server 2014.

Cheers, Bob

As always, query plans…

leftjoin_batchmode.zip

Other articles

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.