sqlskills-logo-2015-white.png

Combining multiple CONTAINS predicates in SQL Server 2012

The SQL Server 2005  “Performance Tuning and Optimization (Full-Text Search)” books online topic has the following recommendation:

Combine multiple CONTAINS predicates into one CONTAINS predicate. In SQL Server you can specify a list of columns in the CONTAINS query.”

While this shows up for the SQL Server 2005 topic, there is no such statement made for the 2008 and 2012 versions of the the books online topic.

I was curious if this 2005-era recommendation still applied for more recent versions.  To test this scenario I made the following changes to the AdventureWorks2012 sample database on version Microsoft SQL Server 2012 (SP1), 11.0.3000.0 Developer Edition:

USE [AdventureWorks2012];
GO
ALTER TABLE [Production].[ProductDescription]
ADD [Description2] NVARCHAR(400) NOT NULL DEFAULT '';
GO

-- Populating the new description column
UPDATE [Production].[ProductDescription]
SET [Description2] = [Description];
GO

CREATE FULLTEXT INDEX ON [Production].[ProductDescription] KEY INDEX [PK_ProductDescription_ProductDescriptionID] ON ([AW2008FullTextCatalog])
WITH (CHANGE_TRACKING AUTO);
GO

ALTER FULLTEXT INDEX ON [Production].[ProductDescription] ADD ([Description]);
ALTER FULLTEXT INDEX ON [Production].[ProductDescription] ADD ([Description2]);
GO

Now for an initial test, I executed the following single CONTAINS predicate query:

SET STATISTICS IO ON;

SELECT  [ProductDescriptionID] ,
[Description] ,
[rowguid] ,
[ModifiedDate]
FROM [Production].[ProductDescription]
WHERE CONTAINS([Description], 'bike');

SET STATISTICS IO OFF;

The query execution plan tree (from SQL Sentry Plan Explorer) was as follows:

image

And the STATISTICS IO output was as follows:

Table ‘ProductDescription’. Scan count 0, logical reads 28, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Now for a multi-CONTAINS predicate, I executed the following example:

SET STATISTICS IO ON;

SELECT  [ProductDescriptionID] ,
[Description] ,
[rowguid] ,
[ModifiedDate]
FROM [Production].[ProductDescription]
WHERE CONTAINS([Description], 'bike') OR
CONTAINS([Description2], 'bike');

SET STATISTICS IO OFF;

The query execution plan tree (from SQL Sentry Plan Explorer) was as follows:

image

The operators remained the same, but we have a change in estimated subtree cost and also a doubling up of cardinality estimates.

As for the STATISTICS IO output, it was as follows:

Table ‘ProductDescription’. Scan count 0, logical reads 34, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Logical reads were 34 versus the original 28.

For the last example, I combined the two CONTAINS predicates as follows:

SET STATISTICS IO ON;

SELECT  [ProductDescriptionID] ,
[Description] ,
[rowguid] ,
[ModifiedDate]
FROM [Production].[ProductDescription]
WHERE CONTAINS(([Description], [Description2]), 'bike');

SET STATISTICS IO OFF;

I found that with the combined CONTAINS, the plan shape for this example remained the same as the multiple-CONTAINS version. Operator costs and estimates also remained the same.

For SQL Server 2005, if you had multiple CONTAINS predicates referencing columns from the same table, your execution plan would show separate remote scan operators for each reference instead of them being combined natively into a single operator.  This behavior seems to have been addressed in recent versions, but if you have seen a variation of this issue given other conditions please share in the comments.  Thanks!

Other articles

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.