New Article on SQLPerformance.com: “Exploring SQL Server 2014 SELECT INTO Parallelism”

My 8th guest blog post was published today on SQLPerformance.com:

Exploring SQL Server 2014 SELECT INTO Parallelism

In this post I kick the tires of the new SELECT … INTO parallelism improvement in SQL Server 2014 CTP1 – and the results I saw were favorable.

New Course: “SQL Server 2012: Transact-SQL Error Handling”

Today Pluralsight published my new course, “SQL Server 2012: Transaction-SQL Error Handling”.

As the title suggests, this course steps through how to write Transact-SQL code that deals with anticipated and unanticipated errors during code execution.  The audience is for developers and DBAs – and while the title states it is for SQL Server 2012, several aspects are applicable from SQL Server 2005 and onward.

Presenting “Performance Issue Patterns” at the PASSMN Minnesota SQL Server User Group

If you’re in town (Minneapolis/St. Paul) – just a heads-up that I’ll be presenting at the next PASSMN meeting on Tuesday, August 20th (scheduled from 4PM to 6PM).  This will be located at Microsoft, 3601 West 76th Street, Suite 600 Edina, MN  55437.

Here is what I’ll be presenting:

Title: Performance Issue Patterns

Description: While a SQL Server professional may encounter various one-off situations over the course of a career, there are also several shared patterns and scenarios.  In this session we’ll walk through a few of these patterns interactively – where I’ll reproduce the issue and we’ll work through the scenario until we get to the root cause.

If you’ve seen my Pluralsight course with the same name – please note that this session is structured differently.  I’ll be setting up a scenario (with helpful and unhelpful symptoms), kicking off the pattern and then it is up to us to collectively troubleshoot.  Plenty of scenarios to choose from over 75 minutes, so I’ll likely pick and choose scenarios based on participant interaction as we go along.  Think “Video Game for DBAs”.

New Article on SQLPerformance.com: “Observer Overhead and Wait Type Symptoms”

My 7th guest blog post was published today on SQLPerformance.com:

Observer Overhead and Wait Type Symptoms

This article details seven test scenarios I performed in order to see what wait-type symptoms (if any) arise based on different types of “observer overhead” due to concurrently running SQL Trace and XE sessions.

New Course: “SQL Server: Transactional Replication Fundamentals”

Today Pluralsight published my new course, “SQL Server: Transactional Replication Fundamentals.”

This course provides a fundamental overview of how to configure, monitor, tune and troubleshoot a SQL Server transactional replication topology. Transactional replication meets a few specific data distribution requirements that no other native SQL Server feature does out-of-the-box.  Even if you’re not a fan of this feature (and I’ve met quite a few), if you are responsible for architecting SQL Server data-distribution solutions it is helpful to understand how transactional replication fits into the overall scalability, high availability and disaster recovery feature landscape. 

New Article on SQLPerformance.com: "Avoid HA/DR Solution Self-Delusion"

My sixth guest blog post was published today on SQLPerformance.com:

Avoid HA/DR Solution Self-Delusion

This article gives a few examples of how we can delude ourselves into thinking that a solution is fully meeting our high availability and disaster recovery requirements.

Control Flow vs. Data Flow Demo

I had an interesting question yesterday in class about query execution plan control flow (from the root of the plan to the leaf level) and data flow (from leaf to root) and thought I would share a small demo I put together over lunch break to walk through the discussion.

Let’s start by dropping clean buffers (on a test system, please):

DBCC DROPCLEANBUFFERS;

USE [Credit];
GO

SELECT  COUNT(*) AS [page_count]
FROM    [sys].[dm_os_buffer_descriptors] AS bd
WHERE   [bd].[database_id] = DB_ID() AND
[bd].[allocation_unit_id] = 15045483298816;
GO

This returns 0 rows for the allocation_unit_id associated with the table we’re about to query from the Credit database:

SELECT  [charge].[charge_no]
FROM    [dbo].[charge]
ORDER BY [charge].[charge_no];
GO

The actual plan shows the following (via SQL Sentry Plan Explorer):

image thumb Control Flow vs. Data Flow Demo

Nothing fancy, just a Clustered Index Scan.  And in terms of page counts from sys.dm_os_buffer_descriptors, we see 9,303 data pages in cache now.

Now let’s drop the clean buffers and execute a query returning just the top 100 rows:

DBCC DROPCLEANBUFFERS;

USE [Credit];
GO

SELECT TOP 100
[charge].[charge_no]
FROM    [dbo].[charge]
ORDER BY [charge].[charge_no];
GO

The actual plan is as follows:

image thumb1 Control Flow vs. Data Flow Demo

Re-executing the query against sys.dm_os_buffer_descriptors, this time we see just 13 data pages.

The original question/discussion was with regards to the storage engine – and whether all data pages still get loaded into memory even with a TOP.  As we see in this scenario, this was not the case.

SQLPerformance.com “Troubleshooting SQL Server CPU Performance Issues”

My fifth guest blog post was published today on SQLPerformance.com:

Troubleshooting SQL Server CPU Performance Issues

In this post I describe the general framework you can use to perform root cause analysis for high CPU or CPU-related performance issues.

New Course: “SQL Server 2012: Nonclustered Columnstore Indexes”

Yesterday Pluralsight published my new course, “SQL Server 2012: Nonclustered Columnstore Indexes”.

The intent behind this course was to provide a fast ramp-up on the aspects of this feature that you should consider when looking at whether or not this is a viable option for your SQL Server relational data warehouse workloads.  The course covers the nonclustered columnstore index feature fundamentals, batch-execution mode and segment elimination considerations, data modification techniques, and how to plan for deployment.

Even if you’re unsure of whether or not you’ll be using nonclustered columnstore indexes anytime soon, it is part of the database engine, so from a core SQL Server perspective it is a good idea to familiarize yourself with this topic. With vNext of SQL Server, Microsoft is introducing clustered columnstore indexes, so I think we’ll see a significant uptick in overall usage in the future. 

Exploring Columnstore Index Batch Sizes

In this blog post we’ll explore columnstore index batch execution mode limits.  I’m using SQL Server 2012 Developer Edition, version 11.0.3349 (SP1, CU3).

To start off with, I’ll create a table that uses six different supported nonclustered columnstore index data types:

USE [CS];
GO

CREATE TABLE dbo.[FactContrived]
(
col00 TINYINT NOT NULL DEFAULT 255,
col01 SMALLINT NOT NULL DEFAULT 32767,
col02 INT NOT NULL DEFAULT 2147483647,
col03 BIGINT NOT NULL DEFAULT 9223372036854775807,
col04 SMALLMONEY NOT NULL DEFAULT 214748.3647,
col05 MONEY NOT NULL DEFAULT 922337203685477.5807);
GO

I’ll populate this table with 1,048,576 rows, all using the same value for each row (contrived by-design):

SET NOCOUNT ON;
GO

INSERT dbo.[FactContrived]
DEFAULT VALUES;
GO

INSERT dbo.[FactContrived]
SELECT  [col00],
[col01],
[col02],
[col03],
[col04],
[col05]
FROM dbo.[FactContrived];
GO 20
 

Next I’ll create a nonclustered columnstore index on each column in the table:

CREATE NONCLUSTERED COLUMNSTORE INDEX [NCI_FactContrived]
ON dbo.[FactContrived]
([col00], [col01], [col02], [col03], [col04], [col05] )
WITH (MAXDOP = 1);
GO

Once created, I’ll check the page count statistics with the following query:

SELECT  [partition_id],
SUM([in_row_used_page_count]) AS [in_row_used_page_count],
SUM([lob_used_page_count]) AS [lob_used_page_count],
SUM([row_count]) AS [row_count]
FROM    sys.[dm_db_partition_stats]
WHERE   [object_id] = OBJECT_ID('FactContrived') AND
[index_id] = INDEXPROPERTY
(OBJECT_ID('FactContrived'),
'NCI_FactContrived',
'IndexId')
GROUP BY [partition_id];
GO

This query returns the following results:

image thumb Exploring Columnstore Index Batch Sizes

So as expected, the nonclustered columnstore index is stored within LOB pages (1,050 total).

Next I’ll check the segment metadata for the specific partition_id:

 
SELECT
[column_id],
[segment_id],
[encoding_type],
[row_count],
[primary_dictionary_id],
[secondary_dictionary_id],
[min_data_id],
[max_data_id],
[on_disk_size]
FROM    sys.[column_store_segments]
WHERE [partition_id] = 72057594039762944;

This returns the following 7 rows:

image thumb1 Exploring Columnstore Index Batch Sizes

Observations on this output:

  • We have 7 columns instead of 6 that are defined for the index.  Recall that there is no primary key or unique column for this table – and also look at column_id 7 and its associated on_disk_size and primary_dictionary_id of “-1”.
  • Our data was duplicated across each column when we populated it and notice the min_data_id and max_data_id is identical for each column as expected (except column_id 7).

Next I’ll check the dictionary metadata:

SELECT  [column_id],
[dictionary_id],
[type],
[entry_count],
[on_disk_size]
FROM    sys.column_store_dictionaries
WHERE [partition_id] = 72057594039762944;

This returns the following 6 rows (no dictionary for our column_id 7):

image thumb2 Exploring Columnstore Index Batch Sizes

Notice each dictionary just has one entry (since each column has just one unique value in the table across the 1,048,576 rows).  Also notice the very small on_disk_size is bytes – and the dictionary type of “1” which according to BOL is a “Hash dictionary containing int values.”

Now I’ll execute a query that leverages the columnstore index using batch execution mode:

SELECT    [col00],
MAX([col00]) AS [MaxCol00]
FROM [dbo].[FactContrived]
GROUP BY [col00]
OPTION (RECOMPILE);
GO

The plan (via SQL Sentry Plan Explorer) is as follows:

image thumb3 Exploring Columnstore Index Batch Sizes

The Plan Tree tab shows that batch execution mode was used for the Columnstore Index Scan and the Hash Match (Partial Aggregate) operators:

image thumb4 Exploring Columnstore Index Batch Sizes

As for thread distribution, while the plan had a degree of parallelism of 8 and and one branch with 8 reserved threads, only one thread has rows associated with it:

image thumb5 Exploring Columnstore Index Batch Sizes

Now regarding the actual number of batches, we see the following:

image thumb6 Exploring Columnstore Index Batch Sizes

So with 1,166 batches of rows across 1,048,576 rows, we’re looking at an average batch size of 899.29 rows.

Experimenting with referencing the other columns of this table, we’ll see the same number of batches each time (I’ll spare you the repeated results).

Now when columnstore indexes were first discussed a year or so ago, you would see references to batch sizes of approximately 1,000 rows.  For example, the SQL Server Columnstore Index FAQ talks about how a batch “typically represents about 1000 rows of data.”  Then a few months after SQL Server 2012 I recall seeing references to it being approximately 900 rows.  The results of this particular example show we’re closer to that 900 row value.

Now for contrast – I went ahead and dropped the nonclustered columnstore index, truncated the table, and repopulated it with random values instead of using one unique value per column.  Below is a revisiting of the metadata for the new data distributions after adding back a nonclustered columnstore index…

Partition Stats

image thumb7 Exploring Columnstore Index Batch Sizes

Notice we have more LOB pages for the random data vs. the uniform data – as expected.

Segment Metadata

image thumb8 Exploring Columnstore Index Batch Sizes

Notice the higher on_disk_size and the varying min/max data ranges per segment.  Also notice that with the random, more unique data, there only one of the columns (our tinyint column) has a primary_dictionary.

Dictionary Metadata

image thumb9 Exploring Columnstore Index Batch Sizes

And we see one dictionary entry for our tinyint column.

Revisiting our original test query, the plan shape changes somewhat (additional Parallel Gather Streams), given that we now actually have 256 rows returned based on the tinyint column:

image thumb10 Exploring Columnstore Index Batch Sizes

As for row distribution across threads, we still see one thread handling the batch-mode operations and then a spread of the row-mode rows across the other operators:

image thumb11 Exploring Columnstore Index Batch Sizes

Now as for the number of batches, we see the following:

image thumb12 Exploring Columnstore Index Batch Sizes

So 1,280 batches over 1,048,576 rows.  Averaging 819.2 rows per batch, vs. our previous test’s 899.29 rows.

There is more to explore on this subject – but that’s all for today as I’m about to jump on a plane to Chicago to help deliver IE2: Immersion Event on Performance Tuning.