This blog post applies to SQL Server 2012 CTP3 (11.0.1440).

SQL Server 2012 introduces the “indirect checkpoints” feature.  At a high level, it allows you to adjust the target recovery time of a specific database, rather than relying entirely on the SQL Server instance-level ‘recovery interval (min)’ setting. To demonstrate the impact, I’ll walk through a scenario where we’re performing large batch inserts into a data warehousing Fact table.  In this demo I’m using the Codeplex “AdventureWorksDWDenali” database.

As a first step, I’ll back up the database so that I can reset the baseline in order to compare the default checkpoint behavior with the new indirect checkpoint setting:

USE master

GO

 

BACKUP DATABASE [AdventureWorksDWDenali]

TO  DISK = N'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\Backup\IndirectCheckpoint_AWDW.bak'

 WITH NOFORMAT, NOINIT,  NAME = N'AdventureWorksDWDenali-Full Database Backup', CHECKSUM

GO

Now with that out of the way I’ll check the recovery interval of the SQL Server instance:

SELECT value_in_use 

FROM sys.configurations

WHERE name = 'recovery interval (min)'

 

The value in use is “0” – the default.  So basically the estimated checkpoint will be every minute for an active database (although I’m over-simplifying things here – since automatic checkpoint frequency varies based on other factors as well - but for this demo you'll still see a clear difference in the before-and-after).

 

Now I’ll check the indirect checkpoint time for the AdventureWorksDWDenali database:

 

SELECT target_recovery_time_in_seconds

FROM sys.databases

WHERE name = 'AdventureWorksDWDenali'

 

This returns “0” – meaning we haven’t configured this new SQL Server 2012 database option (yet) and so the SQL Server instance level setting is in effect.

 

Now I’ll create an extended events session to track the individual checkpoint events for the database (database_id "5" happens to be the AdventureWorksDWDenali database on my instance): 

 

CREATE EVENT SESSION [track_checkpoints] ON SERVER

ADD EVENT sqlserver.checkpoint_end(

    ACTION(sqlserver.database_id,sqlserver.database_name,sqlserver.is_system)

    WHERE ([sqlserver].[database_id]=(5)))

ADD TARGET package0.ring_buffer(SET max_memory=(2048))

GO

 

 

(By the way - I updated max_memory down to 2MB from the 100MB per Jonathan Kehayias' heads up on potential limits - not to mention that this was more memory than needed for this example).

 

 Next I’ll turn it on:

 

ALTER EVENT SESSION [track_checkpoints] ON SERVER STATE=START

GO

 

And now I’m going to push some I/O by creating a new table and populating it multiple times from the FactInternetSales table:

 

USE AdventureWorksDWDenali

GO

 

SELECT *

INTO dbo.CheckPoint_Test_FactInternetSales

FROM dbo.FactInternetSales

GO

 

INSERT dbo.CheckPoint_Test_FactInternetSales

SELECT ProductKey, OrderDateKey, DueDateKey, ShipDateKey, CustomerKey, PromotionKey, CurrencyKey, SalesTerritoryKey, LEFT(CAST(NEWID() AS NVARCHAR(36)),20), SalesOrderLineNumber, RevisionNumber, OrderQuantity, UnitPrice, ExtendedAmount, UnitPriceDiscountPct, DiscountAmount, ProductStandardCost, TotalProductCost, SalesAmount,

TaxAmt, Freight, CarrierTrackingNumber, CustomerPONumber, OrderDate, DueDate, ShipDate

FROM dbo.FactInternetSales

GO 10

 

This shows the following rows affected:

 

(60398 row(s) affected)

Beginning execution loop

(60398 row(s) affected)

(60398 row(s) affected)

(60398 row(s) affected)

(60398 row(s) affected)

(60398 row(s) affected)

(60398 row(s) affected)

(60398 row(s) affected)

(60398 row(s) affected)

(60398 row(s) affected)

(60398 row(s) affected)

Batch execution completed 10 times.

 

Now I’ll query the track_checkpoints Extended Event session (** and thanks to my colleague Jonathan Kehayias for the following query which I just modified to query against track_checkpoints and then change a few column names **):

 

SELECT

    n.value('(event/@name)[1]', 'varchar(50)') AS event_name,

               n.value('(event/@timestamp)[1]', 'datetime') as event_datetime

FROM

(    SELECT td.query('.') as n

    FROM

    (

        SELECT CAST(target_data AS XML) as target_data

        FROM sys.dm_xe_sessions AS s   

        JOIN sys.dm_xe_session_targets AS t

            ON s.address = t.event_session_address

        WHERE s.name = 'track_checkpoints'

          AND t.target_name = 'ring_buffer'

    ) AS sub

    CROSS APPLY target_data.nodes('RingBufferTarget/event') AS q(td)

) as tab

GO

 

This returns the following checkpoint event data:

 

event_name       event_datetime

checkpoint_end 2011-10-18 20:54:46.200

checkpoint_end 2011-10-18 20:54:46.650

checkpoint_end 2011-10-18 20:54:47.063

checkpoint_end 2011-10-18 20:54:47.433

checkpoint_end 2011-10-18 20:54:47.647

checkpoint_end 2011-10-18 20:54:48.200

checkpoint_end 2011-10-18 20:54:48.597

checkpoint_end 2011-10-18 20:54:49.157

checkpoint_end 2011-10-18 20:54:49.620

checkpoint_end 2011-10-18 20:54:50.127

 

So we see 10 checkpoints for the default behavior.

 

And now I’m going to stop the event session and restore the database back to its original state to show you the comparison to the SQL Server 2012 indirect checkpoint option:

 

ALTER EVENT SESSION [track_checkpoints] ON SERVER STATE=STOP

 

-- Restoring the database back to the original format

USE master

GO

 

RESTORE DATABASE [AdventureWorksDWDenali]

FROM  DISK = N'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\Backup\IndirectCheckpoint_AWDW.bak'

WITH  FILE = 1

GO

 

Next I’m going to change the database’s TARGET_RECOVERY_TIME to 5 minutes (this is the SQL Server 2012 indirect checkpoints feature I was talking about):

 

ALTER DATABASE AdventureWorksDWDenali

SET TARGET_RECOVERY_TIME = 5 MINUTES

 

Just to confirm that it “took” – I’ll validate sys.databases:

 

SELECT target_recovery_time_in_seconds

FROM sys.databases

WHERE name = 'AdventureWorksDWDenali'

 

This returns 300 seconds instead of the 0 seconds we saw before – 300 representing the new target recovery time I just set in minutes.

And now I’m going to enable the event session again and re-populate the table:

 

ALTER EVENT SESSION [track_checkpoints] ON SERVER STATE=START

GO

 

USE AdventureWorksDWDenali

GO

 

SELECT *

INTO dbo.CheckPoint_Test_FactInternetSales

FROM dbo.FactInternetSales

GO

 

INSERT dbo.CheckPoint_Test_FactInternetSales

SELECT ProductKey, OrderDateKey, DueDateKey, ShipDateKey, CustomerKey, PromotionKey, CurrencyKey, SalesTerritoryKey, LEFT(CAST(NEWID() AS NVARCHAR(36)),20), SalesOrderLineNumber, RevisionNumber, OrderQuantity, UnitPrice, ExtendedAmount, UnitPriceDiscountPct, DiscountAmount, ProductStandardCost, TotalProductCost, SalesAmount,

TaxAmt, Freight, CarrierTrackingNumber, CustomerPONumber, OrderDate, DueDate, ShipDate

FROM dbo.FactInternetSales

GO 10

 

How many checkpoints will we see now?  With all things equal between tests, we now only see four checkpoints:

 

event_name       event_datetime

checkpoint_end 2011-10-18 21:06:52.697

checkpoint_end 2011-10-18 21:06:53.520

checkpoint_end 2011-10-18 21:06:54.717

checkpoint_end 2011-10-18 21:06:56.277

 

I tested this before-and-after scenario three times, and each time I saw the identical change in the number of checkpoints.  So I saw a consistent result given an identical test setup.

 

That’s all for now – but two closing thoughts:

 

·        I like how we’re able to adjust this target recovery time at the database level.  This allows us to take into account the recovery time requirements per application if we need to – adjusting upward or downward based on different requirements or issues.

·        I didn’t measure the actual performance impact against the load, but if I had I imagine we would have seen some performance improvement with the second scenario of less frequent checkpoints (but with a trade-off of longer recovery time - a nontrivial consideration).

This post covers examples from Denali  SQL Server 2012 CTP3, version 11.0.1440.

In my last post I hinted towards some interesting findings regarding parallelism and columnstore indexes.  I’ll talk about what I observed in this post.

Just as a quick level-set, columnstore indexing provides two new areas of functionality within SQL Server 2012 that can potentially improve query performance for typical relational data warehouse workloads.  The features contributing to this potential performance gain include the Vertipaq columnstore technology (which uses compression and stores the data by column instead of row) and the new method of query processing that processes blocks of column data in batches.  You can benefit from from the columnstore storage alone  – but you may also see further query performance improvements if the query runs in "batch" execution mode versus "row".  

One key point to underscore– just because you add a columnstore index to a table doesn’t mean that your query will use “batch” execution mode. 

Let’s walk through an example of row vs. batch using the AdventureWorksDWDenali database.  The test server used in this illustration has four visible schedulers (not including the DAC) and 8GB of RAM.

Also – in order to increase the cost of the queries against this table and make it worth the columnstore index's time, I put together the below query to bump up the row count in FactInternetSales from 60,398 rows to 483,184 rows (SalesOrderNumber was part of the primary key, so I populated it with NEWID() and kept the other column values static from the existing rows and then used "GO 3" to populate it a few times):

INSERT dbo.FactInternetSales
(ProductKey, OrderDateKey, DueDateKey, ShipDateKey, CustomerKey, PromotionKey, CurrencyKey, SalesTerritoryKey, SalesOrderNumber, SalesOrderLineNumber, RevisionNumber, OrderQuantity, UnitPrice, ExtendedAmount, UnitPriceDiscountPct, DiscountAmount, ProductStandardCost, TotalProductCost, SalesAmount, TaxAmt, Freight, CarrierTrackingNumber, CustomerPONumber, OrderDate, DueDate, ShipDate
)

SELECT ProductKey, OrderDateKey, DueDateKey, ShipDateKey, CustomerKey, PromotionKey, CurrencyKey, SalesTerritoryKey, LEFT(CAST(NEWID() AS NVARCHAR(36)),20), SalesOrderLineNumber, RevisionNumber, OrderQuantity, UnitPrice, ExtendedAmount, UnitPriceDiscountPct, DiscountAmount, ProductStandardCost, TotalProductCost, SalesAmount
,
TaxAmt, Freight, CarrierTrackingNumber, CustomerPONumber, OrderDate, DueDate,
ShipDate
FROM
dbo.FactInternetSales
GO 3


 

The following query is executed against a table with no columnstore index (I’ve also enabled the actual Execution Plan to show afterwards – and also execute the query twice in order to measure results with the data in cache):

SET STATISTICS IO ON
SET STATISTICS
TIME
ON


SELECT
c.CommuteDistance

       d.CalendarYear
,
       SUM(f.SalesAmount)
TotalSalesByCommuteDistance
FROM
dbo.FactInternetSales as f
INNER JOIN dbo.DimCustomer as c
ON
           f.CustomerKey =
c.CustomerKey
INNER JOIN dbo.DimDate d
ON
           d.DateKey =
f.OrderDateKey
GROUP BY c.CommuteDistance
,
         d.CalendarYear 

The results for STATISTICS IO were as follows:

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'FactInternetSales'. Scan count 1, logical reads 20289, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'DimCustomer'. Scan count 1, logical reads 979, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'DimDate'. Scan count 1, logical reads 66, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

And the results of the SET STATISTICS TIME ON were as follows:

SQL Server Execution Times:

   CPU time = 562 ms,  elapsed time = 625 ms.

 

As for the execution plan, if we hover over the Clustered Index Scan for FactInternetSales you’ll see the “Actual Execution Mode”.

clip_image002[4]

So we see a value of “Row” (the execution mode we’re used to) and “Actual Number of Batches” of 0.

 

Now let’s create a columnstore index:

 

CREATE NONCLUSTERED COLUMNSTORE INDEX [CSI_FactInternetSales] ON [dbo].[FactInternetSales]
(
[OrderDateKey]
,
[CustomerKey]
,
[SalesAmount]
)


GO

Re-executing the query after creating the index, I see the following STATISTICS IO output:

 

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'FactInternetSales'. Scan count 1, logical reads 597, physical reads 0, read-ahead reads 1167, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'DimCustomer'. Scan count 1, logical reads 979, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'DimDate'. Scan count 1, logical reads 66, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 

That’s a reduction in logical reads against FactInternetSales from 20289 down to 597.  Very likeable.

What about STATISTICS TIME?

 

SQL Server Execution Times:

   CPU time = 453 ms,  elapsed time = 485 ms.

 

Okay so CPU time was 562 ms now it is 453 ms.  Elapsed time was 625 ms and now it is 485 ms. 

 

What about the execution plan, were we using batch execution mode?

 

clip_image004[4]

No.  Still using “Row” with a columnstore index scan.   So things brings me to what I observed a few days ago. 

One thing I didn’t mention is that in my test environment -  my max degree of parallelism was set to “1”. 

 

What happens if I uncap the max degree of parallelism for my 4-scheduler server and re-execute the query?

clip_image005[4]

 

After lifting the cap on max degree of parallelism, the query executes using a parallel plan and also switches from “row” to “batch” mode.  Notice also the “Actual Number of Batches” – which shows a value of 1,482.

 

This is all nice – but what about the query performance and I/O.  Any further reductions?

 

The I/O results are as follows:

Table 'DimCustomer'. Scan count 5, logical reads 1072, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'DimDate'. Scan count 3, logical reads 56, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'FactInternetSales'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

What’s this? Zero for all stats related to FactInternetSales?  I even dropped clean buffers to double-check, and it still displayed all zeros.  (More on this for another day.)

 

What about the execution statistics?

 

SQL Server Execution Times:

 

                 CPU time = 78 ms,  elapsed time = 114 ms.

 

Now we're talking big improvements. CPU time was 562 ms without columnstore, then 453 ms with columnstore + row processing and then 78 ms for columnstore + batch processing.  Elapsed time was 625 ms without columnstore and then 485 ms for columnstore + row processing and then 114 ms for columnstore + batch processing.  So more non-trivial reductions.

 

So a few key findings in closing:

 

·        Columnstore + row processing can provide a performance boost (getting column based storage and compression)

·        Columnstore + batch processing can provide additional performance improvements on top of the columnstore index I/O benefits

·        When I capped parallelism, I didn’t see batch processing – so check your execution plan and be aware that there are other factors that impact whether batch processing vs. row processing as well

This post covers examples from Denali CTP3, version 11.0.1440.  

I was working with Denali’s columnstore index feature this last week and was testing it on a virtual machine when I encountered the following error message when trying to create a new index:

The statement has been terminated.
Msg 8657, Level 17, State 5, Line 2
Could not get the memory grant of 91152 KB because it exceeds the maximum configuration limit in workload group 'default' (2) and resource pool 'default' (2). Contact the server administrator to increase the memory usage limit.'

Now at the time the VM I was using was constrained for resources. It was configured to use 1 GB of RAM.

So I shut down the VM and added in 7 GB more of RAM. Given that the “max server memory (MB)” wasn’t capped, I was then able to create the columnstore index successfully. Regarding the memory requirements, BOL states that we need [approximately] 8 MBs times the # of columns in the index times the DOP.  Also, the more string data type columns involved, the higher the memory needed for creating the index.

So I thought I would test out these requirements in a more controlled fashion and see how close the estimates were to the reality.

In this test, I used a VM with 4 processors and 8 GB of RAM, but unlike with my previous experience I capped the “max server memory (MB)” to a much lower value to reproduce the issue I had earlier.  I was adding the columnstore index to the dbo.FactInternetSales table from the AdventureWorksDWDenali database (which can be downloaded here).

For the first step, I capped the max server memory for my test Denali SQL Server instance to a very low value (300 MBs did the trick in this case):

EXEC sp_configure 'show advanced options', 1

RECONFIGURE

EXEC sp_configure 'max server memory (MB)', 300

RECONFIGURE

 

Next I attempted to create the following columnstore index:

USE [AdventureWorksDWDenali]

GO

 

CREATE NONCLUSTERED COLUMNSTORE

INDEX [NCSI_FactInternetSales]

ON [dbo].[FactInternetSales]

(

          [ProductKey],

          [OrderDateKey],

          [DueDateKey],

          [ShipDateKey],

          [CustomerKey],

          [PromotionKey],

          [CurrencyKey],

          [SalesTerritoryKey],

          [SalesOrderNumber],

          [SalesOrderLineNumber],

          [RevisionNumber],

          [OrderQuantity],

          [UnitPrice],

          [ExtendedAmount],

          [UnitPriceDiscountPct],

          [DiscountAmount],

          [ProductStandardCost],

          [TotalProductCost],

          [SalesAmount],

          [TaxAmt],

          [Freight],

          [CarrierTrackingNumber],

          [CustomerPONumber],

          [OrderDate],

          [DueDate],

          [ShipDate])

GO

 

This failed with the following error:

 

The statement has been terminated.

Msg 8657, Level 17, State 5, Line 2

Could not get the memory grant of 143560 KB because it exceeds the maximum configuration limit in workload group 'default' (2) and resource pool 'default' (2).  Contact the server administrator to increase the memory usage limit.

 

So the memory grant request was just over 140 MB.  Looking at the requirements of the columnstore index request and taking into account the equation in BOL, we have 8 MBs x 25 columns x 4 available processors = 800 MB.  Quite a bit higher than the 140 MB it needed at runtime.  But this was assuming 4 processors in the equation.  If we factor in just 1 proc being used – we have 200 MB which is closer to what was being requested and factors in the varying data types and sizes for columns defined in FactInternetSales.

What if I capped the DOP but keep the memory capped at the low value?  I gave it a try:

 

EXEC sp_configure 'max degree of parallelism', 1

RECONFIGURE

 

Attempting a creation of the same index gave the following error message and same memory grant value:

 

The statement has been terminated.

Msg 8657, Level 17, State 5, Line 1

Could not get the memory grant of 143560 KB because it exceeds the maximum configuration limit in workload group 'default' (2) and resource pool 'default' (2).  Contact the server administrator to increase the memory usage limit.

 

So in this case, capping the DOP didn’t reduce the memory requirements.

 

What about using the MAXDOP hint with the CREATE statement (WITH (MAXDOP = 1))?  Again this returned the same memory grant requirement of 143560 KB.

 

What about capping the default workload group itself (not recommending this as standard practice – but rather to further explore memory grant requirements for columnstore)?

 

ALTER WORKLOAD GROUP [default]

WITH(max_dop=1)

GO

ALTER RESOURCE GOVERNOR RECONFIGURE

GO

 

And even after this change, the memory grant requirements remained the same. So I reverted the DOP options to get back to my previous state:

 

EXEC sp_configure 'max degree of parallelism', 0

RECONFIGURE

 

ALTER WORKLOAD GROUP [default]

WITH(max_dop=0)

GO

 

ALTER RESOURCE GOVERNOR RECONFIGURE

GO

 

Regarding the maximum memory grant request itself, the error message gives us enough of a hint on where to look (workload group 'default' (2) and resource pool 'default' (2)).   So I ran the following query to return the request_max_memory_grant_percent value for the default workload group:

 

SELECT request_max_memory_grant_percent

FROM sys.resource_governor_workload_groups

WHERE name = 'default'

 

The value returned was 25%.  So with my 300 MB cap, we’re talking 75 MB.  Not the 140 MB we need. 

I then bumped up the max memory grant percent (but left the max server memory at the low value) just to see what would happen:

 

ALTER WORKLOAD GROUP [default] WITH(request_max_memory_grant_percent=70)

GO

 

ALTER RESOURCE GOVERNOR RECONFIGURE

GO

 

Sure enough – my CREATE NONCLUSTERED COLUMNSTORE INDEX was allowed to execute – but while it executed, it didn't complete.  Instead it ran for 2 minutes and 59 seconds before getting the following error message:

 

The statement has been terminated.

Msg 8645, Level 17, State 1, Line 1

A timeout occurred while waiting for memory resources to execute the query in resource pool 'default' (2). Rerun the query.

I tried creating the index a second time so that I could see what was going on in sys.dm_exec_query_memory_grants - but it executed immediately the second time around. So I set back the max memory grant to the default 25 % - dropped the index and tried to recreate and got the error again.  I then set the max memory grant back to 70% and had the timeout again – but this time I was ready for it and I executed a query against sys.dm_exec_query_memory_grants:

 

SELECT scheduler_id, dop, requested_memory_kb, required_memory_kb, ideal_memory_kb

FROM sys.dm_exec_query_memory_grants

 

The results were as follows:

 

clip_image002[4]

 

What jumped out at me was the DOP value of “1” (even though my SQL Server instance’s “max degree of parallelism” was set to “0” and I had 4 available prcs and the “default” pool dop was also set to 0).

 

Now it seems that the requested memory was remaining the same because the plan was assuming to be maxdop of “1” all along.  Which made me wonder if the requested memory would increase if I added a hint for the creation of the columnstore index to more than one processor (I was reaching, I know, but I'm a fiddler by nature)?  Before testing this, I set back the max memory grant percent to 25%:

 

ALTER WORKLOAD GROUP [default] WITH(request_max_memory_grant_percent=25)

GO

 

ALTER RESOURCE GOVERNOR RECONFIGURE

GO

 

Then I used WITH (MAXDOP = 4) for the query.  The result?  Still asking for a memory grant of 143560 KB.  And repeating the test of bumping up the max memory grant and checking the dop value in sys.dm_exec_query_memory_grants – it remained at “1”. 

So this post was more of an exploration and if I saw this case in the wild I would ask more directly about available memory for the SQL Server instance and/or increasing the max server memory if there was sufficient availability already. 

I would also ask about the necessity of each column being included in the columnstore index definition.  For example – if I needed only half of the columns from FactInternetSales, we’re talking about a 60MB memory grant requirement versus a 140 MB one.

 

During my work last week with columnstore indexes, there were also some interesting findings related to parallelism.  I’ll save this for another post.

This post covers examples from Denali CTP3, version 11.0.1440.

In my last post, “Under The Covers With Sp_server_diagnostics Data (Part I)” I started an exploration of the output surfaced from the sp_server_diagnostics stored procedure.  In this post, I’d like to continue this exploration, having a look specifically at the “query_processing”, “io_subsystem” and “events” component data.

Just a reminder of why we would be looking at this in the first place?  Quite simply, I see this output as a means of gathering out-of-the-box SQL Server health information in absence of (or in compliment of) existing scripts or processes running on the SQL Server instance.

So continuing on the exploration, the query_processing component data can actually be broken down into five useful groupings of statistics:

·        Query processing statistics

·        Top non-preemptive wait statistics by count

·        Top non-preemptive wait statistics by duration

·        Top preemptive waits by count

·        Top preemptive waits by duration

The following is an example of the query processing statistics data:

<queryProcessing maxWorkers="512" workersCreated="46" workersIdle="17" tasksCompletedWithinInterval="12" pendingTasks="0" oldestPendingTaskWaitingTime="0" hasUnresolvableDeadlockOccurred="0" hasDeadlockedSchedulersOccurred="0" trackingNonYieldingScheduler="0x0">

 

The naming conventions are intuitive – returning a point of time measurement of max workers, actual workers created, idle workers, pending tasks, oldest pending task waiting time (I’ll definitely be paying attention to this one) and scheduler deadlock and non-yielding counts.

 

For fans of evaluating wait statistics (and hopefully that’s most SQL Server folks these days) – we have a four sets of Top 10 wait stats results.  The following is the output of the top non-preemptive waits by count (and specifically – by the “waits” attribute):

 

<byCount>

        <wait waitType="HADR_FILESTREAM_IOMGR_IOCOMPLETION" waits="1884" averageWaitTime="967892" maxWaitTime="800" />

        <wait waitType="PAGEIOLATCH_SH" waits="534" averageWaitTime="1329" maxWaitTime="37" />

        <wait waitType="IO_COMPLETION" waits="71" averageWaitTime="128" maxWaitTime="4" />

        <wait waitType="THREADPOOL" waits="53" averageWaitTime="209" maxWaitTime="25" />

        <wait waitType="WRITELOG" waits="35" averageWaitTime="58" maxWaitTime="17" />

        <wait waitType="PERFORMANCE_COUNTERS_RWLOCK" waits="18" averageWaitTime="3" maxWaitTime="1" />

        <wait waitType="ASYNC_NETWORK_IO" waits="10" averageWaitTime="125" maxWaitTime="118" />

        <wait waitType="PAGEIOLATCH_EX" waits="8" averageWaitTime="14" maxWaitTime="6" />

        <wait waitType="SLEEP_DBSTARTUP" waits="8" averageWaitTime="779" maxWaitTime="175" />

        <wait waitType="CLR_AUTO_EVENT" waits="8" averageWaitTime="8527" maxWaitTime="4085" />

      </byCount>

 

We also have the non-preemptive waits by duration (and specifically – by the average wait time):

 

    <byDuration>

        <wait waitType="HADR_FILESTREAM_IOMGR_IOCOMPLETION" waits="1884" averageWaitTime="967892" maxWaitTime="800" />

        <wait waitType="CLR_AUTO_EVENT" waits="8" averageWaitTime="8527" maxWaitTime="4085" />

        <wait waitType="FT_IFTSHC_MUTEX" waits="6" averageWaitTime="4381" maxWaitTime="2977" />

        <wait waitType="SLEEP_MASTERDBREADY" waits="1" averageWaitTime="1836" maxWaitTime="1836" />

        <wait waitType="PWAIT_ALL_COMPONENTS_INITIALIZED" waits="3" averageWaitTime="1558" maxWaitTime="541" />

        <wait waitType="PAGEIOLATCH_SH" waits="534" averageWaitTime="1329" maxWaitTime="37" />

        <wait waitType="LCK_M_S" waits="5" averageWaitTime="1146" maxWaitTime="336" />

        <wait waitType="SLEEP_DBSTARTUP" waits="8" averageWaitTime="779" maxWaitTime="175" />

        <wait waitType="SLEEP_DCOMSTARTUP" waits="1" averageWaitTime="377" maxWaitTime="377" />

        <wait waitType="FFT_RECOVERY" waits="5" averageWaitTime="241" maxWaitTime="118" />

      </byDuration>

 

We have two more result sets for isolating out preemptive waits that are identical in format to the non-preemptive results, so I won’t repeat them here.  And while we can pull aggregated wait statistics from other sources – it is convenient to have this data at-the-ready.

 

Next we have the I/O component data which is far less verbose in my sample output:

 

<ioSubsystem ioLatchTimeouts="0" intervalLongIos="0" totalLongIos="0">

  <longestPendingRequests />

</ioSubsystem>

 

There we can see information on latch timeouts and long I/Os.  My test system didn’t have a value for the longest pending requests – and that is something I’ll look for later – but if you have a sample on your own Denali system, please post to the comments.   I’d also be curious to compare the value of this data against what gets exposed in sys.dm_io_pending_io_requests  and see which has more value from a troubleshooting perspective.

 

The last component to cover from sp_server_diagnostics data is the “events” output (here is the vanilla output without events for starters):

 

<events>

  <session startTime="2011-10-03T14:11:52.380" droppedEvents="0" largestDroppedEvent="0">

    <RingBufferTarget truncated="0" processingTime="0" totalEventsProcessed="0" eventCount="0" droppedCount="0" memoryUsed="0" />

  </session>

</events>

 

BOL states that this will expose events “of interest” recorded on the SQL Server instance – similar to what you could pull from the ring buffer.  So when testing a login failure by putting in a bad login identifier, I saw three events raised – one being “security_error_ring_buffer_recorded” and then the others being two “connectivity_ring_buffer_recorded” events.  The first connectivity event was for an event type of “Error” and another of “ConnClose”. 

The data exposed is pretty chatty – so you might wonder (as compared to the wait statistics) – whether it is worth sifting through.  One example is for security error ring buffer data – which in my testing did show a few bread crumbs of interesting troubleshooting data.  For example, useful data on the error number, state and remote host:

 

<data name="sni_consumer_error">

          <type name="uint32" package="package0" />

          <value>18456</value>

        </data>

<data name="state">

          <type name="uint32" package="package0" />

          <value>5</value>

        </data>

<data name="remote_host">

          <type name="ansi_string" package="package0" />

          <value>&lt;local machine&gt;</value>

        </data>

 

So from this data we know that an error 18456 and state 5 occurred.  We also know that state 5 represents an invalid user id.  Testing a bad password, I saw a state of 8, which does indeed represent a password mismatch (see the post Understanding "login failed" (Error 18456) error messages in SQL Server 2005 for more on this tangent).

So that’s an initial tour of the data exposed by sp_server_diagnostics.  As Denali starts rolling out more broadly, I anticipate several parsing scripts being generated by the SQL community.  There are several use-cases that I can imagine from this data, so I’ll keep watching this space and will report back the more valuable uses as I find them.

 

** Update - Aaron Bertrand wrote a very comprehensive 18456 reference in the following post, "Troubleshooting Error 18456".

Categories:
Denali | Performance

This post covers examples from Denali CTP3, version 11.0.1440.

In this post I’ll be discussing the SQL Server health related output from Denali’s sys.sp_server_diagnostics system stored procedure.  This system stored procedure is used to check Denali SQL Server health status behind-the-scenes for both failover cluster instances and availability groups.  

While this procedure is used in service of base functionality, you can still call sys.sp_server_diagnostics directly yourself and find some pretty useful information embedded in the [data] column.  This is information that you could gather using other methods as well, but if you’re in a situation where you don’t have access to your script toolbox or if there are no existing data collection processes running on the SQL Server instance you have a quick way to snapshot the health of the SQL Server instance at a specific point in time. 

While this procedure is documented in Books Online, the output it returns may change in the future and so you should avoid creating application dependencies on it.   With that said, I think this is a great out-of-the-box tool and I can definitely see using it in the future to supplement other data sources or serve as a quick assessment in a pinch…

For this post I’ll capture a single execution of the stored procedure into a new table in tempdb:

USE tempdb

CREATE TABLE dbo.tmp_sp_server_diagnostics

               ([create_time] datetime,

                [component_name] nvarchar(20),

                [state] int,

                [state_desc] nvarchar(20),

                [data] xml)

INSERT dbo.tmp_sp_server_diagnostics

EXEC sys.sp_server_diagnostics

As an aside, I’m using the xml data type for the “data” column in the tmp_sp_server_diagnostics table – but that is not its actual data type according to Books Online (which states that it is varchar(max)). Aaron Bertrand (b | t) actually posted a Microsoft Connect item regarding this observation several months ago, discussing the merits of using the xml data type instead (one advantage being that the format is easier to display in SSMS). And if I designate the [data] column as xml for the destination output table, it implicitly converts upon INSERT just fine.

The following query returns the 5 rows (without doing anything fancy with the XML value):

SELECT create_time, component_name, state, state_desc, data

FROM dbo.tmp_sp_server_diagnostics

This returned:

create_time                                      component_name            state                     state_desc

2011-10-04 07:57:07.503              system                                1                            clean

2011-10-04 07:57:07.503              resource                             1                            clean

2011-10-04 07:57:07.503              query_processing             1                            clean

2011-10-04 07:57:07.503              io_subsystem                    1                            clean

2011-10-04 07:57:07.503              events                                 0                            unknown

I intentionally left off the xml [data] column off the previous result set as this is what I’ll be walking through over this post and the next. Each row represents a different monitored “component” – covering system, resource, query processing, IO subsystem and events. I won’t recap BOL on what each of the rows reflects – but instead I’ll jump into the output and call out why this is useful.

Since the [data] column was captured in the xml data type, clicking on any value in SQL Server Management Studio should bring you to the output in a separate, formatted window.  So on my system, the first row was for the “system” component, which had a value as follows:

<system sqlCpuUtilization="0" systemCpuUtilization="4" pageFaults="2654" nonYieldingTasksReported="0" intervalDumpRequests="0" totalDumpRequests="0" writeAccessViolationCount="0" isAccessViolationOccurred="0" latchWarnings="0" sickSpinlockTypeAfterAv="none" sickSpinlockType="none" spinlockBackoffs="0"/>

Based on the naming conventions we’re looking at - the following categories of information for the “system” component data get surfaced:

·        CPU utilization (sqlCpuUtilization, systemCpuUtilization)

·        Page Faults (pageFaults)

·        Non Yielding tasks (nonYieldingTasksReported)

·        Dumps (intervalDumpRequests, totalDumpRequests)

·        Access Violations (writeAccessViolationCount, isAccessViolationOccurred )

·        Latches (latchWarnings)

·        Spinlock activity (spinlockBackoffs, sickSpinlockType, sickSpinlockTypeAfterAv )

So basically we’re looking at a bundling of occurrence counts and current values that would be quite helpful to a support engineer.  I’ll be interested to find out what rules are used to assess “clean” state for the system component.  I imagine this can be reversed engineered through various scenarios, but hopefully the logic will be revealed directly.  The actual text of sp_server_diagnostics is “(server internal)” if you try looking at the definition.

Moving along, for the “resource” component data there are three areas of interest in the XML output. The first is around general resource monitoring:

<resource lastNotification="RESOURCE_MEMPHYSICAL_HIGH" outOfMemoryExceptions="0" isAnyPoolOutOfMemory="0" processOutOfMemoryPeriod="0">

So the lastNotification looks to be the memory manager’s latest report, with RESOURCE_MEMPHYSICAL_HIGH telling us that SQL Server can further consume more memory (in this example).   The other attributes cover out-of-memory statistics.

The second section of information from the “resource” component data is a memory report around Process and System counters and counts:

<memoryReport name="Process/System Counts" unit="Value">

    <entry description="Available Physical Memory" value="286638080" />

    <entry description="Available Virtual Memory" value="8792557727744" />

    <entry description="Available Paging File" value="1205690368" />

    <entry description="Working Set" value="184197120" />

    <entry description="Percent of Committed Memory in WS" value="100" />

    <entry description="Page Faults" value="70284" />

    <entry description="System physical memory high" value="1" />

    <entry description="System physical memory low" value="0" />

    <entry description="Process physical memory low" value="0" />

    <entry description="Process virtual memory low" value="0" />

  </memoryReport>

Again this is an example where we are pulling data (such as available physical and virtual memory) that could be pulled via other methods and sources.  But in absence of other ready-to-use queries, using sp_server_diagnostics you have a nice sampling of the data as of a point in time.

The third memory report from the “resource” component data covers Memory Manager counter values:

  <memoryReport name="Memory Manager" unit="KB">

    <entry description="VM Reserved" value="3098256" />

    <entry description="VM Committed" value="162896" />

    <entry description="Locked Pages Allocated" value="0" />

    <entry description="Large Pages Allocated" value="0" />

    <entry description="Emergency Memory" value="1024" />

    <entry description="Emergency Memory In Use" value="16" />

    <entry description="Target Committed" value="389912" />

    <entry description="Current Committed" value="162896" />

    <entry description="Pages Allocated" value="109672" />

    <entry description="Pages Reserved" value="0" />

    <entry description="Pages Free" value="3464" />

    <entry description="Pages In Use" value="122536" />

    <entry description="Page Alloc Potential" value="873184" />

    <entry description="NUMA Growth Phase" value="0" />

    <entry description="Last OOM Factor" value="0" />

    <entry description="Last OS Error" value="0" />

  </memoryReport>

</resource>

In this case, although this summarized data is good, I do think collecting DBCC MEMORYSTATUS would still be useful (more detailed) along with perfmon data so I can see memory trends over time. As with any new feature, the real test of utility is in using it during an actual issue and seeing if it helps.

In Part II of this post I’ll cover the “query_processing” (my favorite out of the bunch), “io_subsystem” and “events” component data.  At a future point I’d also like to share  queries for parsing out the data and also comparing samplings over time, particularly since sp_server_diagnostics has a built-in repeat interval capability, enabling it to sample data every X number of seconds.

Categories:
Denali | Performance

Theme design by Nukeation based on Jelle Druyts