Updated wait stats script for performance and 2014

Very short blog post to let you all know that I’ve updated my wait stats script so that it works on 2014 and also now runs very fast. If you’re using my script, please replace it with the new one.

Check it out on the original post: Wait statistics, or please tell me where it hurts.

Enjoy!

When is fast recovery used?

It’s been a bit light on technical posts here over the last few months but now that summer’s over I’ll be ramping up again with lots of cool stuff planned.

First up is a question that came up on the MCM distribution list this morning. There was a discussion of fast recovery (which I explained in detail in the post Lock Logging and Fast Recovery back in 2009), but in a nutshell is the ability of Enterprise Edition to allow access to a database after the REDO (rolling forward committed transactions) phase of crash recovery has completed and before the UNDO (rolling back uncommitted transactions) phase of crash recovery has completed. The idea is that UNDO can take much longer than REDO, so early access to the database is a good thing, hence it being an Enterprise Edition feature (from SQL Server 2005 onward).

The question essentially became: when is fast recovery used?

The answer is that it’s used whenever a database is started up and needs to have recovery run on it. This means fast recovery will be used:

  • When SQL Server starts up after a crash or shutdown where a database was not cleanly shut down
  • After a cluster failover
  • After a database mirroring failover
  • After an availability group failover
  • When a database state is changed to ONLINE and crash recovery needs to be run

Note that I did not include:

  • When restoring a database from backups
  • When bringing a log shipping secondary database online (this is restoring from backups)

Fast recovery is NOT used during a restore operation. You’ll read in some places online that it is, but those places are incorrect.

So why isn’t it used during a restore sequence?

It’s to do with the underlying mechanism that allows fast recovery. Operations that make changes to a database are logged, and the log record includes a bitmap of what locks were held at the time (examples of this are in the blog post I referenced above). When crash recovery runs, the REDO phase also acquires all the locks necessary to do the UNDO phase, as the REDO phase knows which transactions in the log being recovered need to be rolled back. At the end of the REDO phase, access can be given to the database because recovery can guarantee that no user can block the UNDO phase, as the UNDO phase locks are already held.

So why doesn’t that mechanism work for restores? Well restore doesn’t do one REDO and one UNDO like crash recovery does. For each backup that is restored in the restore sequence, the REDO phase of recovery is performed. This avoids having a really long REDO phase at the end of the restore sequence (which could be, say, a week’s worth of transactions spread over tens or hundreds of backups), and having to have a huge transaction log to hold all those log records.

At the end of the restore sequence, all necessary REDO has already been performed, but the REDO operations have NOT been acquiring UNDO locks. The UNDO locks aren’t acquired because UNDO isn’t likely to be the next phase during a restore sequence. It’s likely to be another restore operation. In that case, it’s likely that some of the transactions that were uncommitted at the end of the last restore become committed during the next restore, so if UNDO locks had been acquired, they would have to be released again. This would involve either rescanning the log records involved or keeping track of which in-restore transactions had acquired which locks. Either of these would be complicated and time consuming, so the benefit hasn’t been deemed worthwhile for the engineering effort involved.

So no fast recovery during restores.

But hold on, I hear you say, database mirroring is just a constant REDO process so how come fast recovery works for that? Back in SQL Server 2005, when a database mirroring failover occurred, the database was momentarily set offline so that full crash recovery would be run when the database came back online, hence allowing fast recovery to work. From SQL Server 2008 onward, that doesn’t happen any more, so there is a mechanism that figures out what UNDO locks are necessary when a mirroring failover occurs, allowing fast recovery behavior. I guess technically that same mechanism could be ported over to the restore code base, but I think it would be difficult to do, and I don’t think there’s enough demand to make the engineering effort and possible destabilization of the restore code worthwhile.

Hope this helps explain things – let me know if you have any questions.

New SQLskills Pluralsight courses

We’ve just released two new courses on Pluralsight:

The first new course is the conclusion of Joe Sack’s query-tuning course: SQL Server: Common Query Tuning Problems and Solutions – Part 2.

The second new course is Kimberly’s log-awaited SQL Server: Optimizing Stored Procedure Performance. It’s more than seven hours long and the abstract is:

When using stored procedures, their performance is dependent on creating the optimal plan for the execution environment and making sure that it remains cached. This comprehensive course will show you all of that and more, and is applicable to SQL Server developers and anyone responsible for ensuring performance of stored procedures, from SQL Server 2005 onward.

The module list is:

  • Introduction
  • Why Use Stored Procedures?
  • Creation, Compilation, and Invalidation Section 1
  • Creation, Compilation, and Invalidation Section 2
  • Optimization and Recompilation Section 1
  • Optimization and Recompilation Section 2
  • Optimization and Recompilation Section 3
  • Optimization and Recompilation Section 4
  • Other Concerns and Considerations

Check them out!

Also, some people have questioned the Basic/Intermediate/Advanced ratings and how they relate to our in-person classes. Any courses rated Intermediate on Pluralsight are the same level as our in-person content.

We’ve got a lot more courses in production and planned for the remainder of 2014 around DMVs, more stored procedure performance, Service Broker, Change Data Capture, fragmentation, and more. Watch this space (and the Insider Newsletter) for details.

Thanks!

How to ask questions politely and correctly

This post is a much-expanded version of the editorial in my last bi-weekly SQLskills Insider newsletter, including a bunch of the feedback I got.

Recently I’ve been complaining on Twitter about some of the ridiculous questions I receive over email (I get a bunch of SQL Server questions every day from random people across the world), where it was clear that the senders had done no research at all before sending me an email, or they demanded a response ASAP, or they didn’t say please or thank you, or a combination of all three. It gets pretty frustrating and it bugs me when people are demonstrably lazy and/or rude (usually without meaning to be). I see the same thing in forum questions too and occasionally on #sqlhelp on Twitter.

Don’t get me wrong, I like getting SQL Server questions in email, and I encourage class students to do it. I’m not being a grumpy old curmudgeon: I think people should learn how to ask questions properly. If you’re old enough to be responsible for SQL Server, you should have basic human-human communication skills.

I had feedback from the newsletter that some people might not know how to use Google/Bing. In this day and age, if someone is responsible for SQL Server in any way, shape, or form, and they don’t know how to search online, they shouldn’t be allowed near any I.T. infrastructure IMHO. I had a few people suggest sending Let Me Google That For You links as replies to emails where the answer can easily be found online, but I think that’s too snarky of a response and I don’t like to be rude.

I came up with a handy rule: if my 14-yr old smart daughter could find the answer on Google from your question, you’d shouldn’t post or ask in email.

But what about the concern of over-use of Google, so people never remember anything and rely on finding the answer online? And what if the answer is wrong? (Sometimes my life is this! :-) Yes, these are legitimate concerns. So maybe do the research and then ask a clarifying question or ask for corroboration from an authoritative source. And research the methodology for solving your problem, not just ‘the answer’. Teach yourself to fish rather than just finding the fish (mangling the old saying…)

And what about the concern about people wasting hours searching on Google rather than spending 5 minutes asking a question? That’s between you and your manager to work out how your company wants you to spend your time. But if you’re sending a random email question to someone *outside* your company, due diligence is not optional unless you’re paying for help.

This all brings to mind a training course from my days at Microsoft entitled Precision Questioning. The course taught how to hone your questioning technique for maximum efficacy, saving time for the questioner and the person/group being questioned. It also stressed the need to do due diligence in trying to find the answer to your question before taking up someone else’s time to help. I think something similar would be of great benefit to technical communities on the Internet where people ask questions.

Here are some quick guidelines for asking SQL Server questions I came up with (many of these apply to asking questions in general):

  • State your problem as unambiguously as possible.
  • State what research you’ve done before asking the question (this show’s you’ve done due diligence). You could even say what you searched for on Google/Bing.
  • State the SQL Server version/SP/CU/build you’re using.
  • State what you’ve done to try to alleviate your problem so far, if anything.
  • Ideally, include all Transact-SQL code to reproduce the phenomena (or, at least the details of the schema, indexes, query, query plan etc.) – just use your best judgment and include these as an annotated/commented script.
  • Include whatever results are pertinent.
  • Try to anticipate what people would ask to get more information, and provide answers in your question.
  • If you’re sending an unsolicited email to someone like me, consider this: if it’s going to take more than a minute or two to read the question, the answer’s going to be something like: this is too long to do in a quick email, please post on forum XYZ or let me know if you’d like some formal consulting help. Free questions are cool (to me), but don’t abuse the privilege. Note that this doesn’t apply if you know me or you’ve been invited to send questions (e.g. from being in a class). You might still get that same answer though, but it won’t be because you’ve abused the privilege.
  • Mandatory: say please and thank you.
  • Don’t demand urgent or immediate help. The only people who can ask for urgent assistance or use ASAP in their email are colleagues and people paying for help. Unless you’re in one of those categories, you’re not entitled to help – doesn’t matter how urgent your problem is. If your problem really is urgent, put yourself into one of those two categories. And even if you’re in one of those categories, make sure you frame the request so it isn’t too demanding as that can come across as rude.
  • DO NOT SEND THE SAME EMAIL QUESTION TO MULTIPLE PEOPLE INDIVIDUALLY. Seriously, this wastes people’s time by having multiple people respond to you. That’s what a forum is for. This is the #1 thing that annoys me and 100% guarantees a response stating that they will not get help from us (unless they’re paying, of course :-) Some people also expand this rule to include not posting the same question to multiple forums at the same time.

You might say to all of this that I have to consider the culture of the person asking the question and cut them some slack if they don’t say please or thank you, for instance. NO!!!! *They* need to consider *my* culture. That’s how communication works. The onus is on the person asking for help to make sure they’re communicating appropriately. This is *especially* true when sending random email. Along these lines, there’s a really great book that I recommend to anyone who interacts in a business setting with people outside their own culture: Kiss, Bow, or Shake Hands (and all its more focused companion titles).

If you’re using #sqlhelp on Twitter, most of this isn’t possible, and it’s accepted to not take up valuable tweet space with please and thank you. But asking a question over two tweets, using 1/2 and 2/2 is perfectly acceptable, and putting ‘Looked on Google already’ shows due diligence. You might be asked to post a longer question on a forum, in which case you should do so.

Whatever your chosen method for asking, make sure you respond to those who respond to you. Nothing puts people off from answering questions more than a complete lack of any thanks, or response to a clarifying question. And it’s just common courtesy (sadly lacking these days, especially in the relative anonymity of the internet).

I could sum all of this up by just asking people to be polite, conscientious, and grateful when asking questions.

If you’re reading this and thinking ‘OMG I sent a question to Paul and didn’t do those things, now he hates me’, you can stop panicking. If you really weren’t polite, demanded urgent help, or showed lack of any research, I would have let you know in my reply.

And don’t forget, this isn’t just for asking questions online, these guidelines should help within your company as well. Feel free to use this list, forward this post, or use it in a way that will help you or your company. I don’t mean this to be a marketing post, but a lot of what I have here is covered in my Communications: How to Talk, Write, Present, and Get Ahead! course on Pluralsight.

Now, where can you ask your question online?

And a whole bunch of other places where people give out help for free, but these are the main ones as far as I’m concerned.

To finish off, I leave you with two blog posts I wrote three years ago:

The more information you can provide, show you’ve tried on your own to find an answer, and are polite and grateful, the more likely you’ll get useful answers quickly. And if you’re sending direct email, you’d better do all of this if you want a reply. Often you’ll find that going through the rigmarole of putting all this into your question will help you get to the answer on your own, so problem solved!

You’ll be amazed at the great help you can get online for free… if you ask properly!

PS Happy to hear your thoughts on this. If I’ve missed anything I’ll update the post based on your comment.

New SQLskills Pluralsight courses

We’ve just released a new course on Pluralsight, and there were a couple of other courses released earlier this year that you may have missed.

The new course is Joe Sack’s SQL Server: Common Query Tuning Problems and Solutions – Part 1. It’s almost three hours long and the abstract is:

There are a wide variety of common performance problems that you will encounter when query tuning. Part 1 of this two-part course begins by showing how to validate query execution assumptions around statistics and cardinality in order to ensure you’re troubleshooting the true root cause of a query performance issue. You’ll then learn about common areas where bad assumptions can occur that negatively impact overall query performance, and about common problems and solutions related to being imprecise when querying a database. The course is very demo-centric, with lots of practical tips and tricks you’ll be able to apply in your environment. This course is perfect for developers, DBAs, and anyone responsible for query performance tuning on SQL Server, from SQL Server 2005 onward.

The other recent courses are:

  • Glenn’s SQL Server 2014 DMV Diagnostic Queries: Part 1 (1hour)
    • It is very common for SQL Server instances to be configured incorrectly in one or more ways, and this often has a detrimental effect on workload performance. This course will describe and demonstrate more than 20 DMV queries you can use to easily examine SQL Server instance configuration information and identify problems. This course is perfect for anyone who is responsible for a SQL Server instance, with most topics also applicable to SQL Server 2012 and earlier versions.
  • Glenn’s Scaling SQL Server 2012 – Part 1 (4 hours)
    • When considering how to improve the scalability of a SQL Server workload, many people jump to the conclusion that scaling up or scaling out are the only answers. This course will show you that there are a huge number of things you can do to improve performance and scalability before you should even think of scale-up or scale-out. The bulk of the course describes architecture and design issues in both the application and the database that can affect scalability, along with demonstrations of some of the issues and detailed scripts to help you find the issues in the first place. The course ends with a section on common indexing problems that can drastically limit a workload’s scalability, along with how to identify and fix them. Save your company a bunch of money by postponing the scaling decision through fixing your existing scalability bottlenecks, not ignoring them! This course is perfect for anyone who is responsible for improving the performance and scalability of a SQL Server workload, with most topics applicable to SQL Server 2005 onward.
  • Erin’s Supporting SQL Server ISV Applications (2 hours)
    • This course follows on from the Developing and Deploying SQL Server ISV Applications course and describes how to effectively support your application and provide the best experience for your customers and your company, whether you work for a large or small ISV and are creating a complex or simple application. The course discusses how to give customers recommendations, how to best support customer implementation of your application, and how to maintain the application database. It then explains how to deal with customer performance issues and overly-persistent customers, plus the importance of capturing usage information from customers and how to make use of it. The course wraps up with a discussion of how to build good relationships and a community with your customers, and how to create meaningful and useful documentation. This course is perfect for anyone involved in supporting applications that use SQL Server for data storage. The information in the course applies to all versions from SQL Server 2005 onward.

Check them out!

We’ve got a lot more courses in production and planned for the remainder of 2014 around DMVs, stored procedure performance, Service Broker, Change Data Capture, fragmentation, and more. Watch this space (and the Insider Newsletter) for details.

Thanks

SQL Server 2005 survey from the SQL Server Team

SQL Server 2005 will end Extended Support on 4/12/2016 (see the Support Lifecyle page for details) and the SQL Server Team in Microsoft is conducting a survey with customers to gain some insight that will help with development of the next version of SQL Server. I volunteered to host a public version of the survey on my blog, take responses from you, and forward them on to the relevant SQL Server Team members.

If you’d like to participate, please cut-and-paste the survey below and email me with your answers. I’ll pass them on, stripping out your contact details if you specify that.

I’ve made the survey into plain text to ease c&p issues. Please don’t put your answers in a comment, just send them in email. I’ll delete your email once I’ve processed it.

[Edit 9/18/14: Please don't send any more answers - the SQL Server team has enough results. Thanks!]

Thanks!

Do you want to be anonymous?

Questions from the SQL Server Team:
1) Is your company still using SQL 2005 for applications in production?
   a) If so, how important are the apps (mission-critical, LOB, etc.)
   b) If so, which SKU (Express, Standard, Enterprise)?

2) In the following list of features, are you or your company still relying on them?
   a) SQL Mail
   b) CREATE/ALTER TABLE or CREATE INDEX syntax without parentheses around table options
   c) sp_configure options 'user instances enabled' and 'user instance timeout'
   d) SQL Server Database Management Objects (SQL-DMO)
   e) sp_dboption
   f) DATABASEPROPERTY
   g) UPDATE table1, table2, ... SET syntax
   h) Database compatibility level 80

3) Are there any features in SQL Server 2005 that has been announced deprecated that you still rely on?
   a) The deprecated features list is: http://technet.microsoft.com/en-us/library/ms143729(v=sql.90).aspx

4) If you're looking to upgrade from SQL Server 2005:
   a) To which SQL Server version are you looking at upgrading to?
   b) What are the biggest upgrade blockers you see?

Many thanks for any information you can provide!

Australia classes in Sydney and Canberra in December

I’ve just released our final classes of the year for registration!

We will be coming to Sydney and Canberra in Australia and teaching our new IEPTO1: Immersion Event on Performance Tuning and Optimization – Part 1 :

  • From the curriculum: As well as optimization techniques, this course will also help with design and architecture so you can prevent performance and scalability problems from happening. The cores of this class are comprehensive coverage of indexing and statistics strategies: a SQL Server workload will not perform well unless these are designed, implemented, and tuned correctly. You will also learn why and how to optimize transaction log operations, tempdb and data file configuration, transactions and isolation levels, and locking and blocking.

Our classes in Australia will make use of the excellent Cliftons facilities in both cities, and the price includes breakfast, lunch, and coffee/tea/snacks through the day.

The dates of the classes are:

  • Sydney, NSW, December 8-12
  • Canberra, ACT, December 15-19

The class cost is US$3,995 plus GST, with an early-bird price of US$3,495 plus GST (cut-off dates vary by class).

For both classes, you can register for US$3,195 plus GST through July 31st only – SUPER-early-bird price!

You can get all the logistical, registration, and curriculum details by drilling down from our main schedule page.

We hope to see you there!

July pricing special for new October classes

To celebrate our newly revamped IE1 and IE2 classes becoming IEPTO1 and IEPTO2, through the month of July you can register for either October class in Chicago at the super-early-bird price of $2,995. This price is usually reserved just for our alumnus students. At the end of July, the price will revert to the normal early-bird price of $3,295.

You can get all the logistical, registration, and curriculum details by drilling down from our main schedule page.

Hope to see you there!

PS The classes in Australia in December will be published on July 1st next week!

Revamped IE1 and IE2 classes open for registration in October

I’ve just released our final US classes this year for registration!

We’ve revamped our IE1 and IE2 classes so they’re now both focused on performance and together they form a comprehensive, 10-day performance tuning and optimization course.

  • IE1 is now IEPTO1: Immersion Event on Performance Tuning and Optimization – Part 1 (see here for the revamped curriculum)
    • From the curriculum: As well as optimization techniques, this course will also help with design and architecture so you can prevent performance and scalability problems from happening. The cores of this class are comprehensive coverage of indexing and statistics strategies: a SQL Server workload will not perform well unless these are designed, implemented, and tuned correctly. You will also learn why and how to optimize transaction log operations, tempdb and data file configuration, transactions and isolation levels, and locking and blocking.
  • IE2 is now IEPTO2: Immersion Event on Performance Tuning and Optimization – Part 2 (see here for the revamped curriculum)
    • From the curriculum: The core of this class is understanding resource usage and we will cover in-depth all the areas of concern for a SQL Server workload: I/O, CPU usage, memory usage, query plans, statement execution, parameter sniffing and procedural code, deadlocking, and the plan cache. You will learn how to use specific tools and techniques for analyzing SQL Server: creating and using performance baselines, benchmarking tools, wait and latch statistics, Extended Events, DMVs, and PerfMon. These techniques will be highly adaptable to whatever situation you encounter and you will understand not just how to capture performance data but also how to interpret it, so you can derive answers to your own performance problems rather than relying on someone giving you the answer.

As before, these classes both stand alone perfectly well, but we strongly recommend taking IEPTO1 before IEPTO2 as all its knowledge is assumed in IEPTO2.

If you’ve taken IE1 previously, you don’t need to go back and take IEPTO1. If you’ve already taken IE1, IEPTO2 is the next course for you. If you’ve already taken IE2 but not IE1, we recommend you take IEPTO1.

Our classes in October will be in Chicago, at our usual hotel in Oakbrook Terrace:

  • IE0: Immersion Event for Junior/Accidental DBAs
    • October 6-8
  • IEPTO1: Immersion Event on P.T.O. – Part 1
    • October 6-10 – special super-early-bird pricing through July!
  • IEPTO2: Immersion Event on P.T.O. – Part 2
    • October 13-17 - special super-early-bird pricing through July!

You can get all the logistical, registration, and curriculum details by drilling down from our main schedule page.

We hope to see you there!

Are mixed pages removed by an index rebuild?

This is a question that came up this morning during our IE1 class that I thought would make an interesting blog post as there are some twists to the answer.

The first 8 pages that are allocated to an allocation unit are mixed pages from mixed extents, unless trace flag 1118 is enabled.

See the following blog posts for more info:

Assuming that mixed pages are not disabled with trace flag 1118, does an index rebuild remove all mixed pages or not?

Let’s investigate. First I’ll create a clustered index with 1,000 data pages:

CREATE TABLE [MixedTest] ([c1] BIGINT IDENTITY, [c2] CHAR (8000) DEFAULT 'a');
CREATE CLUSTERED INDEX [MixedTest_CL] ON [MixedTest] ([c1]);
SET NOCOUNT ON;
GO
INSERT INTO [MixedTest] DEFAULT VALUES;
GO 1000

And then make sure that we have mixed pages be examining the first IAM page in the clustered index’s IAM chain. You can get the sp_AllocationMetadata proc here.

EXEC [sp_AllocationMetadata] N'MixedTest';
GO
Object Name   Index ID   Alloc Unit ID       Alloc Unit Type   First Page   Root Page   First IAM Page
------------  ---------  ------------------  ----------------  -----------  ----------  ---------------
MixedTest     1          72057594046185472   IN_ROW_DATA       (1:987)      (1:1732)    (1:988)
DBCC TRACEON (3604);
DBCC PAGE (N'master', 1, 988, 3);
GO

(I’m just including the relevant portion of the DBCC PAGE output here…)

<snip>
IAM: Single Page Allocations @0x00000000227EA08E

Slot 0 = (1:987)                    Slot 1 = (1:989)                    Slot 2 = (1:990)
Slot 3 = (1:991)                    Slot 4 = (1:1816)                   Slot 5 = (1:1817)
Slot 6 = (1:1818)                   Slot 7 = (1:1819)
<snip>

Now I’ll do an offline index rebuild of the clustered index, and look again at the IAM page contents (assume I’m running the proc and DBCC PAGE after the rebuild):

ALTER INDEX [MixedTest_CL] ON [MixedTest] REBUILD;
GO
<snip>
IAM: Single Page Allocations @0x0000000023B0A08E

Slot 0 = (1:1820)                   Slot 1 = (1:446)                    Slot 2 = (1:1032)
Slot 3 = (0:0)                      Slot 4 = (1:1035)                   Slot 5 = (1:1034)
Slot 6 = (1:1037)                   Slot 7 = (1:1036)
<snip>

So the answer is no, an index rebuild does not remove mixed page allocations. Only trace flag 1118 does that.

But this is interesting – there are only 7 mixed pages in the singe-page slot array above. What happened? The answer is that the offline index rebuild ran in parallel, with each thread building a partial index, and then these are stitched together. The ‘stitching together’ operation will cause some of the non-leaf index pages to be deallocated as their contents are merged together. This explains the deallocated page that was originally tracked by entry 3 in the slot array.

Let’s try an offline index rebuild that forces a serial plan.

ALTER INDEX [MixedTest_CL] ON [MixedTest] REBUILD WITH (MAXDOP = 1);
GO
<snip>
IAM: Single Page Allocations @0x0000000023B0A08E

Slot 0 = (1:1822)                   Slot 1 = (1:1823)                   Slot 2 = (1:291)
Slot 3 = (1:292)                    Slot 4 = (0:0)                      Slot 5 = (0:0)
Slot 6 = (0:0)                      Slot 7 = (0:0)
<snip>

In this case there is only one index (i.e. no parallel mini indexes) being built so there are no pages being deallocated in the new index as there is no stitching operation. But why aren’t there 8 mixed pages? This is because during the build phase of the new index, the leaf-level pages are taken from bulk-allocated dedicated extents, regardless of the recovery model in use. The mixed pages are non-leaf index pages (which you can prove to yourself using DBCC PAGE).

For parallel and single-threaded online index operations, the same two patterns occur as for offline index rebuilds, even though the algorithm is slightly different.

Enjoy!