Statistics Starters Presentation: Scripts, the Database, and Some Answers

This past Tuesday I presented “Statistics Starters” for the PASS DBA Fundamentals Virtual Chapter.  You can read the abstract here, and as you may have guessed from the title, it was a 200 level session on statistics appropriate for anyone who knew of statistics in SQL Server, but wasn’t exactly sure how they were created, how they were updated, how to view them, etc.  Over 300 people attended (thank you!) and I had some great questions.  I plan to answer the questions in a series of posts, starting with this one.

Question: Can we get a copy of the scripts in your demo?  And where can we get a copy of the database you used?

Answer: The scripts, slide deck, and database can be downloaded from the SQLskills demos and databases resource page.  The database I used for these demos, which I plan to continue to use for presentations, is the Lahman baseball database.  While the AdventureWorks database is well known and widely-used, I admit that I have a hard time thinking of good Sales and Product examples in my demos.  I know baseball a lot better than I know sales 🙂

Question: Are we able to rollback newly created statistics if the plans created after an update are bad?

Answer: (edited 2014-01-23 2:45 pm) Great question.  The answer is no kind of.  This is one feature that exists in Oracle that I would be interested in seeing in SQL Server.  Oracle provides the ability to save and restore statistics.  You can even export statistics from one database and import them into another.  Pretty cool…potentially dangerous, but still cool; however, it is not possible to restore statistics in SQL Server if you save out the stat stream first, and then update the statistic with the stream.  Thanks to my colleague Bob Beauchemin (b) for pointing out how it can be done (I learn something new every day).  Johan Bijnens also messaged me to point out that you can script out statistics – which I always forget.  The next step is to update statistics with stats_stream that you script out.  Take note: it is a hack.  Thomas Kejser blogged the steps here, and he has a fantastic disclaimer at the beginning because the method described is unsupported.  Before I write any more about the “feature”, I’m going to do a little testing and hacking of my own.  More to come!

Question: Why should I use the UPDATE STATISTICS command…isn’t sp_updatestats always the best option?

Answer: See my post Understanding What sp_updatestats Really Updates to see why I don’t recommend using sp_updatestats.

Question: Is it good to update statistics after rebuilding an index?

Answer: This is not recommended.  Remember that rebuilding an index updates statistics with a full scan – if you run a command to update statistics after a rebuild, you are wasting resources and the statistics may update with a smaller sample.   This is sometimes not ideal, because depending on the sample, it can provide less accurate information to the optimizer (not always, but it’s possible).

Question: Is it good practice to update statistics if I reorganize the index?

Answer: In general, yes, because reorganizing an index does not update statistics.  I recommend that you pro-actively manage your statistics, and not rely solely on automatic updates (assuming you have the AUTO UPDATE STATISTICS option enabled for your database).  If you are only reorganizing your indexes, make sure that you have another step or job that does update statistics.  If you either rebuild or reorg (or do nothing) based on the level of fragmentation, then you need to make sure you manage statistics accordingly (e.g., don’t update if a rebuild has occurred, do update if you’ve reorganized).

I’ll answer a few more questions in my next post, thanks for reading!

2 thoughts on “Statistics Starters Presentation: Scripts, the Database, and Some Answers

  1. Hi Erin,
    I have went through your articles on statistics. Very helpful to understand it. Now i would like to know two things. how histogram will create, what algorithm it will follow to create histogram in statistics? And also would like to know, how default sampling rate will calculate by sql server while updating statistics automatically?

    Please help me to know above things.

    Thanks

Leave a Reply

Your email address will not be published. Required fields are marked *

Other articles

A Fond Farewell

If you haven’t guessed from the title, I’m writing this post because I am leaving SQLskills. This Friday, January 14th, is my last day, and

Explore

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.