Wednesday, April 16, 2008

In my blog post on my new sp_helpindex proc (sp_helpindex2), I mentioned that the indexes in my sample were not necessarily a recommended set of indexes - just a test set of indexes. So... in this post, I thought I'd start a series on indexes, limitations and best practices/uses... Especially, why/how to best choose when to use INCLUDE v. having columns in your key. To start, I thought I'd give some background, as well as limitations that exist in various releases from 2005 to 2008 CTP6 (Feb CTP), plus what's expected in the SQL Server 2008 RTM (ah... I did say "expected" so don't come back and yell at me if/when I'm wrong ;-))

First, let's go through a few rules and limitations and background:

SQL Server 2005

  • 250 total indexes per table: 1 clustered index and up to 249 nonclustered indexes (no, this is not a goal!)
  • The index key can be up to 16 columns OR 900 bytes - whichever comes first...
  • The leaf level is limited just as a table is limited to 1024 columns (and, all column types are acceptable in the leaf level of an index - even LOB columns)
  • Statistics are kept for every index (so, up to 250 index-related stats) and there can also be statistics on columns or sets of columns in addition to the index-related stats. In earlier releases, statistics used index ids and as a result, the number of statistics were limited to 250 total minus the statistics used by indexes... in SQL Server 2005, they changed to having statistics kept/managed separately (use sys.stats to see them). As a result of using sys.stats, you can now have 2000 statistics on a table, in addition to the 250 (total) indexes and their statistics. If you want to test this out (and check it on various versions of SQL Server), use this script to setup a test database, a test table and then use dynamic string execution to loop through (until it errors) with creating nonclustered indexes and statistics: testindexmax.zip (.47 KB)

SQL Server 2008 CTP6

  • So far, it seems as though most of the maximums have not yet been lifted...
  • 250 total indexes per table: 1 clustered index and up to 249 nonclustered indexes (and this number  - for CTP6 - includes filtered indexes AND spatial indexes too!)
  • The index key limit hasn't changed (it can be up to 16 columns OR 900 bytes - whichever comes first)
  • The leaf level is still limited just as a table is limited to 1024 columns (and, all column types are still acceptable in the leaf level of an index)
  • Statistics in CTP6 seem to be limited to only 2000 stats per table...

SQL Server 2008 RTM (expected/target... no guarantees on this one :)

  • 30,000 columns per table (mostly to allow sparse columns)
  • 1,000 total indexes per table: 1 clustered index and up to 999 nonclustered indexes. This is also not a goal BUT, it makes sense because of both sparse columns and filtered indexes. Both Paul and I will try to post some entries about sparse columns and filtered indexes in the coming days...
  • The index key limit won't change
  • The leaf level is will be limited just as a table is limited to 30,000 columns (and, all column types are still acceptable in the leaf level of an index)
  • Statistics are also said to be increasing and likely to 30,000... And, for having extra statistics just sitting around and possibly not being used - well, outside of a minimal amount of disk space taken by the stat blob (which does start to get interesting at 1,000s I suppose), even stats that don't get used don't really create much of a problem. So, I'm OK with this one increasing - even significantly - but I have to admit I'm somewhat nervous about the significan't increase in indexes.........

So... you can have A LOT more indexes in SQL Server 2008 but just because you can - DOES it mean that you should?!

And on that - I'll leave you hanging for my next post where I start to talk about WHY they're increasing this (hint: sparse columns and filtered indexes = more columns/more indexes)....

Have fun,
kt

Wednesday, April 16, 2008 1:44:41 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Wednesday, April 02, 2008

OK - so this has been frustrating me for many months... when you create indexes with included columns (which was a new feature of SQL Server 2005), they're not shown by sp_helpindex or by DBCC SHOW_STATISTICS. I understand this not showing for statistics because included columns are not factored into the histogram (that's only the high order element which is the first column in the index) OR the density vector (which only shows the densities (or averages) for the left-based subsets of the key). So, why doesn't sp_helpindex show it? Well... I guess it just didn't get updated for SQL 2005. So, in SQL 2008, I was hoping I'd not only see included columns but also filtered indexes... well, neither is there and sp_helpindex is still the same old proc. Don't get me wrong, you can use SSMS to see all of the index properties for a single index (pane, by pane for each property) OR you can run queries to find the included columns for a given index:

SELECT
(CASE ic.key_ordinal WHENTHEN CAST(AS tinyint) ELSE ic.key_ordinal END) AS [ID],
clmns.name AS [Name],
CAST(COLUMNPROPERTY(ic.object_id, clmns.name, N'IsComputed') AS bit) AS [IsComputed],
ic.is_descending_key AS [Descending],
ic.is_included_column AS [IsIncluded]
FROM sys.tables AS tbl
   
INNER JOIN sys.indexes AS
      
ON (i.index_id >AND i.is_hypothetical = 0) AND (i.object_id = tbl.object_id)
   INNER
JOIN sys.index_columns AS ic 
      
ON (ic.column_id >AND (ic.key_ordinal >OR ic.partition_ordinal =OR ic.is_included_column != 0)) 
         
AND (ic.index_id = CAST(i.index_id AS int) AND ic.object_id = i.object_id)
   INNER
JOIN sys.columns AS clmns 
   
ON clmns.object_id = ic.object_id AND clmns.column_id = ic.column_id
WHERE (i.name = N'[MyIndex]') AND ((tbl.name = N'[MyTable]' AND SCHEMA_NAME(tbl.schema_id) = N'[MySchema]'))
ORDER BY IsIncluded, [ID] ASC

but, there isn't a nice clean way to show all of the included columns for all indexes for a particular table... until now :)

A couple of weeks ago I sat down and rewrote sp_helpindex. I was actually on a plane from Hyderabad to Frankfurt or from Frankfurt to San Fran or from San Fran to Seattle (it was a long day :) and I was using (and well, forcing myself to learn how to use :) my new Vista laptop. OK, that's a HUGE story in and of itself and it definitely warrants its own post but I'll sum up the story with the fact that I had to purchase a new laptop while in Hyderabad because BOTH my primary laptop (T61p) AND my backup laptop (T60p) BOTH (yes, BOTH!!!) suffered catastrophic disk failures on their boot drives within 24 hours of each other. In the end, I really cannot believe the "coincidence" of two laptops crashing within 24 hours of each other. Yes, I thought MTBF too (at first) but the laptops were two Lenovos - one Lenovo (the T60p) was purchased in Feb 2007 and the second, a Lenovo T61p was purchase in Oct 2007. And, it was the T61p that went first. The only thing I can even begin to speculate about and/or think to attribute it to (as I was in India for 17 days from Mar 3 through Mar 20 and this all started on Mar 17) was an overactive metal detector at the hotel at which I was staying (or something related to St. Patrick but I've since ruled that out - and no, I wasn't drinking green beer either...). OK, I really need to do another post to give you all of the details about this trip BUT, I did get a new laptop... and, having just bought it only shortly before I flew back I felt like I really needed to get my money's worth so I just *had* to work on the flights home (ah, security with *3* laptops was NOT fun and I'm *VERY* glad that none of them asked me to "boot" my laptops to prove they were working... that could have been a VERY bad situation... lol).

OK - so back to the story... I was working on the flights and I was preparing to deliver some content on the Friday after I returned (yes, I taught a full day in India on Wednesday then flew back leaving India at 2:15am Thursday morning so that I could arrive back in Redmond at roughly 7pm Thursday night - about 30 hours later - and then teach Friday morning for an 8:30 start time... ah, I was *really* tired on Friday night :). Anyway, in preparing, I decided that I finally needed to re-write sp_helpindex. When I was first writing it, I was only thinking of SQL Server 2005. So, here's the 2005 version that I wrote: sp_helpindex2_2005.zip (2.71 KB).

So, I had wanted to blog that when I got back to Redmond but in preparing for the trip we're on now AND rebuilding my primary and backup laptops, well, it got tabled. So now, today, Paul and I are in Iceland (working with our great friends at Miracle Iceland) and we're teaching "the Accidental DBA" (this past Monday) and SQL Server 2008 New Features in Database Infrastructure and Scalability (Tue through Thursday)... I was giving a lecture on Filtered Indexes in SQL Server 2008 and I, once again, found myself needing a better sp_helpindex. So, when Paul got up to talk about Compression (which is no short lecture for him :), I had time to rewrite sp_helpindex... again. And, here's what I ended up with...

exec sp_helpindex2 'member'

index_name index_description index_keys included_columns filter_definition
member_corporation_link nonclustered located on PRIMARY corp_no NULL NULL
member_ident clustered, unique, primary key located on PRIMARY member_no NULL NULL
member_region_link nonclustered located on PRIMARY region_no NULL NULL
NCIndexCoveringLnFnMiIncludePhone nonclustered located on PRIMARY lastname, firstname, middleinitial phone_no NULL
NCIndexCoversAll4Cols nonclustered located on PRIMARY lastname, firstname, middleinitial, phone_no NULL NULL
NCIndexLNinKeyInclude3OtherCols nonclustered located on PRIMARY lastname firstname, middleinitial, phone_no NULL
NCIndexLNOnly nonclustered located on PRIMARY lastname NULL NULL
QuickFilterTest nonclustered located on PRIMARY lastname phone ([lastname]>'S' AND [lastname]<'T')

So, in the end, I can quickly see whether or not my index has included_columns and/or a filter_definition. Don't get me wrong, these indexes above are NOT necessarily a good combination of indexes (or recommendation of ANY kind) to have - these were just created to make sure that my code works. And, as my good friend Gunnar would say - "it's not my best code but it's not my worst code either" <G>. And, so, here it is: sp_helpindex2_2008.zip (2.75 KB).

Pretty darn useful for sure! Oh, and I used the undoc'ed sp_MS_marksystemobject so that I could still create the sp_ in master but then execute it in all other databases. It's frustrating that this behavior (with sp_ named objects) no longers works in 2005/2008 but at least the sp_MS_marksystemobject still sets the behavior so that we can create this one proc in master but use it in all other databases.

Have fun!
kt

Wednesday, April 02, 2008 10:22:40 AM (Pacific Standard Time, UTC-08:00)  #    Comments [4]  | 
Wednesday, March 12, 2008

A couple of weeks ago, Paul and I recorded two interviews with TechNet Radio... both are ready for download and in multiple formats! 

Our specific interviews can be downloaded from the following links/formats:
  SQL 2008 Part 1 of 2: Security and Availability WMA | MP3 High | MP3 Low
  SQL 2008 Part 2 of 2: Management, Troubleshooting and Throttling  WMA | MP3 High | MP3 Low
  More TechNet Radio interviews (and *lots* of other shows), can be found on Channel 9.

Enjoy! 
kt

Wednesday, March 12, 2008 8:11:17 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Monday, March 10, 2008

(Cross-posted on Paul and Kimberly's blogs)

With the Spring SQL Server Connections show coming up next month, it's time to start planning for the Fall show. Once again, Kimberly and I will be co-Chairs of the Conference and we'd like to invite you to submit abstracts for the Fall 2008 SQL Connections conference, to be held in Las Vegas, November 10-14th, 2008.

The conference will take place shortly after the SQL Server 2008 launch (when it actually RTMs, not the 'launch' that happened February 27th), and will focus heavily on SQL Server 2008. Abstracts are still welcome on best practices for SQL Server 2005 and how to upgrade and migrate applications from SQL Server 200x to SQL Server 2008. However, we will consider all topics - from design to performance to troubleshooting to architectures to programming. Also, we're likely to closely follow our past conference format with three primary themes/tracks: Infrastructure and Scalability, Programming, and Business Intelligence. You should aim for the intermediate to advanced audience for any SQL Server 2005 materials and for SQL Server 2008 we'll consider any level - especially depending on the newness of the release.

Note: You will not be able to speak at a DevConnections show if you are also presenting at a competitor’s show, in the same state, within 30 days of the DevConnections show. To be more specific, if you wish to speak at SQL Connections or any other DevConnections show this Fall in Las Vegas, you are precluded from speaking at any competing conferences in the state of Nevada between October 10, 2008 and December 13, 2008.

For submitting session abstracts, please use this URL:  http://www.deeptraining.com/devconnections/abstracts

The tool will be open from March 10th to midnight EST April 9th, after which we won't accept any abstracts. If you have used this site before and have forgotten your password, you can have your secret word mailed to you. It's better to do this then to create a new account. Please keep the abstracts under 200 words each and in one paragraph. Please do not use bulleted items or line breaks, and please use a spell-checker. If you have an issue with the site, please follow up with an email to paul@sqlskills.com.

Please create a new speaker record (if you haven't submitted before) and submit at least 3 abstracts; however, it will help your chances of being selected if you submit 5 or 6 abstracts. There's one slight change for this conference only - we won't have a fourth track for speakers just delivering a single conference session, so if you only submit a single session abstract, you're unlikely to be picked.

What you will get if selected:

  • $500 per conference talk. (Additional compensation for pre/post conference workshops.)
  • Coach airfare and hotel stay paid by the conference
  • Free admission to all of the co-located conferences
  • Speaker party
  • The adoration of attendees <img src="http://www.sqlskills.com/blogs/paul/smilies/happy.gif">
  • etc.

Thanks for your continued support for our SQL Connections conference and we look forward to seeing your bright and shiny new abstracts!

Thanks,
Kimberly L. Tripp & Paul S. Randal
SQL Connections Conference Chairs

Monday, March 10, 2008 1:32:15 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Tuesday, February 05, 2008

I know that Paul and I recommended that you subscribe to Conor's blog... but have you? He's posted some great details on Partitioning (Part 1 and Part 2) as well as statistics and it always reminds me of how much I can learn from other people's perspectives!

And, just to dove-tail on some of his statistics comments... I, too, have found that as tables get significantly larger AND have non-standard distributions of more than 200 distinct values (and un-even distribution between those values as well), that the optimizer just cannot possibly do a perfect job. The only way an optimizer can be good is when it can "find a good plan fast" (which I first heard from Nigel Ellis (former Development Manager of the Query Processor team) - back when he delivered a Pacific Northwest SQL Server User Group meeting many moons ago). The most important thing to realize is that it's just not possible to waste time to find the absolutely best plan... mathematically analyzing all permutations would be prohibitive - you'd have to take a vacation between query executions (wait, that's not a bad idea... I digress :).

The point:

  1. Make sure that statistics are up-to-date (either through the database option: auto update stats OR by manually updating statistics)
  2. Consider re-evaluating statistics over large tables (and, when poor performance occurs - look at the estimated rows v. the actual rows - if the estimate/actual are off by a fact of 10, then it could be the statistics). I'd try updating stats first and then if that doesn't work, updating with a fullscan. If neither of those work, I'd also re-evaluate other possible indexes (there are some distributions between tables being joined that just can't show a correct correlation between the values when in multiple indexes... sometimes the best index is a multi-column (ie. composite index)). 
  3. Consider breaking very large tables down into smaller chunks (not just table index partitioning but possibly Partitioned Tables AND Partitioned Views) as this can give the optimizer additional details about partiticularly interesting data sets. Even in SQL Server 2008, statistics are still table-level (filtered indexes can provide some, but not complete, relief... I'll give more details in a later post) but I'd often argue that some of the best table designs are not just for a single table. Consider the statistical, locking, and indexing implications for mixed workloads against a single table (and the tremendous amount of blocking that could occur in addition to varying access patterns). And, even while 2008 will offer Partition-level lock escalation, well-designed tables may not need it! I know I've mentioned this before but different perspectives on statistics, optimizers and the fact that a good optimizer has to be efficient in-and-of itself, remind me of some of the most basic things that are also the most common problems contributing to poor query performance.

Returning to the basics and optimizing a system from the ground up always leads to better scalability!

Enjoy!
kt

 

Tuesday, February 05, 2008 12:06:01 AM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 
Sunday, October 21, 2007

OK, I still have a way to go in learning about data compression in SQL Server 2008 but one thing that I do know is that nothing is free. So, the trade-off will be performance (i.e. CPU) v. space. And, that's not really a new trade-off wrt to compression. Sometimes that trade-off has other benefits that still minimize the overall cost (for example, backup compression compresses in-memory and before it goes to disk... this actually makes the overall backup process faster because the actual backup written to disk is smaller). However, if we're talking about data and data access, then we need to think more about how the data is going to be used as well as the impact on performance. I can definitely think of many reasons to compress older (and read-mostly, if not read-only) data (mostly due to volume) but depending on the queries and the impact to uncompress it (based on the volume of data being accessed), I'm going to do a lot of testing before I compress high performance/OLTP data. To help estimate the savings on space, SQL Server 2008 offers a stored-proc: sp_estimate_data_compression_savings.

Compression in SQL Server 2005
SQL Server 2005 offers the ability to have read-only data compressed using Windows NTFS file compression. File compression is only supported for secondary non-primary data files and only when they're set to read-only. If the entire database is set to read-only then all files (incl. the primary and log) can be on compressed drives. While supported, and it can make sense to do this when you have large amounts of historical data, it's still not very granular.

The other form of compression in SQL Server 2005 was introduced in SP2 as data compression for the decimal/numeric data types, called vardecimal. First, you enable compression at the database level and then you turn it on at the table level. The primary form of compression used by vardecimal is when your actual values are generally much smaller than the defined/declared decimal/numeric column. For example, if you've chosen to define a lot of columns as precision/scale (38,4) then as a decimal column each value (per column, per row) will take 17 bytes whether you use all of it or not. If you only store the value 87.5 (which would normally take only 5 bytes as a decimal(3,1)) then you're wasting 12 bytes. This form of compression will still be supported in SQL Server 2008 so if you're interested in how the vardecimal type works, check out this whitepaper. As for the new forms of compression... row-level compression is similar to vardecimal, but the other forms are quite different, and very interesting (especially the page-level dictionary compression)!

Compression in SQL Server 2008
In addition to offering support for NTFS file compression and vardecimal, SQL Server 2008 offers row-level compression or page-level compression (which includes row-level compression) AND it offers the ability to turn these on at the partition-level or at the table-level for all partitions. While I think the per-partition option is excellent, you might still want to separate your OLTP and read-only data into separate tables for other benefits (like online index operations which I mentioned here) but, the "table-level only" options are certainly starting to decrease! And, more granular options always means better manageability.

So, how does compression work in SQL Server 2008:
   Paul wrote about backup compression here.
   Sunil wrote about data compression here and here.
   Chad Boyd wrote about both here.

Paul and I will post more on compression... I really want to get some numbers regarding performance and Paul will dive into all of the internals using DBCC PAGE (go figure! :).

Enjoy!
kt

Sunday, October 21, 2007 6:53:12 PM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 
Saturday, October 06, 2007

Does it sound too good to be true: transparent data encryption? Well, it kind of is and kind of isn't. Let me explain. Transparent means that the application developer doesn't do anything (and if you have a third party app in which you can't do anything anyway, then this is even more important). Once enabled, the data is seemlessly protected.

But, protected from what? That's the part that needs a bit of explaining...

We all talk about the importance of physical security (I hope this doesn't sound new)? We all talk about the importance of protecting the physical media to which you've backed up your database(s)? We all talk about protecting the server/machine from physical access and of course, protecting the mdf/ldf files from unauthorized access if we copy them to other machines for attach. And, we all know that with all releases of SQL Server, up to and including SQL Server 2005 (with one exception - details coming up), that our databases are susceptible to unauthorized access if someone gains access to the physical files (whether the actual data files and/or the backup). The one exception is if data within the database has been encrypted using SQL Server 2005 column level encryption. As long as your database master key (created to allow encryption within the database) is strong and protected, then the data that is encrypted with it is also protected (you're only as good as your key protection mechanism). However, any data that is not encrypted is fair game to any system administrator who solely attaches the database or restores it. This is not good. However, it is reality in general. Physical security is incredibly important.

Now, having said all of that, SQL Server 2008 is going to change the landscape a bit. SQL Server 2008 is going to offer a way of protecting this "resting" data - transparently. Basically, once this is turned on, your data is protected from an unathorized attach OR when backed up, unauthorized restore.

This feature is not in the current CTP but I hope to see it in the next publicly available CTP (which will hopefully be out soon). So, when we can finally see it, I'll give you more details such as how to implement it, whether or not it negatively/positively impacts other features and the impact on performance.

Thanks for reading!
kt

Saturday, October 06, 2007 5:31:34 PM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 
Thursday, October 04, 2007

With the November 5-9th SQL Connections conference in Las Vegas around the corner, Shirley Brothers, the Connections Conference Manager, would like to start planning the Spring show. And, for the Spring show forward, Paul and I will be the co-chairs of the SQL Connections conference. We would like to invite you to submit abstracts for the Spring 2008 SQL Connections conference, to be held in Orlando, April 20-24th, 2008.

The conference will take place shortly after the SQL Server 2008 launch (Feb 27, 2008), and will focus on best practices for SQL Server 2005, how to upgrade and migrate applications from SQL Server 200x to SQL Server 2008 and new features to leverage in SQL Server 2008. However, we will consider all topics - from design to performance to troubleshooting to architectures to programming. Also, we're likely to closely follow our past conference format with three primary themes/tracks: Infrastructure and Scalability, Programming, and Business Intelligence. You should aim for the intermediate to advanced audience for any SQL Server 2005 materials and for SQL Server 2008 - we'll consider any level - especially depending on the newness of the feature.

For submitting session abstracts, please use this URL: 
http://www.deeptraining.com/devconnections/abstracts

Please keep the abstracts under 200 words each and in one paragraph. Please do not use bulleted items or line breaks, and please use a spell-checker. If you have an issue with the site, please follow up with an email as well to
paul@sqlskills.com.

Please create a new speaker record (if you haven't submitted before) and submit at least 3 abstracts; however, it will help your chances of being selected if you submit 5 or 6 abstracts. We need to have all your abstracts by October 26th. 

Finally, please see some important notes about your sessions at Connections conferences. These comments are from our Connections Conference Manager - Shirley Brothers:

=========================================================================
I would like to thank each and every one of our Connections speakers for helping us make this event so successful. Last year in Vegas we had just over 4,700 attendees; we hope to do that or better this fall. By the end of September we had over 3,000 registrations for our Fall show.

A successful show happens when you have a combination of things come together: great speakers, good venue, great partners, fun events, and ongoing relationship building. I hope we can all continue to work together to make Connections the very best event outside of Tech Ed, for years to come. As a more successful show, our competitors are trying to compete by moving their shows closer to our dates and in some cases, to the same cities.

It’s very disheartening for me to see our speakers presenting similar topics at competing shows that are scheduled so close to our own shows. So I want to make a small change in how we handle speakers. Essentially, I don’t want to schedule a speaker at a Connections show who is also presenting at one of our competitor’s shows, in the same state, within 30 days of a Connections show. Many of you have known me for years and you know that I would never discourage a speaker from doing something that is good for his/her career and company. I have never asked a speaker not to speak for a competitor and I am NOT saying that now. I am saying that if another show sits within 30 days of ours in the same state, that the speaker should choose to speak at one or the other conference, but not both. If for whatever reason a speaker does not choose Connections during that time period, there will NOT be hard feelings and they can speak for one of our shows at another time. 

I want Connections conferences to be different and unique in the minds of our attendees, our sponsors, and our speakers. I have given this a lot of thought and I think the best way to produce our Connections shows is not to have any of our speakers presenting at our show and at a competitor’s show a few weeks earlier or later in the same place.

Shirley
=========================================================================

Thanks for your continued support for our SQL Connections conference and we look forward to seeing your bright and shiny new abstracts!

Thanks,
Kimberly L. Tripp & Paul S. Randal
SQL Connections Conference Chairs
Thursday, October 04, 2007 5:27:26 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

ITForum.gifITForum.gif

 

 

 

OK, so after SQL Connections in Las Vegas, Paul and I head off to Barcelona for the second week of TechEd's two week event (week one for developers and week two for IT professionals). November's definitely a busy month. So, if you're in the US - we hope to see you at Connections and if you're in Europe, we hope to see you at Microsoft TechEd ITForum 2007.

 

Here's what we're doing:

Sessions

  • (DAT205) The Next Release of Microsoft SQL Server: Manageability Overview

The next release of SQL Server will contain exciting new manageability features targeted at reducing total cost of ownership. Come learn more about what's in store in areas like policy-driven administration and performance data collection and analysis. The session focuses on the database engine.

  • (DAT301) SQL Server Indexing - Unravelling the Unknown

Knowing tips and tricks to indexing is extremely helpful and will help you to solve "known" query problems. But what's lurking in the unknown? Is SQL Server using your indexes? Or, do you have a bunch of indexes sitting around wasting space and negatively impacting performance? Finally, SQL Server 2005 has an answer! SQL Server 2005 DMVs (Dynamic Management Views) can provide you with valuable information about your current indexing strategies, what should be removed, and even what's missing. Do you know how to find this information, leverage it, and then programmatically respond to it? Come to this session to learn a few tips and tricks as well as how to figure out the unknown indexing problems!

  • (DAT305) Secrets to Fast Detection and Recovery from Database Corruptions

How can you tell whether your data is corrupt? If you have corruption, how do you work out what’s wrong with the database? How do you ensure you have a valid backup? If you don’t have a valid backup, how and what do you repair? If you do have a backup, how do you work out whether you should restore or repair? And at what granularity? How do you go about determining what went wrong in the first place? It’s all about limiting downtime and data-loss when a corruption occurs - from knowing the tools to understanding the choices to planning a successful strategy. Some of the features discussed:

  • Torn-page detection and page checksums
  • IO read-retry
  • Backup checksums
  • Consistency checks (DBCC CHECKDB and related commands)
  • Database repairs

Facing database corruption is almost inevitable in every DBAs career - make sure you're prepared when it happens to you.

Chalk-talks

  • (DAT01-IS) SQL Server Upgrade Best Practices, Tips, and Tricks

Even though SQL Server 2005 has been out for a while, many companies are only just getting ready to upgrade. Come to this session to learn best practices, tips, and tricks distilled from two years of customer experiences. We'll also touch on some issues you'll face going to SQL Server 2008 when it's released next year. Come to this session to learn and share - bring your questions and experiences!

  • (DAT02-IS) SQL Server 2005 Database Mirroring: Setup to Implementation to Monitoring

Database Mirroring is one of the most exciting technologies in SQL Server 2005. With more and more people including it in their Disaster Recovery Strategies, it's important to know when to implement Database Mirroring as well as the implications of the architecture you choose. In this session there will be no slides just demos that will explore how Database Mirroring works in its various configurations and how that may effect your performance. Join us and see database mirroring in action and get your questions answered!

  • (DAT07-IS) DBCC Internals

All DBAs should have heard of (and used) DBCC – especially for consistency checking. Get down deeper than 400-level with this session on how some of the most important DBCC commands work. Topics covered include CHECKDB, SHRINKFILE, INDEXDEFRAG, and more.

Instructor-led labs

  • (DAT03-ILL) SQL Server Always On Technologies Instructor-Led Lab: Part 1 - Database Mirroring

See Database Mirroring in action! From implementation to monitoring to failover, database mirroring provides an ideal solution for many disaster recovery scenarios and this session will prepare you to handle them with minimal downtime or data loss.

  • (DAT04-ILL) SQL Server Always On Technologies Instructor-Led Lab: Part 2 - Database Snapshots

Database Snapshots are useful in many situations: database maintenance, data recovery, and point-in-time data access. You can even create a snapshot on a mirror database to get better return-on-investment (ROI) on your high-avaialbility (HA) investments. In this session we will explain how database snapshots work as well as go through several exercises, including working with multiple database snapshots and creating database snapshots on a mirror database.

  • (DAT05-ILL) SQL Server Always On Technologies Instructor-Led Lab: Part 3 - Online Operations

The bane of any DBA's life is to have to take data offline to perform maintenance or recover from a disaster. The various Online Operations in SQL Server 2005 alleviate much of this frustration. This session will show you how to move a table online for better isolation and control, partition a table online, access a database that's partially damaged, and perform online piecemeal restore.

Lunchtime Demos

  • (DAT01-PD) Database Recovery Techniques

In this fast-paced demo session nasty things will be done repeatedly to database. Then the methods and approaches to recovery will be shown. Not for the faint-hearted!

So, just like SQL Connections the week before, serious amounts of info with tips and tricks for you to take home and implement!

It's going to be a great week. We hope to see you there!!

Kimberly (and Paul)

Thursday, October 04, 2007 8:46:31 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Wednesday, October 03, 2007

120x240_SQLConn_IBT.jpg120x240_SQLConn_IBT.jpg

 

 

OK, it's about that time again - the Fall conference season - is here! Building on our co-presented Database Maintenance workshop at SQL Connections in Orlando, Paul and I are doing a *ton* of stuff at SQL Connections this Fall. The conference is back at the Mandalay Bay hotel and officially runs from November 5th to 8th, with pre-con workshops on the 5th. But, after Spring, we decided to significantly add recovery content to our maintenance content that we decided to have our maintenance content as a pre- pre-conference workshop on the 4th and then on the 5th, we have all new content on Disaster Recovery and Lessons Learned. AND, after the week of sessions, we decided to add a HANDS-ON (bring your own laptop) post-con workshop on the 9th! Our day off is Tuesday the 6th as it's Microsoft day... with a session line-up looks great with lots of juicy details about SQL Server 2008 - as well as some best-practices sessions for those of you who are happy with SQL Server 200x for now.

 

 

Here's what we're doing:

Workshops

  • November 4th - Pre-pre-con: SPR301: SQL Server Database Maintenance: From Planning to Practice to Post-Mortem

No matter how much effort you spend on the design of your database, if you don't maintain it in production then it will suffer from performance and manageability problems. The key to continued performance and smooth operations is a well thought-out and automated database maintenance plan. This full-day workshop has three sections: planning, practice, and post-mortem. Planning for database maintenance actually starts with database design, so one of the things covered will be how to avoid design choices that limit database maintenance or contribute to maintenance problems. We'll discuss a laundry-list of maintenance problems and then explore how to tell if you need to mitigate them, strategies and best-practices for doing so, and how to avoid having your mitigation choices cause unforeseen and undesirable side-effects. Topics covered will include database files (shrink, grow, virtual log files, log size/management), consistency checks and corruption detection, fragmentation, statistics, backup/restore (options, granularity, strategies) and recovery models. The workshop will vary between 200-400 level covering ALL the key concepts of maintenance features. Finally, we'll spotlight some real-world examples where people made good and bad choices and discuss how you can repeat or avoid them, respectively. If you're wondering how to bring your database back under control, and keep it there, then this full-day workshop will help you tame maintenance problems whether you're a full-time system administrator or a reluctant DBA.

  • November 5th - Pre-con: SPR303: SQL Server Disaster Recovery: From Planning to Practice to Post-Mortem

Every DBA's nightmare is having down time and data loss and not knowing how to recover. However, designing and implementing a successful disaster recovery strategy is easier said than done. It's about asking all the right questions and figuring out all the best answers for your situation. This full-day workshop has three sections: planning, practice and post-mortem. Planning is a critical part of disaster recovery, but the most-often disregarded. Topics we'll cover here include: How do you choose technologies to fit requirements and effectively use key features of SQL Server 2005? How do technology choice affect workload performance? Putting a well-thought out plan into practice requires even more planning and in this section we'll discuss technology implementation, building step-by-step recovery/operation guides for when disasters happen, and, most importantly, testing your plan by simulating real problems. In the final section, we'll spotlight some real-world examples where people made costly mistakes and show you how they could have been avoided with a little planning and practice. If you've ever had nightmares about disaster recovery (or actually had a disaster!) and been at a loss for what to do, then this full-day workshop will give you the direction and technical details you need for success!

  • November 9th - Post-con: SPS302: SQL Server - Put Your Knowledge Into Action (Bring Your Own Laptop)

After a week of learning and watching demos - spend your last conference day putting your knowledge into action and diving deeper into the implementation details. Bring your own laptop to install our VPC environment setup with hands-on lab exercises to walk you through some of our most important features in Database Maintenance and Disaster Recovery. All labs will be ILLs (instructor-led labs) with supporting hands-on lab content *and* you will walk away with your own copy of the DVD to continue the exercises back at your office. You can attend without a laptop but your experience will be significantly better with one! This is meant as an advanced workshop and you're expected to bring a reasonable laptop configuration in order to participate: * Virtual Server or Virtual PC - already installed * At least 1GB of physical memory w/512MB dedicated to the VPC environment (2GB is preferred w/1GB dedicated to VPC) * 12 GB of physical disk space (20+ GB is preferred) * DVD Drive

Sessions

  • SDB351: Follow the Rabbit - Interactive Q&A on Availability

In this session, Kimberly Tripp and Paul Randal will have only 5-10 slides. The focus of this session is on mixing availability technologies to create the best overall architecture to minimize downtime and data loss. In general, we're going to focus on best practices and then open up to your questions so that you can drive the discussion! This session might not seem as structured as other sessions, but you'll be surprised at how informative and fun it is! Grab your lunch and come back - we'll probably still be hanging out!

  • SDB350: SQL Server Table Strategies - Designing for Performance and Availability

Often tables are designed based solely on the data that needs to be tracked (here's a column name, here's a data type - done!). Unfortunately, design does not usually take into account how the data is going to be used OR how SQL Server uses the data. Knowing the internals of table structures as well as the optmizations that come with good design will make your database truly scalable. Come to this session to learn some internals as well as various design strategies such as vertical and horizonal partitioning. Additionally, are there any other features that require changes in your design and thinking? For example, online index operations impact design because of the limitations that exist with partitioning and LOB columns. If you want to scale, you need to be here!

  • SDB347: SQL Server Indexing for Performance - Finding the Right Balance

In terms of performance tuning, there are few silver bullets. If I had to choose ONE area that improves performance the most (when designed appropriately!), it's indexing. However, indexing strategies depend on the data and even more so, the usage of the data. Come to this session to see what indexing strategies help the base table the most as well as how to optimize your worst performing queries.

  • SDB348: SQL Server Indexing Strategies - Are You Sure?

Knowing tips and tricks to indexing is extremely helpful and will help you to solve "known" problems. But what's lurking in the unknown? Is SQL Server using your indexes? Or, do you have a bunch of useless indexes? Finally, SQL Server 2005 has an answer! SQL Server 2005 DMVs (Dynamic Management Views) can provide you with valuable information about your current indexing strategies, what should be removed, and even what's missing. Do you know how to find this information, leverage it, and then programmatically respond to it? Come to this session to figure it out!

  • SDB349: Follow the Rabbit - Interactive Q&A on the Storage Engine and the Relational Engine

In this session, Kimberly Tripp and Paul Randal will have only 5-10 slides. Each slide cover topics for discussion as well as the reason(s) for why something might be behaving badly and/or things to try to solve your problems. In general we're going to focus on best practices and then open up to your questions so that you can drive the discussion! Paul will focus on the SE (Storage Engine) and internals and Kimberly will focus on the RE (Relational Engine) and query tuning/performance. This session might not seem as structured as other sessions, but you'll be surprised at how informative and fun it is!

At this event, you'll be able to get what most conferences offer (and that's breadth - in terms of session choices, etc.) but with the large number of workshops and the detailed planning that went into sequencing the conference sessions, you'll also get depth that no other conference offers. Serious amounts of tips and tricks for you to geek-out on with us and take home to immediately apply.

It's going to be a great week. We hope to see you there!!

Kimberly (and Paul)

Wednesday, October 03, 2007 8:49:08 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

OK, let me start by saying that I absolutely love when a feature improves in granularity options. Better granularity in locks means that contention is reduced and concurrency improved. And even though the overhead to manage smaller locks (and typically more of them) is usually higher - the improved concurrency benefits often significantly outweight the costs. Additionally, design is often simplified as more granular locks typically means you don't have to work as hard to minimize contention. Let me give you some history...

In the old days (ok, remember, I started working with SQL Server when I was 12 :) :), SQL Server used to have page-level locking (all releases prior to SQL Server 6.5 sp3). In SQL Server 6.5 sp3 they made an internal change to allow "insert row locks" but that was very targeted in what it improved (in terms of locking). However, in SQL Server 7.0 the locking architecture completely changed (as well as the SE and most of the RE) and that's where they introduced true row-level locking. This resulted in a significantly reduced complexity in table design. No longer did we have to choose clustered indexes to remove page-level locking (and therefore insert hotspots). And, in fact, some designs solely improved their performance by upgrading. The nice thing about internal changes like these is that they mean you can get away without knowing all of the internals, not worry as much about design and yet still get gains in performance. All of which is good.

However, if you do know the internals and you leverage this knowledge then you might be able to see even greater gains. With the change in locking from page to row (as well as based on other changes to the internal dependencies of non-clustered indexes on the clustering key), databases whose indexing strategies changed between 6.5 to 7.0 made the greatest gains in performance. How did they change - I've blogged about "the clustered index debate" a few times so I'll stay away from that one here... but, the key point is that while these changes might allow you to do more with less work - a bit more work to truly leverage the new features/changes might result in the best combination!

And so, that's what brings me to partition-level lock escalation. This is an absolutely necessary step to truly allowing SQL Server to treat partitions like mini-tables. Here are a few of the concerns I've had with regard to SQL Server 2005 table and index partitioning:

* lock escalation can still occur between the read-only and read-write portions of your partitioned table if the read-only portion is accessed by large queries that escalate (in SQL Server 2005 escalation is either row to table or page to table)
* indexes must be exactly the same for all partitions (not related to escalation but it does have bearing on my solution)
* index rebuilds are supported at the partition level; however, partition-level rebuilds must be performed OFFLINE. Only table-level index rebuilds can be performed online (again, not directly related to escalation but it's another problem around blocking)

And, this last one is very frustrating to me in general as I'm finding more and more environments moving to "real-time analysis" where they want to do queries on as-close-to-current data as possible. In fact, real-time data warehousing in a relational database is one of the primary areas of improvements for SQL Server 2008 with features such as partition-level lock escalation, improvements in indexed views, grouping sets, and star join optimizations - just to name a few.

So, in terms of partition-level lock escalation. Am I happy that it's going to be there - for sure! However, the other two issues mentioned above might not change. Having different indexes at the partition level is likely through a feature called "Filtered Indexes" which has not yet appeared in any CTPs but it has been discussed at conferences/events. So, we might solve 2 out of 3 but what about online index rebuilds at the partition level? At this point, I'm pretty sure that they won't be able to solve that for SQL Server 2008... As a result, I would suggest a slightly different architecture. Instead of using only a single partitioned table for both read-only and read-write data, use at least two tables. One table for read-only data and another for read-write data. If you think this might be defeating the purpose of partitioning... then look at these benefits:

* the read-only portion of the table (which is typically the *much* larger portion of the table - can still be managed with partitioning)
* the read-only portion - once separated from the read-write - can have additional indexes for better [range] query performance
* the read-only portion of the table can actually be partitioned into multiple partitioned tables - to give better per-table statistics (statistics are still at the table-level only so even if your partitioning scheme is "monthly" you might want to have tables that represent a year's worth of data...especially if your trends seem to change year to year)
* large range queries against the read-only portion of the data will only escalate to the "table" (which is now separated from the read-write data)
* the read-write portion of the data can have fewer indexes
* the read-write portion of the data can be placed on different disks (MORE fault tolerant disks) due to the importance/volatility of the data
* finally, and most importantly, the read-write portion of the data can be maintained completely separately from the read-only portion with regard to index rebuilds

So, then how do you make it appear as one table? Use partitioned views over partioned tables and consider using a synonym for the hot/insert table. At the end of each month (or whatever your partitioning strategy uses - daily, weekly, monthly, etc.), "switch" the read-write portion of the table into the read-only portion of the table. You should be able to do all of this with no data movement and the synonym used for inserts will mean that your applications don't need to change either.

In summary, I do like the partition-level lock escalation feature especially as it doesn't require rearchitecting your solution/design. However, by creating two or more tables where read-only data is isolated from read-write, you can leverage many other features (like online index rebuilds).

If you're more interested in hearing about lock escalation at the partition-level check out Paul's recent blog entry on it: SQL Server 2008: Lock escalation changes.

Have fun and thanks for reading!
kt

Wednesday, October 03, 2007 6:01:47 AM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 
Wednesday, August 01, 2007

Yesterday, Microsoft released the July Community Technical Preview (CTP) for SQL Server 2008. The release makes available several performance and feature enhancements designed to support mission-critical platform and enable pervasive insight across the enterprise. (ok, those last two sentences weren't mine :)

Check out