Saturday, November 03, 2007

OK, so for those of you who watch our (Paul's and my) blogs... you know it's time for SQLConnections. We're in Vegas (and it's Paul's first time here!) and the conference is the largest it's ever been with over 5000 people!!! We arrived a couple of days early so that Paul could experience a bit of Vegas... last night we went out to dinner with Gert and Karen (Gert's wife) and Michele and Andres (Michele's husband) and we went to Mix. I have to admit that I didn't like my main course all that much (I had only one bite of the Cod dish) but everyone else loved theirs (especially the steak/foie gras special)... the 24,000 sphere chandelier was VERY cool and the views of the Strip are outstanding. Tonight we're (MLB/Andres/us) off to see Elton John. Gert and Karen are off celebrating their anniversary... CONGRATS!!

And today, we're getting ready for the conference - having just picked up our shirts (which Paul - yes PAUL - is about to iron...lol) - we swung by to checkout how things are going in terms of setup. It's always amazing to me to see how much goes on behind the scenes of some of these big events (I blogged about some of the behind the scenes of TechEd here) and well, we caught the 30+ people who were setting up and "building bags" for the 5000+ attendees.......

And, so begins our first conference as a married couple... and all of our sessions (for the most part) are together. It's actually really fun to do sessions together as we're able to bounce things off of each other, take notes on things we want to change and/or questions that are asked AND it allows us to keep things moving without a lot of stops. And, to be honest, it's not quite as tiring... but, with 2 pre-conference workshops, 5 sessions at the conference, multiple meetings/dinners/side-events and then a hands-on post-conference workshop on Friday - we'll both be pretty tired!!! We get back home late Friday night and then we have about 30 hours in Redmond to relax (NOT!) before we head to TechEd ITForum in Barcelona next week. From Barcelona we head to Zurich for a TechNet DeepDive session on Database Maintenance...... November is quite a month for getting around!!

Hope to see you here (or in Spain... or in Zurich... or next year),
kt

Saturday, November 03, 2007 1:59:33 PM (Pacific Standard Time, UTC-08:00)  #    Comments [2]  | 
Sunday, October 21, 2007

OK, I still have a way to go in learning about data compression in SQL Server 2008 but one thing that I do know is that nothing is free. So, the trade-off will be performance (i.e. CPU) v. space. And, that's not really a new trade-off wrt to compression. Sometimes that trade-off has other benefits that still minimize the overall cost (for example, backup compression compresses in-memory and before it goes to disk... this actually makes the overall backup process faster because the actual backup written to disk is smaller). However, if we're talking about data and data access, then we need to think more about how the data is going to be used as well as the impact on performance. I can definitely think of many reasons to compress older (and read-mostly, if not read-only) data (mostly due to volume) but depending on the queries and the impact to uncompress it (based on the volume of data being accessed), I'm going to do a lot of testing before I compress high performance/OLTP data. To help estimate the savings on space, SQL Server 2008 offers a stored-proc: sp_estimate_data_compression_savings.

Compression in SQL Server 2005
SQL Server 2005 offers the ability to have read-only data compressed using Windows NTFS file compression. File compression is only supported for secondary non-primary data files and only when they're set to read-only. If the entire database is set to read-only then all files (incl. the primary and log) can be on compressed drives. While supported, and it can make sense to do this when you have large amounts of historical data, it's still not very granular.

The other form of compression in SQL Server 2005 was introduced in SP2 as data compression for the decimal/numeric data types, called vardecimal. First, you enable compression at the database level and then you turn it on at the table level. The primary form of compression used by vardecimal is when your actual values are generally much smaller than the defined/declared decimal/numeric column. For example, if you've chosen to define a lot of columns as precision/scale (38,4) then as a decimal column each value (per column, per row) will take 17 bytes whether you use all of it or not. If you only store the value 87.5 (which would normally take only 5 bytes as a decimal(3,1)) then you're wasting 12 bytes. This form of compression will still be supported in SQL Server 2008 so if you're interested in how the vardecimal type works, check out this whitepaper. As for the new forms of compression... row-level compression is similar to vardecimal, but the other forms are quite different, and very interesting (especially the page-level dictionary compression)!

Compression in SQL Server 2008
In addition to offering support for NTFS file compression and vardecimal, SQL Server 2008 offers row-level compression or page-level compression (which includes row-level compression) AND it offers the ability to turn these on at the partition-level or at the table-level for all partitions. While I think the per-partition option is excellent, you might still want to separate your OLTP and read-only data into separate tables for other benefits (like online index operations which I mentioned here) but, the "table-level only" options are certainly starting to decrease! And, more granular options always means better manageability.

So, how does compression work in SQL Server 2008:
   Paul wrote about backup compression here.
   Sunil wrote about data compression here and here.
   Chad Boyd wrote about both here.

Paul and I will post more on compression... I really want to get some numbers regarding performance and Paul will dive into all of the internals using DBCC PAGE (go figure! :).

Enjoy!
kt

Sunday, October 21, 2007 6:53:12 PM (Pacific Standard Time, UTC-08:00)  #    Comments [2]  | 
Thursday, October 18, 2007

OK, so Paul could be doing worse things while I'm away... but, what was he doing? I guess you'll have to read here and listen here.

Enjoy!
kt

Thursday, October 18, 2007 7:52:39 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Saturday, October 06, 2007

Does it sound too good to be true: transparent data encryption? Well, it kind of is and kind of isn't. Let me explain. Transparent means that the application developer doesn't do anything (and if you have a third party app in which you can't do anything anyway, then this is even more important). Once enabled, the data is seemlessly protected.

But, protected from what? That's the part that needs a bit of explaining...

We all talk about the importance of physical security (I hope this doesn't sound new)? We all talk about the importance of protecting the physical media to which you've backed up your database(s)? We all talk about protecting the server/machine from physical access and of course, protecting the mdf/ldf files from unauthorized access if we copy them to other machines for attach. And, we all know that with all releases of SQL Server, up to and including SQL Server 2005 (with one exception - details coming up), that our databases are susceptible to unauthorized access if someone gains access to the physical files (whether the actual data files and/or the backup). The one exception is if data within the database has been encrypted using SQL Server 2005 column level encryption. As long as your database master key (created to allow encryption within the database) is strong and protected, then the data that is encrypted with it is also protected (you're only as good as your key protection mechanism). However, any data that is not encrypted is fair game to any system administrator who solely attaches the database or restores it. This is not good. However, it is reality in general. Physical security is incredibly important.

Now, having said all of that, SQL Server 2008 is going to change the landscape a bit. SQL Server 2008 is going to offer a way of protecting this "resting" data - transparently. Basically, once this is turned on, your data is protected from an unathorized attach OR when backed up, unauthorized restore.

This feature is not in the current CTP but I hope to see it in the next publicly available CTP (which will hopefully be out soon). So, when we can finally see it, I'll give you more details such as how to implement it, whether or not it negatively/positively impacts other features and the impact on performance.

Thanks for reading!
kt

Saturday, October 06, 2007 5:31:34 PM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 
Thursday, October 04, 2007

With the November 5-9th SQL Connections conference in Las Vegas around the corner, Shirley Brothers, the Connections Conference Manager, would like to start planning the Spring show. And, for the Spring show forward, Paul and I will be the co-chairs of the SQL Connections conference. We would like to invite you to submit abstracts for the Spring 2008 SQL Connections conference, to be held in Orlando, April 20-24th, 2008.

The conference will take place shortly after the SQL Server 2008 launch (Feb 27, 2008), and will focus on best practices for SQL Server 2005, how to upgrade and migrate applications from SQL Server 200x to SQL Server 2008 and new features to leverage in SQL Server 2008. However, we will consider all topics - from design to performance to troubleshooting to architectures to programming. Also, we're likely to closely follow our past conference format with three primary themes/tracks: Infrastructure and Scalability, Programming, and Business Intelligence. You should aim for the intermediate to advanced audience for any SQL Server 2005 materials and for SQL Server 2008 - we'll consider any level - especially depending on the newness of the feature.

For submitting session abstracts, please use this URL: 
http://www.deeptraining.com/devconnections/abstracts

Please keep the abstracts under 200 words each and in one paragraph. Please do not use bulleted items or line breaks, and please use a spell-checker. If you have an issue with the site, please follow up with an email as well to
paul@sqlskills.com.

Please create a new speaker record (if you haven't submitted before) and submit at least 3 abstracts; however, it will help your chances of being selected if you submit 5 or 6 abstracts. We need to have all your abstracts by October 26th. 

Finally, please see some important notes about your sessions at Connections conferences. These comments are from our Connections Conference Manager - Shirley Brothers:

=========================================================================
I would like to thank each and every one of our Connections speakers for helping us make this event so successful. Last year in Vegas we had just over 4,700 attendees; we hope to do that or better this fall. By the end of September we had over 3,000 registrations for our Fall show.

A successful show happens when you have a combination of things come together: great speakers, good venue, great partners, fun events, and ongoing relationship building. I hope we can all continue to work together to make Connections the very best event outside of Tech Ed, for years to come. As a more successful show, our competitors are trying to compete by moving their shows closer to our dates and in some cases, to the same cities.

It’s very disheartening for me to see our speakers presenting similar topics at competing shows that are scheduled so close to our own shows. So I want to make a small change in how we handle speakers. Essentially, I don’t want to schedule a speaker at a Connections show who is also presenting at one of our competitor’s shows, in the same state, within 30 days of a Connections show. Many of you have known me for years and you know that I would never discourage a speaker from doing something that is good for his/her career and company. I have never asked a speaker not to speak for a competitor and I am NOT saying that now. I am saying that if another show sits within 30 days of ours in the same state, that the speaker should choose to speak at one or the other conference, but not both. If for whatever reason a speaker does not choose Connections during that time period, there will NOT be hard feelings and they can speak for one of our shows at another time. 

I want Connections conferences to be different and unique in the minds of our attendees, our sponsors, and our speakers. I have given this a lot of thought and I think the best way to produce our Connections shows is not to have any of our speakers presenting at our show and at a competitor’s show a few weeks earlier or later in the same place.

Shirley
=========================================================================

Thanks for your continued support for our SQL Connections conference and we look forward to seeing your bright and shiny new abstracts!

Thanks,
Kimberly L. Tripp & Paul S. Randal
SQL Connections Conference Chairs
Thursday, October 04, 2007 5:27:26 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

OK, so thought I'd do a follow up to the post I did a couple of days ago titled: The perils of case-insensitive data (and our life in tangent-land). The reason I'd like to followup on it is that I received some excellent comments and I want to make sure that you're all aware of the tips/tricks and recommendations that there were (some of you may not have returned to see all of the comments). Really, I was impressed by the speed at which people responded as well as the great comments (and things I learned!). It just reminds me of the fact that none of us can know everything AND that our SQL community is awesome in its willingness to share and communicate.

As for the tips/tricks and "yes, duh!" realizations I came to... here are the interesting points from the comments:

First - why did my comparison work for a single character (e.g. '%A%') but not when I did a character range (e.g. '%[A-Z]%')? Well, it was because it was unicode! This was a "right! duh!" realization that I think I dreamed after I wrote this BUT, Hugo Kornelis is exactly right in his comment. Thanks Hugo! Here is a direct cut/paste of his comment:

The reason [A-Z] doesn't work, is that a collation doesn't just govern case sensitive vs case insensitive but also (amongst others) the sort order of letters. And most case sensitive collations sort like A - a - B - b - ... - Z - z. So [^A-Z] would include all letters except the lowercase z.

You can use [A-Z] to find uppercase characters in a binary collation (since all uppercase characters are in one range of ASCII, and all lowercase characters in another), but not in any other collation.

And, you can check out more from Hugo on his blog: http://sqlblog.com/blogs/hugo_kornelis/default.aspx

Second - the comparison query that I wrote all together (where I stated each letter individually in the WHERE clause) only took a few seconds to write (thanks to cut and paste :) AND it did work...And, sometimes getting something to work and moving on is all we can do (come on - you've ALL been there, eh? :). However, my main comment was that "it wasn't pretty". A much more elegant and unbelievably simple solution came from David R Buckingham (aren't the great answers always the really simple ones :)). Here is a direct/cut/paste of his comment:

The following query will return any fully lower case names in the table:

SELECT LastName
FROM Person.Contact
WHERE LastName COLLATE Latin1_General_CS_AS_KS_WS = LOWER( LastName ) COLLATE Latin1_General_CS_AS_KS_WS

I don't believe that David has a blog... maybe he should :).

Third - a very cool and clever trick that came in from Denis Gobo is related to the performance of repeatedly doing case-sensitive searches on a case-insensitive column. I suggested that creating an additional column (preferably a computed column that uses the case sensitive collation) would be an easy and optimal solution. This is still definitely true when the case-insensitive values are NOT selective enough to warrant using an index and the case-sensitive values are... However, if both the case-sensitive AND the case-insensitive values are reasonably selective then the trick that helps is from his comment. Here is a direct cut/paste of his comment:

Kimberly, the way to force an index seek is to do this

SELECT *
FROM MyTestContacts
WHERE Lastname = N'adams'
AND Lastname COLLATE Latin1_General_CS_AS_KS_WS = N'Adams'

The WHERE might return more than one row but the AND will return only the case sensitive one

I wrote about that a while back here:
http://sqlservercode.blogspot.com/2007/05/make-your-case-sensitive-searches-1000.html 

And, you can check out more from Denis on his blog: http://sqlservercode.blogspot.com/

Now, as for the issues related to creating a view in a database that has a different collation from the server's collation... Here, I'm fairly certain that there's still a bug. However, I'm happy to say that I don't think that it's the most likely situation that exists for collations. I think the two most likely situations are:

  1. The server has one collation. The database inherits that collation. The database developer makes column level collation changes throughout the db. This seems to work well. OR
  2. The server has one collation. The database has a different collation. The database developer consistently uses that collation throughout their app. A good example of this is where people have case-sensitive databases on case-insensitive servers. This works fairly well (although there are some issues wrt to temp tables, etc. and default_collation is a good thing to know).

I guess there's even a third one where column level changes are made in a database whose collation is different from the server but where there aren't any views that also change the collation to yet a different collation (and this is where there seems to be a bug).

So, this was an excellent (and reasonably fun :) :) exercise to go through wrt collations. And, this is how I (we?) learn! I really want to thank everyone for reading - and commenting/sharing! - the things they learned/knew. That's part of why I love the SQL Server community. And, speaking of which, I thought I'd end this entry with a few community links - as a reminder to everything that's out there:

Thanks for reading! Thanks for commenting!
kt

Thursday, October 04, 2007 11:57:05 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

ITForum.gifITForum.gif

 

 

 

OK, so after SQL Connections in Las Vegas, Paul and I head off to Barcelona for the second week of TechEd's two week event (week one for developers and week two for IT professionals). November's definitely a busy month. So, if you're in the US - we hope to see you at Connections and if you're in Europe, we hope to see you at Microsoft TechEd ITForum 2007.

 

Here's what we're doing:

Sessions

  • (DAT205) The Next Release of Microsoft SQL Server: Manageability Overview

The next release of SQL Server will contain exciting new manageability features targeted at reducing total cost of ownership. Come learn more about what's in store in areas like policy-driven administration and performance data collection and analysis. The session focuses on the database engine.

  • (DAT301) SQL Server Indexing - Unravelling the Unknown

Knowing tips and tricks to indexing is extremely helpful and will help you to solve "known" query problems. But what's lurking in the unknown? Is SQL Server using your indexes? Or, do you have a bunch of indexes sitting around wasting space and negatively impacting performance? Finally, SQL Server 2005 has an answer! SQL Server 2005 DMVs (Dynamic Management Views) can provide you with valuable information about your current indexing strategies, what should be removed, and even what's missing. Do you know how to find this information, leverage it, and then programmatically respond to it? Come to this session to learn a few tips and tricks as well as how to figure out the unknown indexing problems!

  • (DAT305) Secrets to Fast Detection and Recovery from Database Corruptions

How can you tell whether your data is corrupt? If you have corruption, how do you work out what’s wrong with the database? How do you ensure you have a valid backup? If you don’t have a valid backup, how and what do you repair? If you do have a backup, how do you work out whether you should restore or repair? And at what granularity? How do you go about determining what went wrong in the first place? It’s all about limiting downtime and data-loss when a corruption occurs - from knowing the tools to understanding the choices to planning a successful strategy. Some of the features discussed:

  • Torn-page detection and page checksums
  • IO read-retry
  • Backup checksums
  • Consistency checks (DBCC CHECKDB and related commands)
  • Database repairs

Facing database corruption is almost inevitable in every DBAs career - make sure you're prepared when it happens to you.

Chalk-talks

  • (DAT01-IS) SQL Server Upgrade Best Practices, Tips, and Tricks

Even though SQL Server 2005 has been out for a while, many companies are only just getting ready to upgrade. Come to this session to learn best practices, tips, and tricks distilled from two years of customer experiences. We'll also touch on some issues you'll face going to SQL Server 2008 when it's released next year. Come to this session to learn and share - bring your questions and experiences!

  • (DAT02-IS) SQL Server 2005 Database Mirroring: Setup to Implementation to Monitoring

Database Mirroring is one of the most exciting technologies in SQL Server 2005. With more and more people including it in their Disaster Recovery Strategies, it's important to know when to implement Database Mirroring as well as the implications of the architecture you choose. In this session there will be no slides just demos that will explore how Database Mirroring works in its various configurations and how that may effect your performance. Join us and see database mirroring in action and get your questions answered!

  • (DAT07-IS) DBCC Internals

All DBAs should have heard of (and used) DBCC – especially for consistency checking. Get down deeper than 400-level with this session on how some of the most important DBCC commands work. Topics covered include CHECKDB, SHRINKFILE, INDEXDEFRAG, and more.

Instructor-led labs

  • (DAT03-ILL) SQL Server Always On Technologies Instructor-Led Lab: Part 1 - Database Mirroring

See Database Mirroring in action! From implementation to monitoring to failover, database mirroring provides an ideal solution for many disaster recovery scenarios and this session will prepare you to handle them with minimal downtime or data loss.

  • (DAT04-ILL) SQL Server Always On Technologies Instructor-Led Lab: Part 2 - Database Snapshots

Database Snapshots are useful in many situations: database maintenance, data recovery, and point-in-time data access. You can even create a snapshot on a mirror database to get better return-on-investment (ROI) on your high-avaialbility (HA) investments. In this session we will explain how database snapshots work as well as go through several exercises, including working with multiple database snapshots and creating database snapshots on a mirror database.

  • (DAT05-ILL) SQL Server Always On Technologies Instructor-Led Lab: Part 3 - Online Operations

The bane of any DBA's life is to have to take data offline to perform maintenance or recover from a disaster. The various Online Operations in SQL Server 2005 alleviate much of this frustration. This session will show you how to move a table online for better isolation and control, partition a table online, access a database that's partially damaged, and perform online piecemeal restore.

Lunchtime Demos

  • (DAT01-PD) Database Recovery Techniques

In this fast-paced demo session nasty things will be done repeatedly to database. Then the methods and approaches to recovery will be shown. Not for the faint-hearted!

So, just like SQL Connections the week before, serious amounts of info with tips and tricks for you to take home and implement!

It's going to be a great week. We hope to see you there!!

Kimberly (and Paul)

Thursday, October 04, 2007 8:46:31 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Wednesday, October 03, 2007

120x240_SQLConn_IBT.jpg120x240_SQLConn_IBT.jpg

 

 

OK, it's about that time again - the Fall conference season - is here! Building on our co-presented Database Maintenance workshop at SQL Connections in Orlando, Paul and I are doing a *ton* of stuff at SQL Connections this Fall. The conference is back at the Mandalay Bay hotel and officially runs from November 5th to 8th, with pre-con workshops on the 5th. But, after Spring, we decided to significantly add recovery content to our maintenance content that we decided to have our maintenance content as a pre- pre-conference workshop on the 4th and then on the 5th, we have all new content on Disaster Recovery and Lessons Learned. AND, after the week of sessions, we decided to add a HANDS-ON (bring your own laptop) post-con workshop on the 9th! Our day off is Tuesday the 6th as it's Microsoft day... with a session line-up looks great with lots of juicy details about SQL Server 2008 - as well as some best-practices sessions for those of you who are happy with SQL Server 200x for now.

 

 

Here's what we're doing:

Workshops

  • November 4th - Pre-pre-con: SPR301: SQL Server Database Maintenance: From Planning to Practice to Post-Mortem

No matter how much effort you spend on the design of your database, if you don't maintain it in production then it will suffer from performance and manageability problems. The key to continued performance and smooth operations is a well thought-out and automated database maintenance plan. This full-day workshop has three sections: planning, practice, and post-mortem. Planning for database maintenance actually starts with database design, so one of the things covered will be how to avoid design choices that limit database maintenance or contribute to maintenance problems. We'll discuss a laundry-list of maintenance problems and then explore how to tell if you need to mitigate them, strategies and best-practices for doing so, and how to avoid having your mitigation choices cause unforeseen and undesirable side-effects. Topics covered will include database files (shrink, grow, virtual log files, log size/management), consistency checks and corruption detection, fragmentation, statistics, backup/restore (options, granularity, strategies) and recovery models. The workshop will vary between 200-400 level covering ALL the key concepts of maintenance features. Finally, we'll spotlight some real-world examples where people made good and bad choices and discuss how you can repeat or avoid them, respectively. If you're wondering how to bring your database back under control, and keep it there, then this full-day workshop will help you tame maintenance problems whether you're a full-time system administrator or a reluctant DBA.

  • November 5th - Pre-con: SPR303: SQL Server Disaster Recovery: From Planning to Practice to Post-Mortem

Every DBA's nightmare is having down time and data loss and not knowing how to recover. However, designing and implementing a successful disaster recovery strategy is easier said than done. It's about asking all the right questions and figuring out all the best answers for your situation. This full-day workshop has three sections: planning, practice and post-mortem. Planning is a critical part of disaster recovery, but the most-often disregarded. Topics we'll cover here include: How do you choose technologies to fit requirements and effectively use key features of SQL Server 2005? How do technology choice affect workload performance? Putting a well-thought out plan into practice requires even more planning and in this section we'll discuss technology implementation, building step-by-step recovery/operation guides for when disasters happen, and, most importantly, testing your plan by simulating real problems. In the final section, we'll spotlight some real-world examples where people made costly mistakes and show you how they could have been avoided with a little planning and practice. If you've ever had nightmares about disaster recovery (or actually had a disaster!) and been at a loss for what to do, then this full-day workshop will give you the direction and technical details you need for success!

  • November 9th - Post-con: SPS302: SQL Server - Put Your Knowledge Into Action (Bring Your Own Laptop)

After a week of learning and watching demos - spend your last conference day putting your knowledge into action and diving deeper into the implementation details. Bring your own laptop to install our VPC environment setup with hands-on lab exercises to walk you through some of our most important features in Database Maintenance and Disaster Recovery. All labs will be ILLs (instructor-led labs) with supporting hands-on lab content *and* you will walk away with your own copy of the DVD to continue the exercises back at your office. You can attend without a laptop but your experience will be significantly better with one! This is meant as an advanced workshop and you're expected to bring a reasonable laptop configuration in order to participate: * Virtual Server or Virtual PC - already installed * At least 1GB of physical memory w/512MB dedicated to the VPC environment (2GB is preferred w/1GB dedicated to VPC) * 12 GB of physical disk space (20+ GB is preferred) * DVD Drive

Sessions

  • SDB351: Follow the Rabbit - Interactive Q&A on Availability

In this session, Kimberly Tripp and Paul Randal will have only 5-10 slides. The focus of this session is on mixing availability technologies to create the best overall architecture to minimize downtime and data loss. In general, we're going to focus on best practices and then open up to your questions so that you can drive the discussion! This session might not seem as structured as other sessions, but you'll be surprised at how informative and fun it is! Grab your lunch and come back - we'll probably still be hanging out!

  • SDB350: SQL Server Table Strategies - Designing for Performance and Availability

Often tables are designed based solely on the data that needs to be tracked (here's a column name, here's a data type - done!). Unfortunately, design does not usually take into account how the data is going to be used OR how SQL Server uses the data. Knowing the internals of table structures as well as the optmizations that come with good design will make your database truly scalable. Come to this session to learn some internals as well as various design strategies such as vertical and horizonal partitioning. Additionally, are there any other features that require changes in your design and thinking? For example, online index operations impact design because of the limitations that exist with partitioning and LOB columns. If you want to scale, you need to be here!

  • SDB347: SQL Server Indexing for Performance - Finding the Right Balance

In terms of performance tuning, there are few silver bullets. If I had to choose ONE area that improves performance the most (when designed appropriately!), it's indexing. However, indexing strategies depend on the data and even more so, the usage of the data. Come to this session to see what indexing strategies help the base table the most as well as how to optimize your worst performing queries.

  • SDB348: SQL Server Indexing Strategies - Are You Sure?

Knowing tips and tricks to indexing is extremely helpful and will help you to solve "known" problems. But what's lurking in the unknown? Is SQL Server using your indexes? Or, do you have a bunch of useless indexes? Finally, SQL Server 2005 has an answer! SQL Server 2005 DMVs (Dynamic Management Views) can provide you with valuable information about your current indexing strategies, what should be removed, and even what's missing. Do you know how to find this information, leverage it, and then programmatically respond to it? Come to this session to figure it out!

  • SDB349: Follow the Rabbit - Interactive Q&A on the Storage Engine and the Relational Engine

In this session, Kimberly Tripp and Paul Randal will have only 5-10 slides. Each slide cover topics for discussion as well as the reason(s) for why something might be behaving badly and/or things to try to solve your problems. In general we're going to focus on best practices and then open up to your questions so that you can drive the discussion! Paul will focus on the SE (Storage Engine) and internals and Kimberly will focus on the RE (Relational Engine) and query tuning/performance. This session might not seem as structured as other sessions, but you'll be surprised at how informative and fun it is!

At this event, you'll be able to get what most conferences offer (and that's breadth - in terms of session choices, etc.) but with the large number of workshops and the detailed planning that went into sequencing the conference sessions, you'll also get depth that no other conference offers. Serious amounts of tips and tricks for you to geek-out on with us and take home to immediately apply.

It's going to be a great week. We hope to see you there!!

Kimberly (and Paul)

Wednesday, October 03, 2007 8:49:08 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

OK, let me start by saying that I absolutely love when a feature improves in granularity options. Better granularity in locks means that contention is reduced and concurrency improved. And even though the overhead to manage smaller locks (and typically more of them) is usually higher - the improved concurrency benefits often significantly outweight the costs. Additionally, design is often simplified as more granular locks typically means you don't have to work as hard to minimize contention. Let me give you some history...

In the old days (ok, remember, I started working with SQL Server when I was 12 :) :), SQL Server used to have page-level locking (all releases prior to SQL Server 6.5 sp3). In SQL Server 6.5 sp3 they made an internal change to allow "insert row locks" but that was very targeted in what it improved (in terms of locking). However, in SQL Server 7.0 the locking architecture completely changed (as well as the SE and most of the RE) and that's where they introduced true row-level locking. This resulted in a significantly reduced complexity in table design. No longer did we have to choose clustered indexes to remove page-level locking (and therefore insert hotspots). And, in fact, some designs solely improved their performance by upgrading. The nice thing about internal changes like these is that they mean you can get away without knowing all of the internals, not worry as much about design and yet still get gains in performance. All of which is good.

However, if you do know the internals and you leverage this knowledge then you might be able to see even greater gains. With the change in locking from page to row (as well as based on other changes to the internal dependencies of non-clustered indexes on the clustering key), databases whose indexing strategies changed between 6.5 to 7.0 made the greatest gains in performance. How did they change - I've blogged about "the clustered index debate" a few times so I'll stay away from that one here... but, the key point is that while these changes might allow you to do more with less work - a bit more work to truly leverage the new features/changes might result in the best combination!

And so, that's what brings me to partition-level lock escalation. This is an absolutely necessary step to truly allowing SQL Server to treat partitions like mini-tables. Here are a few of the concerns I've had with regard to SQL Server 2005 table and index partitioning:

* lock escalation can still occur between the read-only and read-write portions of your partitioned table if the read-only portion is accessed by large queries that escalate (in SQL Server 2005 escalation is either row to table or page to table)
* indexes must be exactly the same for all partitions (not related to escalation but it does have bearing on my solution)
* index rebuilds are supported at the partition level; however, partition-level rebuilds must be performed OFFLINE. Only table-level index rebuilds can be performed online (again, not directly related to escalation but it's another problem around blocking)

And, this last one is very frustrating to me in general as I'm finding more and more environments moving to "real-time analysis" where they want to do queries on as-close-to-current data as possible. In fact, real-time data warehousing in a relational database is one of the primary areas of improvements for SQL Server 2008 with features such as partition-level lock escalation, improvements in indexed views, grouping sets, and star join optimizations - just to name a few.

So, in terms of partition-level lock escalation. Am I happy that it's going to be there - for sure! However, the other two issues mentioned above might not change. Having different indexes at the partition level is likely through a feature called "Filtered Indexes" which has not yet appeared in any CTPs but it has been discussed at conferences/events. So, we might solve 2 out of 3 but what about online index rebuilds at the partition level? At this point, I'm pretty sure that they won't be able to solve that for SQL Server 2008... As a result, I would suggest a slightly different architecture. Instead of using only a single partitioned table for both read-only and read-write data, use at least two tables. One table for read-only data and another for read-write data. If you think this might be defeating the purpose of partitioning... then look at these benefits:

* the read-only portion of the table (which is typically the *much* larger portion of the table - can still be managed with partitioning)
* the read-only portion - once separated from the read-write - can have additional indexes for better [range] query performance
* the read-only portion of the table can actually be partitioned into multiple partitioned tables - to give better per-table statistics (statistics are still at the table-level only so even if your partitioning scheme is "monthly" you might want to have tables that represent a year's worth of data...especially if your trends seem to change year to year)
* large range queries against the read-only portion of the data will only escalate to the "table" (which is now separated from the read-write data)
* the read-write portion of the data can have fewer indexes
* the read-write portion of the data can be placed on different disks (MORE fault tolerant disks) due to the importance/volatility of the data
* finally, and most importantly, the read-write portion of the data can be maintained completely separately from the read-only portion with regard to index rebuilds

So, then how do you make it appear as one table? Use partitioned views over partioned tables and consider using a synonym for the hot/insert table. At the end of each month (or whatever your partitioning strategy uses - daily, weekly, monthly, etc.), "switch" the read-write portion of the table into the read-only portion of the table. You should be able to do all of this with no data movement and the synonym used for inserts will mean that your applications don't need to change either.

In summary, I do like the partition-level lock escalation feature especially as it doesn't require rearchitecting your solution/design. However, by creating two or more tables where read-only data is isolated from read-write, you can leverage many other features (like online index rebuilds).

If you're more interested in hearing about lock escalation at the partition-level check out Paul's recent blog entry on it: SQL Server 2008: Lock escalation changes.

Have fun and thanks for reading!
kt

Wednesday, October 03, 2007 6:01:47 AM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 

OK, have you ever been working on one thing...that led you to another (and another and another) and then you seem to have lost hours? OK, I know. That's our life [in the computer industry and I'm sure others!] - putting out fires and chasing strange behaviors that we eventually call "gremlins" when we really can't figure them out (especially when they don't repro). And, I know that we all want to be incredibly prepared but, sometimes bugs happen. And, sometimes bugs lead to serious problems possibly even data corruption/loss (which I've seen) and NO, I'm not directly relating this to anything about SQL Server. I'm just wanting to stress the necessity of a backup strategy (ah, a *tested* backup strategy) but, the bugs I've run into today are really not all that serious (they do NOT corrupt data). But, they do lead me to believe that far fewer changes are made to collations than I had thought? Or, that many of you change collations at the most granular level (probably at the column level?) and that database collation changes are done but without additional column level changes later.  

And, that's really the point of this blog post... for now, I'm going to recommend that you make changes at the column-level OR you don't make additional changes AFTER you've changed a particular database's collation. In other words, if you have a case insensitive server and a case sensitive database then things will probably work well. You can even leverage things like default_collation for temp tables. However, if you try to make addition changes to collations in other objects - such as views - it doesn't seem to work. Basically, I've run into problems creating views with different collations only when the database's collation is different than the server's collation. So far, that's the only thing that I've found that's wrong with what I've been doing. And, I didn't even figure this one out on my own - I did a live search on the error and found this: http://cc.msnscache.com/cache.aspx?q=72171562874629&mkt=en-US&lang=en-US&w=286a60c3&FORM=CVRE which seems like the same problem I'm having (and sorry for the cached page, I couldn't seem to get to the live page).

Regardless of this issue (is it a bug?), the real reason for this blog post is that a great discussion came up on the Regional Director tech alias (it's an internal thing we use to leverage each other's skills). The original question led to a few discussions and in the end, I think there are really two questions that I thought I'd discuss here:

#1) do you want a ONE-TIME way of checking a bunch of data to find rows that are lower-case (and shouldn't be)
#2) do you want to REPEATEDLY find rows based based on a case-sensitive search (where the data is stored case-insensitive).

In my first response, I answered #2. And, I'm going to start with that one here too. If you want to query a case-insensitive column with a case-sensitive search then changing the collation on the fly (with a where clause) works (although there are some performance issues related to this). So, I took an old example of mine (which was against pubs) and I decided that it needed a refresh (meaning, I wanted to update this to work against AdventureWorks). And, that's where half of my fun today started since this is where I've run into what I think is a bug. Anyway, let's start with what works:

-- First, I'll create a test database. Without a collation specified,
-- it will use the server's default collation.

CREATE DATABASE TestAdventureWorks
go

-- Verify the database collation
SELECT DATABASEPROPERTYEX('TestAdventureWorks', 'Collation')
go

-- database is set to SQL_Latin1_General_CP1_CI_AS as expected
-- this is a case-insensitive database

USE TestAdventureWorks
go

SELECT LastName collate database_default AS LastName
, FirstName collate database_default AS FirstName
, MiddleName collate database_default AS MiddleName
INTO MyTestContacts
FROM Adventureworks.Person.Contact
go

SELECT *
FROM MyTestContacts
WHERE Lastname = N'Adams'
go -- (86 row(s) affected)

SELECT *
FROM MyTestContacts
WHERE Lastname = N'adams'
go -- (86 row(s) affected)

SELECT *
FROM MyTestContacts
WHERE Lastname COLLATE Latin1_General_CS_AS_KS_WS = N'Adams'
go -- (86 row(s) affected)

SELECT *
FROM MyTestContacts
WHERE Lastname COLLATE Latin1_General_CS_AS_KS_WS = N'adams'
go -- (0 row(s) affected)

-- Next, create a view:
CREATE VIEW ContactLastNameCaseSensitive
AS
SELECT
LastName COLLATE Latin1_General_CS_AS_KS_WS AS CSName
FROM MyTestContacts
go

SELECT *
FROM ContactLastNameCaseSensitive
WHERE CSName = N'Adams'
go -- (86 row(s) affected)

SELECT *
FROM ContactLastNameCaseSensitive
WHERE CSName = N'adams'
go -- (0 row(s) affected)

And, everything works... in TestAdventureworks. In the *real* AdventureWorks, I get an error when I try to create the view:
Msg 2791, Level 16, State 5, Procedure ContactLastNameCaseSensitive, Line 3
Could not resolve expression for schemabound object or constraint.

So, this is the first issue. It seems as though you can't create the view if your database has a different collation than the server collation. Well, (again), I haven't spent all that much time on this one but I did repro what the chain on the forum seemed to have found.

Now, as for the second issue... the query can be EXTREMELY painful and slow if you run this against a large set of data. See, changing the collation on the fly will need to perform a row-by-row comparison of the data. So, to minimize that HUGE hit on performance - you have two options.

1) actually consider changing the column's collation so that it matches your queries AND then create an index (note: the actual use of the index will be determined by the selectivity of the data).

SELECT LastName collate database_default AS LastName
, FirstName collate database_default AS FirstName
, MiddleName collate database_default AS MiddleName
INTO MyTestContacts2
FROM Adventureworks.Person.Contact
go

ALTER TABLE MyTestContacts2
ALTER COLUMN LastName nvarchar(100) COLLATE Latin1_General_CS_AS_KS_WS
go

CREATE INDEX CSNameInd on MyTestContacts2 (LastName)
go

-- let's use a query that's highly selective (selective enough to use the index):

SELECT *
FROM MyTestContacts2
WHERE LastName = N'Barlow'
go -- (1 row(s) affected)

SELECT *
FROM MyTestContacts2
WHERE LastName = N'barlow'
go -- (0 row(s) affected)

2) create an index with a different collation... but this is harder than it sounds as the CREATE INDEX statement doesn't directly allow changing collation (however, it should!). But, you can do this by either creating another column (real or computed) with the case-sensitive collation and then indexing it OR you could do this through an indexed view (but that adds a few complexities as well). I think the computed column that's indexed is GREAT if the searches are generally highly selective. If they are not, then it is probably better to create a real column - as a computed copy of the inserted value - that is case sensitive. However, at that point, I'm not entirely sure why you're keeping the case-insensitive version around...unless it's to keep the actual inserted value (maybe for printing and/or display?). Regardless, here's how you can create an indexed computed column.

ALTER TABLE MyTestContacts
ADD
CSName
AS LastName COLLATE Latin1_General_CS_AS_KS_WS
go

SELECT *
FROM MyTestContacts
WHERE CSName = N'Adams'
go -- (86 row(s) affected)

SELECT *
FROM MyTestContacts
WHERE CSName = N'adams'
go -- (0 row(s) affected)

CREATE INDEX CSNameInd ON MyTestContacts (CSName)
go

SELECT *
FROM MyTestContacts
WHERE CSName = 'Barlow'
go -- (1 row(s) affected)

SELECT *
FROM MyTestContacts
WHERE CSName = 'barlow'
go -- (0 row(s) affected)

And, the index will be used if the query is highly selective.

OK, so that ends the answer to part 2 of the question (see how tangents can take us a bit off track :)....

Now, let's get back to question #1.

What if you want to do a one-time search through your data to find all of the lower case data? Well, there are a few thoughts here.... First, let's modify the ONE Barlow row to be lowercase barlow so that we have something to find:

UPDATE MyTestContacts2
SET LastName = N'barlow'
WHERE LastName = N'Barlow'
go -- (1 row(s) affected)

NOTE: This is an ABSOLUTELY horrible quiery as I didn't use any key to point to the exact row I wanted to modify. Had there been a lot of Barlow's I would have modified them all. This worked here because I knew there was only one row. But, all of your tables should have a primary key, etc. (not even going to begin this tangent :).

Now, having said that... let's see if we can find this row easily? You should be able to do this using Transact-SQL and using some type of wildcard pattern matching such as:

SELECT * FROM MyTestContacts2
WHERE Lastname like N'b%'
go -- (1 row(s) affected)

And, that works without any problems.

So, what about NOT an upper case B.

SELECT * FROM MyTestContacts2
WHERE Lastname NOT LIKE N'B%'
go -- (18768 row(s) affected)

SELECT * FROM MyTestContacts2
WHERE Lastname NOT LIKE N'%B%'
go -- (18765 row(s) affected)

tangent number 87 <g>: if you're wondering what the 3 rows are (as was I :)... they are 1 row of O'Brien and 2 rows of Smith-Bates. Here's that query:

SELECT * FROM (SELECT * FROM MyTestContacts2
WHERE Lastname NOT LIKE N'B%') AS Bs
WHERE Lastname like '%B%'
go

OK, so, I thought we were there... I thought we could go to what I thought was the next logical step.....

SELECT * FROM MyTestContacts2
WHERE Lastname NOT LIKE N'%[A-B]%'
go

And... well, we lose barlow from the result set. For some reason...when you do ranges of characters it seems to lose the case??? I remember that [A-Z] and [a-z] were different in some release? Is this a regression? Someone help me out with this one as I'm without a clue. In the end, the ONLY way I could get this to work is to do this:

SELECT * FROM MyTestContacts2
WHERE Lastname not like N'%A%'
AND Lastname not like N'%B%'
AND Lastname not like N'%C%'
AND Lastname not like N'%D%'
AND Lastname not like N'%E%'
AND Lastname not like N'%F%'
AND Lastname not like N'%G%'
AND Lastname not like N'%H%'
AND Lastname not like N'%I%'
AND Lastname not like N'%J%'
AND Lastname not like N'%K%'
AND Lastname not like N'%L%'
AND Lastname not like N'%M%'
AND Lastname not like N'%N%'
AND Lastname not like N'%O%'
AND Lastname not like N'%P%'
AND Lastname not like N'%Q%'
AND Lastname not like N'%R%'
AND Lastname not like N'%S%'
AND Lastname not like N'%T%'
AND Lastname not like N'%U%'
AND Lastname not like N'%V%'
AND Lastname not like N'%W%'
AND Lastname not like N'%X%'
AND Lastname not like N'%Y%'
AND Lastname not like N'%Z%'
go

And, well, that works. But, it is NOT pretty! The query's going to require a table scan anyway AND it is a one-time query. I'm OK with this as a solution to this problem BUT, am I missing something here? Please tell me there's something more clever here? Is this a bug?

I'm definitely interested in feedback on this one!
kt

Wednesday, October 03, 2007 1:01:52 AM (Pacific Standard Time, UTC-08:00)  #    Comments [6]  | 
Tuesday, September 04, 2007

OK, if you've been reading at all... you know that Paul's leaving Microsoft (ok, he's left... yippie!) and joining SQLskills (ok, he's joined). And, well, that's the reason for this blog entry. He's going to work on some updates to his prior posts (in the very near term) on the SQL Server Storage Engine blog as well as many new ones. And, he's threatening me to blog more as well.

So, be sure to subscribe to Paul S. Randal "In Recovery" (very well named :): http://www.sqlskills.com/blogs/Paul/ and here's his rss: http://www.sqlskills.com/blogs/paul/SyndicationService.asmx/GetRss.

We have a TON of exciting things planned, including some upcoming events that we're delivering together:

And, that's it for now...........but stay tuned - there's a lot more coming from us as SQLskills expands!

THANKS for reading,
kt

Tuesday, September 04, 2007 7:38:42 PM (Pacific Standard Time, UTC-08:00)  #    Comments [2]  | 
Wednesday, August 01, 2007

Yesterday, Microsoft released the July Community Technical Preview (CTP) for SQL Server 2008. The release makes available several performance and feature enhancements designed to support mission-critical platform and enable pervasive insight across the enterprise. (ok, those last two sentences weren't mine :)

Check out http://www.microsoft.com/sql/prodinfo/futureversion/default.mspx for all of the latest info. You can download the SQL Server 2008 July CTP at: http://connect.microsoft.com/sqlserver/

     Thanks and have fun!
kt

Wednesday, August 01, 2007 10:27:27 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Tuesday, July 31, 2007

OK, I'm not normally the one to blog a lot of personal stuff but there's definitely a reason for a lot of my silence over the past couple of months....... I was planning a wedding. OK, Paul helped out more than you would expect - he met with the officiant (Annemarie of Annemarie Juhlian), the photographer (John Mitchell of John Mark Photography), the ceremony/reception venue (the Woodmark Hotel), the florist (Nathan of Fena Flowers), the amazing baker (Mike's Amazing Cakes) and Paul did so much more too! So, I can't say it was all me but that's the excuse I'm using (for my lack of blogging :) and I'm sticking to it! It was a crazy amount of work in general and now I can truly understand why people hire wedding coordinators!!

Anyway, the weekend could NOT have gone better. So many of our friends and family were able to make it - even our one remaining Grandmother (Doris McDonnell, my Father's Mother - who is 84) made the journey out here. Which, btw, was no small feat for her... it was a train to my Mother's house to spend the night, then together they took a train to Chicago's grand central station where they then caught a cab to O'Hare to catch a 4 hour flight to Seattle. And, that was a fairly short flight compared to Goksin Bakir's flight(s) from Istanbul, Gunther Beersaerts' flight(s) from Belgium, Olga Londer's flights from London, Paul's entire family from Glasgow, Scotland, the Tripp/Fussell/Bostick clan's flight(s) from Tampa, Carl Franklin's flight(s) from Connecticut and Richard Campbell's drive down from Vancouver (ok, that was short compared to Peter Bennett's drive from Racine, Wisconsin). And, of course, we had many of the usual suspects: Bob Beauchemin, Brian Randell, Clemens Vasters, Euan Garden, Gert Drapers, Gunther Beersaerts, Juval Lowy, Liz Vitt, Matt Nunn, Michele Leroux Bustamante, and last, but not least, Timmie Huckaby. And, those were only a few of the more public geeks :) :). Speaking of public... I'm *very* afraid of the recorder that Carl (of the infamous DNR) had running during the late night hours of the BBQ on Friday night.

Speaking of the BBQ, here's our favorite photo (so far!) from the BBQ on Friday.

Paul already blogged our favorite (so far!) from the wedding. Oh, and why only "so far?"... John took over 40GB of photos from the weekend (the BBQ was on the 27th and the wedding was on the 29th). We cannot wait to see all of what he's done. There are photos that we've seen now that show us that we never even knew he/they (John/Katie/Boon) were there...capturing moments rather than just posed shots. With how fun, professional, timely and creative they were - it's obvious that they all love what they do. In fact, John feels more like a friend than a vendor. It was truly a pleasure having them attend the wedding...and not just for the photos!

So, I thought I'd blog a tad of the personal stuff today and let you know about the best SQL union of which I'm aware <g>... we look forward to all of the exciting things that SQLskills will be doing as we move forward with Paul coming over from the Microsoft mothership as "Managing Director" starting on September 1st. Stay tuned with us as I know good things are still yet to come.

Cheers,
kt

Tuesday, July 31, 2007 10:24:19 AM (Pacific Standard Time, UTC-08:00)  #    Comments [7]  | 
Thursday, July 26, 2007

I had a customer ask me about the debate of skipping SQL Server 2005 entirely and just moving straight to SQL Server 2008... I haven't thought much about this as most of my direct customers have moved already BUT, I know there are a few out there that have not yet moved.

So, I definitely need to spend some time on SQL Server 2008 upgrade requirements, restrictions, etc. - and that's my plan in the not-too-distant-future BUT, I haven't thought much about it so far.

Your thoughts???

Cheers,
kt