Effectively Designing a Scalable and Reliable Database

A Primer to Proper SQL Server Development
SQL Server Mixed Workloads, Secondary Databases, Locking and Isolation, Part 6 of 10

Presented by Kimberly L. Tripp, SQLskills.com

 

Q: Can I view a recording of this webcast?

Part 6 can be replayed by clicking here.

All “on-demand” webcasts can be found here.

 

Q: Where can we get the demo scripts AND the sample database: Credit?

The demo scripts are in this zip (20050916MSDNDemoScripts.zip (6.11 KB)); here in this blog entry. However, at the series completion, I will also create an entry under Past Event Resources for the entire webcast series.

 

To download the ZIP of the Credit Database Backup click here. Once unzipped, restore this backup to SQL Server 2000 or SQL Server 2005. The backup is a SQL Server 2000 backup and can be restored to either version! If restoring to SQL Server 2005, you might want to change the destination for the data and log file as the path will probably be different.

 

Q: Where are the links to all prior Webcast Q&As from this series?

Part 1: Creating a Recoverable Database

      For the MSDN Download for Part 1, click here.

      For the SQLskills Blog Entry for Part 1, click here

 

Part 2: Creating a Reliable and Automated Backup Strategy

      For the MSDN Download for Part 2, click here.

      For the SQLskills Blog Entry for Part 2, click here.

 

Part 3: Designing Tables that Scale, Best Practices in Data Types and Initial Table Structures

      For the MSDN Download for Part 3, click here.

      For the SQLskills Blog Entry for Part 3, click here.

 

Part 4: SQL Server Indexing Best Practices

      For the MSDN Download for Part 4, click here.

      For the SQLskills Blog Entries for Part 4

         Resource links blog entry, click here.

         Q&A blog entry, click here.

 

Part 5: SQL Server Index Defrag Best Practices

      For the MSDN Download for Part 5, click here.

      For the SQLskills Blog entry, click here.

Q: How can I replay previous sessions? I thought we were going to get emails for replaying -- but I haven't received any replay emails.

You will receive replay emails ONLY when you register for these sessions through MSDN. We’ve come to find out that there are other ways to register but it’s only through MSDN that we know for sure you will receive the replay information. Regardless, you can always find the “on-demand” version of the sessions here

 

Technical Questions

Q: I know you have covered indexes and backups in other webcasts, but here is my question I use heavily temporary tables. My TempDB grows up to 5 GIG. Should I backup or truncate the transaction log in order to bring it back to its normal size?

No, there is no need to specifically maintain the transaction log of the TempDB database. If the transaction log (and subsequently, the database) grows large – there could be multiple reasons for that and instead of thinking in terms of trying to manage the log, I’d look at long running transactions and/or large transaction. You can use Profiler to help you see long running and/or large transactions.

 

Q: Can I perform a database snapshot to another server?

No, database snapshots must be created on the same server as the database on which the snapshot is being based.

 

Q: Can I snapshot by filegroup?

No, however if what you want to do is create a snapshot which does NOT include certain files – you can take those filegroups offline and then create the snapshot. In the snapshot the only file/filegroups available will be those which were online when the snapshot was created…even if those files/filegroups are brought online after the snapshot was created.

 

Q: Could a reader be blocked on the snapshot DB while SQL updates the changed page?

No. The copy on write mechanism is really a copy before write mechanism and the pages will be copied before the write and essentially before the locks, etc. The only possible “blocking” could be caused by the excess I/Os that need to be performed. However, the I/Os are performed only on the FIRST change to the page after the snapshot is created – so it’s minimal!!

 

Q: Are DMVs in SQL Server 2005 only?

Yes, DMVs = Dynamic Management Views and these are a feature of SQL Server 2005.

 

Q: Is read uncommitted the lowest/least in terms of data consistency?

Yes, read uncommitted is also known as “dirty read.” A dirty read is a read against an “in-flight” transaction; this transaction could be rolled back. As a result, the query that read that data would be inaccurate.

 

Q: What is the effect of versioning on fragmentation and performance (I'm assuming I create a split of the page is full)?

Actually, I’m not sure I’m following this one… But – I think I can answer it by just giving you some insight into how things work. Versioning – in terms of the data overhead added to the data row – does add a 14 byte value to help store the offset. This overhead is added ONLY once, to each row, after one of the snapshot isolation options is turned on (either or both – the READ_COMMITTED_SNAPSHOT or ALLOW_SNAPSHOT_ISOLATION). When this 14 byte value is added to each of the rows, the additional 14-bytes might cause the page to split. Again, this is only a one-time addition. The trick to optimizing this structural change is to change the database option and then rebuild your indexes. This will make the data contiguous and versioning will have no additional affects on the data row.

 

Now, if what you were thinking is that the versions were stored in the data row – then this is NOT the case. The version store comes from the TempDB and as a result, there is no additional overhead (over the 14-bytes) needed within the data row.

 

Q: What if we're not using transactions? Will repeatable read still lock the table during the read (particularly if the select is long)?

Repeatable reads locks – and holds – the resources as they are read. So, YES, in the case of a select statement, you will acquire and hold the read locks for the life of the transaction.

 

Q: How much additional overhead does versioning require from the SQL Server engine?

Most of the overhead comes from TempDB but there’s also a bit of overhead in looking up the version. There are no direct numbers associated with the overhead but in a lot of cases you should think in terms of roughly 10% additional costs for your transaction… So, I guess the best point is that you will have slower overall performance when implementing row version; however, you might solve a lot of your blocking problems. Slower but not blocked is better than not running at all – even when it does run at all quickly. J  In all seriousness though, if blocking is NOT your primary problem, you will add overhead without a possible benefit.

 

Q: So, is it the new transaction data or the old transaction data held in the snapshot store (seems like it might be different for statement vs. transaction level snapshotting)?

It’s always the BEFORE image. The general process of the write is called “copy on write” but I think of it better as copy before write.

 

Q: Can we optimize the snapshot store (different physical device, file group(s), etc.)?

No. However, you should look at optimizing TempDB. There are multiple things that you might want to consider. I discuss those thing in this blog entry here.

 

Q: If I don’t need locking why shouldn’t I use read-uncommitted?

Hmm, you can… you just need to be aware of the fact that the data is “dirty” and is not guaranteed to persist.

 

Q: Where does SQL store all the row versions (with snapshot isolation turned on)?

The version store is in TempDB.

 

Q: How do I view all of the Report options from the summary page? I am looking at Adventureworks (compatibility level = 90), but all that I see is the General report.

Ah, ha! The new summary windows were added to after the beta II April CTP. So, what this tells me is that you’re running a build lower than 9.00.1187.07. At this point, I’d go for the September CTP which is build 9.001314.06.

 

Q: This question is from previous webcast... Is there anything new with SQL 2005 that does datetime support data types? Time datatype or Date datatype only?

No, SQL Server 2005 only includes the datetime datatype for date/time data. However, by using “custom types” you can create your own types which are date only or time only (just for one example). There were separate SQLCLR types of date only and time only in SQL Server 2005; however these were non-native types and subsequently removed. Instead, they will be shipped as examples in a resource kit which ships after RTM.

 

Q: Kimberly, the downloadable zip file from your blog for at least the first session will only unzip to a "C:" drive (which my system doesn't have ;^) Could you please re-zip it to allow election of the drive to which it should unzip?

This one still perplexes me. I didn’t set any options that would restrict this…

 

For the next session, we’re going to cover how SQL Server keeps plans, where you can look to see what's in cache AND how you can know better if the stored procedure's plan should be kept...or not? If you’re interested in hearing more - here’s the registration link:

MSDN Webcast: A Primer to Proper SQL Server Development (Part 7 of 10): Understanding Plan Caching and Optimizing Procedure Performance

 

See you on Friday!

kt

Categories:
Events | Resources | SQL Server 2005

Brian A. Randell's Blog: http://www.mcwtech.com/CS/blogs/brianr/default.aspx
MCWTechnologies Website: http://www.mcwtech.com/

Kimberly L. Tripp's Blog: http://www.SQLskills.com/blogs/Kimberly
SQLskills Website: http://www.SQLskills.com

Presentation Resources
Presentation in PDF form
Kimberly's Demo Scripts
Brian's Demo Scripts and Code

Running SQL Server 2000 tools and SQL Server 2005 tools side-by-side
We talked about re-registering all of your COM components and I didn't have a slide for this. So, if SQL Server 2000 Enterprise Manager crashes when you try to access database properties OR SQL Server 2000 Query Analyzer doesn't seem to do the color coding correctly, then you need to re-register your COM componenets in BOTH of the following directories:
   c:\program files\microsoft sql server\80\tools\binn 
   c:\program files\microsoft sql server\mssql\binn

To re-register the components, execute: FOR %i IN (*.dll) DO regsvr32 /s %i

Resources and Presentations on Indexing Best Practices
First, start by reviewing the blog entries listed in the Indexes category here.
As for the Webcasts - there are 6 from which to choose! Each webcast has an associated Q&A posted to my blog - make sure to look for the Q&As. Usually they are posted within 1 week (give or take :) from the actual webcast.
   MSDN Webcast: Indexing for Performance - Finding the Right Balance (SQL Server 2000), recorded 11 June 2004
   MSDN Webcast: Indexing for Performance - Index Maintenance Best Practices (SQL Server 2000), recorded 19 July 2004
   TechNet It’s Sh0wtime Webcast: Index Creation Best Practices with SQL Server 2005, recorded at Tech Ed Amsterdam, July 2005
   TechNet It’s Sh0wtime Webcast: Index Defragmentation Best Practices with SQL Server 2005, recorded at Tech Ed Amsterdam, July 2005
   MSDN Webcast Series: Part 4 of 10, Best Practices in Indexing, recorded 26 August 2005
   MSDN Webcast Series: Part 5 of 10, New Features in Indexing and Index Maintenance Best Practices, recorded 2 September 2005

MSDN Webcast Series: Building Highly Reliable and Available Systems with SQL Server 2005
Watch one on-demand and/or sign up to attend one of the remaining!

Part 1: Creating a Recoverable Database

      For the MSDN Download for Part 1, click here.

      For the SQLskills Blog Entry for Part 1, click here

 

Part 2: Creating a Reliable and Automated Backup Strategy

      For the MSDN Download for Part 2, click here.

      For the SQLskills Blog Entry for Part 2, click here.

 

Part 3: Designing Tables that Scale, Best Practices in Data Types and Initial Table Structures

      For the MSDN Download for Part 3, click here.

      For the SQLskills Blog Entry for Part 3, click here.

 

Part 4: SQL Server Indexing Best Practices

      For the MSDN Download for Part 4, click here.

      For the SQLskills Blog Entries for Part 4

         Resource links blog entry, click here.

         Q&A blog entry, click here.

         Part 5: SQL Server Index Defrag Best Practices
         
      For the MSDN Download for Part 5, click here.
               
For the SQLskills Blog entry, click here.

         Part 6: SQL Server Mixed Workloads, Secondary Databases, Locking and Isolation
         
      For the MSDN Download for Part 6, click here.

         Part 7: Understanding Plan Caching and Optimizing Procedure Performance 
         
      To register to attend, click here.

         Part 8: Data Loading and Aging Strategies 
         
      For the MSDN Download for Part 6, click here.

         Part 9: Profiling for Better Performance 
         
      For the MSDN Download for Part 6, click here.

         Part 10: Most Common Roadblocks to Scalability and Reliability 
         
      For the MSDN Download for Part 6, click here.

Profiling SQL Server and Creating a Server-side Trace
INF: How to Create a SQL Server 2000 Trace (283790)
HOW TO: Programmatically Load Trace Files into Tables (270599)
How To: Stop a Server-Side Trace in SQL Server 2000 (822853)
INF: How to Monitor SQL Server 2000 Traces (283786)
INF: Stored Procedure to Create a SQL Server 2000 Blackbox Trace (281671)
BUG: BOL Incorrectly States That Users Do Not Need to Be Sysadmin to Use Profiler or SQL Profiler SPs (310175) 
   NOTE: This is ONLY a SQL Server 2000 limitation.
INF: Job to Monitor SQL Server 2000 Performance and Activity (283696)
Support WebCast: SQL Server 2000 Profiler: What's New and How to Effectively Use It

Great KB to Start with for Troubleshooting
HOW TO: Troubleshoot Application Performance with SQL Server

What about Whitepapers - we referenced quite a few!
Get a Lean, Mean Dev Machine with the Express Editions of Visual Basic and SQL Server 2005 by Brian A. Randell 
SQL Server 2005 Beta 2 Transact-SQL Enhancements by Itzik Ben-Gan 
SQL Server 2005 Partitioned Tables and Indexes by Kimberly L. Tripp
SQL Server 2005 Snapshot Isolation by Kimberly L. Tripp
SQL Server 2005: the Database Administrator’s Guide to the SQL Server Database Engine .NET Common Language Runtime Environment by Kimberly L. Tripp

Other Whitepapers, Websites, and Webcasts
Blog Entry: 8 Steps to Better Transaction Log Throughput
MSDN Whitepaper: An Overview of SQL Server 2005 for the Database Developer
MSDN Whitepaper: Processing XML Showplans Using SQLCLR in SQL Server 2005
MSDN Whitepaper: Using CLR Integration in SQL Server 2005
MSDN Whitepaper: XML Support in Microsoft SQL Server 2005
MSDN Whitepaper: XML Options in Microsoft SQL Server 2005
MSDN Whitepaper: What's New in FOR XML in Microsoft SQL Server 2005
MSDN Whitepaper: XML Best Practices for Microsoft SQL Server 2005
MSDN Whitepaper: Usage Scenarios for SQL Server 2005 Native Web Services
MSDN Whitepaper: Managed Data Access Inside SQL Server with ADO.NET and SQLCLR
MSDN On-demand Webcasts 
MSDN Live Webcasts 
SQL Server 2005 Hands-On Labs
         
SQLCLR Hands-On Lab Manual
Microsoft SQL Server TechCenter on TechNet
Sample Book Chapters for SQL Server 2005 is a list of chapters posted from a variety of authors for books related to SQL Server 2005.
Hosting the .NET Runtime in Microsoft SQL Server on the Association for Computing Machinery (www.ACM.org). To access this article you need membership in SIGMOD, the ACM, or you can purchase just this article for download.
Service Oriented Database Architecture by David Campbell, also on the Association for Computing Machinery (www.ACM.org). To access this article you need membership in SIGMOD, the ACM, or you can purchase just this article for download.

Gert E.R. Drapers' website
Microsoft SQL Server Development Customer Advisory Team
PDC Information Site

Well, if that doesn't keep you busy, I don't know what will!

Enjoy!
Kimberly

Categories:
Events | Resources | SQL Server 2005

This is a much needed and much overdue blog entry... In 8 Steps to Better Transaction Log throughput, I mentioned a customer that was helped by TWO typical optimization problems I see. In that blog entry, I said I would write two blog entries - that one on transaction log optimization and another on common tempdb optimizations. Well, I forgot...until I was reminded with an email this morning (thanks Marcus!).

First - a bit of understanding of TempDB - what goes there?

  • Internal temporary objects needed by SQL Server in the midst of other complex operations. For example, worktables created by a hash aggregate will be stored in TempDB or interim tables uses in hash joins (almost anything that shows as "hash" something in your query plan output is likely to go to tempdb).
  • User objects created with either # (for local temporary objects), ## (globabl temporary objects) or @ (table variables)
    • # = Local temporary object
      Local temp objects are objects accessible ONLY in the session that created it. These objects are also removed automatically when the session that created it ends (unless manually dropped).
    • ## = Globabl temporary object
      Global temporary objects are objects that are accessible to ANYONE who can login to your SQL Server. They will only persist as long as the user that created it lasts (unless manually dropped) but anyone who logs in during that time can directly query, modify or drop these temporary objects. These objects are also removed automatically when the session that created it ends (unless manually dropped) OR if being used by another session when the session that created it ends, when the session using it finishes using it (and it's only as long as any locks are held). If other sessions need more permanent use of a temporary object you should consider creating a permanent objects and dropping it manually.
    • @ = User-defined Table Variable
      User-defined Table Variables were introduced in SQL Server 2000 (or, wow - was it 7.0?) and provide an alternative to temporary tables by allowing you to create a variable defined as type TABLE and then you can populate and use it in a variety of ways. There has been A LOT of debate over whether or not you should always use table variables or always use temp tables. My response is that I ALWAYS avoid the word always! My point is that table variables are NOT always better nor are temp tables always better. There are key uses to each. I tend to like temp tables in scenarios where the object is used over a longer period of time - I can create non-key indexes on it and it's more flexible to create to begin with (SELECT INTO can be used to create the temp table). I also have the ability to use the temporary table in nested subprocedures because it's not local to the procedure in which it was created. However, if you don't need any of those things then a table variable might be better. When it is likely to be better - when you have smaller objects that don't need to be accessed outside of the procedure in which it was created and when you only need KEY indexes (a table variable ONLY supports the indexes created by a create table statement - meaning PRIMARY KEY and UNIQUE KEY).
  • Objects created by client applications - this is possibly a large part of your problem... Profiling can help you to determine if there's a lot of TempDB usage from your client applications.

OK, so now that you know what goes there - how do you make it optimal?

First and foremost, TempDB is in cache just as any other database is in cache. TempDB does not spill to disk unless you are low on cache and/or if you have a lot of inflight transactions in TempDB. Although TempDB is not persisted from shutdown to restart - it still needs to do some logging and therefore you should consider its optimization a lot like other databases.

Things you should do for TempDB (that are a lot like what you should do for every database):

  1. Isolate the data and log portion of TempDB.
  2. Place them on clean, defragmented disks.
  3. Pre-allocate them so they don't need to do a lot of autogrowth.
  4. Make sure you have sufficient memory to support active objects (check for disk activity to the disks that contain TempDB files).
  5. Make sure that transactions are written efficiently so that there are no unusually long running transactions that are unnecessarily holding resources (and therefore locks and therefore log activity).

And - if you need to move TempDB, you should review this KB Article: Moving SQL Server databases to a new location with Detach/Attach

Things you should do SPECIFICALLY for TempDB (especially if you're running on a multiproc machine):

Before I say what... let me tell you why? TempDB has a large number of objects being created all the time. For an object to be created, space must be allocated to it. Space allocation is determined by looking at some of the internal system pages (the GAM, and SGAM). In the end, it is these pages that start to have significant contention (with just one file) in a VERY active TempDB. To minimize that contention you can create multiple files.

  1. Consider creating multiple files for TempDB (even if on the same physical disks) so that there is less of a bottleneck when objects are being allocated. Make sure to read associated KB.
  2. Consider setting a trace flag to have object allocation grab extents rather than pages. Make sure to read associated KB.

BOTH of these last two are described in detail by a KB article: FIX: Concurrency enhancements for the tempdb database.

OK - so that should really help! Moving forward (meaning SQL Server 2005), having multiple files can still help for TempDB.

Effectively Designing a Scalable and Reliable Database

A Primer to Proper SQL Server Development
New Features in Indexing and Index Maintenance Best Practices, Part 5 of 10

Presented by Kimberly L. Tripp, SQLskills.com

 

Q: Can I view a recording of this webcast?

Part 5 can be replayed by clicking here.

All “on-demand” webcasts can be found here.

 

Q: Where can we get the demo scripts AND the sample database: Credit?

The demo scripts are in this zip (20050902MSDNDemoScripts.zip (8.52 KB)); here in this blog entry. However, at the series completion, I will also create an entry under Past Event Resources for the entire webcast series.

 

To download the ZIP of the Credit Database Backup click here. Once unzipped, restore this backup to SQL Server 2000 or SQL Server 2005. The backup is a SQL Server 2000 backup and can be restored to either version! If restoring to SQL Server 2005, you might want to change the destination for the data and log file as the path will probably be different.

 

Q: Where are the links to all prior Webcast Q&As from this series?

Part 1: Creating a Recoverable Database

      For the MSDN Download for Part 1, click here.

      For the SQLskills Blog Entry for Part 1, click here

 

Part 2: Creating a Reliable and Automated Backup Strategy

      For the MSDN Download for Part 2, click here.

      For the SQLskills Blog Entry for Part 2, click here.

 

Part 3: Designing Tables that Scale, Best Practices in Data Types and Initial Table Structures

      For the MSDN Download for Part 3, click here.

      For the SQLskills Blog Entry for Part 3, click here.

 

Part 4: SQL Server Indexing Best Practices

      For the MSDN Download for Part 4, click here.

      For the SQLskills Blog Entries for Part 4

         Resource links blog entry, click here.

         Q&A blog entry, click here.

Q: How can I replay previous sessions? I thought we were going to get emails for replaying -- but I haven't received any replay emails.

You will receive replay emails ONLY when you register for these sessions through MSDN. We’ve come to find out that there are other ways to register but it’s only through MSDN that we know for sure you will receive the replay information. Regardless, you can always find the “on-demand” version of the sessions here

 

Related Resources

MSDN Webcast: Indexing for Performance – Proper Index Maintenance

MSDN Whitepaper: Microsoft SQL Server 2000 Index Defragmentation Best Practices

TechNet It’s ShOwtime Webcast: Index Defragmentation with SQL Server 2005 

Technical Questions
Q: In your script, what is "HA Requirements"?
HA = High Availability. This is the requirement that your table stay online and available. Some companies are trying to achieve 99.999% uptime, this is especially challenging when even maintenance operations take a table offline.

Q: If you create extra indexes is there a easy to configure utility that you can run across an application after it has run for a few months to list keys that were never or infrequently used?
Use one of the new DMVs: sys.dm_db_index_usage_stats. To see the complete list of DMV objects, use the following query:

SELECT * FROM sys.system_objects
WHERE [name] LIKE 'dm[_]%'

Q: What is DMV again?
Dynamic Management View. These are new objects which give information about in-memory objects and state information.

Q: What are the parameters and their usage – for sys.dm_db_index_physical_stats?

   (DatabaseID, ObjectID, IndexID, PartitionNumber, Mode)

   DatabaseID = [ NULL | 'DatabaseID' ]

NULL: returns information for ALL databases, if NULL is used no other options can be supplied. This returns ALL indexes for all objects in all databases. Easy but possibly slow.
DatabaseID: smallint type. Refers to the ID for a specific database. DB_ID() or DB_ID('DatabaseID') can be used. The latter allows you to run this from ANY database as long as you have access. However, 3-part naming must be used.

   ObjectID = [ DEFAULT | NULL | 'ObjectID' ]

DEFAULT/NULL: return ALL base data: CL, Heap, LOB for the specified database.
ObjectID: int type. Refers to the ID for a specific object. OBJECT_ID('TableName') can be used. When using OBJECT_ID, you can use 1/2/3-part naming. Be sure to use 3-part when executing outside of database.

   IndexID = [ DEFAULT | NULL | 'IndexID' ]

DEFAULT/NULL: All indexes
IndexID: tinyint type. Refers to the ID of a specific index.

   PartitionNumber = [ DEFAULT | NULL | # ]

DEFAULT/NULL/0: return ALL partitions
#: returns only the details about specific partition. When a PartitionNumber is specified then an IndexID must also be specified.

   Mode = [ DEFAULT | NULL | 'SpecificMode' ]

DEFAULT/NULL/LIMITED: return FAST scan and use only an IS (Intent Shared) Table-level lock. This lock blocks ONLY eXclusive TABLE-level locks and schema changes. Excellent, relatively unobtrusive way to get fragmentation details.

LIMITED: IS Lock. Same as SQL 2000 WITH FAST, only page counts and EXTERNAL fragmentation displayed. Does not detail INTERNAL fragmentation and page density.

SAMPLED:  IS Lock. For tables less than 10,000 pages (~80MB), all details are produced. For tables of more than 80MB, two samples are done (1% and 2%) at every nth page. The samples are compared and if close, 2% sampling output returned. If not close, then up to 10% will be sampled.

DETAILED: S Lock. Entire table analyzed for both internal and external fragmentation. Returns one row for each level of the index from the leaf level (level 0) all the way up to the root level. This can help you determine fragmentation in the non-leaf levels but at the expense of holding a shared table level lock.

Q: How often should you run DEFRAG on your SQL server box? Should this be a part a regular schedule? Taking down SQL is their any other consideration?

First, the only thing that’s not available is the table being REBUILT. Defragging an index does not take that table/index offline. So, more than anything, it depends on what you’re trying to achieve. If you want achieve better availability on SQL Server 2000 then you might choose to defrag rather than rebuild – to keep your tables available.

Q: How often do you get such perfect tables in practice?
A table is always completely clean and contiguous after a rebuild. To periodically fix a table, you should use consistent and automated rebuild strategies.

Q: Do you have suggestions for developers using MSDE when customer’s demands can vary? Vary from few transactions to a large customer with many transactions.
The general best practices in database and table design scale from the low end all the way up to the high-end and in the end – helps your database scale!

Q: Can you touch on rules of thumb for "pad index"?
If fragmentation in the leaf level is minimized through proper index maintenance and fillfactor – then fragmentation in the non-leaf levels should be low as well. You rarely need to specify padindex unless you have widely varying distribution of data and really want to leave larger gaps because of strange densities of data.

Q: Do most of these "Index Rules" apply to Indexed-Views?
Yes! All indexes can become fragmented after data modifications... Your scripts should always look for fragmentation across all scripts.

Q: Can you discuss fragmentation WRT horizontal partitioning, especially range partitioning on the primary key?
SQL Server 2005 offers more granular rebuild options –but not necessarily online. In many cases, you might want to design a read-only partitioned table and keep the volatile portion of the table (especially if only one partition), in its own separate table – possibly using a partition view (or an inline table valued function) over these two tables.

Q: If I'm selecting from a table with a where FirstName = ... and LastName = ... and I have 2 indexes, one on LastName and another on FirstName. Are they both used?
With an AND – maybe. The optimizer will look at the Index statistics to determine if either of them selective enough to use only one index. If neither is selective alone and a better index does not exist (a better index for AND would be one that includes BOTH of the columns in the SAME index – as a composite index), then SQL Server may choose to join the indexes (index intersection).

Q: URLs on the Resources slide can't be read. Could you type then into the Q&A, please?
When the session is available for download (which is what happens when MSDN put this online), then you can access the URLs there as well. Typically, I place all of the links at the beginning of the Q&A – resources section. I’ll make sure to do this consistently!

Q: How does an uniqueidentifier used as a clustered primary key effect performance?
This is best answered by session 4. In short, a non-sequential GUID can cause a lot of fragmentation.

Q: What is ExtentFragmentation as reported by DBCC SHOWCONTIG and is it less important than Logical Fragmentation?
Extent Fragmenation refers to how many extents are next to each other. This is a bit more important than Logical Fragmentation as logical fragmentation shows whether or not the pages are next to each other.

Q: How much danger is there in the defrag processes? What kind of backup procedures do you suggest when you defrag?
More frequent transaction log backups. A defrag generates a lot of log information. However, it does so in mini transactions. As a result, transaction log backups can occur concurrently with the defrag process and even though the defrag is not complete, the transaction can still be cleared because the defrag process runs as small transactions instead of one long running transaction. This also improves concurrency because the locks are released throughout the process.

Q: Defraging a large index can cause the log file to grow quite large. Is there a way to minimize this other than frequently log backups?
Yes, you’re correct – defraging a large index WILL grow the log file quite large! As for minimizing this activity in the log – no way to do that. But – you’re correct in increasing the frequency of log backups!

Q: With very large tables, how much available disk space (both transaction logs and data drive) do you need to have to rebuild? Does it take less space to defrag than to rebuild?
Well, this is really a multipart question… First, log space for rebuilds is mostly dependent on the recovery model. If you are running the FULL recovery model then creating and/or rebuilding indexes will take enough log space for the entire rebuild to complete. If you are running in the BULK_LOGGED or SIMPLE recovery models then this operation will run as a bulk operation and will be minimally logged. While this will take less time and significantly less log space, you are giving up some features when switching recovery models. I would strongly suggest reviewing the second session to see if this is appropriate.

Now, as for data space – a rebuild will always require at least the table size in free space and possibly as much as double (if an online rebuild is being performed). Typically, when space estimates are being done (when capacity planning the database) I always recommend taking the largest table size and multiplying it by 2 or 3 – in order to make sure you have enough space for rebuilds. There is space needed for sorting as well – this can come from the database OR from tempdb (using the SORT_IN_TEMPDB option).

Defraging doesn’t move an object so it doesn’t take additional data space BUT it does require more overall log space because it runs as mini transactions instead of just one.

Q: Should we look at different fragmentation stats if there are multiple files in the same filegroup?
No, you s