It's amazing the things you learn about people the more time you spend with them... the first thing I want to share is about a great longtime friend of mine - Gert Drapers. It's based on the fact that one of the best parts about a regional event is that you often end up in someone's neighborhood... Tech Ed Europe is in the Netherlands this week and it's his home... Who would have known I'd learn something new about him here - he cooks!

Yes, after 9 years I've learned that he can cook.... Mostly hanging out with him at conferences we've been to a fair share of restaurants together. In his home town we decided to have dinner and what I didn't know is that he would be cooking it! So - I've decided I have to share this... Gert & Vera in the Kitchen (26.06 KB).

And I happen to be posting this while Gert's lecturing on SQLDTS in 2005....

And who else do I need to mention - Pat Helland, David Chappell and Don Box. Last night at dinner Don mentioned that they were performing a new song at the end of Pat's session today (Metropolis: Envisioning the Service-Oriented Architecture). Not only was the session fantastic but their song was some of the best fun I've had at a session in a long time - if ever!!! They did a version of Bye Bye Miss American Pie called Bye Bye Mr CIO Guy.... The words were great (written by Pat) and a video will soon follow. You have GOT to see it - watch on Pat's site www.PatHelland.com as he says it should be there shortly (hopefully this week). (And - from a comment - check out this link: http://channel9.msdn.com/ShowPost.aspx?PostID=11950#11950)

And I'll leave you with a couple of final pictures from dinner last night!

Rocky Lhotka, Darren Green, Don Box and Brian Randell - RockyDarrenDonBrian.JPG (21.46 KB)
Michele Leroux Bustamante, Don Box and me - making Don Blush! MicheleDonKimberly.JPG (25.45 KB)

More fun coming - more pictures to post..... and maybe some technical info too :)

Categories:
Events

Well, I'm not one to rant really but I have to tell you that you wouldn't believe the comedy of errors that all fell together during my 31 hours of travel from my house in Redmond, WA to the hotel in the Babin Kuk area of Dubrovnik, Croatia. My original itinerary was SEA (Seattle) to IAD (WA DC) to VIE (Vienna, Austria) to DBV (Dubrovnik)... it really wasn't too bad but the connections weren't great with 3-4 hours in each city. I was to arrive Sunday afternoon around 3pm. I figured I'd jump in the Adriatic to wake up, have dinner and go to bed.... However, the travel gods had other plans.

I originally booked my ticket to ONLY Amsterdam but later ammended it for the extra week to jet down to Croatia (fyi - it's absolutely beautiful here - I never would have chosen it on my own but I have to admit I hope to come back here again and again - I've added a few pictures that I took this year as well as some I took last year.... it's truly a spectacular treasure).

TheOldTown.jpg (126.29 KB) - view from a road side tourist spot where many pictures are taken!
TheWater.jpg (180.29 KB) - as a diver I really appreciate blue water! This is an amazingly clear shot of the beautiful blue waters off the rocky coast. Sunset.JPG (17.82 KB) - this was taken two days ago - from my deck... Sunset, June 22..... this made all of the travel nightmares seem worth it!

Speaking of those... I'm not in the mood to write all of the details... now that I just looked at the picture again it just doesn't seem to matter :). But - I'll just tell you the route I ended up taking... SEA to IAD to LHR to CDG to DBV and - just to add to that - my bags took another route.... They got here Monday.

So - when all is said and done business travel isn't always that much fun. I met some great people along the way (a young couple going to Paris for the first time on the LHR-CDG flight) and a gentleman from Istanbul who's working on his pHD in CA... and some very helpful United counter agents (yes, I spent 5 hours at United counters on this trip and yes they were in fact, pleasant and helpful!). Anyway, not sure how I kept a smile on my face on this trip and I was shattered when I got here BUT - it is a truly delightful destination. In fact, I ran into a couple of locals Tuesday night (people I had met last year) and we're planning to go out after tonight's dinner party.... Pretty amazing that I just ran into them on the main strada in the Old Town.....

OK - so need to run but wanted to tell you that the party invites are already coming in for Amsterdam. If you're going to be there make sure you go to the SQL Server 2005 party on Wednesday, June 30th. The party runs from 8 to 1am and you need to get a ticket from the ATE booth on Tuesday or Wednesday. In fact, make sure to bother Darren Green (the infamous www.SQLDTS.com guy - and a good friend of mine :) at the ATE. He's awesome!

Speaking of people - can't wait for Tech Ed Amsterdam.... tons of fun people all converging on a great city - for a great week.......

To get more details about the party check out Matt Nunn's blog... (it's in my blogroll)

Categories:

The new Seattle Public Library opened on May 23, 2004 and it's an interesting structure that people seem to love or hate. I've always loved architecture (and I really wanted to be an architect when I was growing up :)) so more than anything I just wanted to learn more about the structure. I didn't expect to hear all about it last night; I thought we were just going to wander...but in fact, last night's event was great! A Principal from the LMN Architects firm discussed the ideas and principals behind the library's design and I was SOLD! The concepts behind the design make a ton of sense. They created “platforms,” each with a function and each with room to grow. And that's what surprised me the most; they started talking about “fillfactor.” Ok, they didn't use that term but there it was....

One of the biggest problems facing libraries is their ever expanding collection of books... and as we know they don't just place books at the “end” of the shelves. They have to put the books into the proper section, etc. And - as sections EXPLODE (think of the size of the computer section in 1960 and think of it now) and others shrink they need to “shift” books to make room. When a library is broken into defined floors it makes it very challenging to expand/shrink. Instead, libraries tend to create “forwarding pointers” (no, they didn't use this word either)... These forward references get you to where the “overflow” books are - on another floor, in another building or in the basement. This becomes confusing and eventually a ton of movement needs to be done and/or the library just doesn't work any longer and needs to expand. 

So - when they were designing the SPL (Seattle Public Library) they created a “books spiral” which cleverly creates a rotating set of floors as if they were one really long floor just spiraling down (about the equivalent of seven city blocks)... OK, I'm doing a horrible job explaining it but the bottom line is that they currently house somewhere around 900,000 books and with this new spiral they can expand selections fairly easily and still have a uniform feel to the library (they expect to be about to hold roughly 1.45 million books). They also (and this is where the fillfactor component comes in) kept free space on the top shelves as well as the bottom shelves so that they had room to grow without a lot of restructuring.

There are so many other interesting design techniques used I've love to spend hours learning more... In fact what they did with the glass to prevent heat is so simple yet effective (the building's entire exterior “net” is made of glass). They inserted thin slices of metal at a parallel to the ground (and so that you could still see through the glass) but at an angle to the glass (because the glass pieces are angled - you'll have to look at the pictures to really see what I mean) and what happens is that the sun reflects off the metal and the library doesn't get overly heated and they still allow in light without having to go with UV film, etc.

Check out the Seattle Times for their slide gallery here and the SPL's slide gallery here. If you're in Seattle you must make a stop at the SPL. There's a lot of info about the library on the web but the best place for details and lots of photos is the main site for SPL, specifically here. Here are a few others of interest:

Seattle Public Library - Main Site
Seattle Public Library: Design is fun on a grand scale
Projects of Rem Koolhaas

Here are a few pics I took:

NeonEscalator.JPG (24.16 KB) - the colors are used to help guide people where to go... I found some of the colors a bit overwhelming (especially the neon green elevators - neon green on the inside)
ReadingArea.JPG (30.67 KB) - lots of defined spaces for reading, relaxing and having a latte (yes, there are coffee stands within the library... of course!)
ReadingAreaTopFloor.JPG (35.13 KB) - this was my favorite area. Probably the equivalent of 12 stories high... you could see Eliott Bay and the colors were just stunning!

Enjoy.

Categories:

Well... the performance ramifications are... not good!

Without seeing more of the proc I'd have to suggest that you avoid this like the plague! I think I can convince you with this simple example which shows why it's better to have multiple procedures and limit the flexibility a bit OR use DSE = dynamic string execution (and unfortunately even recompilation issues don't help - which I talked about here)...

And - if we go down the path of DSE I'm sure that will start a separate thread as far as "sql injection" issues/attacks and security (the user executing the procedure will need permission to directly execute the statement which is in the DSE string). But - there are a few ways to protect parameters in a stored procedure. In this procedure, the parameters will not allow “injection” because of the addition of the QUOTENAME function. This function is INVALUABLE for protecting isolated input values (which is another trick: isolate parameters if you can).

In the following “multi-purpose” procedure it becomes obvious (after testing) that ONLY DSE solves the performance problem. I could still argue that more than one procedure could be beneficial especially as it wouldn't have the access requirements that this one has (remember - users would have to have permissions to the base table). BUT that's a lot more work to maintain/call, etc. Regardless, one procedure with “all purpose parameters” is definitely NOT a good choice (without DSE). (And fyi - a future release might help with some of the permissions issues - that's all I'll say for now! :)

OK - so see the ProcedureParameters.sql (4.3 KB) procedure to play with this within the CreditSampleDB.zip (55.79 KB). You will need to modify some data to get rows to return to these queries; use the updates in the .sql script.

CREATE PROC dbo.GetMemberInfoParam
(
 @Lastname varchar(30) = NULL,
 @Firstname varchar(30) = NULL,
 @member_no int = NULL
)
AS
SET NOCOUNT ON

SELECT m.*
FROM dbo.member AS m
WHERE (m.lastname LIKE @lastname OR @lastname IS NULL)
 AND (m.member_no = @member_no OR @member_no IS NULL)
 AND (m.firstname LIKE @firstname OR @firstname IS NULL)
go

This next procedure uses dynamic string execution to build the exact and needed string. You could have built this from the client as well... and that would introduce other sql-injection issues as well. But - this is a simpe idea and I hope this helps.

CREATE PROC dbo.GetMemberInfoParamDSE
(
 @Lastname varchar(30) = NULL,
 @Firstname varchar(30) = NULL,
 @member_no int = NULL
)
AS
SET NOCOUNT ON

IF @LastName IS NULL AND @FirstName IS NULL AND @Member_no IS NULL
 RAISERROR ('You must supply at least one parameter.', 16, -1)

DECLARE @ExecStr varchar(1000)
  , @MemberNoStr varchar(100)

SELECT @ExecStr = 'SELECT m.* FROM dbo.member AS m WHERE '

IF @LastName IS NOT NULL
 SELECT @Lastname = 'm.lastname LIKE ' + QUOTENAME(@lastname, '''')
IF @FirstName IS NOT NULL
 SELECT @Firstname = 'm.firstname LIKE ' + QUOTENAME(@firstname, '''')
IF @Member_no IS NOT NULL
 SELECT @MemberNoStr = 'm.member_no = ' + convert(varchar(5), @member_no)

SELECT @ExecStr = @ExecStr + ISNULL(@LastName, ' ')
 +
 CASE
  WHEN @LastName IS NOT NULL AND @FirstName IS NOT NULL
   THEN ' AND '
  ELSE ' '
 END
 +
 ISNULL(@FirstName, ' ')
 +
 CASE
  WHEN (@LastName IS NOT NULL OR @FirstName IS NOT NULL)
     AND @MemberNoStr IS NOT NULL
   THEN ' AND '
  ELSE ' '
 END
 +
 ISNULL(@MemberNoStr, ' ')

EXEC(@ExecStr)
go

The first procedure generates a plan based on the MOST selective criteria (there's more info in the script and you'll see this in the showplan/statistics io output that I describe - also in the script). The second procedure generates a good plan for each and every execution (as expected). I think this is a good example of what you can do!

I'm going to put together one more entry to bring all of these points together as well. Stay tuned!

Have fun!

OK - I've requested that MSDN create a new “support webcast” page that has all of these links added to it (for a more complete archive/reference item) but it's not yet been completed.

For completeness - here are ALL of the resources:

So - I think that should do it. I hope you enjoy these resources. And - if you have comments, please post them to the specific blog entry for that subject area.

Have fun and I hope you'll be there in July and I hope to meet you in October!

Categories:
Events | Indexes | Resources

I just got off the phone with the event site... and it's confirmed! We will add wireless internet access in the meeting room, along with extended labs hours - so you can stay connected as well as keep up with the interesting content! On Monday we will distribute CDs with the course demo database, all lab scripts and all lab materials. Official lecture will run from 9-5 but we will stay in the class with open lab until 7pm.

General details are on www.SQLskills.com under Services, SQL Immersion Events! I'll be posting these changes shortly.

NOTE: If three or more employees are attending from the same company - special corporate discounts will apply. Additionally, if you're a Microsoft Regional Director, Microsoft MVP or MCS Consultant who would like to attend (possibly with customers), please email me for special Microsoft-related and group-related discounts.

Categories:
Events

Index Management/Maintenance Questions:

Q: Is there a tool or method for monitoring a database for awhile to identify indexes that are rarely or never used – that should be considered for removing?

I think I mentioned this one late in the presentation but it certainly warrants an answer. Unfortunately, the answer is both yes and no. To start with the “no’s” – there’s nothing inside of SQL Server [today] that keeps track of index usage. One option is to create a trace and run it through ITW. ITW has an option on the first dialog that asks if you want to “Keep all existing indexes.” If you deselect this it will give recommendations to drop indexes that are not used by that workload. Be careful! You might drop an index that someone else needs. In SQL Server 2005, the ITW is being replaced by DTA (the Database Tuning Advisor) and it has a "drop-only" mode that allows DTA to only check for the unused indexes.

 

But – this still doesn’t really give you a list of what gets used, what doesn’t. I can certainly think of a way that will be expensive (in terms of profiler cost and your time J) to implement and given a bit of time this might be somewhat useful but not sure?! Here’s my idea, you could capture showplan information into a profiler trace, load it into a sql table and then look for your indexes by name (make sure to use the fully qualified name such as [Credit].[dbo].[member2].[member2Cl] as index names only need to be unique per table). You could even do an outer join with sysindexes to get the complete list of indexes related to the workload. It won’t be pretty but it can certainly be done and it’s still ONLY for the time of the workload. Overall, I think this is too expensive to do. Good news coming though… In SQL Server 2005 there will be a virtual system table that will keep track of index usage since the last server restart or since you reset the values.

 

Q: What is the best way to analyze both the space used and fragmentation of each table? More specifically, is there one procedure that will check all tables in a database, output the results and show me which tables/indexes are foobar?

I don’t think any particular command actually outputs the level of “foobar-ness” but… this one will get you very close. J Using DBCC SHOWCONTIG WITH TABLERESULTS returns almost everything you’re looking for. Use the “pages” column to calculate space ((convert(decimal(10,3), pages) * 8K)/1024 = MB), and then the AveragePageDensity (for internal fragmentation) and then ScanDensity, LogicalFragmentation and ExtentFragmentation (for different forms of external fragmentation). One of the best parts of having returned a tabular set is that you can put the data into a table, verify if/change it and even walk it to analyze it as well as run certain operations as a result of the information you learn. In fact, this is what my sp_RebuildIndexes script does. And check out DBCC ShowContig with TableResults.sql (another demo script) for the definition of the table. Both of these scripts can be found with the demo scripts: MSDNWebcast-Tripp-20040611-SCRIPTS.zip (73.19 KB). Additionally, you can find the scripts on www.SQLskills.com under Event resources.

 

Q: Does SHOWCONTIG keep a table lock? and I was told not to use DBCC SHOWCONTIG on a production server as it can adversely affect performance… Is that true?

First, yes, there is a shared table level lock acquired for the length of the scan. If you run DBCC SHOWCONTIG with the default options that scan will not only scan for external fragmentation but also calculate internal fragmentation. To reduce the time for which these locks are held you can use the WITH FAST option on DBCC SHOWCONTIG. This will ONLY do the scan (returning less fragmentation details) but it's good if all your looking for is Scan Density! So – this can create blocking and you should consider running it off hours OR consider running it on another server where you restore a backup of your production database. All fragmentation that exists when the database is backed up - will be restored since the restore does not physically change any data/pages, etc.

 

Q: Could you tell us how FILLFACTOR affects performance?

The short answer is that it helps IMPROVE performance by leaving free space on the pages when the index is built, rebuilt or defragged. When building or rebuilding the index you can specify (using FILLFACTOR) the percent fullness for each page. By leaving free space you can minimize splits, reduce fragmentation and therefore minimize the need to defrag as frequently. So – you should set the fillfactor based on the frequency with which you can rebuild… Combining the frequency with which you can rebuild and the rate at which your table becomes fragmented between rebuilds you can start to find the most optimal setting. The more fragmented it becomes then you should lower the fillfactor (on the next rebuild)... and if the table doesn’t become overly fragmented (and there’s still free space by the time of the next rebuild) then you might want to increase the fillfactor... so over time you can find the most optimal setting... but it does take a bit of work! Start with the “monster” tables first – the ones that give you the most problems and the ones that are the largest! With a bit of time you’ll have a more consistently balanced system.

 

Q: On a very large table (32 GB, 20 mil rows) with a clustered index and 10 non-clustered indexes – where would you start? If the length of time it takes to execute is long and you can only reindex or defrag a few indexes per day, which ones would you do and in what order?

Well, this is a great question! But – I’d like to change the focus a bit; if you focus on minimizing fragmentation then that will in turn keep things more optimal. Minimizing fragmentation in the clustered index is done by creating the recommended type of clustering key (unique, narrow, static and ever-increasing). If you meet all of these then the base table should only be getting fragmented due to updates to varchar. If you have lots of these then setting a fillfactor can help. The fillfactor defines the amount the pages should be filled to when the index is rebuilt or defraged. If your index is becoming very fragmented between builds then you should consider increasing this frequency or lowering the fillfactor. So – this is really what you should look into: verifying that you’re setting the fillfactor appropriately between rebuilds/reindexes and then adjusting that schedule appropriately!

 

But – when you still have a lot of fragmentation and still need to do frequent rebuilds – focus on the most important tables first, then the most important indexes first (i.e. start with the clustered). From there it’s hard to say but the wider indexes are probably getting used more so I’d look at those next.

 

And – one other tip – if some of you are getting a lot of fragmentation due to updates, think about: fixed length fields (which is rarely an option), default values (GREAT choice if the updates come in with no value and later get updated) and rebuilds either more frequently or with a more appropriate fillfactor.

Categories:
Events | Indexes | Resources | Tips

Index Creation Questions:

Q: In an OLTP server do clustered indexes create a negative performance impact to the constant inserts and updates?

The wrong clustered index can for sure… In most environments, having the right clustered index (an index on an ever-increasing key) will be better than a heap (a table without a clustered index) and much better than the wrong clustered index (a clustered index that creates wildly random access patterns for insert – i.e. LastName). So – the idea with the clustered index debate (as I often refer to this as J) is that the right clustered index balances an improvement in performance for inserts (by minimizing splits as well as having the needed pages already in cache – and by needing less cache overall) with keeping the table more compact (i.e. less fragmentation). However, if you have a high volume of inserts (typically more than 400 per second) then you might want to add additional hot spots (cluster on a composite key like “state, id” which creates as many hot spots as you deal with states) to better balance this increased volume. I still wouldn’t go with something random (like name) and if the table has a lot of scans/reads I certainly wouldn’t go with a heap. You do have to be careful with updates though – if there’s a high volume of updates and the updates are to varchar then there are a few things I’d think about: fixed length fields (which is rarely an option), default values (GREAT choice if the updates come in with no value and later get updated) and rebuilds with a more appropriate fillfactor.

 

Q: Is a unique key the same as a primary key?

Almost… SQL Server has two ways of enforcing entity integrity: the Primary Key (of which there can only be one per table) and Unique Key(s) (of which there can be many – 249 is the maximum number of non-clustered indexes so unique keys and other non-clustered indexes combined can be no more than 249).

 

Q: I thought you said the primary key constraint automatically gets a clustered index, how can you make it non-clustered key?

When specifying index-based constraints (Primary Key and Unique Key) you can specify index type. The default index type (when not specified) is Clustered for the Primary Key and non-clustered for a unique key. Here’s complete syntax for adding these keys to existing tables.

 

ALTER TABLE dbo.Employee
  ADD CONSTRAINT EmployeePK
    PRIMARY KEY CLUSTERED (EmployeeID)

 

ALTER TABLE dbo.Employee
  ADD CONSTRAINT EmployeeSSNUK
    UNIQUE NONCLUSTERED (SSN)

 

In either case you can CHANGE the underlined part of the syntax to change the index type.

 

Q: How is a unique key created? Explicitly or is it Implied via Indexes?

Well, I have to admit I’m not entirely sure of the question here but I think I do… When you create a unique key constraint SQL Server enforces that uniqueness by automatically adding a unique non-clustered index to the table.

 

Q: Are GUIDs as efficient as identity for the clustering key?

Not really. They do meet the criteria of being static and unique however, they’re not as narrow (they’re 16 bytes) and if the newid function is used to populate the values (which is the common approach) then the values are not ever-increasing. With the values populated by the newid function you can end up with a lot of fragmentation quickly. There are alternatives to the newid function for generating the GUID value as an incrementing value through a windows call. Gert Drapers has an extended procedure to help you do this on www.SQLDev.net. I’m not sure of the specific link but I will update this here when I do!

 

Q: When you say to avoid a “volatile” clustering key, do you mean the data value actually changing/being updated?

Yes! Since the clustering key is in EVERY nonclustered index you want to choose it wisely and make sure that it’s relatively static. If it changes that will require all of the redundant versions of the value to change as well…

 

Q: If you create an identity column in a table just for the sake of creating a clustered index on it, how does this improve performance (especially if most of the queries use other fields in the table)?

Another great question! Remember two things: the internals of SQL Server rely heavily on the clustering key (even if you don’t) and if the table stays more compact (by not becoming fragmented) then queries [especially large scan queries] improve. So – it’s a better balance. Now to add a bit more – in many cases having the right nonclustered index can yield better performance for most queries anyway!

 

Q: You said in the power point slides that generally a few wide indexes are better than several narrow; However, later you said that one narrow clustered index on an id column + non clustered indexes is the way to go. So which is it? Or did I not really understand?

You’ve got it – it’s a combination of the two. The clustered should be narrow/static/unique but non-clustered indexes tend to be more effective when they are slightly wider (i.e. they contain a couple to a few columns rather than just one) so that they can service more types of query requests.

 

Q: Do I need to change my indexes if I use the like operator? I noticed that I get Index Scans instead of seeks when I use LastName like '%johnson%' v. LastName = 'johnson'?

Well, this one is kind of mixed. But – let me give you something to think about (which might help)… Think about the phone book for a second – if you have someone’s last name it’s pretty easy to find them. However, if you only know they have “johnson” somewhere in their name it’s a lot harder, right. This could yield ‘Cajohnson’, ‘Ejohnson’, ‘Smith-Johnson’ and so forth. So – there’s really no effective way to get at that data… SQL Server will always need to scan for that type of request (now I could add that covering indexes might help but quite honestly I think there’s a better option here). SO - changing index strategy really won’t help. But – I often see applications that just automatically add the % before and after the requested data. If this is the case, can you ask the user if it’s a real value or a pattern value? Or can you even allow them to supply the exact request? When the value is specific (i.e. Johnson) then the query will be fast, when the value is pattern matching (i.e. %johnson%) then SQL Server will need to scan and the query won’t be as fast – but it will be correct! If you want to search searching for more wildcards and more complex pattern matching into descriptions, etc. then you’ll probably want to look into Full-text Indexing.

Categories:
Events | Indexes | Resources | Tips

General Questions:

Q: If you want to know the value of the key prior to your insert statement, how can you use identity?

You could insert a “place-holder” row (i.e. a row that uses only defaults and/or just basic information so that you can get the @@identity of the row) and then come back later to update the data values. However, be very cautious with this approach. If you insert a lot of NULL values and have a very narrow row inserted then when you come back to update it (and therefore widen the variable width columns, if any) you can end up creating a lot of fragmentation. To minimize this, make sure to use default values (i.e. DEFAULT constraints) to pre-allocate space and reduce fragmentation.

 

Q: Can you give me a brief definition of a doubly-linked list?

Speaking directly to the “leaf-level” of an index the doubly linked list refers to the pages that contain the index data – in indexed order. Since there’s likely to be more than one row per page and since an index implies order; the pages are “linked” to provide that order logically.

The NIST (National Institute of Standards) also has a series of “data structures” definitions and doubly-linked list is here.

 

And for a bit more about SQL internals – SQL Server 2000 (actually 7.0 and higher) uses 8K pages. Each page has a 96 byte header – [a very small] part of which is used to store this previous-page and next-page page pointers. As for how many rows SQL Server will store; that depends on the width of the row. You can divide 8096 by your average row size to get an idea of how many rows you will be storing and you can use DBCC SHOWCONTIG (‘tablename’) WITH TABLERESULTS to see information columns labeled MinimumRecordSize, MaximumRecordSize, and AverageRecordSize for more specific table-related values. There is a maximum limit of 8060 bytes for a single inserted row.

 

Q: Well, this is a live meeting question and I should have reminded you all! But - for future reference… How do I see the demo in full-screen mode?

Press Control + H when the webcast begins a demo.

 

Q: Could you please give an example of a foreign key constraint?

A foreign key constraint is used to enforce referential integrity between two columns of the same or different tables. For example, you might create two tables: Employees and Departments. Because each employee can only be in one department, you can enforce this relationship with keys. First, make the DepartmentID (of the Departments table) a Primary Key and then make the DepartmentID column of the Employee Table reference Departments. Here’s a small snippet of pseudo code.

 

There are lots of good references out there on database design and constraints and I’ve heard good things about Database Design for Mere Mortals by Mike Hernandez. And, here’s an article on MSDN titled: Implementing Referential Integrity and Cascading Actions by Itzik Ben-Gan.

 

Q: Why does SQL Server allocate up the all of the memory? Even if I stop the service in short time it’s the same?
Since SQL Server is typically run as a dedicated service in production environments, SQL Server typically allocates memory to have it available when it needs the memory. If it detects memory pressure SQL Server will release those pages. If you’d like to configure memory for a specific max server memory and/or min server memory, you can check out the books online as well as this MSDN article titled: Inside SQL Server 2000's Memory Management Facilities by Ken Henderson.

Categories:
Events | Indexes | Resources

Tools Questions:

Q: Isn't there some issue with profiler where it does not recognize DBName but only the DBID?

Well, this is a good one – and a frustrating one as well. It is true that Profiler doesn’t always generate a data column value for every type of event. So – if you setup filters you will only filter rows which have a value for that data column. If you ONLY want to see things with a very specific value it can take quite a bit of time to remove all of the uninteresting events to narrow it down to only what you need. In the SQL Server Books Online there is a topic for each category – for example: Stored Procedures Event Category has a listing of all of the data columns that are produced for each event. This is a bit tedious to go through though and there’s a nice full html version that’s available on Gert Draper’s website: www.SQLDev.net. Checkout this link specifically: http://sqldev.net/misc/SQLTraceEventMatrix.htm

 

Q: Is it true that ITW is beneficial because it can predict how the optimizer will benefit most from the indexes it recommends? Sometimes I find that the ITW recommends statistics that hinder the query performance. Do you feel comfortable using it for the most part?

Ok, so there are really two questions here – first “how does ITW work?” and second “is it always a good thing?”

 

As for how it works – yes, you’re correct… ITW hypothesizes about indexes by creating statistics about your data, looking at the queries in the workload and then based on how the optimizer would optimize – it tells you which indexes would be best serve your data!

 

As for whether or not it’s always a good thing… the first thing I’ll jokingly say is – Is there ever an ALWAYS good, good thing? J Joking aside, there is some truth to this (although rare). I do feel comfortable using ITW and because the statistics are based on real data, not updated if they’re not used (there’s a good whitepaper that explains statistics here), so in general, statistics don't usually hinder query (or server) performance. However, having an extremely large number of statistics and indexes can increase optimization time (in analyzing all of these choices) and this may in fact be something you’re seeing.

 

Q: When statistics exists on some of my client tables, SQL Server sometimes chooses to use an index that is less beneficial (even after we update stats with full scan). In some cases, we've had to force the index using optimizer hints. As a result, we disable "AUTO CREATE" and only create indexes when performance issues arise and are needed. Granted this isn't the best method but until the tables become really large we find that some queries work fine.

Ok, well there’s a lot to this one. First, I’d say that there’s a difference between just having statistics and having the right indexes. There are some cases where ONLY having statistics isn’t good enough to give SQL Server enough information to proceed and instead they make incorrect decisions. It’s generally when SQL Server only has statistics for individual columns and doesn’t have better information for a combination of columns (wow, this is hard to describe in a short Q&A). But – I’ll sum it up to say that statistics aren’t always perfect – some assumptions may need to be made and sometimes those assumptions don’t work out because the data isn’t evenly distributed… So, if the indexes that exist are not "perfect" for the query SQL Server ends up not doing as good of a job as it might if it had the right indexes... If it helps, I have a class where I get into more details on statistics (a whole module on just statistics J) that explains a lot more details... It's hard to sum up in a sentence or two BUT I would guess that a bit more tuning (i.e. more real indexes and not just statistics) might help solve the problem... Really, hard to say in a quick chat but that really shouldn't be the majority of the time.

Categories:
Events | Indexes | Resources | Tips

Theme design by Nukeation based on Jelle Druyts