Well, the person then asked if we could sitdown… I said sure…


We sat down at a table and they proceeded to rummage through their bag, pull out a bunch of paper (which turned out to be a single, very, large piece of plotter paper which was folded up) and proceeded to unfold it and smooth it out on the table…


It was their ERD – yes, all 500 tables (or whatever)…


They smoothed it all out (I was getting nervous as to the length of the still pending “question” :) and then here it came…..


How would you index this?


Literally, that was it. Here’s my ERD – all 500 tables… what indexes do I need…….


How do I even begin? Database design (and indexing) are NOT something with an easy and single “right” answer. In fact, one of the reasons I love working with SQL Server is that there’s almost always multiple ways to handle a problem – each having different pros and cons. So – understanding more than just the data is ciritical. For example, I could have noticed a customer table in their ERD and I could have said that an index to aid in customer lookups – by name - would be a good idea. However, if they responded with “well, we don’t really use the customer name – this is our demographics analysis database,” then my recommendation would have been not only useless but a negative in terms of overhead, maintenance and therefore performance.


So – what’s my point – there are hundreds of things that lead up to good design, optimal indexing, and effective procedural code… a design that gets better cache utilization and one that keeps the data pages compact, logically ordered or grouped, minimizes maintenance and has efficient lookups/modifications, etc… There isn’t a single right answer in design…. and I think that’s good! It means that you can design for the usage patterns you anticipate (within reason).


I promise – if there were a sqlserver /faster switch or a single silver bullet – I would tell you (well, I’d have you hire me as a consultant, surf the web and IM with friends for a week, and then set the switch on Friday :)….


Seriously, how does one design for performance?


You need to know the data and more important know how it’s going to be used (which I realize might NOT be perfect – in fact, I’ve sometimes been surprised by usages patterns once something goes into production so you might not always be right). However, the more you know about usage patterns the better you will be able to design for them and proactively reduce the possibly negative impact of them (for example data modifications can cause fragmentation…which not only creates potentially excessive freespace but also makes the modification take longer to process – and well, there’s so much more to this one…).


Having said that, over my next few blog entries I’ll give you a bunch of ideas and best practices for which to look…