This post is an example of the "bring your own problem" approach in the classroom, inspired by upcoming SQLskills Immersion Events in August. See previous posting for the reason behind the blog post title.
Figured that I couldn't wrap up this series without mentioning T-SQL, as most of the "bring your own problem"-type problems have to do with SQL, as you might guess. Rather than go after specifc problems that have showed up in teaching/consulting gigs, I thought I'd mention a class of problems related to what you might call "generalizing esoteric optimization techniques".
The SQL Server query optimizer does a pretty darn good job of coming up with a reasonable query plan most of the time. And of adapting as the distribution of data amongst tables and within a table changes. In fact, you could site the dynamic adaptation as one of its best features. That said, there are times when it doesn't always come up with the very best plan. my favorite example is the "Reason For Early Termination" = "Good enough plan found" in a query plan. And we've all seen the parameter sniffing behavior where the "sniffer" in the optimzer sniffed a non-representative value(s) and came up with a plan that doesn't work well with any other set of values.
Folks who have been consultants or teachers for long enough are used to discovering the subtle anamolies in optimization, and coming up with neat workarounds for specific cases. These usually take the form of query syntax rewrites or use of query hints. The results, based upon a known set of static test data, can often be dramatic. However, once you've seen the dramatic results (possibly demonstrated somewhere, but especially when you think of it yourself) there is a natural inclination to apply the neat trick to EVERY case that even looks vaguely related. And the possibility exists that because the distribution of the data may not be the same as the original, or may change over time, it won't really be an optimization in every situation. Sometimes with disasterous consequences. That's when, as a consultant, you might get to consult on it again.
Another situation may be relying on extreme parallelization, which may work well in a controlled single-user situation. And parallelism in general ISN'T esoteric. But, in an OLTP system, that plethora of processors may not always (or will never) be available. Someone in a class recently showed me two SQL stored procedures that logically should have been close to equivalent. One ran over 1000 times faster than the other (that's clock-on-the-wall time). The faster one broke the problem into a series of queries, each of which could be parallelized. The other used a single query with a construct that couldn't be parallelized. The big difference was that the parallelizable process used 48 processors (all the processors on the box) in each step. That's a situation that's unlikely in an OLTP system. If its the only sproc running on the box, and you run one of them at a time, this may be fine. Maybe it would still be better if there were 2 processors available. You can play with MAXDOP in a test situation to try this out.
I'm not saying that there aren't some general rules or optimization opportunities that almost always work. Or that you shouldn't try to squeeze every last bit of performance out of your queries, especially the one that's running 1000 times per second. But if you do use a neat trick, especially one that seems counterintutive, document it in your stored procedure. Along with any table/table set distributions that could make a difference. And prepare to go back and revisit these queries every once in a while, to ensure that your optmization still does what it did when you chose it. Especially with hard-coded query hints.
Does everybody get that?