This evening I was reading an article in The Washington Post called Twelve Things The World Should Toss Out and it inspired me to start a new blog chain about the top-5 things in SQL Server we all wish would just be removed from the product once and for all.
Here are mine in order of evil-ness:
-
Auto-shrink. Grrr. See Why you should not shrink your data files for why shrink is bad. Auto-shrink is even worse. I tried to have it removed durign SQL 2005 and SQL 2008 development, but to no avail. It needed to stay for backwards compatibility…
-
The FULL recovery model being the default. I would guess that this is responsible for the majority of cases of transaction logs growing out of control. See Importance of proper transaction log size management.
-
RECOVERY being the default RESTORE option. The default should be NORECOVERY to stop people screwing up their restore sequences. I hope you can hear the exasparation from wherever you are!
-
GUIDs being able to be a clustered index key. I shouldn't even have to explain this one. SQL Server should warn you that a PK is by default a clustered index, and that GUID clustered PKs just suck.
-
The default data and log file growth rates of 1-MB and 10% respectively. Seriously, why did they change to 1-MB for the default auto-growth rate? However, the problem is – what should the defaults be? Answers on a postcard please…
These are my top-5. I hereby tag 5 my good friends (in no particular order):
Please link back here so I see when you blog.
Thanks!
14 thoughts on “What 5 things should SQL Server get rid of?”
Paul, I’m curious what the tendency was during the discussions to remove auto-shrink. Did some people actually think it was a good feature or were people just concerned that too many people were using it in production (no matter how misguided) for it to be removed without an entire migration path? This seems like one of those classic millstone features that you’d really like to get rid of because it was a bad idea in hindsight, but that’s gotten too big and prominent to just axe and telling people to "don’t do that" on their own is cheaper.
I agree with all except #4, probably because I have a hard idea with losing the ability to do something even if I’d never do it. Besides, sometimes you want to create an example fragmented table (for articles, learning, etc.) and putting a clustered index on a GUID is one way to do that. Regarding #5, perhaps their shouldn’t be a default, so you’d have to think about what value to use. However, I suppose this would create a perception that SQL Server is harder to use, and marketing isn’t going to like that. Same with auto-shrink (#1), there’s probably a good amount of the user base that sees it as a valuable feature even if it’s evil. For #2 if SIMPLE became the default recovery model, I doubt many users would even notice. Finally #3 is a great idea although this is another case where I wonder if there shouldn’t be a default and you’d be required to explicitly state RECOVERY or NORECOVERY?
In my opinion, #4 should be the first thing implemented. If you really want to create fragmentation for an “instructional” database, there are other ways to do it. A previous employer bought a small software company mainly for its government contracts/contacts. Since they did not want to just fire everyone in the purchased company, management decided the “new” devs should take the lead on a new application being considered. Said application turned out to be a dog performance wise. Worse than anything I had ever seen. EVER! Believe it or not, not only did they use GUIDs for PKs, but rather than update records, they did delete/inserts. Yes, you read that correctly. We (“old” devs) were not part of the development efforts and only brought in after the fact to see if we could help speed up the app. Even when we told management what we found, and explained in excruciating detail why this was ludicrous, the “new” devs said that other apps they had written worked just fine with this kind of approach so it must be something with our hardware and management bought it. That is why you can’t just “count on” people not to do stupid things, you should PREVENT them from doing so.
Regarding #5, I suggested via Connect that default always come from model database. I doubt there is a perfect default, but this would give the DBA some control. I would also like to see setup prompt the installer for the settings for model (recovery, file growth, and initial size).
https://connect.microsoft.com/SQLServer/feedback/details/531593/change-default-behavior-for-filegrowth-argument-in-create-database-statement
#1
I think that Microsoft should get rid of the gap between the office teams and the SQL teams.. and when Access 2000 ships.. I think that it should work with SQL 2000. When Access 2003 ships, I think that it should work with SQL 2005. When SQL 2008 ships, I think that it should work with Access 2007.
Get it? Products that actually work together, that would be a great place to start.
#2
SQL Server Analysis Services has gotten too ugly for most developers. SSAS 2000 was fantastic
#3
SQL Server Reporting Services is not really capable of consuming Olap data. SSRS + SSAS should be ten times easier, ten times more powerful.
I would have to say #4 should be the biggest! GUIDs should NEVER be used in clustered indexes without having to jump through so many hoops, and forcing an IQ test on the spot that ensures you know what it means. There should be no way to do it through the GUI in SSMS.
How about just making all the SQL Server tools actually work with Ado.net? Most of them are still hard coded to MS ODBC and OLEDB. You can’t actually import data into SQL Server from an ADO.Net provider. You have to go build a complicated SSIS package that is about 100 times to hard to build.
The problem is that we keep putting bandaids on stab wounds. Instead of deprecating features, the features should be repaired to provide incredible value. Removing auto-shrink would be doing nothing more than covering up a symptom of the real problem on a sometimes very useful tool. Instead of removing auto-shrink, the database shrink method should be rewritten to work correctly. What the hell… Peter Norton has been doing it right since the early versions of DOS.
Instead of doing a shrink as it’s currently done, here’s how it should work at a 120,000 foot level…
1. Prescan the structure of the file to determine what work needs to be done to shrink the file so that all pages for all tables will be left in a contiguous sequential order in the fewest number of steps.
2. Starting with the "first" out of order page in the file, move it to the first available blank page in the file. This could, of course, be a new single page/extent allocation at the "end" of the file.
3. Find the next correct page for the file and move it to the space opened up by step #2. This, of course, opens up a new blank page for repeats of step 2.
4. I don’t know the exact algorithm that Norton used, but let’s pretend that step 2 and 3 are it… continue steps two and three until complete.
5. Leave space at the end of the file for additional growth according to a user definable setting which, of course, should have a reasonable default based on the size of the data/index space assigned already.
When the run is complete, you’ll have the same effect as what Norton does to a hard drive. Not only will the database be shrunk to the smallest possible size, but all table data will be sequentially contiguous and require comparatively very little disk head movement to traverse the rows of a table even if a table scan occurs. The logical side effect would be the equivalent of having rebuilt all clustered indexes and non-clustered indexes without any table interleaving what-so-ever. Even heaps would become contiguous according to their internal row numbering.
Some options that should be included are (with separate options for non-clustered indexes and clustered indexes/data)…
1. Compact all data and indexes to remove all page and extent splits for maximum compaction and disk space recovery.
2. The opposite setting of 1 would be "Do not compact page or extent splits." Just put them in sequential order. This would be a "self indicating" fill factor for tables and indexes that suffer a lot of inserts/modifications and would prevent a lot of page/extent splits in the near future.
3. Fill factor at the database level. This setting could have a couple of options:
a. No fill factor. Rely on existing page/extent splits.
b. Use original fill factor as originally assigned by index.
c. Fill factor for static tables/indexes (those that have not been written to for xx days/months)
d. Fill factor for frequently inserted tables/indexes).
e. Fill factor for frequently modified tables/indexes).
f. "By table" override for all the above settings.
4. Have the option for all of these actions to run in a "crawler" mode during "quiet times" like most of the disk defragmentation programs have.
Like I said… let’s stop trying to cover up stab wounds with bandaids. Let’s fix the bloody underlying problem and turn things from a "fault" to and incredibly useful feature. If Peter Norton and the folks and DiskKeeper can do it, the boys in Redmond should be able to do it.
Hi Jeff – you’re absolutely right. Problem is, I designed a comprehensive replacement for shrink in early 2001 (when I owned all of DBCC) that did everything you say above and more. There wasn’t enough time and it wasn’t a high enough priority so it didn’t happen. That was the window of opportunity for doing something about it, when someone who cared was in control. Now that’s not likely to happen, I’d like to see auto-shrink removed.
Knowing your background and seeing how much you care on your blogs and forum responses, I kind of figured that was the case and have a much better appreciation for why you want it removed. I agree… if they’re not going to fix it, at least take it out of the hands of babes. Thanks for the feedback, Paul.
I would like to make it impossible for SQL reserved words (and TSQL extensions) to be used as the names for objects. As far as I am concerned square brackets around objects make it easier for bad coding practises to be used.
From reading your earlier articles and blogposts I must say this about number 5. and your comment "what should the defaults be"?
It depends ;-)
Kind regards
Stefan
There are cases where GUID keys make sense, but they’re cases that a DBA won’t see, because performance is generally not an issue.
Metadata controlling a system is the most common. It’s little, and using GUID keys means you can cleanly migrate it to prod after you’ve tested it. The alternative is to just handle the keys manually.
They can also make sense on an Express installation where individual users have their own local database (usually with no network access).
There are a few narrow cases, but in general having a random GUID as the cluster key just causes too many problems that most people don’t realize.
See my comment above: https://www.sqlskills.com/blogs/paul/what-5-things-should-sql-server-get-rid-of/#comment-16905