Do yourself a favor… Trust No One

Cue spooky music… I used to love the X-Files, where one of the catch-phrases was Trust No One.

Their other major catch-phrase was The Truth Is Out There.

Yes, it is – but you're likely going to come up against all kinds of obstacles to finding it, just like Agent Mulder did.

Now, searching for true answers to your SQL Server conundrums is unlikely to make you the assassination target of shadowy government agencies or lead to you or your family being abducted by aliens – although the list of undocumented trace flags is a very closely guarded secret…

The problem is that there's so much crap out there on the internet. And I'm not being dramatic here – there really is some *utter rubbish* being written about SQL Server that only serves to confuse and mislead people.

But how do you know who to trust?

What about Microsoft? You'd think that Microsoft itself is trustworthy all the time, right? Wrong.

Case in point: the Microsoft whitepaper that was published in May about performing SQL Server maintenance for SharePoint 2010. It was full of all kinds of nonsense such as:

  • Run CHECKDB, and if it shows problems, run repair to fix them (without any mention whatsoever of backups or the problems repair can cause).
  • After shrinking a database, grow it again and then rebuild all the indexes.

I couldn't believe it when it was published! I offered to rewrite it and it's since been republished with correct info in it.

Anyone inside Microsoft can publish a blog on MSDN or a whitepaper without any kind of peer review by the SQL team and then it becomes the Microsoft official word on the subject. This has caused no end of problems for DBAs, consultants and others who have to work really hard to correct ingrained misinformation from well-meaning Microsoft people.

Don't get me wrong, Microsoft is great – but being The Source itself, it's deemed inherently trustworthy, so how do you tell whether the blog or whitepaper you're reading is written by someone qualified to disseminate information/guidance to a wide audience? Stick to those whitepapers and blogs published by (or through) the SQL Server team, the SQL CAT team, and to the information in Books Online (which is improving in leaps and bounds). Take anything else with a pinch of salt. Even from the excellent CAT team, for instance, some of their advice is based on them dealing with very large enterprise customers running SQL Server on honking big hardware – and doesn't translate well down to smaller SQL Servers with lower workloads.

Also consider the publication date of a whitepaper/blog post, and which version of SQL Server the whitepaper/blog post is referencing – behavior may have changed drastically since then.

What about 3rd-party tools? You have to be very careful with these too. Look at the problems you can have if you create all the indexes DTA advises. Read my rant against one of the tools that was recommending dropping indexes because it wasn't considering INCLUDEd columns in the index definitions. Tools and their recommendations are only as good as the people who write them – if *they* are incorrect in their knowledge or assumptions, so will the tool be.

What about blogs? There are a *ton* of people blogging about SQL Server, across all levels of knowledge and experience. You have to be careful. Everyone starts with zero knowledge and experience about SQL Server, and no-one is an expert in all areas of SQL Server. For instance, I only blog about the areas I consider myself an expert in – and I only ever blog about things I'm 100% certain of, or I blog about how I'm finding things out, like in my benchmarking series. But nobody's perfect – I've made mistakes in the past and posted corrections, as have many people I consider experts. Look for blogs that are referenced a lot by people you've heard of in the community, and then still check what you're reading if you can. There are lots of people blogging and writing articles that contain poor advice, perpetuate myths and misconceptions, and are just plain wrong.

Enthusiasm about SQL Server is great, but accuracy of the information being presented is paramount.

What about forums? These are the most dangerous areas of the Internet for getting poor or incorrect advice about SQL Server. Even on the most reputable of forum sites (e.g. MSDN or SQL Server Central), anyone can join and anyone can reply to forum threads. And it doesn't matter how many experience points a person has either – I know people with lots of points but often post misleading or incorrect replies. But saying that, there are many, many people who give good advice – just take it with a pinch of salt again.

Here's a classic example from the MSDN forums back in 2008. I just *love* the retort about autoshrink that "Nevertheless all high-priced SQL server expert consultants warn against using it.'

What about twitter? Very interestingly I find that the advice on twitter on the #sqlhelp alias is some of the best I've seen – a sort of collaborative, real-time debugging forum. I'm not sure why that is – maybe because there are so many well-informed people in the SQL twitter community.

What about training classes? This is a hard one for me to comment on as I don't want this to seem a plug for our classes. Anyone can set themselves up as a trainer and claim to be a SQL Server expert. I've heard horror stories of trainers reading slides, having demos that don't work and they don't know what to do, parroting marketing info or Books Online, and class attendees having to correct inaccuracies in material. I've also heard of instructors that cannot answer anything but the most basic questions about the material being taught. Before signing up for a class, do some research about the company and the instructors – look for good and bad reviews. Looks for their blogs and how well they're known in the community. There's good stuff out there, and there's also poor quality stuff too.


Here are some things to look out for that should make you wary of the advice:

  • Someone saying that "this is *always* what you should do". E.g. if you have CXPACKET waits, set MAXDOP to 1. No!
  • Someone citing a configuration as a best practice because it worked for them (e.g. using one data file per processor core in all filegroups for user databases)
  • Someone referencing commands or syntax that have been removed from the product (e.g. advice about using DBCC REBUILD_LOG to fix a suspect database on 2005 or 2008)
  • Anyone recommending auto-shrink :-)

One of the worst things I'm seeing recently is people referencing a Page Life Expectancy value of 300 as being a threshold. That number was published in a whitepaper 6 years ago by Microsoft and is a nonsensical number today. IMHO if you see someone recommending this, they're parroting information from the old whitepaper and they don't really know what they're talking about and what PLE is actually saying. Ask them to explain *why* 300 is the threshold they recommend… sometimes expertise is only skin-deep. (Here's my explanation of PLE over on our SQL Server Magazine blog.)

Ok – so Trust No One is a bit too restrictive – but be careful who you trust. Try things out on a test system before doing them in production. Get validation of the info you've read from a few other sources if possible. Don't just blindly follow advice you get on forums – especially for critical operations like disaster recovery.

I'm sure some people will take this post as some sort of self-aggrandizement – that's not my aim in any way.

I'm just fed up of seeing people posting garbage and of having to defend the truth in the face of misleading misinformation.

Remember – The Truth Is Out There. Just be careful who you trust.

Survey: nonclustered index counts (code to run)

Here's a survey I've been meaning to run for quite a while – that Kimberly and I are really interested in.

For all the databases on your server, how many nonclustered indexes do you tables have, plus is the table a heap or a clustered index?

The code below will return the following result set (taken from a random client system):

BaseType   NCIndexes TableCount
———- ——— ———–
Clustered  0         1645
Clustered  1         832
Clustered  2         417
Clustered  3         68
Clustered  4         18
Clustered  5         13
Clustered  6         9
Clustered  7         5
Clustered  8         3
Clustered  9         3
Clustered  10        2
Heap       0         303
Heap       1         19
Heap       2         51
Heap       3         4
Heap       4         1 

I think it'll make some really interesting reading, and you may be really surprised at the results for your system – how many tables with no indexes at all and how many tables with large numbers of nonclustered indexes.

And yes, I'm sure there's a slightly better way to write the code below, but I'm not an expert T-SQL developer (and yes Aaron, I know about your (seriously) fabulous sp_MSforeachdb replacement :-)

If someone wants to bang out a quick post with the PowerShell to run this on multiple servers, I'll link to it. (Update: The PowerShell script can be downloaded from the bottom of this post)

Feel free to send the results in email, in a spreadsheet, in a comment below, or even written on papyrus – anything works.

The more results the better – thanks!

Here's the code:

IF EXISTS (SELECT * FROM msdb.sys.objects WHERE [name] = 'SQLskillsPaulsIndexCounts')
    DROP TABLE msdb.dbo.SQLskillsPaulsIndexCounts;
CREATE TABLE msdb.dbo.SQLskillsPaulsIndexCounts (
    BaseType CHAR (10),
    IndexCount SMALLINT);

EXEC sp_MSforeachdb
    FROM sys.databases WHERE [state_desc] = ''ONLINE''
        AND [database_id] > 4
        AND [name] != ''pubs''
        AND [name] != ''Northwind''
        AND [name] != ''distribution''
        AND [name] NOT LIKE ''ReportServer%''

        AND [name] NOT LIKE ''Adventure%'') AS names WHERE [name] = ''?'')
USE [?]
INSERT INTO msdb.dbo.SQLskillsPaulsIndexCounts
SELECT ''Heap'', COUNT (*)-1
FROM sys.objects o
JOIN sys.indexes i
    ON o.[object_id] = i.[object_id]
WHERE o.[type_desc] IN (''USER_TABLE'', ''VIEW'')
    AND o.[is_ms_shipped] = 0
        SELECT *
        FROM sys.indexes
        WHERE [index_id] = 0
            AND [object_id] = o.[object_id])
GROUP BY i.[object_id];

INSERT INTO msdb.dbo.SQLskillsPaulsIndexCounts
SELECT ''Clustered'', COUNT (*)-1
FROM sys.objects o
JOIN sys.indexes i
    ON o.[object_id] = i.[object_id]
WHERE o.[type_desc] IN (''USER_TABLE'', ''VIEW'')
    AND o.[is_ms_shipped] = 0
        SELECT *
        FROM sys.indexes
        WHERE [index_id] = 1
            AND [object_id] = o.[object_id])
GROUP BY i.[object_id];

SELECT DISTINCT [BaseType], [IndexCount] AS [NCIndexes], COUNT (*) AS [TableCount]
FROM msdb.dbo.SQLskillsPaulsIndexCounts
GROUP BY [BaseType], [IndexCount]
ORDER BY [BaseType], [IndexCount];

DROP TABLE msdb.dbo.SQLskillsPaulsIndexCounts;

Pauls Index Survey.ps1 (2.64 kb)

Survey: do you have SharePoint in your SQL environment?

This question came up on the #sqlhelp tag on Twitter today, and it's something I'm interested in too: what percentage of SQL Server environments include SharePoint somewhere too?

This is interesting because it has implications for standardization of SQL Server maintenance and HA/DR procedures across your company.

If you don't have SharePoint, please take 5 seconds to click No so the results make some sense.

I'll report on the results in a week or two.