Where do the Books Online index fragmentation thresholds come from?

I made them up. Yup.

I’m talking about the guidance which is:

  • if an index has less than 1000 pages and is in memory, don’t bother removing fragmentation
  • if the index has:
    • less than 5% logical fragmentation, don’t do anything
    • between 5% and 30% logical fragmentation, reorganize it (using DBCC INDEXDEFRAG or ALTER INDEX … REORGANIZE)
    • more than 30% logical fragmentation, rebuild it (using DBCC DBREINDEX or ALTER INDEX … REBUILD)

These numbers are made up. They can and will vary for you, but they’re a good starting point to work from.

There’s been some discussion since PASS, when I confessed publicly on Twitter (during Grant Fritchey‘s session) to making them up, about whether I really said that etc etc. Yes – I really did make them up.

Back in 1999/2000 when I wrote DBCC INDEXDEFRAG and DBCC SHOWCONTIG for SQL Server 2000, customers wanted *some* guidance on what the thresholds should be where they should care about fragmentation or not, and how to remove it. We had to put *something* into Books Online (my favorite “it depends!” wouldn’t have been too helpful), so I talked to some customers, inside and outside Microsoft, did a bunch of experimentation, and chose these numbers as most appropriate at the time. So they’re not *really* made up – they were chosen carefully.

They’re not set in stone – they’re a big generalization, and there are a ton of other factors that may affect your choice of threshold and fragmentation removal method (e.g. recovery model, high-availability technologies in use, log backup schedule, query workload, disk space, buffer pool memory, and so on). I wish Microsoft would update the old whitepaper on fragmentation – they keep promising me they’ll get around to it.

In the meantime, take those numbers with a pinch of salt and don’t treat them as absolute.

15 thoughts on “Where do the Books Online index fragmentation thresholds come from?

  1. Paul,

    Although I don’t discount your numbers, I have heard different numbers from Microsoft engineers in the recent past. Two engineers have said not to bother with rebuilding indexes until fragmentation hits 75% as anything below that will not have a performance benefit if the index is rebuilt. Both engineers stated the same as you in regards to ignoring any indexes with less than 1000 pages.

    ~Tara

  2. Hey Tara – that’s just BS. It totally depends. I’d love to know why they were so I can contact them using my Microsoft email account. Cheers.

  3. Does the size of an index influence your thoughts on those recommendations? In my experience the larger the index the lower the threashold should be. I am just curious to hear your thoughts.

  4. I understand your point. Let’s say you have any index that is 26 gigabytes that has 30% fragmentation compared to a 2 megabyte index with 65% fragmentation the much larger index will put more stress on the disk systems and should (in my thinking) lower the threshold. How do you tell the distribution of the fragments in the index?

  5. It’s a bit hard to say – if the fragmentation is spread through the index, then a higher number could be fine. If the fragmentation is limited to a small section of a large index, but that section is the one being scanned, then a lower number should be the threshold. That gets a bit tricky to explain to people so I tend not to go into details. Sounds like something for my perf benchmarking series though!

  6. Indeed. There’s no easy way to tell the distribution. Various commercial tools will give you a ‘map’ like a disk defrag tool does, but there’s no T-SQL on the net to do it AFAIK. If you know the DML pattern on your table you should be able to work it out with a bit of thought.

  7. Is there any point on Rebuilding Indexes on tables with < 5% fragmentation ? It's recommended by a vendor that we rebuild indexes on certain large tables regardless of the fragmentation level (this is usually well below 5% over a week, around 1%). These tables are between 2-20 GB. It doesn't make sense to me to do that, but I'd appreciate a feedback from the experts.

  8. Sorry, in your post your say “if an index has less than 1000 pages and is in memory, don’t bother removing fragmentation”, but I don’t believe Memory-optimized Indexes contain Pages as such do they?

    Can you confirm that you mean:
    “if an index has less than 1000 pages OR is in memory, don’t bother removing fragmentation”

    else, can you tell me how to find the number of pages of a Memory-optimized Index, as sys.dm_db_xtp_index_stats doesn’t seem to return that?

    Thanks!

  9. Thank you for the informative post. We have a situation where our VAR is holding fast to the 1000 page rule and we have been able to demonstrate that re-indexing some of our smaller highly fragmented tables have drastically helped performance. To that end, if a table is hitting 70% fragmentation and is 300 pages or larger, is there any detriment to the system in re-indexing those indexes?

    1. That rule isn’t a rule, it’s a generalization. Are you sure it’s the removal of fragmentation that’s fixing performance, or the consequent recompile of query plans involving that index? I would bet it’s the latter. Fragmentation of a tiny table shouldn’t cause a drastic performance drop.

      1. Thank you Paul for you comment, I just implemented Ola’s scripts into my OLtP environment, I think it has worked wonferful; however my boss wants to change pagecountlevel parameter to 1 ! for a 2 TB Database. I keep telling him is an overkill for such a big environment. I will direct him to your site.

Leave a Reply

Your email address will not be published. Required fields are marked *

Other articles

Some thoughts on courage

(This is also the Ponderings – editorial – in today’s SQLskills newsletter.) I want to start out this post by sincerely thanking everyone who emailed

Explore

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.