I've blogged about ghost records and the ghost cleanup task a couple of time before (the only place it is really explained AFAIK), but one of my fellow MVPs was asking me some questions about it today for a customer of theirs and couldn't find the trace flag to turn it off.
My previous blog posts about this are:
These explain what ghost records are and how the ghost cleanup process works.
On big systems it's possible for the ghost cleanup process to fall behind the rest of the system, with no hope of it catching up. It's a single-threaded task – so imagine a 16-way box with lots of deletes occuring, and a single CPU spending a few seconds every 5 seconds trying to remove all the ghosted records resulting from the deletes of all the other CPUs. It's pretty obvious that the ghost cleanup process is going to lag behind.
The problem with this scenario is that the ghost cleanup process will still pop up every 5 seconds (every 10 on 2008) and start removing ghost records, potentially causing performance issues by keeping pages in the buffer pool, generating log records, and causing physical IOs. The ghost cleanup task is also one of the background processes that can cause IOs to occur on what looks like a totally quiescent system.
There is a way to turn off the ghost cleanup task, using trace flag 661, as documented in KB 920093. Be careful though!!! If you disable the ghost cleanup task, the space taken up by deleted records will *NOT* be released for reuse by SQL Server until you do something else to remove it, like rebuilding an index.
One method people sometimes consider is to force ghost cleanup to clean everything by performing a table or index scan (thus queuing all the deleted records up for the ghost cleanup task). Although this is an alternative, it still uses the ghost cleanup task to do the work, and on a very busy system with a very large number of deletes (warning: generalization! :-) it can be much more efficient to remove the deleted-but-not-yet-reclaimed records using index reorganize or index rebuild.
Turning on this trace flag can be useful as a performance gain on systems with very heavy delete workloads, but only as long as you're careful about it. It's not something that's generally recommended but it may be useful to you.
Enjoy!
18 Responses to Turning off the ghost cleanup task for a performance gain
Thanks for the info, Paul. This would have been REALLY useful for a project I did two years ago. Oh well, better late than never!
Hi Paul,
Instead of disabling the ghost cleanup with a trace flag, one should be able to force the Ghost cleanup with an index hint and be done with it instead of waking up automatically every few second(s) if generating the log records is ok. Would you or would you NOT recommend this method? Is generating too much log the only reason to turn off this feature?
Adam,
What was the scenario that would have benefitted from turning off the Ghost cleanup if you can recall? I am trying to convince myself that there is a valid use case for turning off this feature.
You could do that, but it still forces the ghost cleanup task to get rid of the deleted records. For a very busy system where such a method is necessary, it can be way more efficient to simply turn off ghost cleanup and allow index reorg or index rebuild to remove the deleted-but-unreclaimed space. Thanks
Good idea, thanks for the tip. Is there an easy way to tell what % of resources the ghost rec cleanup is taking up on the system?
Thanks,
Scott
Hey Scott – not that I know of.
The easiest way is, of course, to disable the ghost rec cleanup and while measuring the resource usage of the system before and after. Will using the dbcc traceon(661, -1) disable it on the live system such that you could see the difference immediately or does it require a reboot (and to be set in the startup paramenters)?
Would using dbcc traceon (661, -1) disable the ghost cleanup process without requiring a reboot? That way we could have some idea if it was going to help before we permanently add the traceflag to the startup parameters.
Yes, you can turn it on dynamically. Just be careful :-)
[...] ghost cleanup background task. It's documented in KB 920093 and I've blogged about it here (and in a few other posts). It's a very useful trace flag which is safe to use, is [...]
[...] Randal’s blog posts Inside the Storage Engine: Ghost cleanup in depth, Ghost cleanup redux, and Turning off the ghost cleanup task for a performance gain. To my knowledge Paul’s posts are the only things that cover Ghost Cleanup at any level [...]
Paul,
I’m seeing a lot of Error: 1222, Severity: 16, State: 18 when doing tracing in our system. We currently have ~40 customer databases on this physical server, each with the same schema but different (their own) data. I see 1222 pop up randomly, not always after the same Stored Procedure call, so I’m not even sure that what I’m seeing is the Ghost Cleanup process running, or if it is normal in the system. How do I determine that it is indeed the Ghost Cleanup or not? Is there anything else it could be with the 1222,16,18 (ex. I’ve read maybe very long running queries are blocking/locking)? And finally, if it is the Ghost Cleanup, what impact, if any, is there to the users in the app? I was able to produce my own 1222 exceptions while testing in our app and could see the errors in the trace file go flying by, but I did not get an exception in the app layer or notice any lag or delay in the UI. But yet I have customers claiming they do get these freezes/lags. I’m just trying to isolate the issue and reproduce so we can then fix/tune accordingly. Any insight?
Thanks!
Why do you think the lock timeouts are from the ghost cleanup task? One way to prove yes/no is to enable trace flag 661 using dbcc traceon (661, -1) to disable ghost cleanup. If you still see the 1222s, nothing to do with ghost cleanup.
As I said, I’m not even sure it is Ghost Cleanup but I’ll have to work with our DBA to see if he is willing to enable trace flag 661 to disable ghost cleanup. Assuming I still see 1222s, in your experience, what else typically causes them and why don’t they seem to generate (from some brief testing) any impact to the UI if they are timeout related? I guess we might have to start trying to catch them specifically in the UI code to help learn more? Any suggestions as to help determine what would be causing them?
Unless you’ve changed the default lock timeout (infinite) what you’re seeing must be a background task.
I am seeing timeouts randomly and trace leads me to SPID that maps to task manager.
Is ghost clean the only task that task manager handles? if trace flag is turned on, how will it affect the performance if index rebuilds are done once a week?
Nope – there are a bunch of background tasks. Turning on the trace flag will not directly affect index rebuilds.
Paul, May I know why heap table doesn’t support ghost records?
Take a look at my Ghost Cleanup In Depth post – I explain in the comments there. Thanks