(Posted with permission of the dev team)
Here's an interesting bug that people are hitting. I found out about this while here in Barcelona at TechEd when I got roped into a discussion with a couple of Microsoft colleagues. Their customer was seeing errors like the following:
Msg 8914, Level 16, State 1, Line 1
Incorrect PFS free space information for page (1:35244) in object ID 1683128146, index ID 1, partition ID 223091033422352, alloc unit ID 81405523118118176 (type LOB data). Expected value 0_PCT_FULL, actual value 100_PCT_FULL
This error says that the PFS page (see this blog post) has the wrong free-space tracking bits for a text page. In SQL Server 2000, the algorithm to keep track of free space in the PFS pages wasn't perfect so CHECKDB never reported these errors – it would silently fix them. In 2005 we fixed the algorithm (supposedly), so I turned on the reporting of these errors again in CHECKDB.
When I first saw the description from the customer, my first reaction was that it was an I/O subsystem problem causing corruption, but the customer has page checksums turned on, so a corruption would result in an 824 error before an 8914 error. My conclusion then was that there's a bug in the new free-space tracking algorithm. After checking with the dev team, it turns out my suspicions were correct – there is a bug in 2005 SP2. It's fixed in 2005 SP3 and in 2008, but you may see these 8914 errors if you're not running one of those.
Here are the technical details of the problem (slightly edited from the dev team explanation):
The issue was that when minimal logging for LOBs was used (under the SIMPLE recovery model, during BULK INSERT/BCP/Large insert with TABLOCK), extents are being pre-allocated and the pages were being marked 100% full in the PFS page when the pages were allocated. The idea was that all pages eventually will be filled up with LOB data, and by marking them full during allocation we avoid an extra update to the PFS pages when the data is actually put on the page.
Suppose now that 64 pages are pre-allocated, and only 40 or so pages are used and have rows on them. When the transaction commits, the Storage Engine is supposed to deallocate the 24 extra pages that were pre-allocated, and marked 100% full, even though they don’t have any rows on them yet. There was a bug where in a certain case the deallocation wouldn't happen, so you end up with empty pages that have a PFS state of 100%, but don’t have any rows on them.
In general, even though this is a bug in the code, functionality wise, there is nothing wrong with the database, besides a number of additional pages that are empty and allocated to the LOB tree, so as long as the message tells you that the page is supposed to be empty, but is actually marked 100%, nothing can really go wrong with that page. Unfortunately, DBCC will keep reporting these errors.
Although I haven't tried this, my guess is that you can get rid of the empty LOB pages using ALTER INDEX … REORGANIZE WITH (LOB_COMPACTION = ON). So, if you see some of these errors with no other errors, you may have hit this bug and have nothing really to worry about.
8 Responses to Corruption bug that people are hitting: Msg 8914 – PFS free space
Is there a kb article for this bug?
Not that I know of.
Got the same problem on a SQL 2K5 EE, with SP3…
The ALTER INDEX command you suggested had no effect.
Ok – in that case you’ll need to either run repair to fix it up (which is a DB offline operation) or extract the data into a new table. Hopefully this has been fixed in the latest CUs for SP3.
Paul-
I see this error in SQL Server 2008 with a SharePoint Services database SPS02_Config_db. The database was created in SQL Server 2000; we skipped over 2005 and jumped into 2008. I just noticed the problem when I setup a new maintenance plan including the integrity check, but I am confused on how best to fix it. Our Intranet powered by SharePoint Services is an important resource so I am fearful about screwing up while trying to fix it. Any ideas? Is the repair_allow_data_loss my only hope?
FYI, I am very green about SQL Server administration. So I apologize if I ask lots of questions or some (lets be honest, most or all) are obviously newbie. Thanks for your help!
Executing the query "DBCC CHECKDB(N’SPS02_Config_db’) WITH NO_INFOMSGS…" failed with the following error: "Incorrect PFS free space information for page (1:1383) in object ID 834102012, index ID 1, partition ID 336138686169088, alloc unit ID 71830782770675712 (type LOB data). Expected value 0_PCT_FULL, actual value 100_PCT_FULL.
Incorrect PFS free space information for page (1:1463) in object ID 834102012, index ID 1, partition ID 336138686169088, alloc unit ID 71830782770675712 (type LOB data). Expected value 0_PCT_FULL, actual value 100_PCT_FULL.
….REPEATE 49 times….
CHECKDB found 0 allocation errors and 50 consistency errors in table ‘Binaries’ (object ID 834102012).
CHECKDB found 0 allocation errors and 50 consistency errors in database ‘SPS02_Config_db’.
repair_allow_data_loss is the minimum repair level for the errors found by DBCC CHECKDB (SPS02_Config_db).". Possible failure reasons: Problems with the query, "ResultSet" property not set correctly, parameters not set correctly, or connection not established correctly.
Sounds like you had the problem with 2000 (which could get out-of-whack free space too) but didn’t discover it until 2008. You’ll need to run repair with REPAIR_ALLOW_DATA_LOSS to fix it – based on the errors above, it will only fix the PFS free space and not lose any data.
Paul, I am running DBCC CHECDB on SQL Server 2005 SP4. But still i can see these errors in TEMPDB and ‘WSS_Content_SharepointMain’ DB.
Looks like this issue hasn’t been fixed yet. Any suggestions?
Paul, I am running DBCC CHECDB on SQL Server 2005 SP4. But still i can see these errors in TEMPDB and ‘WSS_Content_SharepointMain’ DB[compatibility - 80].
Looks like this issue hasn’t been fixed yet. Any suggestions?