Migrating Legacy LOB Data Types to Current Ones – A Big Gotcha!

Recently, Erin (Blog|Twitter) and I encountered a bug in SQL Server that can affect some scenarios in a very negative manner.  When working with a database that was designed in SQL Server 2000 using the legacy LOB data types, text, ntext, and image, it is possible to encounter a problem when the ‘text in row’ table option has been configured and the data types for the columns were later migrated to the newer varchar(max), nvarchar(max), or varbinary(max) data types in SQL Server.  If you have a table that fits the specific scenario in question, and you try to analyze the table for data compression using the sp_estimate_data_compression_savings stored procedure, the end result is a hung session with a Schema Modification lock being held in tempdb that requires a restart of the instance to clear.  NOT GOOD!!!!!

We were able to reproduce this problem in SQL Server and we’ve subsequently filed a bug internally with Microsoft as well.  The good thing is that there is a work around, which I’ll cover later in this blog post, if you happen to run into this specific scenario and bug.

To start off with, if you create a table using varchar(max) and try to use sp_tableoption to set ‘text in row’ for the table it will error out:

DROP TABLE dbo.Table_1

CREATE TABLE dbo.Table_1
(
RowID int NOT NULL,
TestColumn varchar(max) NULL,
OtherColumn varchar(50) NULL,
ChangeDate datetime NOT NULL
)

exec sp_tableoption 'Table_1', 'text in row', '7000';

 

Msg 2599, Level 16, State 1, Procedure sp_tableoption, Line 102
Cannot switch to in row text in table "Table_1".

This is expected behavior by the Engine since the ‘text in row’ option only applies to the legacy data types and cannot be used with the newer LOB data types in SQL Server 2005+. For these, you instead must use the ‘large value types out of row’ option.

However, if you have a table that was created in SQL 2000 or before, and used text, image, or ntext columns, and the ‘text in row option’ was set:

DROP TABLE dbo.Table_1

CREATE TABLE dbo.Table_1
(
RowID int NOT NULL,
TestColumn text NULL,
OtherColumn varchar(50) NULL,
ChangeDate datetime NOT NULL
)

EXECUTE sp_tableoption 'Table_1', 'text in row', '7000';</blockquote>

and then later you upgraded to 2005+ and changed the column data type to varchar(max) or one of the other LOB data types:

ALTER TABLE Table_1;
ALTER COLUMN TestColumn varchar(max);

The ‘text in row’ value remains configured for the table.

SELECT text_in_row_limit
FROM sys.tables
WHERE name = 'Table_1';

If you attempt to estimate the compression for this table it will generate an error because the compression code creates a copy of the table in tempdb and tries to call sp_tableoption to set the option on the temp table used for sampling.

WARNING: Running the below code will result in a locked session that requires restarting the SQL instance to clear! DO NOT DO THIS ON A PRODUCTION SQL SERVER!

EXECUTE sp_estimate_data_compression_savings 'dbo','Table_1', 0, 1, 'row'

This command will result in error 2599 occurring, however, the SP doesn’t handle the error correctly for some reason, and it ends up with the session waiting on a SCH-M lock in tempdb that is being blocked by itself and never goes away.  The proc will actually complete and return the error for 2599 in SQL Server 2008 R2 + SP1, but it results in a stuck session for SQL Server 2008R2 RTM, SQL Server 2008R2 + SP2, and all versions of SQL Server 2012 that we’ve tested so far (we are waiting for further information from Microsoft at this point about why this might occur, so this may be updated in the near future, but I wouldn’t rely on this behavior currently).
In addition to this, you can’t clear the ‘text in row’ value from the table because executing sp_tableoption will return the above error.  The only work around that I’ve found so far for this issue is to add a dummy column to the table with one of the legacy types, turn off the option, and then drop the column.

ALTER TABLE Table_1 ADD [dummy] text;
EXECUTE sp_tableoption 'Table_1', 'text in row', 'OFF';
ALTER TABLE Table_1 DROP COLUMN [dummy];

Once you do this, sp_estimate_data_compression_savings  can estimate the compression savings. If you have migrated data types from legacy LOB types to the current LOB types, you should check for this scenario before attempting to run sp_estimate_data_compression_savings against the tables that changed.

I’ve done a bit of testing and I don’t think that this problem is prevalent in SQL Server instances in the field for a couple of reasons.  Firstly, it requires that you are attempting to estimate the compression savings for a table affected by the problem, and in my experience, data compression is not really widely used.  Secondly, it requires that you migrated from one of the legacy LOB data types to a current one through the use of ALTER TABLE DDL explicitly (the SSMS UI will generate a script to create a new table, copy all the data into it, drop the existing tables indexes and constraints, then drop the old table, and the rename the new table to the old table name, then build all the necessary constraints and indexes, so the table option won’t be in effect after the designer based change).

You really need the perfect storm of conditions to hit this specific problem inside of SQL Server, but it is still something to be wary of.

Hopefully you won’t hit it, but I’m sure someone else will at some point.

5 thoughts on “Migrating Legacy LOB Data Types to Current Ones – A Big Gotcha!

  1. I have encountered problems with SQL 2000 and compression before on upgraded databases.

    I forget the specifics, but scenario was something like following:

    Upgrade in place (on a test server :)) from 2000 to 2008. Used space ~800 GB
    Apply page and row compression. Used space~768 GB
    Not very effective then
    Check each object. Massive discrepancy.

    Script DB, BCP data, ~200 odd GB

    There seemed to be some structure issues when I used undocumented procs to look at pages, but I truly can’t recall any details

  2. Well, you were right that “someone else will run into this obscure issue at some point”.
    I just ran exactly into this scenario.

    Microsoft SQL Server 2012 (SP1) – 11.0.3128.0 (X64)
    We have a Microsoft Dynamics CRM database with some custom code (and maybe custom tables) from a 3rd party vendor. I was reviewing the largest tables and was considering data compression via SSMS. the largest table is about 2.4 GB and has a bit over 200,000 rows.
    In SSMS, I went to Storage / Manage Compression for that table and requested an estimate for page compression.
    After a good while, not getting any refresh on this estimate, I clicked Cancel. The Manage Compression popup window closed that the operation seemed cancelled.
    Nevertheless, I noticed a SQL session left, blocking itself, exactly as described in this article.
    Trying to kill that session never worked.

    As expected, the details of the self-blocking session are:
    exec sys.sp_estimate_data_compression_savings ‘dbo’,’AsyncOperationBase’,NULL,1,PAGE

    executing KILL WITH STATUSONLY
    always returns: “transaction rollback in progress. Estimated rollback completion: 0%. Estimated time remaining: 0 seconds.”

    It did not seem to eat much CPU but I could see the waits related to LCK_M_SCH_M cropping-up.

    Simply restarting just the SQL instance did not help. The hanged session was still there after the restart.
    In my second attempt I:
    1. Applied your advice to correcting the root of the issue, by using your work around to temporarily add a dummy column.
    2. Stopped and restarted ALL services for this SQL Server instance, including the Agent, Browser and the [SQL Full-text Filter Daemon Launcher]

    This successfully eliminated the self-blocking session and sp_estimate_data_compression_savings works fine now on that table.
    I had concerns that the work around solution might leave the table with a slightly different structure. I scripted the CREATE Table definition before and after applying the work around solution. I compared then and am happy to report that they are identical – so, there should be no impact on existing 3rd party code that is using that table. Please let me know if this risk exists.

    Thank you so much for this article!
    I did not find any more to the point article and it saved me a considerable amount of time, not to mention the stress levels 🙂

    Cheers,
    Radu

  3. Hi Jonathan,

    I am facing a weird issue and finally I ended up looking at this blog. Though my issue was different, this is an excellent blog. Hope I get some solution here.
    we have an application built on legacy classic ASP. The application is hosted on windows 2003 OS. We recently upgraded our database from SQL 2000 to SQL 2008 R2.We are facing an issue where some of the report generation is failing with error ‘Script time out’. The backend query runs fine on SSMS but the code is failing to bind the query output to html table. The weird part is this works fine for some other search criteria even though the data is more.
    Any idea on this.

Leave a Reply

Your email address will not be published. Required fields are marked *

Other articles

Bitten by SSD Bit Rot

Back in 2012, I wrote a post titled Looking at External Disk Performance using USB 3.0 and eSATA with SSD, where I tested a number

Explore

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.