Here's an interesting bug that was just recently fixed, and someone actually hit it today, so it's time to blog it.

Bottom line is that if you're doing NOLOCK scans of tables involving LOB data, the perfmance might suck. 2005 SP3 CU3 has the bug fix, but unfortunately the KB article was written by someone who didn't seem to know what they were talking about, so I suggested it was rewritten, which it has been (but not republished yet). The original KB article is at http://support.microsoft.com/kb/961049/. It's also fixed in 2008, but I don't know which build (I believe CU4 at least).

Here's my explanation of the problem.

LOB data can be stored in-row or off-row (my previous LOB post Importance of choosing the right LOB storage technique has more details). When it's stored off-row, it must be accessed by first reading the pointer to the LOB data from the data/index record, and then following the pointer (remember that 'record' is synonymous with 'row'). When a LOB data value is updated, the off-row value is updated first, and then if the off-row link changed, the data/index record is updated with the new link. There's obvioulsy a window here, where someone reading the data/index record might see the wrong/non-existent off-row pointer. This is exactly what can happen when a NOLOCK scan occurs.

To mitigate the possibility of a NOLOCK scan trying to follow a bad off-row link, the old behavior was to scan all the IAM pages (see Inside the Storage Engine: IAM pages, IAM chains, and allocation units) for the table/index to make sure that the off-row link actually pointed to a page allocated to the table/index. If there are lots of IAM pages, this means lots of logical IOs, and poorly performing SELECT queries. And it does the scan once for *every row*. The person that hit it today had a 500 row select of ~20KB per row taking 20 seconds – 10MB of physical IOs and 30MB of logical IOs!

The fix is to make further use of an already existing in-memory cache of IAM pages to do a quick lookup of the right IAM page covering the GAM interval of the LOB page being read, without having to scan the whole IAM chain.

Hope this helps explain things, and track down perf problems for some of you.