(The Curious Case of… used to be part of our bi-weekly newsletter but we decided to make it a regular blog post instead so it can sometimes be more frequent. It covers something interesting one of us encountered when working with a client, doing some testing, or were asked in a random question from the community.)
Last week I had a question from someone who wondered about the 8060-byte row size limit and why data file pages sometimes show more than 8,060 bytes free when looked at using DBCC PAGE.
Before I explain, let me make it clear that the 8,060-byte limit is just for the part of the record that’s stored on the data or index page in the ‘in-row data’ allocation unit; a record can have many LOB columns that are stored off-row in a ‘LOB data’ allocation unit (e.g. varchar(max) or FILESTREAM columns) and/or non-LOB, variable-length columns that have been pushed off-row to be stored in a ‘row overflow’ allocation unit (e.g. char(1-8000), nchar(1-4000), or sqlvariant columns), and so the actual row size is essentially unlimited.
I could rephrase the original question as this: given that a page is 8,192 bytes, and only 96 bytes are used for the page header, why is the in-row record size limited to 8,060 bytes? What about the other 36 bytes of available space?
The answer is that an 8,060-byte row can actually take up a lot more space.
Firstly, there’s the 2-byte row offset that’s stored at the end of the page and tells the Storage Engine where the row starts on the page. So that’s 8,062 bytes used, 34 bytes remaining.
Secondly, if the row is on a heap data page, and it’s a forwarded record, there’s a 10-byte back-pointer (to the original location of the record) tacked on to the end of the record. This is used in case the record has to move again, and then the forwarding stub record (in the original location of the record) is updated to point to the new location of the forwarded record. So in that case, that’s 8,072 bytes used, 24 bytes remaining.
Thirdly, if the row has a prior version, because it’s been updated since either read committed snapshot isolation or snapshot isolation were enabled for the database, it will have a 14-byte versioning tag tacked on to the end of the record, containing the location of the previous version of the record in the version store in tempdb and the versioning timestamp when the current version of the record was created. So in that case, that’s 8,076 bytes used, 20 bytes remaining.
Lastly, it may be a heap forwarded record that also has a prior version, so has the 10-byte back-pointer and the 14-byte versioning tag tacked on the end. So in that case, that’s 8,086 bytes used, 10 bytes remaining.
And those 10 bytes are available for any future features to use. Some of them are probably used by the on-page versioning that Accelerated Database Recovery does in SQL Server 2019, but I haven’t investigated that in any depth yet.
So there you have it: an 8,060 byte record may actually need 8,086 bytes or more of physical storage because of various Storage Engine features.