I’ve been interested in exploring what changes may be coming through in SQL Azure (Windows Azure – “SQL Database” now) around query execution cost models and cardinality estimates being fed to them. This was motivated by a couple of things…
First of all, I saw a blog post from Grant Fritchey where he noticed that the estimated costs for a few plan operators were different.
Secondly, I read a paper called “Testing Cardinality Estimation Models in SQL Server” by Campbell Fraser, Leo Giakoumakis, Vikas Hamine, and Katherine F. Moore-Smith. This was a for-fee article, but the non-member price of $15.00 was worth it. One particularly interesting quote was as follows:
“The new CE model is planned as a future service release of the Microsoft SQL Azure service.”
That quote was a tipping point for further investigation, so, collaborating with Jonathan Kehayias, we discussed a testing approach and set up two different Azure databases, with one database on Web Edition and the other Business Edition. The intention wasn’t to perform “formal” tests, but I did want to sniff around and see what variations in cost and cardinality estimates I could find (if any) between SQL Azure (version 11.0.2006) and SQL Server 2012 (version 11.0.2316) across various types of queries. I used the Credit database for a variety of test queries – with identical schema and data in all three databases (one engine DB and two Azure DBs).
One thing I’ve learned so far is that you should watch out for is misinterpreting cost differences for “identical” databases. Even if you load the same exact schema and rows, you will likely have a different data page count between Engine and Azure (think of the Azure fillfactor and RCSI behavior). For example, after loading my dbo.member table in SQL Azure, it had 159 pages versus the 142 pages in my SQL Server 2012 version. So testing an initial Clustered Index Scan query against that table showed me an estimated IO of 0.1075694 in SQL Server 2012 versus 0.120162 for SQL Azure. So assuming one random I/O and the rest sequential, I see that my SQL Azure cost is still calculated the same for the Clustered Index Scan:
— Random I/O – 0.003125
— Sequential – 0.000740741
SELECT 0.003125 +
0.000740741 * (159-1);
So the key will be to make sure I’m looking at true apples-to-apples comparisons here. I’ll be testing when I have a few spare moments between other tasks – but in the meantime I’m very interested to learn more about what new changes will come in to SQL Azure in the future. I’ll share anything interesting I find on the blog – and if you find noteworthy QO items, please share on the comments of this blog as well.
No Responses to What Does the Future Hold for Cardinality Estimates and Cost Models in Windows Azure SQL Database?
Another thing to watch out for is failovers in Windows Azure SQL Databases. On busy systems it happens a lot more regularly than you may think, and the secondary does not necessarily have the same number of data pages for a table or index (due to factors such as recreating the secondary behind the scenes therefore rebuilding indexes). I have seen overall database sizes jump up or down by as much as 10GB due to a failover. This affects cardinality estimates and is not predictable.
You can monitor for it though by taking regular snapshots of sys.dm_db_partition_stats.
Interesting! Thanks for that comment, Michael.
Are you able to share more details on the data page variances that you encounter?
I was discussing this with Jonathan Kehayias and that behavior seems inconsistent with the requirement to have at least one secondary replica confirm log records have been written. Assuming log record sends, that points to identical physical structures (similar to DBM and AGs).
There may be something else that can explain what you are seeing.
You’ll notice this behaviour more on databases with a lot of indexes, and also those large in size. I’m privileged enough to have been told some of the inner workings on secondaries and replication in Azure, but unfortunately it is covered by a NDA. What I will say though, is that it’s to do with a combination of the way data is replicated (not necessarily transactional replication) as well as when secondaries are rebuilt or moved and extra space is reclaimed.
Just running a basic query every hour, capturing the results, will show you this behaviour over time:
select object_name(object_id), used_page_count, reserved_page_count
It’s a very common query, as you’ll know, but it allows you to track the number of pages used by each index over time. If you also take the overhead of capturing the index fragmentation levels, then it helps back up what I’m saying. At some point (on a busy system) the primary will failover to a secondary, and you’ll see a jump (up or down) in fragmentation and pages used for one or more indexes.
As a result of this behaviour, when monitoring for database sizes and "percent full", we treat 90% as 100% in case a secondary is larger in size.
I’d be happy to send some example (real) data for the above, if you like. Obfuscated of course… 🙂