What Does the Future Hold for Cardinality Estimates and Cost Models in Windows Azure SQL Database?

I’ve been interested in exploring what changes may be coming through in SQL Azure (Windows Azure – “SQL Database” now) around query execution cost models and cardinality estimates being fed to them. This was motivated by a couple of things…

First of all, I saw a blog post from Grant Fritchey where he noticed that the estimated costs for a few plan operators were different.

Secondly, I read a paper called “Testing Cardinality Estimation Models in SQL Server” by Campbell Fraser, Leo Giakoumakis, Vikas Hamine, and Katherine F. Moore-Smith. This was a for-fee article, but the non-member price of $15.00 was worth it. One particularly interesting quote was as follows:

“The new CE model is planned as a future service release of the Microsoft SQL Azure service.”

That quote was a tipping point for further investigation, so, collaborating with Jonathan Kehayias, we discussed a testing approach and set up two different Azure databases, with one database on Web Edition and the other Business Edition. The intention wasn’t to perform “formal” tests, but I did want to sniff around and see what variations in cost and cardinality estimates I could find (if any) between SQL Azure (version 11.0.2006) and SQL Server 2012 (version 11.0.2316) across various types of queries. I used the Credit database for a variety of test queries – with identical schema and data in all three databases (one engine DB and two Azure DBs).

One thing I’ve learned so far is that you should watch out for is misinterpreting cost differences for “identical” databases. Even if you load the same exact schema and rows, you will likely have a different data page count between Engine and Azure (think of the Azure fillfactor and RCSI behavior). For example, after loading my dbo.member table in SQL Azure, it had 159 pages versus the 142 pages in my SQL Server 2012 version. So testing an initial Clustered Index Scan query against that table showed me an estimated IO of 0.1075694 in SQL Server 2012 versus 0.120162 for SQL Azure. So assuming one random I/O and the rest sequential, I see that my SQL Azure cost is still calculated the same for the Clustered Index Scan:

— Random I/O – 0.003125
— Sequential – 0.000740741
SELECT 0.003125 +
0.000740741 * (159-1);

So the key will be to make sure I’m looking at true apples-to-apples comparisons here. I’ll be testing when I have a few spare moments between other tasks – but in the meantime I’m very interested to learn more about what new changes will come in to SQL Azure in the future. I’ll share anything interesting I find on the blog – and if you find noteworthy QO items, please share on the comments of this blog as well.

Thanks!

0 thoughts on “What Does the Future Hold for Cardinality Estimates and Cost Models in Windows Azure SQL Database?”

Mike Lewis says:

September 11, 2012 at 9:21 pm

Another thing to watch out for is failovers in Windows Azure SQL Databases. On busy systems it happens a lot more regularly than you may think, and the secondary does not necessarily have the same number of data pages for a table or index (due to factors such as recreating the secondary behind the scenes therefore rebuilding indexes). I have seen overall database sizes jump up or down by as much as 10GB due to a failover. This affects cardinality estimates and is not predictable.

You can monitor for it though by taking regular snapshots of sys.dm_db_partition_stats.
joe says:

September 11, 2012 at 11:02 pm

Interesting! Thanks for that comment, Michael.
joe says:

September 11, 2012 at 11:57 pm

Hey Michael,

Are you able to share more details on the data page variances that you encounter?

I was discussing this with Jonathan Kehayias and that behavior seems inconsistent with the requirement to have at least one secondary replica confirm log records have been written. Assuming log record sends, that points to identical physical structures (similar to DBM and AGs).

http://social.technet.microsoft.com/wiki/contents/articles/1695.inside-windows-azure-sql-database.aspx

There may be something else that can explain what you are seeing.

Thanks!
Mike Lewis says:

September 19, 2012 at 12:17 am

Hey,

You’ll notice this behaviour more on databases with a lot of indexes, and also those large in size. I’m privileged enough to have been told some of the inner workings on secondaries and replication in Azure, but unfortunately it is covered by a NDA. What I will say though, is that it’s to do with a combination of the way data is replicated (not necessarily transactional replication) as well as when secondaries are rebuilt or moved and extra space is reclaimed.

Just running a basic query every hour, capturing the results, will show you this behaviour over time:
select object_name(object_id), used_page_count, reserved_page_count
,convert(money,used_page_count*8192)/(1024*1024) UsedSpaceMB
,convert(money,reserved_page_count*8192)/(1024*1024) ReservedSpaceMB
,row_count
,*
from
sys.dm_db_partition_stats

It’s a very common query, as you’ll know, but it allows you to track the number of pages used by each index over time. If you also take the overhead of capturing the index fragmentation levels, then it helps back up what I’m saying. At some point (on a busy system) the primary will failover to a secondary, and you’ll see a jump (up or down) in fragmentation and pages used for one or more indexes.

As a result of this behaviour, when monitoring for database sizes and "percent full", we treat 90% as 100% in case a secondary is larger in size.

I’d be happy to send some example (real) data for the above, if you like. Obfuscated of course… 🙂

Comments are closed.

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.

What Does the Future Hold for Cardinality Estimates and Cost Models in Windows Azure SQL Database?

0 thoughts on “What Does the Future Hold for Cardinality Estimates and Cost Models in Windows Azure SQL Database?”

Other articles

New Course: “SQL Server: Common Query Tuning Problems and Solutions – Part 2″

New Course: “SQL Server: Common Query Tuning Problems and Solutions – Part 1”

SQLIntersection Post-Conference Session

Optimizing Your Query Plans with the SQL Server 2014 Cardinality Estimator

Finding Timeout-Prone Code with max_elapsed_time

Deck from “Practical SQL Server Cardinality Estimation”

Imagine feeling confident enough to handle whatever your database throws at you.