Last week in Azure SQL Database – Part 4 – Performance Levels

In this post I’ll address perhaps the most important of all the announcements, performance levels.

The 6 new ASD tiers provide different levels of performance and different resource guarentees/reservations. There is a chart here: http://msdn.microsoft.com/library/azure/dn741340.aspx that lists performance levels (among other things) and there is a different chart here: http://msdn.microsoft.com/en-us/library/azure/dn741336.aspx that gives more details on performance, predictability, and resource guarantees (like max worker threads and max sessions).

The part concerning predictability is useful because ASD servers (physical servers in datacenters, not SQL Server master database they call “servers”) are shared. Because servers are shared, there is a syndrome that affects predictability called the “noisy neighbor” syndrome. Imagine someone who shared a physical server with you is performing a database stress test…maybe during your business’ peak time…

Performance is defined in a new, curious, unit of measure called a DTU or Database Throughput Unit. DTUs are meant to allow comparison between the new tiers. Currently, there is no direct comparison between the new tiers and the old (Web/Business) tiers, possibly because there was no performance SLA in the old tiers at all.

DTUs are based on throughput with a described benchmark (The Azure SQL Database Benchmark, see http://msdn.microsoft.com/en-us/library/azure/dn741327.aspx). This benchmark and how they run it are described in a nice level of detail. However, it would be better if the source code and running instructions could be released in future. Unless it’s already been released, and I missed it.

For now, the DTU is a nice way to compare the tiers and a known benchmark is a nice thing but, to reiterate, there’s no way to ensure that you’re getting your bang-per-buck. And remember, at this point, any “smoke tests” you do on your own are being performed against a preview, not production. With Basic, sometimes it seems to take “a long while” (a nebulous term) to connect the first time to a new Basic tier database. After that, it’s faster (another nebulous term). Others have reported that Basic/Standard is slower than Web/Business on self-invented performance test. It would be nice, before the new tiers go GA, if they run the benchmark on a traditional Web/Business database (maybe a few times and take the average, but post all detail runs) just to assuage the fears of folks before they need to convert to the new tiers. MHO… After everyone’s converted, we can start talking about DTUs again, and they become more interesting and meaningful.

BTW, to get information about your databases (new or old tiers) in T-SQL, just use this query. It has all of the tier and objective information. There is some redundancy in the metadata, so start with SELECT * and choose the information you’d like to see:

select o.name, o.description, st.*, do.*
from slo_objective_setting_selections sl
join slo_service_objectives o
on sl.objective_id = o.objective_id
join slo_dimension_settings st
on sl.setting_id = st.setting_id
join slo_database_objectives do
on o.objective_id = do.current_objective_id
order by o.name;

Cheers, @bobbeauch

Last week in Azure SQL Database – Part 3 – HADR preview service for premium

This post is about a new SQL Azure Database feature called a “Business Continuity Feature”, called “Disaster recovery/geo-replication”. This feature was announced last week as a preview. For the Premium tiers, this is a lovely feature that include “Active geo-replication” (their term) and cmdlets (and portal) for controlling it. For Basic and Standard tiers, you can use “Database copy + Manual export” according to the chart here: http://msdn.microsoft.com/library/azure/dn741340.aspx. By now, you know what “Database copy + Manual export” means (BACPACs, CREATE DATABASE AS COPY OF).

Terms concerning geo-replication are documented here: http://msdn.microsoft.com/en-US/library/azure/dn741339.aspx and there’s an entire docs section devoted to how to use it. A few points need to be called out:
1. Secondaries are readable.
2. You can create secondaries in the same region/different server or different region.
3. The target server must have an available Premium database quota to create the active secondary. You start with a quota of 2 and can “request” (from Azure support?) to have your quota increased.
4. The secondaries are always transactionally consistent but can run behind the primary.
5. Secondaries must be the same performance level (P1,P2,P3) as the primary. This has charge repercussions, as you could imply (nowhere is it directly stated but it seems obvious) that each secondary costs the same as the primary.
6. Each primary and secondary consists of a database that is itself replicated (passively and unseen by you) 3 times, because this is the design of ASD. And there is no info that this internal design has changed.

You start a continuous copy to the secondary using the portal or the (soon-to-be-released) PowerShell cmdlet Start-AzureSqlDatabaseCopy. This starts a seeding process, and eventually catches up to close to the primary. You can monitor this. Your can stop or cancel a continuous copy by using portal, the DROP DATABASE command on the copy, or Stop-AzureSqlDatabaseCopy. You need to use Stop-AzureSqlDatabaseCopy -ForcedTermination parameter to cancel the operation while it is still in progress or stop the replication immediately. See http://msdn.microsoft.com/en-us/library/azure/dn741337.aspx.

There are two ways to failover, planned or forced. Planned is performed only on the primary. Forced can be performed on the primary or secondary. Their terms for the types of types of disaster recovery that can be designed are:
1. Active-passive compute with coupled failover
2. Active-active compute with decoupled failover
3. Active-passive compute with decoupled failover

I know some folks in the disaster recovery biz have a problem with the terms active-active and active-passive, so perhaps these terms will change in future. There are a few recovery scenarios documented, I’d like to try these with the PowerShell cmdlets, when they appear.

This is an excellent feature for Premium edition but, to reiterate, nothing new in this space is currently announced for Basic and Standard, past the original, built-in and internal, 3 database replicas as with the original ASD. A change to the internal single-database design was not announced.

In the next post (which will be the last post, if I can make it short enough), I’ll talk about performance levels in the tiers and comment on the current state of ASD functionality.

Cheers, @bobbeauch

Last week in Azure SQL Database – Part 2 – New preview services

Note: Well that was quick. I’ve updated this blog entry (same day) to reflect clarifications provided by a member of the Azure SQL Database team. Thanks for these excellent clarifications. For now (I may go back and change this later) changes from the original blog post are indicated with italics.

The last post in this series was about the new tiers/pricing in Azure SQL Database (ASD). This post will be more exiting, as it covers the new services that come with the new tiers. I’m talking about what the announcement (and docs) call “Business continuity features”. To summarize these features are Self-Service Restore and Disaster Recovery – Geo-Replication.

Although the docs and the chart on http://msdn.microsoft.com/library/azure/dn741340.aspx show these features as available on all new tiers, currently, these services only appear on premium. And the PowerShell cmdlets mentioned in the docs aren’t in Azure PowerShell 2.3. I was told the cmdlets “will be out this(Apr 28) week”. See the chart for how the new features are implemented on different tiers.

One final thing about using “CREATE DATABASE .. AS COPY OF” and the preview. Last year’s Premium preview created a copy that was in a “disabled premium” state. The new preview will create a copy at the same level, so, for example, “CREATE DATABASE .. AS COPY OF” with a P2 database will create a P2 database. This has charge repercussions.

First, Self-Service Restore. Microsoft keeps (and has always kept) database backups at their data center. I’m guessing these are “traditional” database backups (not BACPACs). BUT, you can’t use their backups, because Backup and Restore are not supported on ASD. Export and Import are supported. Self-service restore is a way you can have them use THEIR backups to restore an ASD database. There are two flavors of self-service restore:
1. Restore a copy of a currently existing database as of a point-in-time. Perhaps you deleted a table, for example, or some data with a miscoded SQL DELETE statement. It happens…
2. Restore a copy of a database you deleted by mistake. Or that you want back. The database doesn’t currently exist now.

I’ve heard both of these referred to as “oops recovery”. I’m thrilled with this service, even though you and I have never made a mistake, right? ;-)

Use the portal (see http://msdn.microsoft.com/en-us/library/azure/dn715779.aspx) or the PowerShell cmdlets Start-AzureSqlDatabaseRestore (for a Standard or Premium Edition database) or Start-AzureSqlDatabaseRecovery (for a Basic database, because it doesn’t have point-in-time recovery), and a restore request will be submitted for you. To restore a deleted Standard or Premium database, just restore to a point-in-time before you deleted it. There is no SLA on *how long* the request will take to process. I couldn’t even get a ballpark figure, because it depends on the size of the database and the amount of recent activity. You can, however, get information about the status of the restore operation. You can even get this a T-SQL with the sys.dm_operation_status metadata table.

Unlike “CREATE DATABASE .. AS COPY OF”, self-service restoring a database produces a database of the same tier, but the lowest performance level in that tier. For example, restoring a P3 database creates a P1 database as a copy. This lessens the charge repercussions, but you do, of course, pay for at least one day of the copy.  If you’re manually just using it to recover a table, don’t forget to delete the restored copy when you are done. The database that’s created with a restore request can have the same name or a different name as the original and is always created on the same logical server, same data center. So to use a restored copy, you’ll need to change connection strings to point to the new database name or choose the same name when you when you submit the restore request. You may also want to increase the performance level, if you want to use the copy in place of the original afterwards.

If you’re the kind of person who wants their own backup (to Import the data on-premises, for example) you’ll still need to use Export/Import and BACPACs. The backup/restore capability is not available to you. BTW, if you used the Automated Export service (in preview itself) with Web/Business (it produces BACPACs to Azure Storage on a schedule), this is NOT available currently on the new tiers (at least that I could see, on the portal). No announcement when/if it will be.

To reiterate, the level of self-service restore (length of retained backups and point-in-time or not) is dependent on the service tier. See the chart referred to above. Also, here’s a clarification of what “Most recent daily DB backup in past 24 hours” means for Basic tier. For each database the service manages several types of backups: full backup created once a week, differential backup created once a day and transaction log backup created every 5 minutes. The first two are also copied to the Azure storage and that is what we refer to as “daily backups”. The actual time those backups are created differ therefore we can only guarantee that they will not be older than 24 hrs.  Consequently, if a database is recovered using Start-AzureSqlDatabaseRecovery the data loss (RPO) will be less than 24 hrs.

As this post is getting too long, I’ll save disaster recovery – geo-replication for another post.

Cheers. @bobbeauch

Last week in Azure SQL Database – Part 1

This blog post is especially dedicated to those who attended my SQLIntersection post-conference talk on WASD just over one week ago. The announcements were made after the talk, and so the information here is mostly a delta for those who want to catch up. I think the post series will be useful above and beyond that, however. I’ll try not to paraphrase the announcements but may need to quote from it or provide URLs. I’ll also tell you where I’m currently a bit unclear about things.

Last week, Microsoft made some major announcements about the future of Azure SQL Database. As an aside, the announcements referred to it as “Azure SQL Database” (although there is one usage of the name Microsoft Azure SQL Database), so I’m going to refer to it as a new made-up acronym, ASD, rather than my previous acronym (WASD) for their previous name, Windows Azure SQL Database. “Azure SQL Database” is just too long. For folks who get the two confused, ASD is a platform as a service (PaaS) database offering based on the SQL Server code. As opposed to “SQL Server on an Azure VM”, which is an IaaS offering. With ASD, you don’t need to maintain a guest OS or maintain SQL Server software yourself. I’ve written about this previously.

The first announcement was a changing of the tiering and pricing structure. ASD goes from 2 tiers (Web and Business) to 6 tiers (Base, Standard S1 and S2, Premium P1, P2, P3). Even more important, the charges go from being database size-based to tier-based. To see the new pricing, go here (http://azure.microsoft.com/en-us/pricing/details/sql-database/), scroll down, and click on “Basic, Standard, and Premium” button to see the chart. The difference between the tiers (besides size limit) is the level of service. More on that later. The chart mentions “Database size limit (included)”, but I’m unsure that there isn’t SOME charge based on the database size (as was the case with Web and Business). I can’t find it if there is, and the word “included”, tends to suggest that there isn’t additional per-GB data size charge. One (1) day is the minimum charge and charges are daily. I guess 1 day means “24 hours after you create the DB”, but haven’t tested that out yet.

You’ll have 1 year to convert from Web/Business to Base/Standard/Premium. The announcement doesn’t say what will happen if you don’t. The new tier structure is in “preview mode”, which has two main consequences. Firstly, to use preview mode you forgo your SLA, until the new tiers become GA. Secondly, to try it out, you need to sign-up/opt-in at http://azure.microsoft.com/en-us/services/preview/. Scroll until you see “New Service Tiers for SQL Databases” and click “Try it”. If you have multiple Azure subscriptions, you must do it once per subscription. It was almost immediately visible for me on either the old or new Azure portal.

The new service tiers use different sets of hardware, so you can’t use a new tier on an existing “server”. You need a new server. The new tier choices also show up when you Import a BACPAC into a database (on portal, +New/Data Services/SQL Database/Import), too. On “Import” and “New/Custom Create” I get choices of all old and new tiers. They don’t show up on “quick create” in the same path, however, but when I tried it (after making the new tiers available) “quick create” (on a new server) created a Basic (new-tier) database. Not sure I like that, because preview tiers have no SLA.

You can also convert between tiers for an existing database. On the portal, choose your database, and it’s under “Scale”. Scale (on a Basic database) didn’t give me Web/Business as a choice, only the new tiers as you’d expect. Be careful with this because, if I read the charges right, switching from Basic -> P1 Premium -> Basic will cost you $15 (the cost for 1 day of P1). I didn’t do this yet to find out.

There’s much more to come, but I do want to close this blog entry by quickly mentioning two other things:
1. When new tiers go GA, the uptime SLA goes from 99.9 to 99.95 for all new tiers.
2. On same day, it was announced that Azure Federation feature also will be discontinued after a year (Apr 2015). More on this later.

Cheeers, @bobbeauch

SQL Server 2014 In-Memory OLTP: What exactly is a “dusty corner”?

There’s a set of performance counters for the new In-Memory OLTP in SQL Server 2014. You might have overlooked them because they’re not with the other SQL Server counters, but are in their own group that begins with “XTP” (eXtreme Transaction Processing).

Looking over those counters, both the XTP Garbage Collection and XTP Phantom Processor groups contain a counter that refers to “Dusty Corner scan retries”. So what, exactly, is a dusty corner? Searching both Books Online and the “SQL Server 2014 In-Memory OLTP TDM White Paper” yields no references, but the academic Sigmod-2013 paper contains a helpful description.

The term has to do with how the In-Memory OLTP feature stores data in memory and how its garbage collection works. Because a single copy of a row’s data is stored along with multiple index pointers (both hash and BwTree indexes are ultimately pointer-based), all of index pointers must be “unlinked” before an old row can be garbage collected. Since threads can unlink old pointers while running queries, any index ranges with a lot of activity will quickly unlink the appropriate pointers. However, if an index or index range is rarely used, special scans by the system garbage collector will be needed to find these pointers. They’re “hiding” in the dusty corners (apparently dust forms on unused index ranges, I envision Carol Burnett, duster in hand ;-)

So the presence of dusty corner scans means some index ranges aren’t being used much. If, by looking at index usage, you can determine that an entire index is almost never being used (Database Engine Tuning Advisor doesn’t tune indexes for in-memory tables, that I’m aware of), that index would be a candidate for removal. However, in-memory tables don’t support filtered indexes so, if another part of the index range is frequently used, you’ll have to decide if it’s worth letting old versions hang around for longer. Until those dusty corner scans unlink that last pointer.

I’ll be doing preconference seminars covering all the SQL Server 2014 features in detail at SQLIntersection in Orlando and DevWeek London in April, as well as a performance-centered new and old feature seminar at Addskills in Stockholm later this month. if you’re at any of these locations and would like to see how these features would help you and you’re clients, I’ll see you there.

Cheers, @bobbeauch

Managing Windows Azure SQL Database’s newest feature – Premium Database

The latest feature to make an appearance in Windows Azure SQL Database (actually it’s still a preview program) is the Premium Database. Physically, when you have a WASD database, you’re sharing that database and server with others. It’s possible that if one of your database “neighbors” decides to run a stress test, your database (and you) could be stressed as well. This is known as the “noisy neighbor” syndrome. Just like noisy neighbors in a hotel, you can call the management and complain and they may even “move” you, just like a hotel would. To guarantee a certain level of resources, WASD made the premium database available.

There’s two different reservation sizes, P1 and P2, and these sizes are defined in terms of CPU cores, concurrent active workers, number of sessions, data IOPS, and memory. The relevant reservation size guarantees you these resources. The reservations are defined using the term “service level objective” (SLO).

Once you’ve been accepted for the preview program, you can create a new premium level database using the Azure portal or PowerShell Azure cmdlets. You can also upgrade an existing database to premium and also downgrade from premium to “regular”, subject to limitations, as well as change the reservation size. BTW, the cmdlets are a boon for those of us that would like to manage WASD but don’t want to perform repetitive operations via a web-based GUI. I strongly suggest you give them a try if you’re doing any Azure work.

One of the things you need to get used to when moving from SQL Server to WASD is that WASD does not expose all the DMVs and metadata views that SQL Server does, BUT ALSO, WASD contains some DMV and metadata views that SQL Server does not have. To be able to work with the Premium Database feature, there are six new metadata tables, all beginning with “slo” in the DBO schema in the master database. For example, if I have a premium database named “consistant_db”, I can monitor the premium change history by using the SQL query:

SELECT *
FROM dbo.slo_assignment_history
WHERE database_name = ‘consistant_db’
ORDER BY operation_start_time DESC;

I can also monitor resource activity by looking at worker_counts in sys.resource_stats to make sure I’m getting my money’s worth or to decide when it’s best to update/downgrade to premium.

Keeping on top of WASD changes, as well as knowing how and when to use WASD can be, like keeping up with anything in the Azure space, chasing a moving target. At the SQLIntersection conference in Orlando in April, I’ll be presenting a full day post conference workshop on “Windows Azure SQL Database from A to Z”. Join me there for the whole story on managing, developing on, and interacting with this beast.

Cheers, @bobbeauch

SQL Intersection in April, I’ll be there!

In mid-April, April 13-16 to be exact, the spring installment of SQL Intersection comes to Orlando. I’ll actually be there a little earlier and leave a little later than that. This conference has a whole lineup of speakers that like to add a little depth to how you’re used to thinking about your databases. There’s just too many excellent speakers to list them all, you’ll need to peruse this for yourself.

On Saturday before the conference, I’m doing a full day of SQL Server 2014 material. This includes some in-depth material on the new storage engine/execution engine/multi-version concurrency engine that was previously known at Hekaton. But it’s not just about Hekaton, I’ll cover everything database-related that’s in the new release, like the new columnstore and new cardinality estimation enhancements.

During the conference I’ll compare and contrast Windows Azure SQL Database (WASD) and SQL Server in an Azure VM, for those of you investigating going “to the cloud” and trying to decide on the proper vehicle. Also I’ll have a session on that age-old debate about an ORM-based vs. a stored procedure-based application. And a look at what Microsoft’s contributing to Hadoop, and their Azure-based implementation.

After the conference, I’ll have a WHOLE DAY to tell you all about WASD for folks who want to learn about that in more depth. It’s called “Windows Azure SQL Database from A-Z”.

Sounds like a fun week. If you’re at the conference, don’t forget to stop by and say hi.

Cheers, Bob

@bobbeauch

Expertseminarium in Stockholm on performance and SQL Server 2014

This March, I’ll be presenting a 2-day seminar as part of AddSkills Expertseminarium series in Stockholm on “Performance and new developer features in SQL Server 2014″. I’ll start with techniques to measure and monitor query performance in general, as well as troubleshooting query performance problems. But from there, I’ll cover the features in SQL Server 2014 that were designed to give your SQL the best performance. Which features are those? Turns out that it’s just about all of them!

There’s more information and a more detiled outline at the AddSkills website, or you can contact them directly. Hope to see you there!

Cheers, @bobbeauch

Getting close to SQLskills developer immersion event in February

I’m gearing up for my first SQLskills Immersion Event of the year, Immersion Event for Developers (IEDEV) in Tampa on Feb 17-21.

This one is getting kind of jam-packed with material because, in addition to coverage of T-SQL and its associated topics like query and procedural code performance and transactions and locking, this event will round out your toolbox. They’ll be coverage of specialized techniques such as coding with SQLCLR (.NET procedures, types, and aggregates), SQL and XML, Service Broker, Spatial, and Fulltext Search, as well as designing for tracking changes. As with T-SQL, when using these special techniques, the emphasis is always on performance and using the right tool for the job. We’ll see a lot of query plans and delve into details about specialized index usage. I’ve also included some information and demos on how SQL Server 2014 features fit into the big picture.

Registering soon will give you time to get geared up too, for the sheer amount of information we’ll cover. And, as always, they’ll be some after-hours events and giveaways. Arrive well-rested… If you have specific questions about the coverage, feel free to send me mail before hand. Hope to see you in sunny Tampa in February!

Cheers, Bob
@bobbeauch

Compiled T-SQL? Don’t throw away all your SQLCLR code just yet

The speed increase from memory-optimized tables in SQL Server 2014 is nice, but I thought, to get the most bang for your buck for really need to use compiled stored procedures in addition. After looking at some of the C code these produce, it almost looked like I was doing direct vector branching into the table’s C-based specific access routines. But, in the meantime, I’d worked with a friend of mine who was trying to convert some custom SQLCLR code to compiled T-SQL. He was using table variables in place of CLR objects and, although the code run way faster than the non-compiled T-SQL code, it was still 90% slower than his SQLCLR equivalent.

In general, you’ll get the biggest speed increase from a compiled sproc if you use do lots of logic in code, or use nested loops in plans that return lots of rows. I thought I had just the use case.

Once upon a time, someone had given me code that did a lot of calculations in SQLCLR and in T-SQL. The SQLCLR code uses 2-and-3-dimensional arrays as well as some data access. The T-SQL code was an unoptimized, line-by-line port of the SQLCLR code used multiple temporary tables and a cursor over outer iterator. You could remove the outer cursor and still SQLCLR performed 100 iterations (100 products in this case) 7-8 times faster than T-SQL performed 1 iteration (1 product). In other words, SQLCLR was about 700-800X faster than T-SQL. Folks used to tell me this was “cheating” because of the large number of SQL queries the T-SQL code did (100s of queries, many using INSERT INTO a temp table). Nevertheless, it was worth a try.

Converting the code was pretty straightforward.
Replace temp tables with strongly-typed in-memory table variables
Make sure each table variable has a key and index (they’re required in in-memory table variables)
Change a subquery into another select into table variable + main query (compiled sprocs don’t support subqueries yet)
Make sure everything follows the rules for schemabinding
Didn’t use the outer cursor, so the compiled T-SQL code either performed a single iteration or called the compiled proc from within non-compiled T-SQL cursor code

I also had to wait until CTP2, because the procedure used DATEADD. That’s wasn’t supported until CTP2 (good choice of what to support sooner, SQL team, thanks for adding date functions).

The results for switching to in-memory tables were encouraging:
SQLCLR – standard tables – 5 sec
TSQL – standard tables – 37 sec (x100 = 370 sec) – 740X slower
TSQL – in-memory tables – 10 sec (x100 = 100 sec) – 200x slower
SQLCLR doesn’t support in-memory tables through context connection. I didn’t try it with an external connection back into the same instance.

Switching to compiled sproc (and pre-compiling the sprocs to factor out the first-time compile overhead for each)
TSQL – in-memory tables – 10 sec (x100 = 100 sec)
Compiled TSQL (which only supports in-memory tables) – 10 sec (x100 – 100 sec)

So for this particular use case (just a straight port, didn’t try to optimize the code), the difference was minimal. BTW, I’m not pretending this is a benchmark or anywhere near, just a quick test and some round numbers for comparison.

Then I took a harder look at my code. The SQL code was doing a large *number* of SQL statements, each of which used and returned a small number of rows. The computations were done, not in program logic, but with iterations over SQL statements (not a good way to do custom computations in ANY case). Horrible use case, I guess. Lessons learned:
-In-memory tables are not multidimensional arrays or custom “objects” as far as speed is concerned.
-The SQL statement still has overhead over in-memory operations even with simple SQL in a compiled sproc
-Iterating with SQL statements doesn’t substitute for aggregations and computations done in a single SQL statement (I knew that one already).
-Remember where the sweet spot is for compiled stored procedures.

So folks, don’t throw away your custom SQLCLR sprocs that do custom computations just yet. But do give it a test. And push as much of the computation as possible into the SQL statement and optimize those SQL statements.

Cheers, Bob
@bobbeauch