Query Store Questions: 24HOP follow up


A couple weeks ago I presented for the 24HOP Summit Preview and I had a lot of great general questions about how Query Store works.  My session title was “Why You Need Query Store” (you can watch it here) and I only had about 45 minutes.  As you can probably guess – since I have a full day pre-con on the topic – I can talk about Query Store for a loooong time 🙂  The 24HOP session was really focused on getting folks to understand THE most important things that Query Store provides to show why it’s needed.  I left out a lot of details about HOW Query Store works because talking through it is the fun stuff that I’ll dive into during the pre-con.  I did have a good number of questions from attendees related to specific functionality, and I promised to write a post answering them.  Questions* and answers are below…if you need clarification on anything, please leave a comment and I’ll follow up!

*Questions copied exactly as they were shared with me, I did not try to re-word or make any inferences about what was being asked.


1. Can I get Query Store data for a production database deployed on a client site, that I don’t actually have access to myself? Can the DBA send me something I can use in my own development environment?

A: You provide instructions to the client that explain how to enable Query Store, either through the UI or with T-SQL.  If you want to view that Query Store data, the client can either send you a backup of the database or create a clone with DBCC CLONEDATABASE and share that.


2. If a user executes a stored procedure from ‘master’ it is not captured?

A: If you have Query Store enabled for a user database, and you execute a query against that user database from the context of the master database, it is not supposed to be captured in Query Store for that user database.  But in my testing, it is.  But it is not supposed to be, so there is no guarantee it will always work that way.


3. If your database is part of an AG the data you look at can be different based on the server it is running on at that time, correct?

A: I’m not quite clear what’s being asked, but I wrote a post about Query Store and Availability Groups, which will hopefully answer the question.


4. Is it easy to remove the forcing of a given plan?

A: Yes, just use the “Unforce Plan” button in the UI, or use the stored procedure sp_query_store_unforce_plan (you supply the query_id and plan_id).


5. If you have 3+ plans how does SQL Server decide which plan to use?

A: I assume this is specific to the Automatic Plan Correction feature, and if so, it will force the last good plan (most recent plan that performed better than the current plan).  More details in my Automatic Plan Correction in SQL Server post.


6. What equivalent options we have for lower versions?

A:  There is an open-source tool called Open Query Store for versions prior to SQL Server 2016.


7. Why are the trace flags not on by default? given the issues with AlwaysOn and QS

A: Great question.  Trace flag 7752 will be default functionality in SQL Server 2019.  TF 7745 is not default functionality because, I suspect, of the potential for losing Query Store data…and SQL Server wants you to make a conscious choice about that.  More details in Query Store Trace Flags.


8. How would you use Query Store to troubleshoot Views?

A: Query Store does not differentiate between a query that references a view and a query that references a table.  It does not capture the object_id of the view and store that in Query Store (as happens for a stored procedure), so you have to look specifically for the view name in the query_sql_text column (within sys.query_store_query_text) to look for queries that reference the view.


9. Is there any way to make use of Query Store in readonly secondary AG replicas?

A: You can read data from the Query Store views on a read-only replica, but you cannot capture data in Query Store about queries executing against the read-only replica.  See my post referenced in #3, and then please up-vote this request: Enable Query Store for collection on a read-only replica in an Availability Group.


10. Is it possible that a query store run from one instance to another instance for example I want check the queries of production from dev instance?

A: If you can connect to the production instance from the dev instance, and have appropriate permissions, then you can query the Query Store data on the production instance (but the data exist in the production database).


11. If I execute a parameterized query with OPTION (RECOMPILE), will Query Store have the parameter values of every execution?

A: No.  The plan will have the values used for the initial execution that generated said plan, but values for every individual execution are not captured (it would generate excessive overhead to capture every execution).


12. Can Query Store supply the T-SQL to force plan?

A: The UI does not provide an option to script forcing a plan for a query, but if you are using Automatic Plan Correction, the T-SQL to force it can be found in sys.dm_db_tuning_recommendations.


13. Will there be any significant performance overhead by using query store?

A: See Query Store Performance Overhead: What you need to know


14. How does it function when queries span multiple databases?

A: As alluded to in question #2, cross-database queries are tricky.  You should work under the assumption that if you execute a query from Database_A, where Database_A has Query Store enabled, it will be captured.  If you execute a query from Database_A that queries both Database_A and Database_B, and both databases have Query Store enabled, it will ONLY be captured in Database_A.


15. It seems to be working for me, but sometimes not

A: I would love to help you out, just not sure of the behavior you’re seeing and what your question is.


16. How do you get the full query text from inside the ‘Top Resource Consuming Queries’ windows?

A: Click on the button with the grid and magnifying glass, which says “View the query text of the selected query in a query editor window” when you hover over it.

Button to display query text to help understand how query store works

Button to “View the query text of the selected query in a query editor window”


17. Is the data will be stored in Query Store After the adhoc/SP completed or it will do while is running?

A: Once the plan has been compiled for a query, the query text and plan are sent to Query Store.  When execution completes, the runtime statistics are sent to Query Store.


18. If we change the compatibility to SQL 2012 or lower, will that affect Query store?

A: No, Query Store functions in SQL Server 2016 and higher, and Azure SQL Database, regardless of your compatibility mode.


19. If we drop a SP, will that clear the history of that SP plans in the query store?

A: No, but…If you use DROP PROCEDURE syntax, then the object_id column in sys.query_store_query will no longer reference an existing object (in sys.objects).  The query and plans will stay in Query Store until they are aged out based on the retention policy.


Follow Up

Again, if any answers are unclear, leave a comment and I can clarify.  If you are interested in learning more about Query Store I would love to see you in my full day session at the PASS Summit!  It’s on Monday, November 5th, and you get more details here: Performance Tuning with Query Store in SQL Server

SQLskills at the PASS Summit

The PASS Summit is coming up this fall in Seattle, and it will be here before you know it.  Though I want to mention that I am thoroughly enjoying summer and really not ready for pumpkin anything!  I’m returning to Summit this year, as are Jonathan and Glenn.  We are all looking forward to the conference and with the schedule announced this week, I wanted to include some more details about  the full day pre-conference sessions that Jonathan and I are each presenting.  Lucky Jonathan, we are presenting together as part of the Developer Learning Pathway along with fellow Clevelander Bert Wagner (one of my favorite peeps, ask him about coffee!).

I’ve included links to all of our sessions below, and since Jonathan and I are teaching on consecutive days, you can attend both our workshops 😊

My pre-conference session will dive deep into how to leverage Query Store to detect and correct performance issues, both with existing SQL Server 2016 and newer environments, and Jonathan’s session will teach you how to properly use the new features in SQL Server 2017 and SQL Server 2019 to get better performance.  If you’re planning an upgrade OR moving to Azure, both of these workshops are applicable.  We think these sessions complement each other very well, and even though we aren’t presenting together, we will reference content/concepts discussed in the other session.  Do you have to attend both to make sense of it all?  Definitely not.  But we do have a ton of great information to cover and are already working on new demos to showcase the latest and greatest in SQL Server.

If you have questions, please email us!  You can email me or Jonathan directly to ask about our pre-conference sessions if you don’t have a clear picture after reading the abstract.  We know that there are a lot of great workshops at Summit, and we definitely want to make sure you learn what will help you most.  Glenn, Jonathan and I look forward to seeing you in November!

Erin’s workshop on Monday, November 4, 2019: Performance Tuning with Query Store in SQL Server

Jonathan’s workshop on Tuesday, November 5, 2019: Bigger Hardware or Better Code and Design?

Erin’s general session on Wednesday, November 6, 2019 at 3:15PM: Understanding Execution Plans

Glenn’s half-day session on Wednesday, November 6, 2019 at 3:15PM: Dr. DMV’s Troubleshooting Toolkit

Jonathan’s general session on Thursday, November 7, 2019 at 10:45 AM: Eliminating Anti-Patterns and RBAR for Faster Performance

Glenn’s general session on Friday, November 8, 2019 at 9:30AM: Hardware 301: Choosing Database Hardware for SQL Server 2019



PASS Summit 2017: Day 2

Day 2 is starting here at PASS Summit in Seattle – watch this space for updates the next hour and a half!

Grant Fritchey is kicking off day 2!  I’m a big fan of Grant.  He talks about the community and how people should take advantage of what’s available here at Summit.  I couldn’t agree more – I wouldn’t be where I am in my career without this conference, nor would I have the friends I do all over the world.  Grant hands it over to Denise McInerney, VP of Marketing, and Denise shares her journey within PASS (she’s a 6 year Board Member).  Denise continues driving the point about the value of this conference and PASS.  She then talks about the volunteers for PASS, and announces the winner of the PASSion Award.  This year’s winner is Roberto Fonseca.

Denise is talking about the survey feedback from the past couple years, and session evaluations at PASS.  *PLEASE* complete these by the end of next week (if not earlier) – both PASS and the speakers truly value this feedback.  If you provide additional feedback there is a Board Q&A tomomrrow at Friday, 2PM. Today is the WIT lunch, and Denise announces that next year’s Summit is  November 6 – November 9, 2017.  Denise introduce Rimma Nehme, a Group Product Manager/Architect for Azure Cosmos DB and Azure HDInsight at Microsoft.  Today is going to uncover CosmosDB.  It will be technical!  Let’s go!!

 Do we need another database?

This is the wrong question to ask.  If you look at the rising data problems and needs, most of the production systems today (particularly the ones designed in the 70s and 80s), the modern calls and needs are addressing these problems.  90% of the world’s data was created in the last 2 years alone.  The estimated growth in the next 3-5 years is 50x.  Another trend is global, and another is data is big.  Not just a few TB, but trends of companies processing hundreds of TB to pedabytes.  Every 60 seconds 200 million emails are generated.  Rimma is throwing out ridiculous numbers about the amount of data being generated.  I can’t keep up!

Data is also interconnected.  What you do in Seattle can be connected to another location in the world.  This the butterfly affect.  We are experiencing about 125 exabytes of data (that’s a lot of zeroes).  Companies are looking at ways of extracting that data and monetize that information.  Another trend is the balance of power continues to shift from structured to unstructured data.  About 80% of data originates in unstructured data. Never push the data to computation – push the computation to the data.

When dealing with distributed, you need to deal with a lot of differences.  For example, different architectures.  In 2010 an engineer inside Microsoft observed this and identified that we need a different architectures to deal with these fundamental differences in data at scale.  This is how Project Florence was born, which is the base of what is now CosmosDB.  It was one of the exemplary partnerships between Microsoft Research and the Azure Data team.

At the time they were working to address the problem of the data for large scale applications within Microsoft (e.g. XBox).  They tried the “earthly databases”, building something on their own, and these options weren’t working.  Hence project Florence to meet the internal needs.  A basic set of requirements were laid out:

  1. Turnkey global distribution
  2. Guaranteed low latency at the 99th percentiles, worldwide
  3. Guaranteed HA within region and globally
  4. Guaranteed consistency
  5. Elastically scale throughput and storage, at any time, on demand, and globally
  6. Comprehensive SLAs (availability, latency, throughput, consistency)
  7. Operate at low cost (this is crucial and important! – first customers were Microsoft departments)
  8. Iterate and query without working about schemas and index management (applications evolve over time and rapidly))
  9. Provide a variety of data model and API choices

This manifests into three principals that have evolved

  1. Global distribution from the ground up
  2. fully resource governed stack
  3. Schema-agnostic service

It is very hard to build any service (particularly with such requirements).

If it was easy, everyone would do it (via NASA).  So this is how CosmosDB was built.  This is used internally by Microsoft.  It is one of the fastest services in the cloud.  It is a ring-0 service, meaning it is available in all regions by default.  It is millions of lines of C++ code.  It is 7 years in the making, it is truly not a new services.  Here is what it looks like (a bit of marketing here).

The foundations of the service for a globally distributed, massively scale-able multi–model database service are

  1. comprehensive SLA
  2. five well-defined consistency model
  3. guaranted low latency at t the99th percentile
  4. elasticscale out of storage and throughput
  5. and…

Fine grained multi-tenancy.  This cannot be an after thought.  From left to right, you can take a physical resource like a cluster and dedicate to a single tenant (e.g. customer or database).  You can take an entire machine and dedicate.  You can go another step and take a machine to homogeneous customers.  The final level of granularity is taking that machine and dividing between heterogeneous tenants and providing performance and scalability.

In terms of global distribution, Azure has 42 regions world wife…36 are available, 6 are still being  built out.  You can span your CosmosDB across all of those regions.

Within a database account you have a database.  Within that you have users and permissions.  Within that CosmosDB is a container.  A container of data with a particular data model.  Below that are other user defined code.  The database may span multiple clusters and regions and you can scale it in terms of these containers.  It is designed to scale throughput and storage INDEPENDENTLY.  How is the system designed by the scene (10K foot view)?  Within regions there are data centers, with data centers there are stamps, within that there are fault domains, within that there are containers and within that the replicas.  Within the replicas are the data.  On the database engine this is where the secret sauce comes in – bw-indexes, resource manager, log manager, IO manager, etc.  On any cluster will see thousands or hundreds of tenants.  Need to make sure that none of the tenants are noisy.

Another tenant that is import is the concept of partitioning.  How does CosmosDB solve this?  The tenants create containers of data and behind the scenes these are partitions.  The partitions are comprised are 4 replicas.  This is consistent and reliable.  Each one is a smart construct.  Out of those partitions, you can create partition sets.  These can then span clusters, federations, data centers, regions.  You can overlay topologies to implement solutions that span across multiple regions across the planet.  You need to make sure that the applications then work really well (particularly when merge or split partitions set).  You have the partition which is a data block and then you can build the partition set of various topological.

What are some of the best practices?

  1. Always want to select a partition key that provides even distribution
  2. user location aware partition key for access locally
  3. Select a partition key that can be a transaction scope
  4. Don’t want to use the timestamp for write-heavy workloadso

The resource model summary : Resources, Resource model, partitioning model

Core capabilities Turnkey global distribution – this is adding regions with a click.  Yu can come to an Azure portal, you can see the map of the entire world and pick the regions where you want your data to be.  The data is replicated behind the scenes and then its available for access.  You’re not dealing with VMs, cores.  You can add and remove regions at any time and the application does not need to be re-deployed.  The connection between application and database is logical.  This is enabled by multi-homing capability API.  You can connect to physically to the end point.  Another capability is that you can associate priorities with each of the regions.  If there is an outage or failover in a region, the failover will occur in the order of priority, and that can be changed at any time.  Something added for customers is to simulate a regional outage (but don’t go crazy with this says Rimma!).  You can test HA of the entire application stack.

Another capability is being able to provide geo-fencing.  If you come from any other part of the world there can be regulations where data has to present in particular regions, so if data needs to stay withing a location for requirements, that capability is required.

How does AlwaysOn work?  By virtue of partitioning have multiple locations.  One replica goes down, the application will be unaffected.  If partition goes down, the application will go t partition in another region.  If an entire region goes down, the application will go to another region.  The data is always available.

Another area of interest is active writers and active readers in any region.  Right now turnkey provided at database level, but looking to push this down to the partition key level (a true Active Active topology).  Online backups are available, they are stored in Azure blob storage.  The key capability is that it’s intended for “oops I deleted my data”, it”s’ not for a data center going down (that’s hwy you have multiple regions).

Second capability is elastic scale out.  As data size, scale throughput independently.  Could start out with small amount of data and keep adding more and more.  Back end will seamlessly scale.  Transnational data tends to be small, web and content data is medium sized, and social data/machine generated data is much larger.  As data size grows or throughput grows, scale occurs and this happens seamlessly behind the scenes.  This is done with SSDs behind the scenes.

Resource governance is the next component.  As operations occur, they occur RUs.  You provision RUs that you need (how many transactions/sec to you need?).  All replicas (just a partitioning of data) get a certain budget of RUs.  If you exceed, you’ll get rate limited.  At any time can increase provision throughput.  Can then support more transactions/sec.  Can also decrease at any time.

RU is a read-based currency partitioned at granularity of a 1 second.  It is normalized across DB operations.  Cost the operations via machine learning pipelines that cost queries (e.g. scans, lookups, etc.).  Have run machine learning on models on telemetry data, and then calibrate the cost model accordingly for RUs.  ((DATA DRIVEN).  Back to partitioning model: at any time can change throughput and behind the scenes you can specify the throughput (RUs) you want.  Behind the scenes the re-partitioning will occur, and each one will get more RUs to provide the throughput asked for.  This is where splitting/merging partitions matters, but it happens behind the scenes and you don’t have to worry about it.

What about when you add regions?  You want to add more RUs so you don”t starve existing regions.  Those RUs are spread across all partitions and regions.  Rimma shows how one customer elastically provisioned resources during the holiday season to size up to meet demand, and then size down when no-longer needed.  In a 3 day period, Rimma shows a graph of RUs.  At the top end there are 3 trillion RUs.  (THREE TRILLION IN THREE DAYS PEOPLE)

Various nodes have a various number of RUs serving different workloads, and you can look at the different tenants and partitions in there.  Multi-tenancy and global distribution at that level is incredible.

Another tenant: Guaranteed low-latency at 99%.  This was a core requirement because time is money.  From the business part of view, twice as much revenue lost to slowdowns.  So the system is designed.  At 99th percentile, less than 10ms for the reads measured at 1KB document (which is =80-90% of workload).  At average, will observe lower latency (less than 2 ms for reads and 6ms for writes.  How is this accomplished?  Reads and writes from local region and SSDs done.  The database designed to be write optimized and using latch-free database engine.  All data is indexed by default.  This is a fundamental difference from relational databases, here we have automatically indexed SSD storage.  Customer example: application in California and data in far east.  Added another region and then latency dropped.  Over black Friday/cyber Monday latencies less than 10ms for reads and 15ms for writes.  Azure Cosmos DB allows you to be the speed of light.  If have a database in Chicago and have friends in Paris who want to read your data.  If this  was a central database they would request to read the data from Paris and getting that data from Chicago to Paris takes 80-100 ms.  With CosmosDB you get it in less than 10ms because of those regional locations.

The last here is the consistency model in CosmosDB.  How does it go?  When you are dealing with any distributed system, whether databases or other sytem, typically you are faced with fundamental trade off of latency, availability, consistency and throughput.  If centralized database all requests against primary copy.  With global distribution, get geo–replication get HA and low latency.  But what happens if one replica can’t communicate with others and updates are being made?  What kind of consistency guarantees are made?  This can be a problem!  Do you need to wait for data to be synchronized before you serve it?  Do you want strong consistency or eventual consistency?  Do you want the red pill or blue pill?  With a relational database you get a perfect consistency.  They won’t serve the data until quorum is an agreement. The price there is latency.  On the other hand, the whole movement of no consistency guarantees means low latency.  But real-world consistency is not a binary choice as just described.

What about something in between?  The research literature talks about the wild west of consistency models (not one or the other).  A parper recommended is Replicated Data Consistency Explained Through Baseball by Doug terry, a Microsoft Research individual.  Uses real-world examples from baseball.  Depending on who you are in the game, you might get value out of different consistency models.  The engineers asked the question: can we pick out an intermedicate consistency model and is easy to configure, programmable, presents clear trade-offs?  Most real-life applications don”t fall into those two extremes.  Bounded-stateless, session (monotonic reads and writes local to geo location), and consistent prefix (when updates applied, the prefixes are guaranteed to be consistent).

How is this accomplished?  use TLA+ specifications to specify consistency models.  If you don”t know, check out video by Leslie Lampert who is an important figure in how the system was designed.  Leslie was a Touring award winner for Paxis (sp?) algorithm and founding father of what is used in the stack.

Operationalized the five different consistency models.  Using telemetry to see then how those models are used.  Only 3% use strong consistency, 4% use eventual, and 93% are using the three models in between.  Depending on consistency model specified, might need more computational work, which requires RU.  Have to make trade offs accordingly, and you can monetize this and decide what’s’ best.

Comprehensive SLAs…high availability SLAs are not enough.  Microsoft is in the service business, not just the software business.  Typically services don’t give any SLA, or give for HA.  When tried to approach this problem, asked “What are all the things that developers and customers really care about?”  They care about performance, throughput, consistency, availability and latency.  SO this is the first service in the market that has published comprehensive SLAs that are backed up by Microsoft financially.  Can now see that guaranteed if come in and run workload, will get guaranteed performance, throughput, consistency, availability and latency .  Availability tracked at the tenant and partition level in 5 minute granularity.   Customers can see their run time statistics against their SLA.

Service is also multi-model.  Wanted to enable native integration with different data models.  Behind scenes just ARS model (atom-record-sequence).  All models get translated to ARS model.  Very easy for the service to then on-board other data models now and in the future.  If want document and graph, do not need two copies of data, it can be handled by the same set of data.  This is a powerful combination — to look at data through different lenses.

Why schema agnostic approach?  Modern applications that are built in the cloud are not static.  Can start with one schema, add more tables/new columns…need a robust approach to handle these scenarios.  The object model is schema-free.  The data gets stored as-is.  How do you query this data?  Behind the scenes the data is ARS.  At the global scale, dealing with indexes, schema, etc. is a nonstarter.  In CosmosDB there is schema agnostic indexing.  The indexes are a union of all document trees, and can then consolidate into one and only keep unique values.  All of this structure info is then normalized.  It is an inverted index which gives optimal write performance.  Can identify where documents located and then serve up.  The index overhead in practice is pretty small.  There is a published paper, Schema-Agnostic Indexing with Azure Cosmos DB, go read it!

The Bw-tree in Azure Cosmos DB is highly concurrent, non-blocking, optimized for SSDs.  Avoids random writes.  There are three layers, the b-tree, cache and log structured store (see paper).  Rhema is going faster now. I’m not kidding.  Bw-tree is implemented as delta updates.  There is a mapping table to the updates and updates stored as deltas (sounds like in-memory index structure?).

Rimma shows architecture of Query Processing, there are different semantics but have the same underlying components (compiler, optimizer, etc.).  The engine is very flexible and expect that in the future will host other run-times.  The multi-API approach allows native support for multiple APIs.  If want to store data in cloud but not re-write your app, you can.  There are more APIs coming in the future.  What does this strategy enable?

  • No recompilation needed
  • Better SLAs, lower TCO
  • leverage the existing OSS tool-chain and ecosystem and development IT expertise
  • Life and shift from on-premises to cloud
  • No vendor lock-in
  • Symmetric on-premises and cloud database development

All happy databases are alike, each unhappy database is unhappy in its own way (Kyle Kingsbury via Leo Tolstoy).

How run service?  Weekly deployments worldwide.  16 hours of stress testing every day.  It’s like magic factory across the globe.  Also run variant and invariant checks.  Bots that are fixing nodes that might have issues.  Consistency checking and reporting going over the data continually.

In conclusion…wanted to put herself in our shoes. It’s a lot of information to digest…particularly if not invested in this technology.  Why should you care?  Rimma brings up a quote from Darwin:

It is not the strongest species that survive, nor the most intelligent, but the ones most responsive to change.

Can try CosmosDB for free, no need to credit card info, etc.  Childhood dream of going to Cosmos (space) will be fulfilled.

Key points to remember:

  • global distribution, horizontal partitioning and fine grained multi-tenancy cannot be an afterthought
  • schema agnostic database engine design is crucial for a globally distributed database
  • intermediate consistency models are extremely useful
  • globally distributed database must provide comprehensive SLAs beyond just HA

This is a hidden gem, but the bigger message remember the entire NoSQL movement is a counter-culture movement. But the question is how would we build databases if we started today? Without the legacy that we know, would we look at things differently?  Would we focus on limitations or focus on the bigger picture?  Sometimes it is ok to break the rules and try to different things.  Life is too short to build something that nobody wants.  If we focus on real pain points, not the “nice to have things”, but really look at the issues and abandon our constraints and self–imposed limits, we can create a modern SQL.  Rimma ends by thanking Dharma Shukla and entire CosmosDB team.