{"id":4993,"date":"2019-05-30T14:19:28","date_gmt":"2019-05-30T21:19:28","guid":{"rendered":"http:\/\/3.209.169.194\/blogs\/paul\/?p=4993"},"modified":"2019-05-30T15:37:56","modified_gmt":"2019-05-30T22:37:56","slug":"the-curious-case-of-very-long-failover-times-with-a-large-ad-hoc-workload","status":"publish","type":"post","link":"https:\/\/www.sqlskills.com\/blogs\/paul\/the-curious-case-of-very-long-failover-times-with-a-large-ad-hoc-workload\/","title":{"rendered":"The Curious Case of&#8230; very long failover times with a large ad hoc workload"},"content":{"rendered":"<p style=\"text-align: justify;\"><em>(The Curious Case of\u2026<\/em>\u00a0used to be part of our bi-weekly\u00a0<a href=\"https:\/\/www.sqlskills.com\/join-the-sqlskills-insider-community\/\" target=\"_blank\" rel=\"noopener noreferrer\">newsletter<\/a>\u00a0but we decided to make it a\u00a0regular blog post instead so it can sometimes be more frequent. It covers something interesting one of us encountered when working with a client, doing some testing, or were asked in a random question from the community.)<\/p>\n<p style=\"text-align: justify;\">Erin worked with a client recently who was experiencing very long failover times with a large ad hoc workload and needed to understand why. By very long, we&#8217;re talking\u00a0more than 30 minutes for an availability group failover, and they&#8217;d tested failovers under load before and never had it take more than a few minutes. They wondered if their workload had changed or there was some configuration setting that was off because they&#8217;d also recently upgraded.<\/p>\n<p style=\"text-align: justify;\">Obviously Erin didn&#8217;t want them to do another failover and take more downtime and luckily they have a process in place to capture wait statistics all the time so Erin could see what waits happened while the failover was in progress. One stood out to her right away: <a href=\"https:\/\/www.sqlskills.com\/help\/waits\/qds_loaddb\/\" target=\"_blank\" rel=\"noopener noreferrer\"><em>QDS_LOADDB<\/em><\/a>.<\/p>\n<p style=\"text-align: justify;\">The <em>QDS_LOADDB<\/em> wait happens when a database is bring brought online, Query Store is enabled, and the Query Store data is being loaded from disk. No queries can execute until that data has finished loading. In this case, the client had enabled Query Store after upgrading and with their large workload of ad hoc queries, there was 100GB of Query Store data to load. That was the cause of the long failover time and they didn&#8217;t realize that Query Store has this behavior.<\/p>\n<p style=\"text-align: justify;\">Luckily there is a workaround other than disabling Query Store. You can enable documented trace flag 7752 which makes the Query Store data load asynchronous with the database startup. This means that queries can run before the data load completes, but Query Store won&#8217;t capture any information about them (Query Store is essentially read-only until the data load completes). Erin discusses this and other Query Store trace flags in her post <a href=\"https:\/\/www.sqlskills.com\/blogs\/erin\/query-store-trace-flags\/\" target=\"_blank\" rel=\"noopener noreferrer\">here<\/a>, and in general about Query Store settings and data size in <a href=\"https:\/\/www.sqlskills.com\/blogs\/erin\/query-store-settings\/\" target=\"_blank\" rel=\"noopener noreferrer\">this post<\/a>.<\/p>\n<p style=\"text-align: justify;\">Bottom line: make sure you understand how a feature changes behavior before enabling it, and HA\/DR testing should be performed regularly to catch unexpected behaviors such as this.<\/p>\n<p style=\"text-align: justify;\">\n","protected":false},"excerpt":{"rendered":"<p>(The Curious Case of\u2026\u00a0used to be part of our bi-weekly\u00a0newsletter\u00a0but we decided to make it a\u00a0regular blog post instead so it can sometimes be more frequent. It covers something interesting one of us encountered when working with a client, doing some testing, or were asked in a random question from the community.) Erin worked with [&hellip;]<\/p>\n","protected":false},"author":5,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[116,73,115],"tags":[],"class_list":["post-4993","post","type-post","status-publish","format-standard","hentry","category-query-store","category-query-tuning","category-the-curious-case-of"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v26.5 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>The Curious Case of... very long failover times with a large ad hoc workload - Paul S. Randal<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.sqlskills.com\/blogs\/paul\/the-curious-case-of-very-long-failover-times-with-a-large-ad-hoc-workload\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"The Curious Case of... very long failover times with a large ad hoc workload - Paul S. Randal\" \/>\n<meta property=\"og:description\" content=\"(The Curious Case of\u2026\u00a0used to be part of our bi-weekly\u00a0newsletter\u00a0but we decided to make it a\u00a0regular blog post instead so it can sometimes be more frequent. It covers something interesting one of us encountered when working with a client, doing some testing, or were asked in a random question from the community.) Erin worked with [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.sqlskills.com\/blogs\/paul\/the-curious-case-of-very-long-failover-times-with-a-large-ad-hoc-workload\/\" \/>\n<meta property=\"og:site_name\" content=\"Paul S. Randal\" \/>\n<meta property=\"article:published_time\" content=\"2019-05-30T21:19:28+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2019-05-30T22:37:56+00:00\" \/>\n<meta name=\"author\" content=\"Paul Randal\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Paul Randal\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/paul\/the-curious-case-of-very-long-failover-times-with-a-large-ad-hoc-workload\/\",\"url\":\"https:\/\/www.sqlskills.com\/blogs\/paul\/the-curious-case-of-very-long-failover-times-with-a-large-ad-hoc-workload\/\",\"name\":\"The Curious Case of... very long failover times with a large ad hoc workload - Paul S. Randal\",\"isPartOf\":{\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/paul\/#website\"},\"datePublished\":\"2019-05-30T21:19:28+00:00\",\"dateModified\":\"2019-05-30T22:37:56+00:00\",\"author\":{\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/paul\/#\/schema\/person\/ffcec826c18782e1e0adf173826a7fce\"},\"breadcrumb\":{\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/paul\/the-curious-case-of-very-long-failover-times-with-a-large-ad-hoc-workload\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.sqlskills.com\/blogs\/paul\/the-curious-case-of-very-long-failover-times-with-a-large-ad-hoc-workload\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/paul\/the-curious-case-of-very-long-failover-times-with-a-large-ad-hoc-workload\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.sqlskills.com\/blogs\/paul\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"The Curious Case of&#8230; very long failover times with a large ad hoc workload\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/paul\/#website\",\"url\":\"https:\/\/www.sqlskills.com\/blogs\/paul\/\",\"name\":\"Paul S. Randal\",\"description\":\"In Recovery...\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.sqlskills.com\/blogs\/paul\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/paul\/#\/schema\/person\/ffcec826c18782e1e0adf173826a7fce\",\"name\":\"Paul Randal\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.sqlskills.com\/blogs\/paul\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/0b6a266bba2f088f2551ef529293001bd73bf026bc1908b9866728c062beeeb6?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/0b6a266bba2f088f2551ef529293001bd73bf026bc1908b9866728c062beeeb6?s=96&d=mm&r=g\",\"caption\":\"Paul Randal\"},\"sameAs\":[\"http:\/\/3.209.169.194\/blogs\/paul\"],\"url\":\"https:\/\/www.sqlskills.com\/blogs\/paul\/author\/paul\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"The Curious Case of... very long failover times with a large ad hoc workload - Paul S. Randal","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.sqlskills.com\/blogs\/paul\/the-curious-case-of-very-long-failover-times-with-a-large-ad-hoc-workload\/","og_locale":"en_US","og_type":"article","og_title":"The Curious Case of... very long failover times with a large ad hoc workload - Paul S. Randal","og_description":"(The Curious Case of\u2026\u00a0used to be part of our bi-weekly\u00a0newsletter\u00a0but we decided to make it a\u00a0regular blog post instead so it can sometimes be more frequent. It covers something interesting one of us encountered when working with a client, doing some testing, or were asked in a random question from the community.) Erin worked with [&hellip;]","og_url":"https:\/\/www.sqlskills.com\/blogs\/paul\/the-curious-case-of-very-long-failover-times-with-a-large-ad-hoc-workload\/","og_site_name":"Paul S. Randal","article_published_time":"2019-05-30T21:19:28+00:00","article_modified_time":"2019-05-30T22:37:56+00:00","author":"Paul Randal","twitter_misc":{"Written by":"Paul Randal","Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.sqlskills.com\/blogs\/paul\/the-curious-case-of-very-long-failover-times-with-a-large-ad-hoc-workload\/","url":"https:\/\/www.sqlskills.com\/blogs\/paul\/the-curious-case-of-very-long-failover-times-with-a-large-ad-hoc-workload\/","name":"The Curious Case of... very long failover times with a large ad hoc workload - Paul S. Randal","isPartOf":{"@id":"https:\/\/www.sqlskills.com\/blogs\/paul\/#website"},"datePublished":"2019-05-30T21:19:28+00:00","dateModified":"2019-05-30T22:37:56+00:00","author":{"@id":"https:\/\/www.sqlskills.com\/blogs\/paul\/#\/schema\/person\/ffcec826c18782e1e0adf173826a7fce"},"breadcrumb":{"@id":"https:\/\/www.sqlskills.com\/blogs\/paul\/the-curious-case-of-very-long-failover-times-with-a-large-ad-hoc-workload\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.sqlskills.com\/blogs\/paul\/the-curious-case-of-very-long-failover-times-with-a-large-ad-hoc-workload\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.sqlskills.com\/blogs\/paul\/the-curious-case-of-very-long-failover-times-with-a-large-ad-hoc-workload\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.sqlskills.com\/blogs\/paul\/"},{"@type":"ListItem","position":2,"name":"The Curious Case of&#8230; very long failover times with a large ad hoc workload"}]},{"@type":"WebSite","@id":"https:\/\/www.sqlskills.com\/blogs\/paul\/#website","url":"https:\/\/www.sqlskills.com\/blogs\/paul\/","name":"Paul S. Randal","description":"In Recovery...","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.sqlskills.com\/blogs\/paul\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/www.sqlskills.com\/blogs\/paul\/#\/schema\/person\/ffcec826c18782e1e0adf173826a7fce","name":"Paul Randal","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.sqlskills.com\/blogs\/paul\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/0b6a266bba2f088f2551ef529293001bd73bf026bc1908b9866728c062beeeb6?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/0b6a266bba2f088f2551ef529293001bd73bf026bc1908b9866728c062beeeb6?s=96&d=mm&r=g","caption":"Paul Randal"},"sameAs":["http:\/\/3.209.169.194\/blogs\/paul"],"url":"https:\/\/www.sqlskills.com\/blogs\/paul\/author\/paul\/"}]}},"_links":{"self":[{"href":"https:\/\/www.sqlskills.com\/blogs\/paul\/wp-json\/wp\/v2\/posts\/4993","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.sqlskills.com\/blogs\/paul\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.sqlskills.com\/blogs\/paul\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.sqlskills.com\/blogs\/paul\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/www.sqlskills.com\/blogs\/paul\/wp-json\/wp\/v2\/comments?post=4993"}],"version-history":[{"count":0,"href":"https:\/\/www.sqlskills.com\/blogs\/paul\/wp-json\/wp\/v2\/posts\/4993\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.sqlskills.com\/blogs\/paul\/wp-json\/wp\/v2\/media?parent=4993"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.sqlskills.com\/blogs\/paul\/wp-json\/wp\/v2\/categories?post=4993"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.sqlskills.com\/blogs\/paul\/wp-json\/wp\/v2\/tags?post=4993"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}