Checking Your Intel Processor Features Regarding the Meltdown Exploit

By now, you have probably heard plenty about the Spectre and Meltdown exploits that affect many modern processors. The Meltdown exploit (CVE-2017-5754) in particular only affects Intel processors.

Microsoft has already patched most of their client and server operating systems to mitigate against Meltdown. Depending on the exact model of Intel processor you are using, the Microsoft operating system patch will take advantage of processor support for two specific features and instructions which help minimize the performance impact of the operating system patch for Meltdown by using kernel page-table isolation.

These two instructions are process-context identifiers (PCID) and invalidate process-context identifiers (INVPCID). If your Intel processor supports both of these instructions, then the patched version of Windows will take advantage of that to give you better performance than you would otherwise get without those two instructions.

I have previously written about how to use PowerShell to check your Meltdown and Spectre mitigation status in Windows. Using that PowerShell module will give you a complete view of your Meltdown and Spectre mitigation status.

If your Intel processor supports both PCID and INVPCID, then you will see this message at the bottom of the PowerShell results:

Windows OS support for PCID performance optimization is enabled: True [not required for security]

Another way to check what features that your processor supports (whether it is AMD or Intel) is the Coreinfo v3.31 tool from Microsoft. You need to download the zip file, and then unzip it and copy the coreinfo.exe file to where you want it.

You can then simply use a command prompt to run Coreinfo –f > CoreInfoResults.txt to dump the core feature information for the processor to a text file in the same directory as the coreinfo.exe file as you see in Figure 1.


image

Figure 1: Running Coreinfo with the –f flag and saving the results to a text file


If your Intel processor supports the PCID feature and the INVPCID instruction, you will see an asterisk in the result (which is towards the bottom of the file), as you see below:

PCID            *    Supports PCIDs and settable CR4.PCIDE
INVPCID       *    Supports INVPCID instruction

If your Intel processor does not support either or both of the PCID feature and the INVPCID instruction, you will see a dash in the result, as you see below:

PCID            –   Supports PCIDs and settable CR4.PCIDE
INVPCID       –   Supports INVPCID instruction

It is possible to for a processor to support PCID and not support INVPCID. My understanding from the Microsoft Powershell code is that you need both to get Windows OS support for the PCID performance optimization.

From what information that I have been able to dig up (and from also asking a source at Intel), PCID is supported in some Westmere family and most Sandy Bridge family processors, with server processors getting support before client processors. According to Intel documentation, INVPCID support was introduced in the Haswell family processors.

This means you will get Windows OS support for the PCID performance optimization in Intel Xeon E5-2600 v3 product family (Haswell-EP) and later processors, along with the Intel Xeon E7 v3 product family (Haswell-EX) and later processors.

In my mind, this is even more justification for a new server or hopefully a complete data platform upgrade, where you get a brand new server running Windows Server 2016, along with SQL Server 2017. If you are thinking along the same lines, SQLskills can help you plan and implement an upgrade and migration.


Update: 1-12-2018

I ran CoreInfo on every Intel client machine that I own (a total of 16 machines, which is way too many)! The oldest is a 2008-vintage Intel Core2 Quad Q9550 (Yorkfield), which as expected, has neither PCID or INVPCID support. The next oldest is a 2009-vintage Intel Core i7-720QM (Clarksfield) which also has neither PCID or INVPCID support. I don’t have any Nehalem or Westmere machines.

After that, my oldest machine is a Q1 2011-vintage Intel Core i7-2600K (Sandy Bridge), which has PCID support, but does not have INVPCID support. The situation is the same for a Q1 2012-vintage Intel Core i7-3770K (Ivy Bridge) which has PCID support, but does not have INVPCID support.

With a Q2 2013-vintage Intel Core i5-4670K (Haswell), we finally get both PCID and INVPCID support. My newest Intel machine has a Q4 2017-vintage Intel Core i7-8700K (Coffee Lake) which also has both PCID and INVPCID support.




SQL Server Diagnostic Information Queries for January 2018

This month, there are more minor updates to the all of the versions of the queries, primarily in the comments and documentation. There are also links for the Spectre/Meltdown hotfixes for SQL Server 2008 SP4 and SQL Server 2008 R2 SP3.

I often make additional minor updates to the queries periodically during the month, so if you are in doubt, downloading the latest version is always a good idea.

Rather than having a separate blog post for each version, I have just put the links for all eight major versions in this single post. There are two separate links for each version. The first one on the top left is the actual diagnostic query script, and the one below on the right is the matching blank results spreadsheet, with labeled tabs that correspond to each query in the set.

Here are links to the latest versions of these queries for Azure SQL Database, SQL Server 2017, 2016, and 2014:

Azure SQL Database Diagnostic Information Queries

Azure SQL Database Blank Results Spreadsheet

SQL Server 2017 Diagnostic Information Queries

SQL Server 2017 Blank Results Spreadsheet

SQL Server 2016 Diagnostic Information Queries

SQL Server 2016 Blank Results Spreadsheet

SQL Server 2014 Diagnostic Information Queries

SQL Server 2014 Blank Results Spreadsheet

Here are links to the most recent versions of these scripts for SQL Server 2012 and older:

Since SQL Server 2012 and older are out of Mainstream support from Microsoft (and because fewer of my customers are using these old versions of SQL Server), I am not going to be updating the scripts for these older versions of SQL Server every single month going forward.  I started this policy a while ago, and so far, I have not heard any complaints.

SQL Server 2012 Diagnostic Information Queries

SQL Server 2012 Blank Results Spreadsheet

SQL Server 2008 R2 Diagnostic Information Queries

SQL Server 2008 R2 Blank Results Spreadsheet

SQL Server 2008 Diagnostic Information Queries

SQL Server 2008 Blank Results Spreadsheet

SQL Server 2005 Diagnostic Information Queries

SQL Server 2005 Blank Results Spreadsheet

The basic instructions for using these queries is that you should run each query in the set, one at a time (after reading the directions for that query). It is not really a good idea to simply run the entire batch in one shot, especially the first time you run these queries on a particular server, since some of these queries can take some time to run, depending on your workload and hardware. I also think it is very helpful to run each query, look at the results (and my comments on how to interpret the results) and think about the emerging picture of what is happening on your server as you go through the complete set. I have quite a few comments and links in the script on how to interpret the results after each query.

After running each query, you need to click on the top left square of the results grid in SQL Server Management Studio (SSMS) to select all of the results, and then right-click and select “Copy with Headers” to copy all of the results, including the column headers to the Windows clipboard. Then you paste the results into the matching tab in the blank results spreadsheet.

About half of the queries are instance specific and about half are database specific, so you will want to make sure you are connected to a database that you are concerned about instead of the master system database. Running the database-specific queries while being connected to the master database is a very common mistake that I see people making when they run these queries.

Note: These queries are stored on Dropbox. I occasionally get reports that the links to the queries and blank results spreadsheets do not work, which is most likely because Dropbox is blocked wherever people are trying to connect. I am not planning on moving these to Github any time soon.

I also occasionally get reports that some of the queries simply don’t work. This usually turns out to be an issue where people have some of their user databases in 80 compatibility mode, which breaks many DMV queries, or that someone is running an incorrect version of the script for their version of SQL Server.

It is very important that you are running the correct version of the script that matches the major version of SQL Server that you are running. There is an initial query in each script that tries to confirm that you are using the correct version of the script for your version of SQL Server. If you are not using the correct version of these queries for your version of SQL Server, some of the queries are not going to work correctly.

If you want to understand how to better run and interpret these queries, you should consider listening to my three related Pluralsight courses, which are SQL Server 2014 DMV Diagnostic Queries – Part 1SQL Server 2014 DMV Diagnostic Queries – Part 2 and SQL Server 2014 DMV Diagnostic Queries – Part 3. All three of these courses are pretty short and to the point, at 67, 77, and 68 minutes respectively. Listening to these three courses is really the best way to thank me for maintaining and improving these scripts…

Please let me know what you think of these queries, and whether you have any suggestions for improvements. Thanks!

Performance Effects of Meltdown and Partial Spectre Fixes on Intel Core i7-7500U Laptop

I have a fairly recent vintage HP Spectre x360-13w023dx laptop (slightly over a year old) that has an Intel Core i7-7500U (Kaby Lake-U) processor, 16GB of DDR4 RAM, and a 512GB Samsung PM961 M.2 NVMe SSD that is running Windows 10 Professional Version 1709.

Last night, I installed the Windows 10 January 2018 Security Update (KB4056892) on this system, and then used PowerShell to check my status, with the results as shown in Figure 1. These are the relevant results:

Speculation control settings for CVE-2017-5715 [branch target injection] (This is Spectre variant 2)

Hardware support for branch target injection mitigation is present: False

Windows OS support for branch target injection mitigation is present: True

Windows OS support for branch target injection mitigation is enabled: False

Windows OS support for branch target injection mitigation is disabled by system policy: False

Windows OS support for branch target injection mitigation is disabled by absence of hardware support: True

Speculation control settings for CVE-2017-5754 [rogue data cache load] (This is Meltdown)

Hardware requires kernel VA shadowing: True

Windows OS support for kernel VA shadow is present: True

Windows OS support for kernel VA shadow is enabled: True

Windows OS support for PCID performance optimization is enabled: True [not required for security]

Suggested actions

* Install BIOS/firmware update provided by your device OEM that enables hardware support for the branch target injection mitigation mitigation


image

Figure 1: After Windows 10 Security Update, but before BIOS Update


After I had installed the Windows 10 January 2018 Security Update (KB4056892) on this system, I ran both the CPU-Z 1.82.1 CPU benchmark and the CrystalDiskMark 6.0.0 synthetic disk benchmark three times, to get a performance baseline before I installed the new BIOS (that has the microcode update to enable the Spectre variant 2 fixes). Some results from these test runs are shown in Figures 3 and 5 below.


Following Microsoft’s guidance, I located and installed the latest BIOS for my HP Spectre x360 laptop, which is version F.42. Then, I again used PowerShell to check my status, with the results as shown in Figure 2. These are the relevant results:

Speculation control settings for CVE-2017-5715 [branch target injection] (This is Spectre variant 2)

Hardware support for branch target injection mitigation is present: True

Windows OS support for branch target injection mitigation is present: True

Windows OS support for branch target injection mitigation is enabled: True

Speculation control settings for CVE-2017-5754 [rogue data cache load] (This is Meltdown)

Hardware requires kernel VA shadowing: True

Windows OS support for kernel VA shadow is present: True

Windows OS support for kernel VA shadow is enabled: True

Windows OS support for PCID performance optimization is enabled: True [not required for security]


image

Figure 2: After Windows 10 Security Update and BIOS Update


Figure 3 shows one of the CPU-Z 1.82.1 benchmark run results before the BIOS update, showing a CPU Single Thread score of 373.9, and a CPU Multi Thread score of 1000.2.


image

Figure 3: After Windows 10 Security Update, but before BIOS Update


Figure 4 shows one of the CPU-Z 1.82.1 benchmark run results after the BIOS update, showing a CPU Single Thread score of 317.5, and a CPU Multi Thread score of 971.8. That is about a 15.1% reduction in single-threaded CPU performance and a 2.8% reduction in multi-threaded CPU performance, at least on this quick synthetic CPU benchmark.


image

Figure 4: After Windows 10 Security Update and BIOS Update


Figure 5 shows one of the CrystalDiskMark 6.0.0 benchmark run results before the BIOS update.


image

Figure 5: After Windows 10 Security Update, but before BIOS Update


Figure 6 shows one of the CrystalDiskMark 6.0.0 benchmark run results after the BIOS update.


image

Figure 6: After Windows 10 Security Update and BIOS Update


What these limited synthetic benchmark test results show is that installing the CPU microcode updates (which are a required part of the Spectre variant 2 fix) seems to have a pretty significant effect on single-threaded CPU performance in the CPU-Z benchmark. We also see a very significant effect on random I/O performance in CrystalDiskMark 6.0.0 using the default test settings with a 4GB test file.

Microsoft’s Terry Myerson has a pretty detailed post on this subject

Understanding the performance impact of Spectre and Meltdown mitigations on Windows Systems

Intel has published some client benchmark results for 6th, 7th, and 8th Generation Core processors, as shown here:

Intel Security Issue Update: Initial Performance Data Results for Client Systems




SQL Server 2017 CU3 Released on January 4, 2018

On January 4, 2018, Microsoft released SQL Server 2017 CU3, which is Build 14.0.3015.40. By my count, this CU has sixteen public hotfixes, many of which are for the SQL Engine or SQL performance. There are also some new manageability and programmability features that have been added, such as support for the MAXDOP option for CREATE STATISTICS and UPDATE STATISTICS.

In addition, Microsoft has included the security fixes from the January 3 SQL Server security update in this Cumulative Update. Microsoft has very detailed guidance on how this may affect SQL Server here. One important item to note is that SQL Server 2017 CU3 may have been offered and deployed as an important update by Microsoft Update, depending on how your Microsoft Update settings are configured or whether you are using WSUS.

Since SQL Server 2017 won’t be using Service Packs as part of its servicing mechanism, you will need to start testing and deploying Cumulative Updates on a schedule that makes sense for your organization.

As always, I think it is a good idea to make an effort to stay current on Cumulative Updates, as does Microsoft.

Checking Your Meltdown and Spectre Mitigation Status in Windows

As I have previously discussed, there has been an explosion of information and speculation about the Spectre and Meltdown vulnerabilities. Here is the official information about all three vulnerabilities.


Spectre

CVE-2017-5753 (Bounds Check Bypass)

CVE-2017-5715 (Branch Target Injection)


Meltdown

CVE-2017-5754 (Rogue Data Cache Load)


Checking Your Meltdown and Spectre Mitigation Status in Windows

Microsoft has released a PowerShell module that you can easily use to check whether you have the operating system patch that helps mitigate CVE-2017-5715 (Spectre – Branch Target Injection) installed and enabled, along with a BIOS/CPU microcode update that is also required as part of the mitigation.

The script also shows whether you have the operating system patch to mitigate CVE-2017-5754 (Meltdown – Rogue Data Cache Load) installed and enabled, along with whether you have OS support for the PCID performance optimization, which relies on hardware support in your processor for the process-context identifiers (PCID) feature to reduce the performance impact of the CVE-2017-5754 patch.

This Microsoft KB article explains this in more detail and has a link to download the PowerShell Module for operating systems prior to Windows Server 2016.

Windows Client Guidance for IT Pros to protect against speculative execution side-channel vulnerabilities

I have a Dell Precision 5520 laptop with an Intel Xeon E3-1505M v6 processor, running Windows 10 Professional Version 1709 (OS Build 16299.192) with the January 2018 Security Update (KB4056892) installed. I have also installed BIOS version 1.7.0 for the Dell Precision 5520, which has the firmware update to help mitigate CVE-2017-5715.

Figure 1 shows the results I get after installing both of these updates.


clip_image002

Figure 1: Get-SpeculationControlSettings Results on Fully Patched Dell Precision 5520


I also have a ASRock Fatal1ty X399 Professional Gaming desktop system with an AMD Ryzen Threadripper 1950X processor, running Windows 10 Professional Version 1709 (OS Build 16299.192) with the January 2018 Security Update (KB4056892) installed. I have the latest BIOS version 2.00 for the ASRock Fatal1ty X399 Professional Gaming (which does not include the CVE-2017-5715 firmware fix). ASRock has not released an updated BIOS for this issue yet.

Figure 2 shows the results after installing the Windows 10 January 2018 Security Update. It is also notable that the Get-SpeculationControlSettings module does not think that this modern AMD processor needs kernel VA shadowing (meaning that it is not vulnerable to CVE-2017-5754).

image

Figure 2: Get-SpeculationControlSettings Results on Partially Patched ASRock Fatal1ty X399 Professional Gaming

This Microsoft KB article explains in great detail how to interpret the results of Get-SpeculationControlSettings

Understanding Get-SpeculationControlSettings PowerShell script output

Microsoft SQL Server Updates for Meltdown and Spectre Exploits

Over the last couple of days, you have probably heard quite a bit of chatter and speculation about some newly disclosed ways to attack various processors. The initial reports were that only Intel processors were affected, but some sources indicate that some AMD and ARM processors are also vulnerable.

Security researchers at Graz University (who were involved with the initial discovery of these issues) have put up a site, complete with cute logos, with some useful information about these two exploits. The most detailed information so far about the attack methods comes from Google Project Zero, as shown here: Reading privileged memory with a side-channel. Their testing shows some limited vulnerability for some older AMD processors.

AMD is pretty adamant that their processors are not vulnerable to these exploits, as shown by this statement from AMD’s Tom Lendacky:

“AMD processors are not subject to the types of attacks that the kernel
page table isolation feature protects against. The AMD microarchitecture
does not allow memory references, including speculative references, that
access higher privileged data when running in a lesser privileged mode
when that access would result in a page fault.
Disable page table isolation by default on AMD processors by not setting
the X86_BUG_CPU_INSECURE feature, which controls whether X86_FEATURE_PTI
is set.”

Linus Torvalds also seems pretty confident that AMD is not affected, as witnessed by his comments in a recent code check-in:

“Exclude AMD from the PTI enforcement. Not necessarily a fix, but if AMD is so confident that they are not affected, then we should not burden users with the overhead”

Paul Alcorn has a pretty good write-up about this issue here. Yesterday, Phoronix published some early benchmark results against a patched version of Linux that were pretty alarming for some use cases (synthetic IO benchmarks and PostgreSQL database performance).

Redhat has published some information about the performance impact of OS fixes on several different workload types. The most notable is what they define as

“Measureable: 8-19% – Highly cached random memory, with buffered I/O, OLTP database workloads, and benchmarks with high kernel-to-user space transitions are impacted between 8-19%. Examples include OLTP Workloads (tpc), sysbench, pgbench, netperf (< 256 byte), and fio (random I/O to NvME)”

More details about these findings and some mitigation methods for RHEL are available in these links:

Speculative Execution Exploit Performance Impacts – Describing the performance impacts to security patches for CVE-2017-5754 CVE-2017-5753 and CVE-2017-5715

Controlling the Performance Impact of Microcode and Security Patches for CVE-2017-5754 CVE-2017-5715 and CVE-2017-5753 using Red Hat Enterprise Linux Tunables

It seems like the various fixes for these issues are going to hit database and virtualization performance harder than most other use cases.  I wonder whether it will be possible for Intel to at least partially fix the issue with a stepping change on any Intel processors that are still in production (i.e. they make an actual hardware fix using the same existing processor design) that lets them send out replacement processors that work in some existing servers.

If you are old enough to remember the old Pentium FDIV bug in 1994, Intel initially tried to minimize the issue, saying that it was very rare. Then, they tried to make people prove that they were hitting the bug by running an Intel utility. Finally, they caved in to bad PR and ended up sending out replacement CPUs to a lot of people, no questions asked, which cost them $475 million back in the day. I remember swapping out my CPU, because I was a geek back then too!

Early this morning, Microsoft published this KB article: SQL Server Guidance to protect against speculative execution side-channel vulnerabilities. According to Microsoft, the following versions of SQL Server are impacted when running on x86 and x64 processor systems: SQL Server 2008, SQL Server 2008R2, SQL Server 2012, SQL Server 2014, SQL Server 2016, SQL Server 2017.

Microsoft has already issued two Cumulative Updates that include fixes to help mitigate this issue (along with the other important hotfixes included in each CU).

Cumulative Update 3 for SQL Server 2017

Cumulative Update 7 for SQL Server 2016 SP1

I suspect that there will be an out of band CU or hotfix for SQL Server 2014 SP2 relatively soon, since it is still in Mainstream support. Even though SQL Server 2012 and older are out of Mainstream support, Microsoft will probably develop and release hotfixes for those releases relatively soon since this is a security issue.

Microsoft has also started pushing out an out of band OS update for Windows 10 (KB4056892) that is meant to mitigate this issue. There are similar updates for most other supported Microsoft operating systems. Here is the current information for Windows Server:

Windows Server guidance to protect against speculative execution side-channel vulnerabilities

Here is Microsoft’s current security advisory advice:

ADV180002 | Guidance to mitigate speculative execution side-channel vulnerabilities

Microsoft has also released this statement about how they have been handling this for Microsoft Azure

Securing Azure customers from CPU vulnerability

Here is what I plan on doing over the next couple of weeks as this starts to shake out:

Here is what I think you should be doing:

Plan on getting your database servers patched as soon as possible, which will include OS patches, SQL Server patches, and possible firmware or BIOS/UEFI updates as they become available.

Be ready to do some workload and query tuning as necessary if your workload performance is negatively affected by these various patches and updates.

Think harder about upgrading to new hardware, a newer version of your OS, and a newer version of SQL Server that is still fully supported.

For personal and client workstation systems, you should be checking to see if there are any firmware or BIOS/UEFI updates that become available, both for these issues and as a general best practice.

                  Windows Client Guidance for IT Pros to protect against speculative execution side-channel vulnerabilities


I am collecting some resources about this issue from the server vendors as shown in the links below:

Cisco

CPU Side-Channel Information Disclosure Vulnerabilities

Dell

Microprocessor Side-Channel Attacks (CVE-2017-5715, CVE-2017-5753, CVE-2017-5754): Impact on Dell EMC products (Dell Enterprise Servers, Storage and Networking)

Microprocessor Side-Channel Attacks (CVE-2017-5715, CVE-2017-5753, CVE-2017-5754): Impact on Dell products  (This is for client hardware)

Fujitsu

CPU hardware vulnerable to side-channel attacks (CVE-2017-5715, CVE-2017-5753, CVE-2017-5754)

HPE

Side Channel Analysis Method allows information disclosure in Microprocessors (CVE-2017-5715, CVE-2017-5753, CVE-2017-5754)

Huawei

Security Notice – Statement on the Media Disclosure of the Security Vulnerabilities in the Intel CPU Architecture Design

IBM

Potential CPU Security Issue

Lenovo

Reading Privileged Memory with a Side Channel


 

 

SQL101: Avoiding Mistakes on a Production Database Server

As Kimberly blogged about earlier this year, SQLskills has an ongoing initiative to blog about basic topics, which we’re calling SQL101. We’re all blogging about things that we often see done incorrectly, technologies used the wrong way, or where there are many misunderstandings that lead to serious problems. If you want to find all of our SQLskills SQL101 blog posts, check out SQLskills.com/help/SQL101.

One reason that it is relatively difficult to get your first job as a DBA (compared to other positions, such as a developer) is that it is very easy for a DBA with Production access to cause an enormous amount of havoc with a single momentary mistake.

As a Developer, many of your most common mistakes are only seen by yourself. If you write some code with a syntax error that doesn’t compile, or you write some code that fails your unit tests, usually nobody sees those problems but you, and you have the opportunity to fix your mistakes before you check-in your code, with no one being any the wiser.

A DBA doing something like running an UPDATE or DELETE statement without a WHERE clause, running a query against a Production instance database when you thought you were running it against a Development instance database, or making a schema change in Production that is a size of data operation (that locks up a table for a long period) are just a few examples of common DBA mistakes that can have huge consequences for an organization.

A split-second, careless DBA mistake can cause a long outage that can be difficult or even impossible to recover from. In SQL Server, Cntl-Z (the undo action) does not work, so you need to be detail-oriented and careful as a good DBA. As the old saying goes: “measure twice and cut once”.

Here are a few basic tips that can help you avoid some of these common mistakes:


Using Color-Coded Connections in SSMS

SQL Server Management Studio (SSMS) has long had the ability to set a custom color as a connection property for individual connections to an instance of SQL Server. This option is available in legacy versions of SSMS and in the latest 17.4 version of SSMS. You can get even more robust connection coloring capability with third-party tools such as SSMS Tools Pack.

The idea here is to set specific colors, such as red, yellow, or green for specific types of database instances to help remind you when you are connected to a Production instance rather than a non-Production instance. It is fairly common to use red for a Production instance. This can be helpful if you don’t have red green color blindness, which affects about 7-10% of men, but is much less common among women.

Figure 1 shows how you can check the “Use custom color” checkbox, and then select the color you want to use for that connection. After that, as long as you use the exact same connection credentials for that instance from your copy of SSMS, you should get the color that you set when you open a connection to that instance.

I would not bet my job on the color always being accurate, because depending on exactly how you open a connection to the instance, you may not always get the custom color that you set for the connection. Still, it is an extra piece of added insurance.


image

Figure 1: Setting a custom color for a connection


Figure 2 shows a red bar at the bottom of the query window (which is the default position for the bar) after setting a custom connection color. This would help warn me that I was connected to a Production instance, so I need to be especially careful before doing anything.


image

Figure 2: Query window using red for the connection


Double-Checking Your Connection Information Before Running a Query

Something you should always do before running any query is to take a second to glance down to the bottom right of SSMS Query window to verify your current connection information. It will show the name of the instance you are connected to, your logon information (including the SPID number), and the name of the database you are connected to.

Taking the time to always verify that you are connected to the database and instance that you think you are BEFORE running a query will save you from making many common, costly mistakes.


Wrap Queries in an Explicit Transaction

One common safety measure is to wrap your queries (especially potentially dangerous ones that update or delete data) in an explicit transaction as you see in Figure 3. You open an explicit transaction with a BEGIN TRAN statement, then run just your query, without the COMMIT TRAN statement. If the query does what you expect (which the xx rows affected message can often quickly confirm), then you commit the explicit transaction by executing the COMMIT TRAN statement.

If it turns out that you just made a horrible mistake (like I did in the example in Figure 3) by omitting the WHERE clause, you would execute the ROLLBACK TRAN statement to rollback your explicit transaction (which could take a while to complete).


image

Figure 3: Using an explicit transaction as a safety measure


Test your Update/Delete Queries as Select Queries Before You Run Them

Another common safety measure is to write and run a test version of any query that is designed to change data, where you simply SELECT the rows that you are planning on changing before you actually try to change them with an UPDATE or DELETE statement. You can often just have the query count the number of rows that come back from your test SELECT statement, but you might need or want to to browse the data that comes back to be 100% sure that you don’t have a logic error in your query that would end up deleting or updating the wrong result set.


These are just a few of the most common measures for avoiding common DBA mistakes. The most important step is to always be detail-oriented and very careful when you are making potentially dangerous changes in Production, which is easier said than done. If you do make a big mistake, don’t panic, and don’t try to cover it up. Taking a little time to think about what you did, and the best way to quickly and correctly fix the problem is always the best course of action.





SQL Server 2014 SP2 CU9

On December 18, 2017, Microsoft released SQL Server 2014 SP2 CU9, which is Build 12.05563.0. By my count, this CU has seven public hotfixes, nearly all of which are for the SQL Engine of SQL performance.

Since SQL Server 2014 SP1 and earlier are no longer “supported service packs”, there is no corresponding CU for the SP1 or RTM branches of SQL Server 2014.

As always, I think it is a good idea to make an effort to stay current on cumulative updates, as does Microsoft.

SQL Server Missing Indexes Feature and Trace Flag 2392

Background

Since the SQL Server 2005 release, SQL Server has had the default ability to collect metrics about what the query optimizer thinks are “missing” row store indexes, i.e. indexes that the query optimizer thinks would reduce the cost of particular individual queries that have been executed since SQL Server was last restarted.

The word missing is a little confusing to some people, since I have had many questions over the years about what could have happened to these “missing” indexes (implying that they used to be there, but somebody must have dropped them). Perhaps a better word would have been desired or candidate indexes.

At any rate, while far from perfect, I have found this feature to be a very useful method for helping to get the appropriate row store indexes in place for the overall SQL Server workload on a database. If you blindly add every new index that the query optimizer asks for, you will quickly have a large number of similar, narrow indexes on a table that probably could have been consolidated into fewer, wider indexes.

Using your common sense and experience as a DBA, your knowledge of your overall database workload and by running some queries to better characterize the volatility of a table, it is possible to do a pretty good job of index tuning using this feature. You also want to be on the lookout for indexes that can be dropped as part of your index tuning efforts.

These “missing indexes” are visible as the green missing index warnings in the graphical execution plan for a query, you can also find them by querying the plan cache looking for missing index warnings, and by running DMV queries to find them.


Recent Hotfix Reveals Trace Flag 2392

Microsoft has released a hotfix for the latest round of cumulative updates for SQL Server 2014, SQL Server 2016, and SQL Server 2017 that revealed trace flag 2392. This same fix is in SQL Server 2014 SP2 CU8, SQL Server 2016 SP1 CU6 and SQL 2017 CU2.

FIX: Access violation may occur when you cancel a pending query if the missing indexes feature is enabled in SQL Server 2014

Just to be clear, this hotfix has nothing to do with how the missing index feature functions. It just reveals a previously undocumented trace flag (TF 2392) that can be used to turn the missing index feature off completely (as a workaround to the issue that is corrected by the hotfix). This trace flag has been in the product since SQL Server 2005.

Once I discovered this trace flag, I was curious as to how it actually behaves, which is somewhat different than I expected. First, it is a startup-only trace flag, in terms of what it actually does. You can enable/disable it to your hearts content while SQL Server is running, and it will show as being enabled/disabled (with no error message about only being a startup-only TF).

The problem is, it will not disable/enable missing index stats collection unless it is enabled at startup. If you set it as a startup TF and restart SQL Server, then no missing index stats are collected. If you then subsequently disable TF 2392 while SQL Server is running, it still won’t collect any missing index stats (despite what you may expect).

You have to remove the startup TF 2392 and then restart SQL Server for the change to actually take effect (both ways, enabling and disabling). This is from my testing on SQL Server 2017 CU2.

Personally, I don’t think this is a big deal (unless you really need to enable the workaround described in the KB article). I think most people are going to want to leave the missing index feature enabled. The performance overhead of the feature is probably pretty insignificant.

You just need to be aware of how TF 2392 actually works and know that a restart of SQL Server is required to actually turn the missing index feature off or back on. It would be nice if you did get a message about it being a startup-only trace flag if you tried to enable/disable it while SQL Server was running.

SQL Server Diagnostic Information Queries for December 2017

This month, there are more minor updates to the all of the versions of the queries, primarily in the comments and documentation. There is also a new query for the SQL Server 2017 set of queries. I often make additional minor updates to the queries periodically during the month, so if you are in doubt, downloading the latest version is always a good idea.

Rather than having a separate blog post for each version, I have just put the links for all eight major versions in this single post. There are two separate links for each version. The first one on the top left is the actual diagnostic query script, and the one below on the right is the matching blank results spreadsheet, with labeled tabs that correspond to each query in the set.

Here are links to the latest versions of these queries for Azure SQL Database, SQL Server 2017, 2016, and 2014:

Azure SQL Database Diagnostic Information Queries

Azure SQL Database Blank Results Spreadsheet

SQL Server 2017 Diagnostic Information Queries

SQL Server 2017 Blank Results Spreadsheet

SQL Server 2016 Diagnostic Information Queries

SQL Server 2016 Blank Results Spreadsheet

SQL Server 2014 Diagnostic Information Queries

SQL Server 2014 Blank Results Spreadsheet

 

Here are links to the most recent versions of these scripts for SQL Server 2012 and older:

Since SQL Server 2012 and older are out of Mainstream support from Microsoft (and because fewer of my customers are using these old versions of SQL Server), I am not going to be updating the scripts for these older versions of SQL Server every single month going forward.  I started this policy a while ago, and so far, I have not heard any complaints.

SQL Server 2012 Diagnostic Information Queries

SQL Server 2012 Blank Results Spreadsheet

SQL Server 2008 R2 Diagnostic Information Queries

SQL Server 2008 R2 Blank Results Spreadsheet

SQL Server 2008 Diagnostic Information Queries

SQL Server 2008 Blank Results Spreadsheet

SQL Server 2005 Diagnostic Information Queries

SQL Server 2005 Blank Results Spreadsheet

The basic instructions for using these queries is that you should run each query in the set, one at a time (after reading the directions for that query). It is not really a good idea to simply run the entire batch in one shot, especially the first time you run these queries on a particular server, since some of these queries can take some time to run, depending on your workload and hardware. I also think it is very helpful to run each query, look at the results (and my comments on how to interpret the results) and think about the emerging picture of what is happening on your server as you go through the complete set. I have quite a few comments and links in the script on how to interpret the results after each query.

After running each query, you need to click on the top left square of the results grid in SQL Server Management Studio (SSMS) to select all of the results, and then right-click and select “Copy with Headers” to copy all of the results, including the column headers to the Windows clipboard. Then you paste the results into the matching tab in the blank results spreadsheet.

About half of the queries are instance specific and about half are database specific, so you will want to make sure you are connected to a database that you are concerned about instead of the master system database. Running the database-specific queries while being connected to the master database is a very common mistake that I see people making when they run these queries.

Note: These queries are stored on Dropbox. I occasionally get reports that the links to the queries and blank results spreadsheets do not work, which is most likely because Dropbox is blocked wherever people are trying to connect. I am not planning on moving these to Github any time soon.

I also occasionally get reports that some of the queries simply don’t work. This usually turns out to be an issue where people have some of their user databases in 80 compatibility mode, which breaks many DMV queries, or that someone is running an incorrect version of the script for their version of SQL Server.

It is very important that you are running the correct version of the script that matches the major version of SQL Server that you are running. There is an initial query in each script that tries to confirm that you are using the correct version of the script for your version of SQL Server. If you are not using the correct version of these queries for your version of SQL Server, some of the queries are not going to work correctly.

If you want to understand how to better run and interpret these queries, you should consider listening to my three related Pluralsight courses, which are SQL Server 2014 DMV Diagnostic Queries – Part 1SQL Server 2014 DMV Diagnostic Queries – Part 2 and SQL Server 2014 DMV Diagnostic Queries – Part 3. All three of these courses are pretty short and to the point, at 67, 77, and 68 minutes respectively. Listening to these three courses is really the best way to thank me for maintaining and improving these scripts…

Please let me know what you think of these queries, and whether you have any suggestions for improvements. Thanks!