Balancing Your Available SQL Server Core Licenses Evenly Across NUMA Nodes

One issue I recently ran into with a customer was a case where they were using a new, two-socket server with two Intel Xeon E5-2697 v2 Ivy Bridge-EP processors. This particular 22nm processor has twelve physical cores, plus hyper-threading, so you can have 24 logical cores for one entire physical processor, and 48 logical cores for a two-socket server with both sockets populated.

The reason why we had an issue is because both SQL Server 2012 and 2014 Standard Edition have an artificial license limit of either 16 physical cores or 32 logical cores (along with a limit of four sockets, whichever is lower). This meant that SQL Server 2014 Standard Edition could only use 32 out of the 48 logical cores in this new server. To make matters worse, the available cores were spread unevenly across the two NUMA nodes in the server. There were 24 logical cores on one NUMA node and 8 logical cores on the other NUMA node in the server, which was not an ideal situation.

Fortunately, it is pretty easy to fix this issue by using the ALTER SERVER CONFIGURATION command that was introduced in SQL Server 2008 R2. On the negative side, fixing it was not completely intuitive, because of these dumb Standard Edition license limits.

You can use the queries in the code below to detect and correct this issue.

-- Balancing your available SQL Server core licenses evenly across two NUMA nodes
-- Glenn Berry
-- SQLskills.com


-- Get socket, physical core and logical core count from SQL Server Error Log
EXEC sys.xp_readerrorlog 0, 1, N'detected', N'socket';


-- SQL Server NUMA node information 
SELECT node_id, node_state_desc, memory_node_id, processor_group, online_scheduler_count, 
       active_worker_count, avg_load_balance, resource_monitor_state
FROM sys.dm_os_nodes WITH (NOLOCK) 
WHERE node_state_desc <> N'ONLINE DAC' OPTION (RECOMPILE);


-- SQL Server schedulers by NUMA node
SELECT parent_node_id, 
  SUM(current_tasks_count) AS [current_tasks_count], 
  SUM(runnable_tasks_count) AS [runnable_tasks_count], 
  SUM(active_workers_count) AS [active_workers_count], 
  AVG(load_factor) AS avg_load_factor
FROM sys.dm_os_schedulers WITH (NOLOCK) 
WHERE [status] = N'VISIBLE ONLINE'
GROUP BY parent_node_id;



-- SQL Server NUMA node and cpu_id information
SELECT parent_node_id, scheduler_id, cpu_id
FROM sys.dm_os_schedulers WITH (NOLOCK) 
WHERE [status] = N'VISIBLE ONLINE';


-- Fixing the problem

-- Unfortunately, this does not work, due to the license limits in SQL 2012/2014 Standard Edition
ALTER SERVER CONFIGURATION SET PROCESS AFFINITY NUMANODE = 0,1;

-- Msg 5833, Level 16, State 2, Line 7
-- The affinity mask specified is greater than the number of CPUs supported or licensed on this edition of SQL Server.


-- This command spreads your available 32 logical core licenses across two NUMA nodes 
-- This is valid for an Intel processor, with HT enabled
ALTER SERVER CONFIGURATION
SET PROCESS AFFINITY CPU = 0 TO 15, 25 TO 40;
 

 

A similar issue with “grandfathered” SQL Server 2012 Enterprise Server+CAL customers was discussed by my colleague, Jonathan Kehayias here.

SQL Server Diagnostic Information Queries for February 2015

I revised several of the queries this month in all five versions of the script.  The main enhancements are additional comments and documentation about the queries, including more information about how to interpret the results. 

Rather than having a separate blog post for each version, I have just put the links for all five major versions in this single post. There are two separate links for each version. The first one on the top left is the actual query script, and the one below on the right is the matching blank results spreadsheet.  

SQL Server 2005 Diagnostic Information Queries

SQL Server 2005 Blank Results

SQL Server 2008 Diagnostic Information Queries

SQL Server 2008 Blank Results

SQL Server 2008 R2 Diagnostic Information Queries

SQL Server 2008 R2 Blank Results

SQL Server 2012 Diagnostic Information Queries

SQL Server 2012 Blank Results

SQL Server 2014 Diagnostic Information Queries

SQL Server 2014 Blank Results

The basic idea is that you should run each query in the set, one at a time (after reading the directions for that query). It is not really a good idea to simply run the entire batch in one shot, especially the first time you run these queries on a particular server, since some of these queries can take some time to run, depending on your workload and hardware. I also think it is very helpful to run each query, look at the results and think about the emerging picture of what is happening on your server as you go through the complete set.

You need to click on the top left square of the results grid in SSMS to select all of the results, and then right-click and select “Copy with Headers” to copy all of the results, including the column headers to the Windows clipboard. Then you paste the results into the matching tab in the blank results spreadsheet. There are also some comments on how to interpret the results after each query.

About half of the queries are instance specific and about half are database specific, so you will want to make sure you are connected to a database that you are concerned about instead of the master system database. Running the database-specific queries while being connected to the master database is a very common mistake that I see people making when they run these queries.

Note: These queries are stored on Dropbox. I occasionally get reports that the links to the queries and blank results spreadsheets do not work, which is most likely because Dropbox is blocked wherever people are trying to connect.

I also occasionally get reports that some of the queries simply don’t work. This usually turns out to be an issue where people have some of their user databases in 80 compatibility mode, which breaks many DMV queries.

It is also very important that you are running the correct version of the script that matches the major version of SQL Server that you are running. There is an initial query in each script that tries to confirm that you are using the correct version of the script for your version of SQL Server. If you are not using the correct version of these queries for your version of SQL Server, some of the queries are not going to work correctly.

If you want to understand how to better run and interpret these queries, you should consider listening to my three latest Pluralsight courses, which are SQL Server 2014 DMV Diagnostic Queries – Part 1SQL Server 2014 DMV Diagnostic Queries – Part 2 and SQL Server 2014 DMV Diagnostic Queries – Part 3. All three of these courses are pretty short and to the point, at 67, 77, and 68 minutes respectively. Listening to these three courses is best way to thank me for maintaining and improving theses scripts…

Please let me know what you think of these queries, and whether you have any suggestions for improvements. Thanks!

Windows Power Plan Effects on Newer Intel Processors

I recently was working with a customer who had purchased a new Lenovo ThinkServer RD640 with two 22nm Intel Xeon E5-2697 v2 Ivy Bridge-EP processors. This processor was introduced in Q3 of 2013, and it is one generation behind the latest Intel Xeon E5-2600 v3 series processors.

This server had a new, default installation of Windows Server 2012 R2 Standard Edition, which meant that it was using the default Windows Balanced Power Plan. Running CPU-Z 1.71.1 showed the actual core speed of Core #0 while the system was at rest, with the Balanced Power Plan (Figure 1) and with the High Performance Power Plan (Figure 2).

clip_image002

Figure 1: CPU-Z Results with Balanced Power Plan

Changing the Power Plan to High Performance had an immediate effect on the processor core speed, as shown in Figure 2.

clip_image002[5]

Figure 2: CPU-Z Results with High Performance Power Plan

 

Here are the Geekbench results for the default Balanced Power Plan (Figure 3) and the High Performance Power Plan (Figure 4). The Single-Core score is more relevant here, since the 32-bit GUI version of Geekbench 3.3 only uses 32 total cores (and there are 48 logical cores in this server).

clip_image002[7]

Figure 3: Geekbench 3.3 Results with Balanced Power Plan

 

clip_image002[9]

Figure 4: Geekbench 3.3 Results with High Performance Power Plan

You need to keep in mind that your BIOS power management settings will override your Windows Power Plan settings, so it is very important to check what is going on with CPU-Z. For you virtualization users, you need to check what your hypervisor power management settings are doing, because they will override what is happening with the guest OS Windows Power Plan settings.

Not only does power management affect your processor core speed (and CPU/memory benchmarks like Geekbench), but it also affects things like the performance of your PCIe expansion slots. In case you want some official Microsoft guidance on this subject, the Microsoft KB article linked below discusses it in quite a bit of detail.

Slow Performance on Windows Server 2008 R2 when using the “Balanced” Power Plan

This issue has been around since Windows Server 2008, and still exists with Windows Server 2012 R2.