A SQL Server Hardware Tidbit a Day – Day 3

Today, I am going to talk about the existing Intel Westmere-EX processor family. Instead of this processor series being called the Intel Xeon 7600 series (as was originally expected by some), it is called the Intel Xeon E7 series, with separate model numbers for two socket, four socket and eight socket servers. This includes the E7-2800 series for two-socket servers, E7-4800 series for four-socket servers, and E7-8800 series for eight socket (or larger) servers.

The Intel Xeon E7 family processors have up to ten physical cores (plus hyper-threading in most models). They have four Quick Path Interconnect (QPI) 1.0 links and two memory controllers, which each have two dual-channel interfaces per memory controller. Their memory controllers support the still very expensive 32GB DIMMs and low-power memory modules. This means that a four socket system can support up to 2TB of RAM, while an eight-socket system can support up to 4TB of RAM (which is the current operating system limit for Windows Server 2012).  Of course, you will need pretty deep pockets to do that, because 32GB DDR3 RDIMMs are still very expensive in early 2013 compared to 16GB DDR3 RDIMMs. Using the on-board memory buffer, the E7 processors can run DDR3-1333 memory at data rates of 800, 978 and 1066 MHz.  The E7 processor family supports AES instructions, Trusted Execution Technology, and  VT-x, VT-d and VT-c virtualization features. They only have PCI-E 2.0 support.

Intel claims up to 40% better database performance for the top of the line E7-4870 model in comparison to the previous generation Xeon X7560 model for four-socket servers. Performance of the E7-4870 CPU in integer and floating-point applications is better than the X7560 by up to 22% and 19% respectively. The E7 processors are socket compatible with the earlier Xeon 7500 processors, which means that existing systems from your favorite server vendor were able to use them as soon as they became available back in Q2 of 2011.

While these processors may sound impressive, they are actually not the best choice for most OLTP workloads, due to their older architecture, slower clock speeds, and lack of PCI-E 3.0 support compared to the newer Intel Xeon E5-2600 and E5-4600 series (Sandy Bridge-EP). They are also quite expensive, with the E7-8870 model going for $4616.00 each. They are very expensive to license for SQL Server 2012 Enterprise Edition, with their ten physical cores for each processor. They are well-suited to data warehouse workloads and non-database virtualization workloads because of their high core counts, large L3 caches, and high memory capacity.

The upcoming E7 v2 (Ivy Bridge-EX) processors, due in Q3 of 2013 will be a significant improvement over the current E7 (Westmere-EX) line, jumping a full Tock release forward. They will have up to 15 physical cores, better memory controllers with higher memory capacity, and PCI-E 3.0 support.

What is the Difference Between Physical Sockets, Physical Cores, and Logical Cores?

I witnessed an interesting conversation on Twitter today where someone was talking about how he uses the terms sockets and cores interchangeably, since everyone else does, or words to that effect. This made me think that there may still be some confusion about how these terms are used and what they mean in relation to SQL Server 2012 hardware selection and SQL Server 2012 licensing considerations.

The hierarchy works like this:

Physical socket on a motherboard where a physical processor fits (used for licensing before SQL Server 2012)

Physical core within a physical processor (multi-core, used for licensing with SQL Server 2012 Enterprise Edition)

Logical core within a physical core (hyper-threading)

Back in the prehistoric days of processor technology (around 2001) all Intel and AMD processors had only one core. If you wanted multiple threads of execution, you needed additional physical processors, since one socket = one physical processor = one physical core. Back then, the primary way to increase single-threaded performance was to increase the clock speed of the processor. Both Intel and AMD started running into problems with heat dissipation and power consumption as clock speeds approached 4.0GHz (on air cooling).

In 2002, Intel introduced the first processor with hyper-threading. Hyper-threading creates two “logical processors” within each physical processor core of an actual physical processor, that are visible to the operating system. Depending on the application, hyper-threading can improve total CPU capacity by anywhere from 5-30%. The initial implementation of hyper-threading on the Pentium 4 Netburst architecture did not work as well on many server workloads (such as SQL Server), so the standard advice back then was to disable hyper-threading on database servers. The 2nd generation hyper-threading in the Intel Nehalem, Westmere, and Sandy Bridge processors works much better for SQL Server OLTP workloads, so I always leave it enabled by default.

In 2005, AMD introduced their first dual-core processor, the Athlon 64 X2. This processor had two discrete physical cores, which provided better multi-threaded performance than hyper-threading. A single, dual-core processor would have two processor cores visible to Windows. It is important to remember that Windows Server 2008 R2 Task Manager (and some SQL Server DMVs) cannot easily tell the difference between hyper-threaded logical processors and true dual-core or multi-core processors.

In late 2006, Intel introduced the first Core2 Quad, which was a processor with four physical cores (but no hyper-threading). One of these processors would have four cores visible to Windows. Since then, both AMD and Intel have been rapidly increasing the physical core counts of their processors. AMD has the Opteron 63xx processor family which has 16 physical cores in a single physical processor. Intel has the Xeon E7 Family “Westmere-EX”, which has up to ten physical cores, plus 2nd generation hyper-threading, which means that you have a total of 20 logical cores visible to Windows and SQL Server for each physical processor.

Before SQL Server 2012, SQL Server licensing was only concerned with physical processor sockets, not physical cores, or logical cores. Knowing this, you wanted to always buy processors with as many cores as possible in order to maximize your overall processor performance per processor license. You should also be aware that SQL Server 2008 is limited to 64 logical processors. In order to use more than 64 logical processors, you must be running SQL Server 2008 R2 on top of Windows Server 2008 R2, which will raise your limit to 256 logical processors.

SQL Server 2008 R2 Enterprise Edition also had a license limit of eight physical processors (which would let you go up to 160 logical processors with eight Intel Westmere-EX processors). If you need more than eight physical processors, you needed to run SQL Server 2008 R2 Data Center Edition. Microsoft got rid of the Data Center Edition SKU for SQL Server 2012.

With SQL Server 2012, Microsoft completely changed their licensing model compared to previous releases. With SQL Server 2012 Enterprise Edition, in a non-virtualized environment, you must use core-based licensing, which is based on physical cores (not logical cores). Each processor socket must have at least four processor core licenses (even if there are actually only one or two physical cores). This means you need to be much more thoughtful about which exact processor model you select for your database server. Having lots of physical cores can add up to a very large amount of money for your SQL Server 2012 Enterprise Edition license costs. If you are running SQL Server 2012 Enterprise Edition on top of Windows Server 2012 Standard Edition, you can now have up to 640 logical cores, along with 4TB of RAM in your system.

This new licensing system really penalizes AMD processors, which can have up to 16 physical cores in each processor, but unfortunately, have pretty mediocre single-threaded performance. To try and level the playing field a little bit, Microsoft released something called the SQL Server Core Factor Table, which gives a 25% discount for most modern AMD processors that have six or more physical cores. Even with this discount, it is far more expensive to buy your SQL Server 2012 core licenses for an AMD system compared to an equivalent Intel system.

Remember that the sys.dm_os_sys_info DMV cannot tell the difference between physical and logical cores. Running the query below will tell you how many logical cores are visible and how many physical CPUs you have.

   1: -- Hardware information from SQL Server 2012

   2: -- (Cannot distinguish between HT and multi-core)

   3: SELECT cpu_count AS [Logical CPU Count], hyperthread_ratio AS [Hyperthread Ratio],

   4: cpu_count/hyperthread_ratio AS [Physical CPU Count],

   5: physical_memory_kb/1024 AS [Physical Memory (MB)], committed_target_kb/1024 AS [Committed Target Memory (MB)],

   6: max_workers_count AS [Max Workers Count], affinity_type_desc AS [Affinity Type],

   7: sqlserver_start_time AS [SQL Server Start Time], virtual_machine_type_desc AS [Virtual Machine Type]

   8: FROM sys.dm_os_sys_info WITH (NOLOCK) OPTION (RECOMPILE);


  10: -- Gives you some good basic hardware information about your database server

Two New TPC-E Submissions for SQL Server 2012

Just when I was not looking, two new official TPC-E results have been posted in the last week. IBM has a 3218.46 TPC-E score for an IBM System x3850 X5 that has four Intel Xeon E7-4870 processors, while HP has an 1881.76 TPC-E score for an HP ProLiant DL380p Gen8 system with two Intel Xeon E5-2690 processors.

What is notable about this is that the 3218.46 score for a four-socket Xeon E7-4870 system is significantly higher than we have seen for similar four-socket Xeon E7-4870 systems in the past. An especially good comparison is between an IBM System x3850 X5 that was submitted on June 27, 2011 and this latest result for an IBM System x3850 X5 system that was submitted on November 28, 2012.  As you can see in Table 1, the newer submission for the same model server has a 12.4% higher score than the older submission. This is for the exact same model server, with the exact same number and model of processors.  The first big difference that jumps out is that the newer submission is running SQL Server 2012 Enterprise Edition on top of Windows Server 2012 Standard Edition, while the older submission is running SQL Server 2008 R2 Enterprise Edition on top of Windows Server 2008 R2 Enterprise Edition.

Date Model Processor Operating System SQL Server Version/Edition TPC-E Score
6/27/2011 System x3850 X5 Xeon E7-4870 Windows Server 2008 R2 Enterprise SQL Server 2008 R2 Enterprise 2862.61
11/28/2012 System x3850 X5 Xeon E7-4870 Windows Server 2012 Standard SQL Server 2012 Enterprise 3218.46

Table 1: Comparing Two IBM System x3850 X5 TPC-E Submissions

Could this 12.4% performance jump be simply due to the newer operating system and the newer version of SQL Server?  It is very possible that there were some low level improvements in Windows Server 2012 that work in conjunction with SQL Server 2012 to improve performance (similar to what we saw with Windows Server 2008 R2 combined with SQL Server 2008 R2). With Windows Server 2008 R2, Microsoft did some low-level optimizations so that they could scale from 64 logical processors to 256 logical processors. This work also benefitted smaller systems with fewer logical processors.  I think it is likely that some similar work was done with Windows Server 2012, so that it could scale from 256 logical processors to 640 logical processors, so that might explain some of the performance increase. I have some questions in to some of my friends at Microsoft, trying to get some more detailed information about this possibility.

It is also possible that there were improvements in SQL Server 2012 all by itself that contributed to the performance increase. Another possibility is that the TPC-E team at IBM just did a much better job on this newer system. If you dive deeper into the two submissions, you will notice some other differences in the hardware and the environment for the test.  The newer submission is a system with 2048GB of RAM and (126) 200GB SAS SSDs for database storage, with a 13.3TB initial database size, while the older submission is a system with 1024GB of RAM and (90) 200GB SAS SSDs for database storage, with a 11.6TB initial database size. As long as you have sufficient I/O capacity to drive the TPC-E workload, the TPC-E score is usually limited by processor performance, so I don’t really think that the RAM and I/O differences are that significant here.

What do you think about this?  I would love to hear your opinions and comments!