The post A SQL Server Hardware Tidbit a Day – Day 3 appeared first on Glenn Berry.
]]>The Intel Xeon E7 family processors have up to ten physical cores (plus hyper-threading in most models). They have four Quick Path Interconnect (QPI) 1.0 links and two memory controllers, which each have two dual-channel interfaces per memory controller. Their memory controllers support the still very expensive 32GB DIMMs and low-power memory modules. This means that a four socket system can support up to 2TB of RAM, while an eight-socket system can support up to 4TB of RAM (which is the current operating system limit for Windows Server 2012). Of course, you will need pretty deep pockets to do that, because 32GB DDR3 RDIMMs are still very expensive in early 2013 compared to 16GB DDR3 RDIMMs. Using the on-board memory buffer, the E7 processors can run DDR3-1333 memory at data rates of 800, 978 and 1066 MHz. The E7 processor family supports AES instructions, Trusted Execution Technology, and VT-x, VT-d and VT-c virtualization features. They only have PCI-E 2.0 support.
Intel claims up to 40% better database performance for the top of the line E7-4870 model in comparison to the previous generation Xeon X7560 model for four-socket servers. Performance of the E7-4870 CPU in integer and floating-point applications is better than the X7560 by up to 22% and 19% respectively. The E7 processors are socket compatible with the earlier Xeon 7500 processors, which means that existing systems from your favorite server vendor were able to use them as soon as they became available back in Q2 of 2011.
While these processors may sound impressive, they are actually not the best choice for most OLTP workloads, due to their older architecture, slower clock speeds, and lack of PCI-E 3.0 support compared to the newer Intel Xeon E5-2600 and E5-4600 series (Sandy Bridge-EP). They are also quite expensive, with the E7-8870 model going for $4616.00 each. They are very expensive to license for SQL Server 2012 Enterprise Edition, with their ten physical cores for each processor. They are well-suited to data warehouse workloads and non-database virtualization workloads because of their high core counts, large L3 caches, and high memory capacity.
The upcoming E7 v2 (Ivy Bridge-EX) processors, due in Q3 of 2013 will be a significant improvement over the current E7 (Westmere-EX) line, jumping a full Tock release forward. They will have up to 15 physical cores, better memory controllers with higher memory capacity, and PCI-E 3.0 support.
The post A SQL Server Hardware Tidbit a Day – Day 3 appeared first on Glenn Berry.
]]>The post What is the Difference Between Physical Sockets, Physical Cores, and Logical Cores? appeared first on Glenn Berry.
]]>The hierarchy works like this:
Physical socket on a motherboard where a physical processor fits (used for licensing before SQL Server 2012)
Physical core within a physical processor (multi-core, used for licensing with SQL Server 2012 Enterprise Edition)
Logical core within a physical core (hyper-threading)
Back in the prehistoric days of processor technology (around 2001) all Intel and AMD processors had only one core. If you wanted multiple threads of execution, you needed additional physical processors, since one socket = one physical processor = one physical core. Back then, the primary way to increase single-threaded performance was to increase the clock speed of the processor. Both Intel and AMD started running into problems with heat dissipation and power consumption as clock speeds approached 4.0GHz (on air cooling).
In 2002, Intel introduced the first processor with hyper-threading. Hyper-threading creates two “logical processors” within each physical processor core of an actual physical processor, that are visible to the operating system. Depending on the application, hyper-threading can improve total CPU capacity by anywhere from 5-30%. The initial implementation of hyper-threading on the Pentium 4 Netburst architecture did not work as well on many server workloads (such as SQL Server), so the standard advice back then was to disable hyper-threading on database servers. The 2nd generation hyper-threading in the Intel Nehalem, Westmere, and Sandy Bridge processors works much better for SQL Server OLTP workloads, so I always leave it enabled by default.
In 2005, AMD introduced their first dual-core processor, the Athlon 64 X2. This processor had two discrete physical cores, which provided better multi-threaded performance than hyper-threading. A single, dual-core processor would have two processor cores visible to Windows. It is important to remember that Windows Server 2008 R2 Task Manager (and some SQL Server DMVs) cannot easily tell the difference between hyper-threaded logical processors and true dual-core or multi-core processors.
In late 2006, Intel introduced the first Core2 Quad, which was a processor with four physical cores (but no hyper-threading). One of these processors would have four cores visible to Windows. Since then, both AMD and Intel have been rapidly increasing the physical core counts of their processors. AMD has the Opteron 63xx processor family which has 16 physical cores in a single physical processor. Intel has the Xeon E7 Family “Westmere-EX”, which has up to ten physical cores, plus 2nd generation hyper-threading, which means that you have a total of 20 logical cores visible to Windows and SQL Server for each physical processor.
Before SQL Server 2012, SQL Server licensing was only concerned with physical processor sockets, not physical cores, or logical cores. Knowing this, you wanted to always buy processors with as many cores as possible in order to maximize your overall processor performance per processor license. You should also be aware that SQL Server 2008 is limited to 64 logical processors. In order to use more than 64 logical processors, you must be running SQL Server 2008 R2 on top of Windows Server 2008 R2, which will raise your limit to 256 logical processors.
SQL Server 2008 R2 Enterprise Edition also had a license limit of eight physical processors (which would let you go up to 160 logical processors with eight Intel Westmere-EX processors). If you need more than eight physical processors, you needed to run SQL Server 2008 R2 Data Center Edition. Microsoft got rid of the Data Center Edition SKU for SQL Server 2012.
With SQL Server 2012, Microsoft completely changed their licensing model compared to previous releases. With SQL Server 2012 Enterprise Edition, in a non-virtualized environment, you must use core-based licensing, which is based on physical cores (not logical cores). Each processor socket must have at least four processor core licenses (even if there are actually only one or two physical cores). This means you need to be much more thoughtful about which exact processor model you select for your database server. Having lots of physical cores can add up to a very large amount of money for your SQL Server 2012 Enterprise Edition license costs. If you are running SQL Server 2012 Enterprise Edition on top of Windows Server 2012 Standard Edition, you can now have up to 640 logical cores, along with 4TB of RAM in your system.
This new licensing system really penalizes AMD processors, which can have up to 16 physical cores in each processor, but unfortunately, have pretty mediocre single-threaded performance. To try and level the playing field a little bit, Microsoft released something called the SQL Server Core Factor Table, which gives a 25% discount for most modern AMD processors that have six or more physical cores. Even with this discount, it is far more expensive to buy your SQL Server 2012 core licenses for an AMD system compared to an equivalent Intel system.
Remember that the sys.dm_os_sys_info DMV cannot tell the difference between physical and logical cores. Running the query below will tell you how many logical cores are visible and how many physical CPUs you have.
1: -- Hardware information from SQL Server 2012
2: -- (Cannot distinguish between HT and multi-core)
3: SELECT cpu_count AS [Logical CPU Count], hyperthread_ratio AS [Hyperthread Ratio],
4: cpu_count/hyperthread_ratio AS [Physical CPU Count],
5: physical_memory_kb/1024 AS [Physical Memory (MB)], committed_target_kb/1024 AS [Committed Target Memory (MB)],
6: max_workers_count AS [Max Workers Count], affinity_type_desc AS [Affinity Type],
7: sqlserver_start_time AS [SQL Server Start Time], virtual_machine_type_desc AS [Virtual Machine Type]
8: FROM sys.dm_os_sys_info WITH (NOLOCK) OPTION (RECOMPILE);
9:
10: -- Gives you some good basic hardware information about your database server
The post What is the Difference Between Physical Sockets, Physical Cores, and Logical Cores? appeared first on Glenn Berry.
]]>The post Two New TPC-E Submissions for SQL Server 2012 appeared first on Glenn Berry.
]]>What is notable about this is that the 3218.46 score for a four-socket Xeon E7-4870 system is significantly higher than we have seen for similar four-socket Xeon E7-4870 systems in the past. An especially good comparison is between an IBM System x3850 X5 that was submitted on June 27, 2011 and this latest result for an IBM System x3850 X5 system that was submitted on November 28, 2012. As you can see in Table 1, the newer submission for the same model server has a 12.4% higher score than the older submission. This is for the exact same model server, with the exact same number and model of processors. The first big difference that jumps out is that the newer submission is running SQL Server 2012 Enterprise Edition on top of Windows Server 2012 Standard Edition, while the older submission is running SQL Server 2008 R2 Enterprise Edition on top of Windows Server 2008 R2 Enterprise Edition.
| Date | Model | Processor | Operating System | SQL Server Version/Edition | TPC-E Score |
| 6/27/2011 | System x3850 X5 | Xeon E7-4870 | Windows Server 2008 R2 Enterprise | SQL Server 2008 R2 Enterprise | 2862.61 |
| 11/28/2012 | System x3850 X5 | Xeon E7-4870 | Windows Server 2012 Standard | SQL Server 2012 Enterprise | 3218.46 |
Table 1: Comparing Two IBM System x3850 X5 TPC-E Submissions
Could this 12.4% performance jump be simply due to the newer operating system and the newer version of SQL Server? It is very possible that there were some low level improvements in Windows Server 2012 that work in conjunction with SQL Server 2012 to improve performance (similar to what we saw with Windows Server 2008 R2 combined with SQL Server 2008 R2). With Windows Server 2008 R2, Microsoft did some low-level optimizations so that they could scale from 64 logical processors to 256 logical processors. This work also benefitted smaller systems with fewer logical processors. I think it is likely that some similar work was done with Windows Server 2012, so that it could scale from 256 logical processors to 640 logical processors, so that might explain some of the performance increase. I have some questions in to some of my friends at Microsoft, trying to get some more detailed information about this possibility.
It is also possible that there were improvements in SQL Server 2012 all by itself that contributed to the performance increase. Another possibility is that the TPC-E team at IBM just did a much better job on this newer system. If you dive deeper into the two submissions, you will notice some other differences in the hardware and the environment for the test. The newer submission is a system with 2048GB of RAM and (126) 200GB SAS SSDs for database storage, with a 13.3TB initial database size, while the older submission is a system with 1024GB of RAM and (90) 200GB SAS SSDs for database storage, with a 11.6TB initial database size. As long as you have sufficient I/O capacity to drive the TPC-E workload, the TPC-E score is usually limited by processor performance, so I don’t really think that the RAM and I/O differences are that significant here.
What do you think about this? I would love to hear your opinions and comments!
The post Two New TPC-E Submissions for SQL Server 2012 appeared first on Glenn Berry.
]]>The post Deciding What Processor to Choose for SQL Server 2012 appeared first on Glenn Berry.
]]>Since 2006, Intel has been using a Tick-Tock release model for their processors. What this means is that every two years, they have a Tock release that uses a completely new microarchitecture, which is followed a year later by a Tick release that has a manufacturing process technology shrink, but uses the same microarchitecture as the previous Tock release. Using a smaller process technology typically allows the processor to use less energy and have slightly better performance than the previous Tock release, but the performance jump is not nearly as great as you get with a Tock release. Tick releases are usually pin-compatible with the previous Tock release, so that lets the hardware systems vendors start using the Tick release processor in their existing models much more quickly, usually with just a BIOS update.
Table 1 shows the Tick-Tock release cadence for Intel processors from 2008 through 2016. The dates are obviously more speculative as we go further into the future, since Intel may decide to slow down their release cycle if AMD is unable to give them more viable competition in the next few years.
| Year | Type | Process | Code Name |
| 2008 | Tock | 45nm | Nehalem |
| 2010 | Tick | 32nm | Westmere |
| 2011 | Tock | 32nm | Sandy Bridge |
| 2012 | Tick | 22nm | Ivy Bridge |
| 2013 | Tock | 22nm | Haswell |
| 2014 | Tick | 14nm | Rockwell |
| 2015 | Tock | 14nm | Skylake |
| 2016 | Tick | 10nm | Skymont |
Table 1: Tick-Tock Release Listing
Figure 1 shows how the Tick-Tock model works, with the Tock release (in blue) using the existing manufacturing process technology, while the Tick release (in orange) moves to a new, smaller manufacturing process technology. New Intel processors are first released for the desktop market, and then for the mobile market, followed later by the single-socket server market, the two-socket server market and finally the four-socket server (and above) market coming last. The four-socket server server market does not always get every release because of the lower sales volume and slower release cycle. This explains why there has not been a Sandy Bridge-EX release for the four-socket market.
![]()
Figure 1: Tick-Tock Model
As you can see from Table 1 and Figure 1, Sandy Bridge is a Tock release that came after the Westmere Tick release. The Xeon E5 product family is Sandy Bridge-EP, which is a newer microarchitecture compared to the Xeon E7 product family, which is Westmere-EX. This difference is very important for SQL Server 2012 core-based licensing purposes! Sandy Bridge has significantly better single-threaded performance compared to Westmere and it also has lower physical core counts. Sandy Bridge-EP is available for both two-socket and four-socket servers, while Westmere-EX is available for two-socket, four-socket, and eight-socket servers.
Currently, we have the Intel Xeon E5-2600 product family (Sandy Bridge-EP) for the two-socket space, the Intel Xeon E5-4600 product family (Sandy Bridge-EP) for the four-socket space, along with the older Intel Xeon E7-2800 product family (Westmere-EX) for the two-socket space, the Intel Xeon E7-4800 product family (Westmere-EX) for the four-socket space, and the Intel Xeon E7-8800 product family (Westmere-EX) for the eight-socket space. The Intel Xeon E7 family was released in Q2 2011, the Xeon E5-2600 family was released in Q1 2012, and the Xeon E5-4600 family was released in Q2 2012. On November 5, 2012, Fujitsu published a new TPC-E OLTP benchmark result for a four-socket, Intel Xeon E5-4650 PRIMERGY RX500 S7 system with a score of 2651.27. This is the first published TPC-E result for the newer, four-socket capable Intel Xeon E5-4600 series, so I think it merits some comparison and discussion.
Table 2 shows the TPC-E scores for five systems that use the the five different Sandy Bridge and Westmere processors that I have been discussing so far. It shows that the two-socket Xeon E5-2690 system has the best single-threaded performance, (when you divide the raw score by the number of physical cores) and that the four-socket Xeon E5-4650 system comes in second place. We also see that the scaling goes down quite a bit as we move from two sockets to four sockets with the Xeon E5 family. If we had perfectly linear scaling, you would expect a four-socket system to have twice the score of a two-socket system that was using the same processor, which is not the case here. Part of this can be attributed to the clock speed difference between the 2.9GHz Xeon E5-2690 and the 2.7GHz Xeon E5-4650.
We can also see that the Intel Xeon E5 family does quite a bit better on TPC-E than the Intel Xeon E7 family does, which is no surprise, since we are comparing the newer Sandy Bridge-EP to the older Westmere-EX. From a performance perspective, the two-socket Xeon E5-2690 does much better than the two-socket Xeon E7-2870. In my opinion, you really should not be using the two-socket Xeon E7-2870 for SQL Server 2012 because of its lower single-threaded performance and higher physical core counts (which means a higher SQL Server 2012 licensing cost).
The four-socket Xeon E7-4870 system has a higher raw score than the four-socket E5-4650 system, but it has 40 physical cores compared to 32 physical cores, which means it will cost significantly more for for SQL Server 2012 core licenses, while it will have lower single-threaded performance. Again, I would prefer a Xeon E5-4650 based system over a Xeon E7-4870 based system for an OLTP workload. You can also see that scaling takes a pretty big hit when you go from four-socket systems to eight-socket systems, even though these are all NUMA-based systems here.
| System | Sockets | Total Cores | Processor Model | TPC-E Score | TPC-E Score/Core |
| Fujitsu PRIMERGY RX300 S7 | 2 | 16 | Intel Xeon E5-2690 | 1871.81 | 116.99 |
| Fujitsu PRIMERGY RX500 S7 | 4 | 32 | Intel Xeon E5-4650 | 2651.27 | 82.85 |
| IBM System x3690 X5 | 2 | 20 | Intel Xeon E7-2870 | 1560.70 | 78.04 |
| IBM System x3850 X5 | 4 | 40 | Intel Xeon E7-4870 | 2862.61 | 71.57 |
| NEC Express5800/A1080a-E | 8 | 80 | Intel Xeon E7-8870 | 4614.22 | 57.68 |
Table 2: TPC-E Score Comparisons for Selected Intel Processors
| System | Sockets | Total Cores | Processor Model | TPC-E Score | SQL 2012 License Cost | Cost/TPC-E |
| Fujitsu PRIMERGY RX300 S7 | 2 | 16 | Intel Xeon E5-2690 | 1871.81 | $109,984 | $57.76/TPC-E |
| Fujitsu PRIMERGY RX500 S7 | 4 | 32 | Intel Xeon E5-4650 | 2651.27 | $219,968 | $82.97/TPC-E |
| IBM System x3690 X5 | 2 | 20 | Intel Xeon E7-2870 | 1560.70 | $137,480 | $88.09/TPC-E |
| IBM System x3850 X5 | 4 | 40 | Intel Xeon E7-4870 | 2862.61 | $274,960 | $96.05/TPC-E |
| NEC Express5800/A1080a-E | 8 | 80 | Intel Xeon E7-8870 | 4614.22 | $549,920 | $119.18/TPC-E |
Table 3: SQL Server 2012 Enterprise Edition License Cost Comparisons by TPC-E Score
Table 3 shows the same five systems with the SQL Server 2012 Enterprise Edition license cost information added. This shows that a two-socket system with Xeon E5-2690 processors gives you the lowest licensing cost per TPC-E score, while Table 2 shows that it also gives you the best TPC-E score per physical processor core. Unless you must have more than 384GB of RAM (with affordable 16GB DIMMs) or more than 768GB of RAM (with much more expensive 32GB DIMMs), there are not too many reasons to go with a higher core-count system for an OLTP workload.
One possible reason is that you are concerned that a two-socket Xeon E5-2690 system simply cannot handle your total database workload. Two processors with a total of 16 physical cores is simply not enough computing capacity for your workload. Depending on the magnitude of your workload, that may be true. If you are currently running a four-socket or larger system that is more than a couple of years old, that may not be true. Bigger systems are not faster systems, and the total load capacity of two socket systems has increased dramatically in the last year with Sandy Bridge-EP. If you are convinced that a two-socket Xeon E5-2690 cannot handle your workload, I would look at a four-socket Xeon E5-4650 system, which also lets you go up to 1.5TB of RAM with 32GB DIMMs. Keep in mind that both Xeon E5-2690 and Xeon E5-4650 systems have PCI-E 3.0 support, which gives you twice the I/O bandwidth of the older PCI-E 2.0 standard found in Westmere-EX servers.
If all of this has made your head hurt, you can always contact us for some deeper hardware consulting!
The post Deciding What Processor to Choose for SQL Server 2012 appeared first on Glenn Berry.
]]>The post Memory Error Recovery in SQL Server 2012 appeared first on Glenn Berry.
]]>There was a presentation at TechEd 2012, called “The Path to Continuous Availability with Windows Server 2012” that talked about this being a new feature in Windows Server 2012, which implies that you will need to be running SQL Server 2012 on top of Windows Server 2012 to get this functionality.
In Windows Server 2012, the feature is called Application Assisted Memory Error Recovery, and it requires the application (such as SQL Server 2012) to register for notifications of bad memory page events using CreateMemoryResourceNotification(). It also requires SQL Server 2012 to use the API QueryWorkingSetEx() to scan the memory for bad pages.
It is likely an Enterprise Edition-only feature, but I have not confirmed this assumption yet.
You will also need ECC RAM, and a processor with a memory controller that supports this. I don’t have a list of processors that support this feature yet, but I am working on it. If I had to guess, I would assume that Intel Nehalem and newer, and AMD Magny-Cours and newer will probably be required.
If you have the hardware support, along with both Windows Server 2012 and SQL Server 2012, you will see a message like this in your SQL Server error log:
Machine supports memory error recovery. SQL memory protection is enabled to recover from memory corruption.
There are a few prerequisites that you must satisfy, but this is still an interesting feature. It is one more argument that you can use when you are trying to make the case to upgrade to SQL Server 2012, on a new server with the latest version of Windows Server.
The post Memory Error Recovery in SQL Server 2012 appeared first on Glenn Berry.
]]>