hardware consulting Archives - Glenn Berry

Intel Cascade Lake-SP Processor Analysis for SQL Server

Glenn Berry — Wed, 03 Apr 2019 03:43:16 +0000

Introduction

On April 2, 2019, Intel had their Data-Centric Innovation Day, where they announced and described a number of new products for data center use. Most relevant from a SQL Server perspective is the 2nd Generation Intel Scalable Processor family, aka Cascade Lake-SP. This line of 14nm processors are the successor to the existing 14nm Intel Scalable Processor family (Skylake-SP) that was released in Q3 2017. These new processors will work with existing model servers (with a BIOS update), so there should be no delay waiting on server vendors to do a model refresh.

Cascade Lake-SP Improvements

This new family of processors has minor base and turbo clock speed improvements (typically 200 MHz). They also support DDR4-2933 RAM (at two DIMMs per channel) and 256GB LRDIMMs. This means you can have up to 1.5TB of RAM per socket, with the base, non-M or L suffix SKUs. That is a doubling of memory capacity compared to Skylake-SP. Some of the mid-range Cascade Lake-SP SKUs have larger L3 cache sizes compared to the equivalent Skylake-SP SKUs. Cascade Lake-SP also has Optane DC Persistent Memory support and hardware-level Spectre and Meltdown mitigations. Unfortunately, there is no PCIe 4.0 support with Cascade Lake-SP.

Cascade Lake-SP Regressions

There are some issues with Cascade Lake-SP for SQL Server usage. Not from a technical or performance perspective, but from a product segmentation perspective. First, Intel has introduced a number of new model number letter suffixes which make processor selection more complicated and potentially much more expensive.

The complete list of SKU suffix letters are as follows:

No letter = Normal Memory Support (1.5 TB)
M = Medium Memory Support (2.0 TB)
L = Large Memory Support (4.5 TB)
Y = Speed Select Models
N = Networking/NFV Specialized
V = Virtual Machine Density Value Optimized
T = Long Life Cycle/Thermal
S = Search Optimized

Confused yet? Suffice it say, you will want to avoid those specialized SKUs for most SQL Server usage, with the possible exception of the M or L models if you need higher memory density. Another exception might be the “Y”, Speed Select (SST) SKUs, which let you pin workloads to specific cores (which can have an increased base clock speed) while the other cores have a reduced base clock speed. Another variant of Speed Select (SST-PP) lets you vary the number of cores and clock speeds at boot time. This feature would probably be in violation of current SQL Server licensing, where Microsoft expects you to pay for all of the physical cores in a machine, whether they are enabled or not.

If you decide to use Intel Optane DC Persistent Memory, your maximum memory speed will be reduced to DDR4-2666. Intel has not released pricing for Intel Optane DC Persistent Memory yet, which means that it will be expensive (but less expensive/GB than DDR4 RAM).

Missing SKUs

A bigger issue for SQL Server usage is the fact Intel has apparently dropped at least two of their frequency-optimized SKUs from the previous generation. Based on today’s information, I don’t see a 12-core Intel Xeon Gold 6246 or a 6-core Intel Xeon Gold 6228. These would replace the previous Intel Xeon Gold 6146 and Intel Xeon Gold 6128. There don’t appear to be any 6-core SKUs outside of the Intel Xeon Bronze 3204 (which would be a terrible choice for SQL Server usage).

In fact, there are only five specific Cascade Lake-SP SKUs that I really like for SQL Server usage (if you want the best single-threaded performance possible). These include:

Intel Xeon Platinum 8280 (28 cores)
Intel Xeon Platinum 8268 (24 cores)
Intel Xeon Gold 6254 (18 cores)
Intel Xeon Gold 6244 (8 cores)
Intel Xeon Gold 5222 (4 cores)

All of these SKUs have slightly higher base and turbo clock speeds than their direct Skylake-SP predecessors. All of them (except the Platinum 8168) are the same price as their predecessors. The Platinum 8268 has a larger L3 cache than the Platinum 8168, which somewhat justifies a price increase. The problem is that missing 6-core SKU and the big gap between the 8-core and the 18-core SKUs. That gap represents about $142K in SQL Server 2017 Enterprise Edition licenses costs in a two-socket server.

Figures 1 and 2 show the main specifications for my preferred SKUs for Cascade Lake-SP and Skylake-SP (for SQL Server usage).

Figure 1: Preferred Cascade Lake-SP SKUs

As you can see, there were fewer large gaps in the core counts of these “good” processor choices for SQL Server usage with Skylake-SP.

Figure 2: Preferred Skylake-SP SKUs

Initial TPC-E Results

We already have the first TPC-E submission for a system using Cascade Lake-SP processors. Lenovo recently submitted a result for a two-socket Lenovo ThinkSystem SR650 with two Intel Xeon Platinum 8280 processors. This system had a score of 7012.53. If you divide that score by 56 physical cores, you get a result of 125.22/core.

Lenovo previously submitted a result for an essentially identical Lenovo ThinkSystem SR650 with two Intel Xeon Platinum 8180 processors This system had a score of 6779.53. If you divide that score by 56 physical cores, you get a result of 121.06/core.

That is about a 3.4% improvement. The difference in base clock speed is 8%. Both systems are running SQL Server 2017 Enterprise Edition on Windows Server 2016 Standard Edition. There may be some minor configuration differences between the two systems, but I have not spelunked into the full disclosure reports to determine that yet.

Conclusion

Cascade Lake-SP will give give you marginally better performance at the same core counts compared to Skylake-SP. This is primarily due to the higher base and turbo clock speeds. Higher memory bandwidth and hardware-level Spectre/Meltdown protection will also help in some scenarios. Most of the other Cascade Lake-SP improvements are focused on HPC and AI workloads, and will not be beneficial to SQL Server 2017/2019. Intel is not claiming any significant IPC improvements in Cascade Lake-SP, which seems to be confirmed by the first TPC-E result. Intel Optane DC Persistent Memory may be useful, depending on how much you can leverage it with SQL Server 2019.

Honestly, I am pretty underwhelmed by Cascade Lake-SP so far, at least for SQL Server. It is slightly better than Skylake-SP, assuming the frequency-optimized core count gaps don’t force you to license more cores than you wanted to. Intel should be very concerned about the upcoming 7nm AMD EPYC “Rome” server processors. These AMD processors will have have up to 64C/128T, higher memory density, and more PCIe lanes (with PCIe 4.0 instead of PCIe 3.0). They also may have higher single-threaded performance than Cascade Lake-SP. This is especially likely if AMD decides to offer more frequency-optimized SKUs, like the existing AMD EPYC 7371 from the “Naples” generation.

The post Intel Cascade Lake-SP Processor Analysis for SQL Server appeared first on Glenn Berry.

Using TPC-E OLTP Benchmark Scores to Compare Processors

Glenn Berry — Thu, 18 Jul 2013 18:57:24 +0000

One of the things I do at SQLskills is paid consulting for customers who are looking to upgrade their database servers to new hardware, a new operating system, and a new version of SQL Server. Part of this process is a comparison of the estimated TPC-E score of the existing system compared to the estimated TPC-E score on the new system. Here is an example of some of the type of analysis that I do as part of that process.

Imagine a legacy system that is a Dell PowerEdge 2950 with one 45nm, quad-core, 3.0GHz Intel Xeon X5450 “Harpertown” processor, along with 64GB of RAM. That processor has a 1333MHz FSB and a 12MB L2 cache. It has the 45nm Core2 Quad “Harpertown” microarchitecture, which means that it does not support Intel hyper-threading or Intel Turbo Boost, and it uses the older symmetric multiprocessing (SMP) architecture instead of the newer non-uniform memory access (NUMA) architecture.

Nearest TPC-E Comparable Result for Existing System

There is a TPC-E result from 12/11/2007 for a Dell PowerEdge 2900 system with one 65nm, quad-core, 2.66GHz Intel Xeon X5355 “Clovertown” processor, along with 48GB of RAM. That processor has a 1333MHz FSB and an 8MB L2 cache. It has the 65nm Core2 Quad “Clovertown” microarchitecture, which means that it also does not support Intel hyper-threading or Intel Turbo Boost, and it also uses the older SMP architecture. The Intel Xeon 5300 series is one Intel Tick release older than the Intel Xeon 5400 series, so there is a relatively small difference in their relative performance. This actual TPC-E score is 144.88. The Dell system from 2007 was running SQL Server 2005 on Windows Server 2003.

Comparing that Dell TPC-E system to the existing system, we have to make some adjustments to account for the clock speed difference, L2 cache size difference and the Intel Tick release difference. A 3.0GHz clock speed is 12.4% higher than a 2.66GHz, and I estimate that the combination of a larger L2 cache and the newer Tick release would be another 10% difference. If we multiply 144.88 times 1.224, we get a result of 177.33 as an estimated TPC-E score for the current legacy system.

Nearest TPC-E Comparable Result for New System

There is also a TPC-E result from 11/21/2012 for an HP Proliant DL380p Gen 8 system with two 32nm, eight-core, 2.9GHz Intel Xeon E5-2690 “Sandy Bridge-EP” processors, along with 256GB of RAM. This has the 32nm Sandy Bridge-EP microarchitecture, which means that it supports both Intel hyper-threading and Intel Turbo Boost, and it uses the newer NUMA architecture. It also has PCI-E 3.0 support. The actual TPE-E result for this system is 1881.76. This system is running on Windows Server 2012 and SQL Server 2012.

Since we want to minimize our SQL Server 2012 core-based license costs, we are considering only using one actual Xeon E5-2600 series processor in the new server, possibly with a lower core count. The best choices for SQL Server 2012 are the four-core 3.3GHz Intel Xeon E5-2643, the six-core 2.9GHz Intel Xeon E5-2667, and the eight-core 2.9GHz Intel Xeon E5-2690. These three processors have slightly different base and Turbo clock speeds and different L3 cache sizes (although the size per core is the same) and different core counts that must be accounted for. We also need to account for the fact that we will only have one physical processor in the system instead of two.

With a NUMA architecture in a two-socket machine, you will get quite good scaling as you go from one processor to two processors. I believe we should use an estimate of 55% (i.e. one processor will have 55% of the scalability of two identical processors in the NUMA architecture system). We will have to adjust for the core-count difference in the six-core and quad-core processors. We also need to adjust for the higher base clock speed difference in the quad-core Xeon E5-2643 system.

The two-socket Xeon E5-2690 system has an actual TPC-E score of 1881.76. If we multiply that by .55 we get an estimated TPC-E score of 1034.97 with one Xeon E5-2690. If we multiply that by .75, we get an estimated TPC-E score of 776.23 with one Xeon E5-2667.

If we take the 1034.97 estimate for a single eight-core Xeon E5-2690 and multiply that by .50, we get a result of 517.49 for the four-core Xeon E5-2643. We also need to multiply that by 1.138 to account for the 3.3GHz base clock speed compared to the base 2.9GHz clock speed. This gives us an estimated TPC-E score of 588.90 for a single Xeon E5-2643 processor.

The table below summarizes these TPC-E score estimates.

Processor	Physical Cores	Estimated TPC-E Score
Xeon X5450	4	177.33
Xeon E5-2643	4	588.90
Xeon E5-2667	6	776.23
Xeon E5-2690	8	1034.97

The post Using TPC-E OLTP Benchmark Scores to Compare Processors appeared first on Glenn Berry.

A SQL Server Hardware Tidbit a Day – Day 14

Glenn Berry — Sun, 14 Apr 2013 22:18:18 +0000

For Day 14 of this series, I want to give my current recommended Intel Xeon server processors for different sizes of database servers and different workload types.

My basic premise is that for a database server running SQL Server 2008 R2 or earlier, you want the very best processor available for each physical socket in the server (since SQL Server 2008 R2 Processor licenses are relatively expensive). With SQL Server 2012 Enterprise Edition, you need to worry about the physical core counts in your processors, so there are some situations where you might want to choose a “frequency-optimized” model processor that has fewer physical cores but a higher base clock speed than the top-tier processor that has a higher number of physical cores. An example would be choosing a four-core Intel Xeon E5-2643 instead of an eight-core Xeon E5-2690 processor.

Unlike a laptop or web server, you usually don’t want to pick a processor for a database server that is one or two models down from the most expensive, “top of the line” model. With SQL Server 2012 Enterprise Edition, you certainly don’t want to select a slower speed, less expensive processor that has the same number of physical cores as a slightly more expensive processor from that same processor family and generation.

You will most likely be stuck with whatever processor you choose for the life of the server, since it rarely makes economic sense to upgrade the processors in an existing server. You can also use any “excess” processor capacity for things like data compression or backup compression, to reduce the pressure on your I/O subsystem. Trading CPU utilization for I/O utilization is usually a net win, especially if you have a modern, multi-core processor that can shrug off the extra work.

These recommendations will change when the Xeon E3-1200 v3 series is released in June 2013, and again when the E5-2600 v2 series is released in Q3 of 2013 and the E7-2800, 4800 and 8800 v2 series are released in Q4 of 2013.

So here is my recommended Intel Xeon server processor list:

One-socket server (OLTP workloads)
Xeon E3-1290 v2 (22nm Ivy Bridge)
•    3.7GHz, 8MB L3 Cache, 5.0 GT/s Intel QPI 1.1
•    Four-cores plus hyper-threading, Turbo Boost 2.0 (4.1GHz)
•    Two memory channels, 32GB max memory capacity

One-socket server (DW/DSS workloads)
Xeon E5-2470 (32nm Sandy Bridge-EN)
•    2.3GHz, 20MB L3 Cache, 8.0 GT/s Intel QPI 1.1
•    Eight-cores plus hyper-threading, Turbo Boost 2.0 (3.1GHz)
•    Three memory channels, 96GB max memory capacity

Two-socket server (OLTP workloads)
Xeon E5-2690 (32nm Sandy Bridge-EP)
•    2.9GHz, 20MB L3 Cache, 8.0 GT/s Intel QPI 1.1
•    Eight-cores plus hyper-threading, Turbo Boost 2.0 (3.8GHz)
•    Four memory channels, 384GB max memory capacity (16GB DIMMs)

Two-socket server (DW/DSS workloads)
Xeon E7-2870 (32nm Westmere-EX)
•    2.40GHz, 30MB L3 Cache, 6.40 GT/s Intel QPI 1.0
•    Ten-cores plus hyper-threading, Turbo Boost 2.0 (2.8GHz)
•    Four memory channels, 512GB max memory capacity (16GB DIMMs)

Four-socket server (OLTP workloads)
Xeon E5-4650 (32nm Sandy Bridge-EP)
•    2.7GHz, 20MB L3 Cache, 8.0 GT/s Intel QPI 1.1
•    Eight-cores plus hyper-threading, Turbo Boost 2.0 (3.3GHz)
•    Four memory channels, 768GB max memory capacity (16GB DIMMs)

Four-socket server (DW/DSS workloads)
Xeon E7-4870 (32nm Westmere-EX)
•    2.40GHz, 30MB L3 Cache, 6.40 GT/s Intel QPI 1.0
•    Ten-cores plus hyper-threading, Turbo Boost 2.0 (2.8GHz)
•    Four memory channels, 1TB max memory capacity (16GB DIMMs)

Eight-socket server (Any workload type)
Xeon E7-8870 (32nm Westmere-EX)
•    2.40GHz, 30MB L3 Cache, 6.40 GT/s Intel QPI 1.0
•    Ten-cores plus hyper-threading, Turbo Boost 2.0 (2.8GHz)
•    Four memory channels, 2TB max memory capacity

The post A SQL Server Hardware Tidbit a Day – Day 14 appeared first on Glenn Berry.