TPC-E Single-Threaded Performance Leaderboard

Fujitsu recently posted a new TPC-E benchmark result of 3777.08 for SQL Server 2014, using a two-socket server with the 18-core, 22nm Intel Xeon E5-2699 v3 (Haswell-EP) processor. This is the highest ever actual TPC-E score for a two-socket server, which sounds quite impressive on the surface.

One thing that I have been doing for years is to take the actual, raw TPC-E score, and divide it by the number of physical cores in the system (which is how SQL Server 2012/2014 is licensed on physical servers) to come up with a “Score/Core” figure as shown in Table 1. This simple calculation helps you evaluate the single-threaded performance of a particular processor, which is very relevant for OLTP workloads. Looking at the TPC-E results like this, the Intel Xeon E5-2699 v3  comes in at seventh place on the TPC-E Single-Threaded Performance Leaderboard. Why is this?

The server vendors (who put together these official TPC-E submissions) will always use the “top of the line” processor for a particular model server for one of these benchmark efforts. This top-level SKU is going to have the highest core count available from a particular CPU family and generation. Unfortunately, the highest core count processors from a particular CPU family and generation will run at lower base and turbo clock speeds than the lower core count, “frequency optimized” models from that same CPU family and generation. This means that the Score/Core result tends to decrease as the number of cores increases. This is partially offset by the architectural improvements that are added to each new generation processor, but those improvements usually don’t make up completely for the lower clock speeds.

So what relevance does this have for the average database professional?

Well, think about how much it would cost to purchase 36 processor core licenses for SQL Server 2014 Enterprise Edition. The answer is about $247,392.00, which is about ten times what a fully-loaded two-socket server would cost. If you were to choose the eight-core Intel Xeon E5-2667 v3 processor, with its much higher 3.2GHz base clock speed, it would only cost about $109,952.00 for the SQL Server 2014 licenses. You would also get probably 30-35% better single-threaded performance than with the 18-core model, while losing perhaps 35-40% of your total processor capacity.

If you are worried about total capacity, you could even buy a second server (if you could split your workload), and save enough on the license costs (32 core licenses vs. 36 core licenses) to pay for the second server. If you did this, you would have more total processor capacity, double the RAM, and much better OLTP performance. Remember, the actual raw TPC-E score is a gauge of the total processor capacity of the system, while the Score/Core helps you evaluate single-threaded processor performance.

I really wish the server vendors would take the relatively easy and inexpensive step of testing their benchmark configurations with different model processors. Once they had everything setup and tuned for the high-core count flagship processor, they could simply repeat the test runs and validation process for some of the more interesting lower core count “frequency optimized” processor models, and submit those results. TPC could help by listing the Score/Core results for all of the TPC-E benchmark submissions.

 

TpsE Score/Core System Processor Total Cores Sockets
1881.76 117.61 HP ProLiant DL380p Gen8 Intel Xeon E5-2690                                         16 2
1871.81 116.99 PRIMERGY RX300 S7 Intel Xeon E5-2690                                         16 2
1863.23 116.45 IBM System x3650 M4 Intel Xeon E5-2690                                         16 2
2590.93 108 IBM System x3650 M4 Intel Xeon E5-2697 v2 24 2
1284.14 107.01 HP ProLiant DL380 G7 Server Intel Xeon X5690                                  12 2
1268.3 105.69 PRIMERGY RX300 S6 12×2.5 Intel Xeon X5690                                  12 2
3777.08 104.92 PRIMERGY RX2540 M1 Intel Xeon E5-2699 v3 36 2
1246.13 103.84 PRIMERGY RX300 S6 Intel Xeon X5680                                  12 2
2472.58 103.02 PRIMERGY RX300 S8 Intel Xeon E5-2697 v2 24 2
817.15 102.14 IBM System x3650 M2 Intel Xeon X5570                                 8 2

Table 1: TPC-E Single-Threaded Performance Leaderboard

SQL Server 2014 RTM CU4

Microsoft has released SQL Server 2014 RTM CU4, which is Build 12.0.2430. This cumulative update has 54 hotfixes in the public fix list, which is a fairly large number. As usual, I think you should take a close look at the list of hotfixes, and then think pretty seriously about going through the testing and planning needed to get this CU deployed on your servers.

If you are working on a new SQL Server 2014 instance, I think it is almost a no-brainer to deploy CU4 before you go to Production.

Getting the Best Performance From an Intel DC P3700 Flash Storage Card

I recently had the opportunity to work on a new Dell PowerEdge R720 system that has two, 2TB Intel DC P3700 PCIe Flash Storage Cards installed. This particular card is the largest capacity model of the high-end P3700 series (Intel has lower-end P3600 and P3500 cards in this same family). As with most flash storage, larger capacity devices typically have much better performance than lower capacity devices from the same product family because there are more NANDs to read and write to and there are more channels to use.

Initially, I was somewhat disappointed by the CrystalDiskMark results for this device, as shown in Figure 1. These results are not terrible, especially compared to most SANs or a single 6Gbps SAS/SATA SSD, but they were not nearly as good as I was expecting.

It turns out that Windows Server 2012 R2 has native NVMe support, with some generic, default drivers. These drivers let Windows recognize and use an NVMe device, but they do not give the best performance. Installing the native Intel drivers makes a huge difference in performance from these cards.

You will need to download and install the drivers first (which will require a reboot), and then you will want to download and install the Intel Solid State Drive Data Center Tool (which is a command-line only tool), so you can check out the card and update the firmware if necessary. The links for those two items are below:

Intel Solid-State Drive Data Center Family for PCIe Drivers
Intel Solid-State Drive Data Center Tool

You should also confirm that you are using the Windows High Performance Power Plan and that your BIOS is not using any power management settings that affect the voltage supplied to the PCIe slots in your server. Setting the BIOS power management to OS control or high performance is usually what you need to do, but check your server documentation.

clip_image002

Figure 1: CrystalDiskMark Results with Default Microsoft Driver

Here are the relevant results in text form:

Sequential Read :   682.778 MB/s
Sequential Write :   700.335 MB/s
        
Random Read 4KB (QD=32) :   381.311 MB/s [ 93093.6 IOPS]
Random Write 4KB (QD=32) :   282.259 MB/s [ 68910.9 IOPS]

After installing the native Intel drivers and updating the firmware, CrystalDiskMark looks much better! This is SAN-humbling performance from a single PCIe card that is relatively affordable.

clip_image002[5]

Figure 2: CrystalDiskMark Results with Native Intel Driver

Here are the relevant results in text form:

Sequential Read :  1547.714 MB/s
Sequential Write :  2059.734 MB/s
        
Random Read 4KB (QD=32) :   646.816 MB/s [157914.2 IOPS]
Random Write 4KB (QD=32) :   419.740 MB/s [102475.6 IOPS]

This is a pretty dramatic difference in performance and it is another reason why database professionals should be paying attention to the details of their hardware and storage subsystem. Little details like this are easy to miss, and I have seen far too many busy server administrators not notice them.