Intel Speed Shift Support in Windows Server 2016

If you are gathering evidence to help make the case for a complete data platform upgrade in 2017, you want to find as much information as possible to bolster your argument. This post is meant to assist you in your efforts.

By early-mid 2017, Windows Server 2016 will have been GA long enough to convince most skeptics that it is safe and stable. Windows Server 2016 has many tangible advantages over previous versions of Windows Server, which I will be discussing in future blog posts.

SQL Server 2016 already has its first Service Pack available, with many very valuable enhancements (especially for Standard Edition). Some people who are on older versions of SQL Server Enterprise Edition may be able to migrate to SQL Server 2016 Standard Edition SP1.

Finally, Intel is due to release the next generation two-socket server processor (which will require new model servers from the server vendors). One specific new improvement you will get with a new server/processor and Windows Server 2016 is called Intel Speed Shift.

Intel Speed Shift (which is different than the older Intel SpeedStep technology) was added in the Intel Skylake microarchitecture. It requires operating system support in order to work, and the first OS to enable Speed Shift was Windows 10, after an update in November 2015. This feature lets the OS give control of processor P-states back to the hardware, with P-state requests coming from the operating system. This lets the processor increase the clock speed of individual cores much more quickly in response to an OS request for more performance.

The upcoming Intel Xeon E5-2600 v5 processors (Skylake-EP) for two-socket servers are due to ship in mid-2017 (although rumor has it that they are already shipping to some cloud data center providers). I don’t know for sure whether Intel Skylake-EP will have Speed Shift like the mobile and desktop Skylake processors do, but it probably will. Supposedly, the flagship Xeon E5-2699 v5 will have 32 physical cores, even though you will likely want a lower core count model with a higher base clock speed in most cases, to minimize your SQL Server licensing costs.

Why this matters for SQL Server is that many short-duration OLTP queries might run a little faster with Speed Shift enabled. Legacy power management techniques on older processors can take up to 120ms to fully throttle up to full Turbo Boost clock speed, while Speed Shift does it in 35ms, with most of the frequency increase happening in 3-4ms.

Figure 1 shows an example of the difference in how quickly the processor can increase the clock speed with Speed Shift Technology.

 

Figure 1: Intel Speed Shift Technology Performance Example

 

If your operating system supports Intel Speed Shift (Windows 10 or Windows Server 2016) and if you have an Intel processor with Speed Shift Technology (Skylake or newer), then you will be able to get the performance benefits of Intel Speed Shift with virtually no effort beyond perhaps changing a BIOS/UEFI setting on your server. Figures 2 and 3 have some more information about how Intel Speed Shift works.

 

Figure 2: Intel Speed Shift Technology Introduction

 

Image result for intel speed shift

Figure 3: Intel Speed Shift Technology Details

 

Over the past year, there has been a decent amount of information regarding Intel Speed Shift support in Windows 10, but nothing regarding Windows Server 2016. Part of the problem here is that we already have Skylake processors available in the mobile and desktop space, and in the single-socket server space, but not on the higher socket count server platforms.

Just as a quick experiment, I installed Windows Server 2016 Standard Edition on a brand new HP Spectre X360 laptop that has an Intel Core i7-7500U Kaby Lake-U processor, to see if Intel Speed Shift would be enabled or not. As you can see in Figure 4, the green SST lettering in the Features box on the left side of the System Summary screen in HWiNFO64 shows that Intel Speed Shift is enabled in Windows Server 2016, at least for this processor (which is the next release after Skylake).

 

clip_image001

Figure 4: Intel Speed Shift Enabled on Windows Server 2016 Standard Edition

 

Now, this is not definitive proof that Intel Speed Shift will be enabled with two-socket Skylake-EP processors, but I would say the chances are pretty good (especially given some conversations I have had with some people who actually know the answer)…

The outstanding question is how much will this actually help SQL Server performance? At this point, we simply don’t know, without actually doing some testing. Based on some reading I have been doing about Windows 10, I believe there is a way to temporarily disable Intel Speed Shift in Windows 10. Once I figure that out, it should not be that difficult to do some initial testing with short duration queries (150 ms or less) to see if they have lower average durations when Intel Speed Shift is enabled or not.

There are many other variables to consider regarding your workload and configuration that will likely affect the results in real production usage, but I expect there will be some positive benefit from this feature. As I like to say in presentations, nobody has ever told me that their database server is “too fast”, so I will take any performance improvement that I can find!

 

 

CPU-Z 1.78 is Available

On November 21, CPU-Z 1.78 was released. This is a great tool for getting all the technical details about your processors and checking on their current clock speed.

The main improvement in this version is support for Intel Kaby Lake processors, which are already available in the mobile space. It looks like the desktop version of Kaby Lake will be released at CES in January. Tom’s Hardware did some benchmarking of an early sample of a Core i7-7700K that someone supplied to them, as detailed here.

 

image

Figure 1: CPU-Z 1.78 CPU Tab

 

Recent versions of CPU-Z have added a quick CPU benchmarking function that is very useful for running a brief CPU benchmark that measures single-threaded CPU performance and multi-threaded CPU performance. Each test only takes about 7-8 seconds, and is useful for a number of reasons.

 

image

Figure 2: CPU-Z 1.78 Bench Tab For Intel Core i7-6700K System

 

First, you can get a quick gauge of your single-threaded CPU performance (which equates to the “speed” of the processor), and your multi-threaded CPU performance (which equates to the CPU capacity of the entire system). This is useful for comparing different processors and systems, whether they are physical or virtual. You can measure the performance of a VM versus running bare metal on the host, or you can measure different VM configurations. You can also compare your numbers to the built-in reference processors, or submit your results and compare them to other systems results that are stored online.

Second, you can use the Bench CPU button to briefly stress your processors, and then quickly switch to the main CPU tab while the test is running, to see what happens to your CPU core clock speeds, in order to understand whether you have power management configured correctly to get the performance benefits of Intel Turbo Boost.

Building a Desktop Workstation for SQL Server Development and Testing (August 2016)

Back in March of 2014, I wrote a fairly long blog post called Building a Workstation for SQL Server 2014 Development and Testing, with an updated version in September of 2015, both of which still generate quite a bit of hits and e-mail. Since it is now about twelve months later, I thought it was a good time to update this information to cover the latest available hardware choices.

With the current selection of high-performance and very affordable desktop computer components, it is not very difficult to assemble an extremely high performance workstation for SQL Server development and testing at a very reasonable cost. Depending on how much performance you want and what your available budget is, you can take several different routes to get this accomplished.

At the high end of the spectrum, you can get a Socket 2011 v3 motherboard, with an Intel Xeon E5-2600 v4 product family “Broadwell-EP” processor or an Intel High-End Desktop Processor (HEDT) product family “Broadwell-E” processor and up to 128GB of ECC DDR4 RAM, and multiple PCIe flash storage cards, and spend a pretty significant amount of money, depending on your exact hardware and storage choices.  There are also workstation-class motherboards that let you have two Intel Xeon E5-2600 v4 processors and ECC RAM, so that you can have even more memory and total processor cores.

In the middle, you can build a very powerful system using the current 14nm Intel Core i7-6700K “Skylake” processor that uses a current generation LGA 1151 motherboard and DDR4 memory. Skylake processors require a newer Intel chipset, and many of the Skylake motherboards that are available are using the current high-end Intel Z170 Express chipset. Aside from DDR4 support, the most interesting improvement to Z170 Express is an increased number of PCIe lanes. The Z170 chipset supports a total of 20 PCI-Express lanes at the PCH in conjunction with the CPU’s 16 PCIe 3.0 lanes for a platform total of 36. Last generation’s Z97 Express chipset coupled with Devil’s Canyon or Haswell CPUs only allowed for 24 PCIe 2.0 lanes. All of these from the PCH were only PCIe 2.0 compliant. In contrast, all of the Z170’s PCIe lanes are Generation 3.0 compliant while retaining backwards compatibility with PCIe 2.0 and 1.0 specifications. This means you have a lot more I/O bandwidth available for storage. You can also have up to 64GB of RAM with a Z170-based system.

At the lower end of the spectrum, you can put together a system with a single Intel Core i7-4790K “Devil’s Canyon” processor using an Intel Z97 chipset, 32GB of non-ECC DDR3 RAM, and a single high-performance, 6Gbps consumer-class SSD, and still have a system with more processing power than many existing Production database servers. A system that uses this older generation architecture will still be quite powerful, but will be more economical than ever, since the newer Skylake family has been available for over a year now.

The upcoming 14nm 7th generation Kaby Lake desktop processors and Z270 chipsets probably won’t be available until at least January 2017, so the currently available hardware is the best you are going to be able to get for a little while longer.

If you are going to build a desktop system from scratch, you need eight basic components:

  1. Computer Case
  2. Power Supply
  3. Motherboard
  4. Processor
  5. Memory (RAM)
  6. Storage (magnetic or flash)
  7. Discrete Video Card (optional, not really necessary in most cases)
  8. Optical Drive (optional, becoming much less important)

This assumes that you have a keyboard, mouse, and one or more monitors. I’ll discuss each one of these components, with some tips for what you should consider as you are choosing them.

Computer Case

You will need some sort of case to hold your components (unless you want to leave them running on a test bench). Personally, I like mid-range, mid-tower cases from companies like Fractal Design, Antec, Cooler Master, and Corsair. Mid-Tower cases give you plenty of room for common ATX motherboards, and they usually have at least four to six internal 3.5” or 2.5” drive bays. Newer designs have special 2.5” mounting points for SSDs and front or top mounted USB 3.0 or USB 3.1 ports. Better cases are much easier to work with, and they often have much better cable management features (so you can route most of your cables in a separate space under the motherboard). This not only looks much nicer, but it gives you better airflow inside the case. You probably don’t really need a fancy, gaming-oriented case with LED lighting and a huge number of case fans (unless you like that sort of thing).

A decent case in the $50-100 range will usually have good quality components (such as quieter, larger diameter case fans), along with good thermal and noise management features. The Fractal Design Core 3300 is a good example of an affordable, good quality case for about $80.00. I also like the slightly more expensive Fractal Design Define R5. You can spend a little less on a case, or quite a bit more. Just make sure that the case will allow you to install the size of motherboard that you will be using. Another thing you might want to do in some situations is to replace the original case fans with something quieter and better such as the new Corsair ML120 or ML140 Magnetic Levitation fans.

Power Supply

You should invest in a decent quality power supply as opposed to the cheapest one you can find. You don’t want to go overboard and get a 1200 watt behemoth gaming-oriented power supply (unless you are building an extreme gaming rig with multiple, high-end video cards that really need that much power). For the kind of system that I am recommending, you can use a high quality 400-600 watt 80 PLUS (or better) modular power supply and have plenty of reserve power. Modular power supplies have detachable cables for things like SATA power, MOLEX power, PCIe power, etc., so you only need to plug in and use the cables you actually need.

Power supplies are much less efficient when they are only supplying a very small portion of their rated output. Getting a 1200 watt power supply because you think it must be “better” than a good 500-600 watt power supply is actually a waste of money, both for the initial cost of your power supply and the electrical power costs over the life of your machine. The components that I am recommending will end up drawing about 30-40 watts at idle.  I really like Seasonic power supplies, especially their fan-less, modular models such as the SS400FL and the newer SS-520FL2. They are both completely silent and highly efficient, 80+ Platinum rated power supplies. I also really like their latest Prime Titanium line of power supplies. Another less expensive alternative that I like are Corsair power supplies, such as the Corsair CX550M modular power supply.

Motherboard

The motherboard is where all of your other components are plugged into, so it is a critical component. You need to consider which processor you are going to be using, since there are several different processor socket types available, which will dictate your motherboard choices. The most common type in late 2016 is still the LGA 1150, which will work with the 4th generation, 22nm Intel Core processors (Haswell and Devil’s Canyon). You also need to consider the form-factor of your motherboard. You can choose from ATX, micro-ATX, and mini-ITX, which refers to the size of the motherboard. You also need to think about the chipset used on your motherboard.

The Intel Z97 Express chipset is their best chipset for an LGA 1150 motherboard. As you are looking at motherboards, you should be looking at the low-to-mid range Z97-based motherboards instead of the high-end, gaming motherboards. The high-end gaming Z97 motherboards can be quite expensive, and they will probably have features (such as support for three discrete video cards), that you don’t really need for a SQL Server workstation or test server. Instead, make sure you choose a model that has four DDR3 RAM slots, and at least six 6Gbps SATA III ports. You also might want a model that has one or more M.2 slots. A good example Z97 motherboard is the ASRock Z97 Extreme6/3.1.

A newer, slightly more expensive choice is a 6th generation, 14nm Intel Core processor (Skylake), combined with a Z170 chipset. Right now, there are two Skylake processor choices that are widely available, the Core i7-6700K, and the Core i5-6600K. Skylake processors use the the current Socket LGA 1151, and DDR4 RAM, so you will need a new motherboard and new memory if you are upgrading from an older platform. Unlike older Intel processors, these new Skylake processors do not come with a stock cooling fan bundled with the processor, so you will have to buy some type of decent CPU cooler. A good example Z170 motherboard is the ASRock Z170 Extreme7 +. The reason I really like this motherboard is because of all of the I/O capacity and flexibility that it offers, with three PCIe 3.1 x16 slots, and three Ultra M.2 PCIe 3 x4 slots.

If you are going to run Windows Server 2012 R2 for your host operating system, you should be aware that most Intel embedded NICs that you will find on many desktop motherboards will refuse let you install the NIC drivers with a Microsoft server operating system. In that case, you can buy an inexpensive, non-Intel ($15-20) PCIe Gigabit Ethernet card that work just fine. If you are running Windows 10 for your host operating system, you won’t have this issue.

Processor

You can choose a modern, Intel desktop processor that may well have much more raw processing power than many older two or four-socket production database servers. This is not an exaggeration, although it depends on the age of your production database server. You are far more likely to run into memory or I/O bottlenecks as you push a modern Intel desktop system than processor bottlenecks. For most people, an Intel Core i7-4790K processor will be your best choice (especially if you live near a Micro Center). It is a quad-core processor with hyper-threading (so you have eight logical cores) that runs at a base clock speed of 4.0GHz, with the ability to TurboBoost to 4.4GHz. It runs very cool, and is easy to overclock even with the stock Intel processor cooler. It is not really necessary to overclock this processor, to get good performance. You can have a maximum of 32GB of DDR3 RAM with this processor, and it supports both VT-x and VT-d for better virtualization performance.

Most 4th generation Intel Core processors (Haswell) have pretty good integrated graphics built-in to the CPU package. The better models have HD4600 graphics which give you more than enough performance for normal desktop usage and even some moderate gaming. There was a pretty big improvement in the integrated graphics performance between the Ivy Bridge and Haswell processors, so it is much more feasible to simply use the integrated graphics instead of buying a separate, discrete video card. This will save you money and reduce your electrical power usage.

One big variable in the cost of using this processor is whether you live near a Micro Center computer store or not. Micro Center has 25 locations in the Continental United States, and they sell a few specific models of Intel processors at prices that no other company seems willing to match. They have been doing this for years, and it is their regular practice (so it is not a special sale or promotion). The only catch is that those processors are only available for in-store pickup (so no mail-order).

For example, Micro Center is currently selling the Intel Core i7-4790K processor for $289.99, while NewEgg is selling the exact same Intel Core i7-4790K processor for $339.99. Micro Center quite often does promotions where they will reduce the price of a motherboard by $30-$50 if you buy the motherboard with a qualifying processor. Their prices on motherboards, cases, memory, hard drives and SSDs are also quite competitive.

A newer Intel Core i7-6700K processor will cost $309.99 at Micro Center, and it will cost $349.99 at NewEgg. The other good thing about Micro Center is that the people who work in their computer components department are generally very knowledgeable and helpful. I have seen them patiently help many customers select appropriate components for the type of system they are trying to build.

Memory

If you select an LGA 1150, Z97 Express motherboard with four RAM slots, you can have up to 32GB of non-ECC DDR3 RAM in your system. You can get two 8GB sticks of 240-pin PC3 12800 DDR3 RAM for about $138.00, so it would be about $276.00 to get 32GB of RAM. This should be plenty for most development and testing workloads (including running multiple VMs), but if you really need more, you could make the jump to the LGA 2011 v3 platform that uses the more expensive six, eight and ten-core Intel Broadwell-E processors where you can have up to 128GB of DDR4 RAM.

DDR4 RAM is actually less expensive than DDR3 RAM now. For example, you can get two 8GB sticks of 288-pin PC4 17000 DDR4 RAM for about $74.00, so it would be about $148.00 to get 32GB of RAM. Eventually, you can also get 16GB DDR4 DIMMs, so you will be able to have 64GB of RAM in a Z170 Express system.

One thing you will want to do as you are configuring your system is to go into your BIOS setup and turn on Extreme Memory Profile (XMP), so that you will get better memory performance. This can occasionally cause stability problems, depending on the type of memory that you have, but if that happens, you can always turn it back off.

Storage

You will need some type of storage for your system. Traditional magnetic hard drive prices continue to decline, so that you can get a high-performance 4TB, 7200rpm SATA III drive with 128MB of cache, such as a 4TB Western Digital WD Black WD4004FZWX for $221.99. For just a little more money, you can also get a much smaller, but much much higher performance 6Gbps SATA III consumer-grade SSD, such as a 1TB Samsung 850 EVO SSD for $307.19. Solid State Drive prices have come down a lot (as performance has increased slightly) over the past couple of years, but they still cost about six times as much as conventional magnetic storage, per gigabyte.

I really encourage you to use a modern, fast 6Gbps SATA III SSD for your boot drive since it will have an extremely dramatic, positive effect on how fast your system performs and “feels” in everyday use. It will boot faster, shut down faster, programs will load nearly instantly, and it will take much less time to install new software and Windows Updates. It is similar to the difference between a dial-up modem and a fast broadband connection. Once you start using a fast SSD, you will never want to go back to a conventional magnetic hard drive.

You want to make sure your fast 6Gbps SSD is plugged into a 6Gbps SATA III port (not one of the slower 3Gbps SATA II ports) on your motherboard. Otherwise, your fast SATA III SSD will be limited to about 275MB/sec for sequential reads and writes (which is still about twice as fast as a very fast traditional 7200rpm SATA hard drive). You also want to avoid the smallest capacity 128GB SSD models, since their performance is much usually much lower than the larger capacity models from the same manufacturer and product line. This is because the smaller capacity models have fewer NAND chips and fewer data channels. Ideally, you would want a 250GB (or larger) 6Gbps SATA III SSD plugged into each SATA III port that you have available on your motherboard. This would give you lots of options for how to lay out your SQL Server data files, log files, tempdb files and SQL Server backup files.

Of course, you may not want to spend that much money, so it is still common to have one or two SSDs, along with one or more conventional magnetic drives in a desktop system. One of the luxuries with a desktop system compared to any laptop is that you have a very high number of internal drive bays and up to ten or twelve SATA III ports on the motherboard. You can also buy inexpensive PCIe SATA III cards to add even more SATA III ports to a desktop system.

If you want even more storage performance, you may want to consider a PCIe NVMe flash storage card such as the Intel 750 Series, which comes in 400GB, 800GB, and 1.2TB capacities. These cost slightly less than $1.00/GB, and their performance is about 4-5 times better than SATA III SSDs for reads and about 2-3 times better for writes. Another alternative is the Samsung 950 Pro Series of M.2 PCIe NMVe flash devices, which are available in 256GB and 512GB capacities. If you have one or more M.2 slots that support PCIe 3.0 x4, then you will also get tremendous performance from these.

Discrete Video Card

There are some situations where the Intel integrated graphics might not be enough for your needs. An example would be if you were doing things such as AutoCad that really place a lot of stress on your graphics performance. Another example is if you wanted to run multiple, large monitors on your system. Most motherboards that support the Intel integrated graphics only have two or three video connectors (such as a DP connector, DVI connector and an HDMI connector), so that would limit how many monitors you could connect to the system. Depending on your processor choice, you may not have integrated graphics at all.

If you do decide to go with one or more discrete video cards, you can get quite decent performance for about $100-150.00 each (but you can spend much, much more). You may also need a power supply with multiple, supplemental PCIe power connectors, and you might even need a higher capacity power supply.

Optical Drive

Even though they are becoming much less useful over time, it can still be useful to have a DVD-Recorder, optical drive in a desktop system. It just makes it easier to install the operating system and other software (although you can certainly install from a USB drive). It is also becoming much more common to simply mount an .iso file for doing something like installing SQL Server 2016. You can get bare, OEM optical DVD drives for about $15-20. Personally, I like to use an external USB optical drive to install the operating system, so I don’t have to have an internal optical drive taking up space and power in the system.

So, after all of this, how much money am I trying to convince you to spend?  Well, here is one example:

  1. Case                   $80.00
  2. Power Supply     $100.00
  3. Motherboard      $180.00          (ASRock Z170 Extreme 7+ from Micro Center)
  4. Processor          $309.00          (Intel Core i7-6700K from Micro Center)
  5. RAM                  $148.00          (32GB of DDR4 RAM)
  6. Storage             $308.00          (One 1TB Samsung 850 EVO SSD)

Total System         $1125.00

This system would have much better performance than a laptop that could cost several times as much. It would also have better performance than many production SQL Server database servers. It would be pretty easy to slice over $300.00-$400.00 off of this system cost by choosing some different components, and still have a very capable system.

Of course, a desktop system like this does not have redundant, server-class components or ECC RAM, so you would not want to use them in a production situation. They would probably be much better (in terms of performance) than some ancient, out of warranty, retired server for development and testing.