The post Two Recent Laptops Compared appeared first on Glenn Berry.
]]>The newer machine is an HP Spectre x360 13-w023dx, which has a 14nm Intel Core i7-7500U Kaby Lake-U processor, 16GB of RAM, a 512GB Samsung SM961 M.2 NVMe SSD, one USB 3.0 port, two USB-C Thunderbolt 3 ports and a 1080P touch display.
The high-level processor specifications and CPU-Z benchmark results for these two systems are shown below:
Processor Base Clock Turbo Clock Single-threaded CPU Multi-threaded CPU
Intel Core i7-6500U 2.5GHz 3.1GHz 1467 3391
Intel Core i7-7500U 2.7GHz 3.5GHz 1743 3958
These Skylake-U and Kaby Lake-U processors are quite similar, with the Kaby Lake having an optimized “14nm plus” process technology that lets Intel set the clock speeds slightly higher at the same power usage levels. Kaby Lake also has improved integrated graphics and an improved version of Intel Speed Shift technology that lets Windows 10 throttle up the clock speed of the processor cores even faster than with a Skylake processor.

Figure 1: Improved Intel Speed Shift in Kaby Lake
The single-threaded CPU-Z 1.78.1 benchmark result is 18.8% higher with the new system, while the multi-threaded CPU-Z benchmark result is 16.7% higher on the new system. I attribute this increase to the higher base and turbo clock speeds, the optimized process technology, and the effect of the improved Intel Speed Shift. The results are shown in figures 2 and 3.
Figure 2: Intel Core i7-6500U CPU-Z Benchmark Results
Figure 3: Intel Core i7-7500U CPU-Z Benchmark Results
Honestly, these current generational CPU performance improvements are slightly better than nothing (but not much), and are certainly not a good enough reason to upgrade from an equivalent Skylake-U system to a Kaby Lake-U system. Where we see a big improvement is with basic storage performance and peripheral connectivity between these two systems.
I was happily surprised that the new HP system came a very fast 512GB Samsung SM961 M.2 NVMe OEM SSD that is equivalent to a Samsung 960 PRO. The reason I was surprised was because some reviews I had read indicated that these HP machines had a much slower Samsung OEM M.2 NVMe SSD. This probably varies by when your machine was manufactured, so perhaps the earliest review machines had the older, slower drives.
As you can see, the difference in the CrystalDiskMark performance between these drives is pretty dramatic.
Figure 4: 512GB Samsung SM961 M.2 NVMe SSD
Figure 5: 512GB Samsung PM871 SATA 3 SSD
For day to day average PC usage, you probably won’t really notice the difference between a fast SATA 3 SSD and an M.2 PCIe NVMe SSD, but if you are using SQL Server on a laptop, having that extra sequential bandwidth and much better random I/O performance is really noticeable. It is also very nice to have Thunderbolt 3 support, which will allow you to have really fast transfer performance to an appropriate external drive.
So the moral of all this is that the best reason to consider upgrading to a new laptop or new desktop machine for many people are the additional storage and peripheral connectivity options that you can get with a new machine.
The post Two Recent Laptops Compared appeared first on Glenn Berry.
]]>The post Intel Xeon E7 Processor Generational Performance Comparison appeared first on Glenn Berry.
]]>

Figure 1: Speedup from Successive Processor Generations
This workload is described as “AsiaInfo ADB Database OCS k-tpmC”, while the AsiaInfo ADB is described as “a scalable OLTP database that targets high performance and mission critical businesses such as online charge service (OCS) in the telecom industry”, that runs on Linux.
The reason I have performance in quotes above is because what they are really measuring is closer to what I would call capacity or scalability. Their topline result is “Thousands of Transactions per Minute” as measured with these different hardware and storage configurations.
The key point to keep in mind with these types of benchmarks is whether they are actually comparing relatively comparable systems or not. In this case, the systems are quite similar, except for the core counts of the successive processor models (and the DD3 vs. DDR4 memory support). Here are the system components, as listed in the footnotes of the document:
Baseline: Four-sockets, 15-core Intel Xeon E7-4890 v2, 256GB DDR3/1333 DIMM, Intel DC S3700 SATA for OS, (2) 2TB Intel DC P3700 PCIe NVMe for storage, 10GbE Intel X540-AT2 NIC
Next Generation: Four-sockets, 18-core Intel Xeon E7-8890 v3, 256GB DDR4/1600 LVDIMM, Intel DC S3700 SATA for OS, (2) 2TB Intel DC P3700 PCIe NVMe for storage, 10GbE Intel X540-AT2 NIC
New: Four-sockets, 24-core Intel Xeon E7-8890 v4, 256GB DDR4/1600 LVDIMM, Intel DC S3700 SATA for OS, (2) 2TB Intel DC P3700 PCIe NVMe for storage, 10GbE Intel X540-AT2 NIC
The baseline system has a total of 60 physical cores, running at 2.8GHz, using the older Ivy Bridge-EX microarchitecture. The next generation system has a total of 72 physical cores, running at 2.5GHz, using the slightly newer Haswell-EX microarchitecture. Finally, the new system has a total of 96 physical cores, running at 2.2GHz, using the current Broadwell-EX microarchitecture. These differences in core counts, base clock speeds, and microarchitecture make it a little harder to fully understand their benchmark results in a realistic manner.
Table 1 shows some relevant metrics for these three system configurations. The older generation processors have fewer cores, but run at a higher base clock speed. The newer generation processors would be faster than the older generation processors at the same clock speed, but the base clock speed is lower as the core counts have increased with each successive generation flagship processor. The improvements in IPC and single-threaded performance are obscured by lower base clock speeds as the core counts increase, which makes the final score increase less impressive.
| Processor | Base Clock | Total System Cores | Raw Score | Score/Core |
| Xeon E7-4890 v2 | 2.8GHz | 60 | 725 | 12.08 |
| Xeon E7-8890 v3 | 2.5GHz | 72 | 1021 | 14.18 |
| Xeon E7-8890 v4 | 2.2GHz | 96 | 1294 | 13.48 |
Table 1: Analysis of ADB Benchmark Results
Table 2 shows some metrics from an analysis of some actual and estimated TPC-E benchmark results for those same three system configurations, plus an additional processor choice that I added. The results are pretty similar, which supports the idea that both of these benchmarks are CPU-limited. From a SQL Server 2016 perspective, you are going to be better off from a performance/license cost perspective if you purposely choose a lower core count “frequency-optimized” processor (at the cost of less total system capacity per host).
This is somewhat harder to do with the Intel Xeon E7 v4 family, because of your limited SKU choices. A good processor choice for many workloads would be the 10-core Intel Xeon E7-8891 v4 processor, which has a base clock speed of 2.8GHz and a 60MB L3 cache that is shared by only 10 cores.
If you could spread your workload across two database servers, you would be much better off with two, four-socket servers with the 10-core Xeon E7-8891 v4 rather than one four-socket server with the 24-core Xeon E7-8890 v4. You would have more total system processor capacity, roughly 27% better single-threaded CPU performance, twice the total system memory capacity, and twice the total number of PCIe 3.0 expansion slots. You would also only need 80 SQL Server 2016 Enterprise Edition core licenses rather than 96 core licenses, which would save you about $114K in license costs. That license savings would probably pay for both database servers, depending on their exact configuration.
| Processor | Base Clock | Total System Cores | Est TPC-E Score | Score/Core |
| Xeon E7-4890 v2 | 2.8GHz | 60 | 5576.27 | 92.94 |
| Xeon E7-8890 v3 | 2.5GHz | 72 | 6964.75 | 96.73 |
| Xeon E7-8890 v4 | 2.2GHz | 96 | 9068.00 | 94.46 |
| Xeon E7-8891 v4 | 2.8GHz | 40 | 4808.79 | 120.22 |
Table 2: Analysis of Estimated TPC-E Benchmark Results
The Intel document also discusses the “performance” increases seen from moving from Intel DC S3700 SATA drives to Intel DC P3700 PCIe NVMe drives. This is going to be primarily influenced by the advantages of being connected directly to the PCIe bus and the lower latency and overhead of the NVMe protocol compared to the older AHCI protocol.
Finally, they talk about the “performance” increases they measured from enabling the Intel Transactional Synchronization Extensions (TSX) instruction set and the Intel AVX 2.0 instruction set on current generation Intel E7-8800 v4 series processors.
SQL Server 2016 already has hardware support for older SSE/AVX instructions as discussed here and here. I really hope that Microsoft decides to add even more support for newer instruction sets (such as TSX) in SQL Server vNext.
The post Intel Xeon E7 Processor Generational Performance Comparison appeared first on Glenn Berry.
]]>The post Some Quick Comparative CrystalDiskMark Results appeared first on Glenn Berry.
]]>A few weeks ago, I built a new Intel Skylake desktop system that I am going to start using as my primary workstation in the near future. I have some details about this system as described in Building a Z170 Desktop System with a Core i7-6700K Skylake Processor. By design, this system has several different types of storage devices, so I can take advantage of the extra PCIe bandwidth in the latest Intel Z170 Express chipset, and do some comparative testing.
The latest addition to the storage family is a brand new 512GB Samsung 950 PRO M.2 PCIe NVMe card that just arrived from Amazon yesterday afternoon. As of now, here is the available storage in this system:
Since I have an NVidia GeForce GTX 960 video card in one of the PCI 3.0 x16 slots, both that slot and the PCI 3.0 x16 slot that the Intel 750 is using will go down to x8 (which means 8 lanes instead of 16 lanes). The Intel Z170 Express chipset supports 26 PCIe 3.0 lanes, so you need to think about what devices you are trying to use. This system has Windows 10 Professional installed, so it has native NVMe drivers available from Microsoft.
I did some quick and dirty I/O testing today with CrystalDiskMark 5.02. The two NVMe devices are both using the native Microsoft NVMe drivers from Windows 10. As you can see below, both the Samsung 950 PRO and the Intel 750 PCIe NVMe cards have tremendous sequential and random I/O performance!
| Device | Sequential Reads | Sequential Writes | Random Reads | Random Writes |
| 512GB Samsung 950 Pro | 2595 MB/s | 1526 MB/s | 171755.6 IOPS | 104801.3 IOPS |
| 400GB Intel 750 | 2369 MB/s | 1081 MB/s | 177938.0 IOPS | 151642.1 IOPS |
| 512GB Samsung 850 Pro | 1104 MB/s | 532 MB/s | 100420.4 IOPS | 60765.1 IOPS |
| 6TB WD Red HD | 176 MB/s | 170 MB/s | 386.7 IOPS | 448.2 IOPS |
Table 1: Sequential and Random Results (Queue Depth 32, 1 Thread)
Keep in mind that the two Samsung 850 PRO SSDs are using hardware RAID1, which seems to help their sequential read performance, and that the two NVMe devices are both using the native Microsoft NVMe drivers, which may be hurting their performance somewhat.
Figure 1: 512GB Samsung 950 Pro M.2 PCIe NVMe Results
Figure 2: 400GB Intel 750 PCIe NVMe Results
Figure 3: 512GB Samsung 850 Pro SATA 3 (RAID 1) Results
Figure 4: 6TB Western Digital Red Results
The post Some Quick Comparative CrystalDiskMark Results appeared first on Glenn Berry.
]]>The post Samsung 950 PRO M.2 PCIe NVMe SSD appeared first on Glenn Berry.
]]>According to the press release:
The 950 PRO will be available in 512 gigabyte (GB) and 256GB storage capacities. The 512GB version delivers sequential read/write speeds of up to 2,500 MB/s and 1,500 MB/s. Random read performance is up to 300,000 IOPS, with write speeds of up to 110,000 IOPS.
Both capacities come with a 5-year limited warranty up to 200 terabytes written (TBW) for the 256GB and 400TBW for the 512GB. The 950 PRO will be available beginning in October 2015, with an MSRP of $199.99 for the 256GB capacity and $349.99 for the 512GB capacity.
TheSSDReview has a good story about this drive here.
In case you are wondering, NVMe or NVM Express (Non-Volatile Memory Express) is an optimized, high performance, scalable host controller interface with a streamlined register interface and command set designed for enterprise and client systems that use PCIe SSDs. It typically offers much better performance than the legacy AHCI (Advanced Host Controller Interface) interface used by some PCIe solid state drives (and all SATA solid state drives). You can read more about NVM Express here.
I am getting close to buying the parts for a new Z170-based, Core i7-6700K desktop system to replace my current Z77-based Core i7-3770K system that I built in early 2012. I am going to be using an ASRock Z170 Extreme7+ motherboard for this new system, mainly because all of the I/O capacity that it offers, including four PCIe 3.1 x16 slots, and three “Ultra” M.2 PCIe 3.0 x4 slots. It also has ten SATA 3 ports, three SATA Express ports, and USB 3.1 Type A and C support.
Even with the new Z170 chipset, you won’t be able to use all of this I/O capacity, since you only have 26 high-speed I/O lanes available, but you should be able to put three Samsung 950 PRO M.2 drives into the three available slots on this motherboard.

Figure 1: ASRock Z170 Extreme7+ Motherboard
If you want to be able to use one of these very fast M.2 solid state drives, you will need to make sure that your system has an M.2 slot that is also long enough (80mm) to accommodate the card. You will also want your M.2 slot to support PCIe 3.0 x4 (meaning four lanes), which is sometimes called “Ultra M.2”.
The post Samsung 950 PRO M.2 PCIe NVMe SSD appeared first on Glenn Berry.
]]>The post Beware of the Native Microsoft NVMe Driver! appeared first on Glenn Berry.
]]>The use case from nvmexpress.org is that
“NVM Express is architected from the ground up for Non-Volatile Memory (NVM). NVM Express significantly improves both random and sequential performance by reducing latency, enabling high levels of parallelism, and streamlining the command set while providing support for security, end-to-end data protection, and other Client and Enterprise features users need. NVM Express provides a standards-based approach enabling broad ecosystem adoption and PCIe SSD interoperability.”
NVMe is being pushed as a modern replacement for the old Advanced Host Controller Interface (AHCI) that most flash storage devices are still using, and all indications are that NVMe will really start to become more popular and more affordable in 2015/2016.
Windows Server 2012 R2 and Windows 8.1 have a native NVMe driver that allows NVMe devices to be automatically recognized by Windows. This driver works, but does not offer the best performance. I wrote about my experiences with the native NVMe driver last October. Microsoft has also released a hotfix to Windows Server 2008 R2 and Windows 7 that gives native NVMe support to the operating system.
Anandtech has had similar results with several different NVMe devices. Their information (from Samsung) was that
“the performance difference was due to the Microsoft NVMe driver creating FUA (Force Unit Access) I/O write commands. These FUA commands bypass the DRAM cache on the SSD and directly write to the flash, increasing the response time and also lowering bandwidth. For the same access traces, this situation does not happen with the Microsoft AHCI driver.”
This sounds pretty similar to the difference between write-back and write-through caching for RAID controllers. If you have any NVMe storage devices, you should make absolutely sure that you are using the vendor supplied NVMe driver rather than the generic Microsoft NVMe driver. My fear is that it will be very common for many server administrators to simply install their NVMe device, start the server, and then think everything is ok, since Windows recognized the device and it seems to be working.
There are a lot of recent tests of new NVMe storage devices to whet your appetite for this technology. Here are some reviews and tests of client devices:
PCIe SSD Roundup – Samsung SM951 NVMe vs. AHCI, XP941, SSD 750 and More!
Intel 750 series SSD review: Storage so fast, only the highest-end PCs can keep up
Here are some reviews of server devices:
Intel SSD DC P3700 Review: The PCIe SSD Transition Begins with NVMe
Intel SSD DC P3700 800GB and 1.6TB Review: The Future of Storage
Intel SSD DC P3700 Review (800GB) – NVMe for Enterprise…and Enthusiasts?
Hopefully, Microsoft will improve the performance of their native NVMe driver in a future update for Windows Server 2012 R2 and Windows 8.1. I certainly hope the native NVMe driver performs better in Windows 10 and “Windows Server 2016”. I would love to see Microsoft’s Jose Barreto weigh in on this subject!
The post Beware of the Native Microsoft NVMe Driver! appeared first on Glenn Berry.
]]>