Our man pops the hood on Intel's v4 engine: Broadwell Xeons

Taking new chips for a test drive


Sysadmin Blog Recently, I reviewed Supermicro's Microblade system. One of the goals of this review was to compare the new Intel v4 (Broadwell) Xeons to their predecessor v3 (Haswell) Xeons. This was not as easy as it should have been.

My front line tool for benchmarking CPUs is Prime 95. Supermicro provided me with 2x Intel Xeon E5 2695 v4 CPUs in the Broadwell blade. This means the Broadwell blade has 72 logical cores. Prime95 cannot bench this because it can only address 64 logical cores.

Intel broke Prime 95 through sheer core count. Achievement unlocked. I suppose turnabout is fair play given that Prime 95 can break Intel's Skylake CPUs.

The other challenge was that the Haswell Xeons provided weren't directly comparable to the Broadwell Xeons. They weren't the same core count or frequency, though they were the same TDP. I tried to overheat the chassis so that I could play silly buggers with thermal failure, but with only 2 blades in a chassis designed for 28 possessed of possibly the most aggressive fans in existence I did not succeed.

Fortunately, I have a lab with lots of different CPUs and the ability to do lovely things like over and underclock them. After a great deal of benchmarking I essentially verified the incomparable coverage of the Broadwell Xeons done by Timothy Prickett Morgan at The Register's sister site The Next Platform.

When not using AVX2, Broadwell seems to provide a 4 per cent to 5 per cent increase in performance clock for clock over Haswell. This seems slightly higher with AVX2, with Broadwell showing an 8 per cent increase over Haswell. If this doesn't seem like much, that's perfectly normal. Broadwell is a die shrink of Haswell and not much really changed in terms of microarchitecture.

One set of benchmarks did stand out from the rest: crypto. Some performance tests were showing 80 per cent improvement with many showing 20 to 25 per cent. It turns out that one of the architecture changes Intel made with the Broadwell line was to make the PCLMULQDQ (aka carry-less multiplication) AVX instruction suck less. The result is faster crypto.

Value for dollar

If Broadwell versus Haswell seems a little mediocre, let's compare Broadwell to Intel's v1 (Sandy Bridge) Xeons. A lot of organisations are looking at upgrading v1 Xeons to v4 Xeons and the jump in speed is actually worth it. Clock for clock, the Broadwells seem to be a little under 20 per cent faster for non-AVX workloads. AVX workloads were considerably faster.

What I hadn't known before was that the introduction of AVX2 with the Haswell Xeons doubled the speed of the AVX instructions. Sandy Bridge and v2 (Ivy Bridge) Xeons are capable of 8 double precision Floating Point Operations (FLOPs) per core per cycle. Haswell and Broadwell Xeons can do 16 double precision FLOPs/core/cycle. There were a huge number of other improvements that came along with AVX2 as well.

The net result is that many AVX workloads will easily more than double performance, clock for clock on Broadwell than Sandy Bridge. Now, that is restricted to only a few workloads that are make use of all the enhancements, but some workloads – CRC crypto, for example – show a quadrupling of clock for clock performance.

Considering that Intel has more or less kept the prices steady across the lines, the value for dollar has risen significantly with each successive generation. A lot of that value is in getting more cores for your dollar – great for virtualization – but single threaded performance isn't being neglected.

If you're running Haswell, there isn't a huge incentive to upgrade to Broadwell unless you do rather a lot of crypto. If, however, you're running Sandy Bridge or Ivy Bridge Xeons, Broadwell is probably worth your time. ®

Similar topics

Broader topics


Other stories you might like

  • Intel is running rings around AMD and Arm at the edge
    What will it take to loosen the x86 giant's edge stranglehold?

    Analysis Supermicro launched a wave of edge appliances using Intel's newly refreshed Xeon-D processors last week. The launch itself was nothing to write home about, but a thought occurred: with all the hype surrounding the outer reaches of computing that we call the edge, you'd think there would be more competition from chipmakers in this arena.

    So where are all the AMD and Arm-based edge appliances?

    A glance through the catalogs of the major OEMs – Dell, HPE, Lenovo, Inspur, Supermicro – returned plenty of results for AMD servers, but few, if any, validated for edge deployments. In fact, Supermicro was the only one of the five vendors that even offered an AMD-based edge appliance – which used an ageing Epyc processor. Hardly a great showing from AMD. Meanwhile, just one appliance from Inspur used an Arm-based chip from Nvidia.

    Continue reading
  • TSMC may surpass Intel in quarterly revenue for first time
    Fab frenemies: x86 giant set to give Taiwanese chipmaker more money as it revitalizes foundry business

    In yet another sign of how fortunes have changed in the semiconductor industry, Taiwanese foundry giant TSMC is expected to surpass Intel in quarterly revenue for the first time.

    Wall Street analysts estimate TSMC will grow second-quarter revenue 43 percent quarter-over-quarter to $18.1 billion. Intel, on the other hand, is expected to see sales decline 2 percent sequentially to $17.98 billion in the same period, according to estimates collected by Yahoo Finance.

    The potential for TSMC to surpass Intel in quarterly revenue is indicative of how demand has grown for contract chip manufacturing, fueled by companies like Qualcomm, Nvidia, AMD, and Apple who design their own chips and outsource manufacturing to foundries like TSMC.

    Continue reading
  • Intel withholds Ohio fab ceremony over US chip subsidies inaction
    $20b factory construction start date unchanged – but the x86 giant is not happy

    Intel has found a new way to voice its displeasure over Congress' inability to pass $52 billion in subsidies to expand US semiconductor manufacturing: withholding a planned groundbreaking ceremony for its $20 billion fab mega-site in Ohio that stands to benefit from the federal funding.

    The Wall Street Journal reported that Intel was tentatively scheduled to hold a groundbreaking ceremony for the Ohio manufacturing site with state and federal bigwigs on July 22. But, in an email seen by the newspaper, the x86 giant told officials Wednesday it was indefinitely delaying the festivities "due in part to uncertainty around" the stalled Creating Helpful Incentives to Produce Semiconductors (CHIPS) for America Act.

    That proposed law authorizes the aforementioned subsidies for Intel and others, and so its delay is holding back funding for the chipmakers.

    Continue reading
  • Linux Foundation thinks it can get you interested in smartNICs
    Step one: Make them easier to program

    The Linux Foundation wants to make data processing units (DPUs) easier to deploy, with the launch of the Open Programmable Infrastructure (OPI) project this week.

    The program has already garnered support from several leading chipmakers, systems builders, and software vendors – Nvidia, Intel, Marvell, F5, Keysight, Dell Tech, and Red Hat to name a few – and promises to build an open ecosystem of common software frameworks that can run on any DPU or smartNIC.

    SmartNICs, DPUs, IPUs – whatever you prefer to call them – have been used in cloud and hyperscale datacenters for years now. The devices typically feature onboard networking in a PCIe card form factor and are designed to offload and accelerate I/O-intensive processes and virtualization functions that would otherwise consume valuable host CPU resources.

    Continue reading
  • AMD to end Threadripper Pro 5000 drought for non-Lenovo PCs
    As the House of Zen kills off consumer-friendly non-Pro TR chips

    A drought of AMD's latest Threadripper workstation processors is finally coming to an end for PC makers who faced shortages earlier this year all while Hong Kong giant Lenovo enjoyed an exclusive supply of the chips.

    AMD announced on Monday it will expand availability of its Ryzen Threadripper Pro 5000 CPUs to "leading" system integrators in July and to DIY builders through retailers later this year. This announcement came nearly two weeks after Dell announced it would release a workstation with Threadripper Pro 5000 in the summer.

    The coming wave of Threadripper Pro 5000 workstations will mark an end to the exclusivity window Lenovo had with the high-performance chips since they launched in April.

    Continue reading
  • Lenovo reveals small but mighty desktop workstation
    ThinkStation P360 Ultra packs latest Intel Core processor, Nvidia RTX A5000 GPU, support for eight monitors

    Lenovo has unveiled a small desktop workstation in a new physical format that's smaller than previous compact designs, but which it claims still has the type of performance professional users require.

    Available from the end of this month, the ThinkStation P360 Ultra comes in a chassis that is less than 4 liters in total volume, but packs in 12th Gen Intel Core processors – that's the latest Alder Lake generation with up to 16 cores, but not the Xeon chips that we would expect to see in a workstation – and an Nvidia RTX A5000 GPU.

    Other specifications include up to 128GB of DDR5 memory, two PCIe 4.0 slots, up to 8TB of storage using plug-in M.2 cards, plus dual Ethernet and Thunderbolt 4 ports, and support for up to eight displays, the latter of which will please many professional users. Pricing is expected to start at $1,299 in the US.

    Continue reading
  • AMD bests Intel in cloud CPU performance study
    Overall price-performance in Big 3 hyperscalers a dead heat, says CockroachDB

    AMD's processors have come out on top in terms of cloud CPU performance across AWS, Microsoft Azure, and Google Cloud Platform, according to a recently published study.

    The multi-core x86-64 microprocessors Milan and Rome and beat Intel Cascade Lake and Ice Lake instances in tests of performance in the three most popular cloud providers, research from database company CockroachDB found.

    Using the CoreMark version 1.0 benchmark – which can be limited to run on a single vCPU or execute workloads on multiple vCPUs – the researchers showed AMD's Milan processors outperformed those of Intel in many cases, and at worst statistically tied with Intel's latest-gen Ice Lake processors across both the OLTP and CPU benchmarks.

    Continue reading

Biting the hand that feeds IT © 1998–2022