All the sauce on Big Blue's hot chip: More on Power7+

Clock crank, cache bump ... and maybe on-chip memory compression too

The Hot Chips 24 conference hosted by Stanford University is next week, and IBM, Oracle, Advanced Micro Devices, Fujitsu, and Intel are expected to talk tech relating to just-announced or impending processors. But Big Blue seems unable to contain its enthusiasm for the Power7+ chip that it will talk about alongside its next-generation zNext processors for its System z mainframes.

We have already told you about some of the details on the forthcoming chip that could be scrounged from poking around the Intertubes. From a die shot of the Power4 through Power7+ families of processors that IBM has shown to customers and partners, we were able to discover that the Power7+ chip had eight cores, just like the Power7 chip the precedes it.

It wasn't clear how much on-chip, shared embedded DRAM L3 cache memory was on that chip from looking at the die, but it was clearly more. Now. thanks to a performance document published on IBM's developerWorks site (PDF), we know that IBM is boosting the L3 cache size from 4MB for each local core segment on the Power7 chip (for a total of 32MB) to 10MB per core on the Power7+ chip (for a total of 80MB). This is a tremendous amount of cache memory and is four times what Intel has put on its latest "Sandy Bridge" Xeon E5 server processors.

All that extra cache memory, which should have a dramatic effect on performance, is enabled because of the shrink from the 45 nanometer processes used to etch the Power7 chips to the 32 nanometer processes used for the Power7+ chips. But there are some other changes to the chip in addition to making the cores smaller (the cores are basically the same) and wrapping more cache around them. IBM's roadmaps have been talking about accelerators, and if you poke around patches to the Linux kernel, you can see what some of them are. As previously reported:

  • In this post at the Linux kernel drive database, we see that the Power7+ will have an in-nest cryptographic accelerator that supports the Advanced Encryption Standard (AES) encryption algorithm as well as the Secure Hash Algorithm-2 (SHA-2) functions developed by the National Security Agency in the United States. (Hash functions are used all over the place in code and microcode alike.)
  • This link at the Linux-Crypto site talks about driver support for an on-chip AES accelerator. (Intel's Xeon 5600, E5, and E7 processors support AES encryption and decryption, and Oracle's Sparc T4 supports both AES encryption and SHA-1 and SHA-2 hashing functions.)
  • This link suggests there will be a random number generator etched onto the Power7+ processor. RNGs are also an important part of many applications, particularly in financial services or physics simulations that require randomness.

IBM's chipheads were talking to the Wall Street Journal about the upcoming Hot Chips conference, and Satya Sharma let slip that the clock speeds on Power7+ chips would be 10 to 20 percent higher than those on the Power7. Sharma is an IBM Fellow and CTO of the Power Systems line who leads the development of the Power7 and Power7+ processors.

Power 7 clock speeds range from a low of 3GHz – on a four-core chip used in the Power 720 entry server – to a high of 3.92GHz in the Power 780 with all eight cores turned on, and a high of 4.14GHz in that chip running in turbo boost mode with half the cores turned off. You'd also get 4GHz in an eight-core chip used in the Power 795 and 4.25GHz in a four-core variant also used in that big box. That puts the possible range of clock speeds for Power7+ chips between 3.3GHz and 5.1GHz, but there could be some wiggle room there as IBM might get more clocks on the smaller chips and less on the larger ones. (Traditionally, IBM revs the processors on its biggest boxes faster to boost single-thread performance, so this would be a departure.)

Die shot of the IBM Power7+ processor

Die shot of the IBM Power7+ processor for Power Systems iron

I was guessing that IBM would boost the clock speed on the Power7+ chips by between 25 and 30 percent, with the top bin parts spinning at above 5GHz and in the same range as the current z11 engines used in the System zEnterprise 114 and 196 machines, a quad-core chip that spins at 5.2GHz. (IBM will also apparently be boosting the clock speed on the zNext processor to 5.5GHz, up from the 5.2GHz used on the top-end z11 processor used in the current System z line.) We'll find out about the clock speeds in a week from the presentations at Hot Chips.

El Reg asked Big Blue for clarification on the statements made about the Power7+ chip in the WSJ, and this is what came back from Big Blue:

"Power7+ leverages 32 nanometer technology to provide increased frequency, 2.5X L3 cache, security enhancements, and memory compression with no increased power over previous generation Power7 chips."

The interesting bit in that statement is a reference to "memory compression." The AIX 6.1 operating system from 2010 was given a feature called Active Memory Expansion, a data compression algorithm implemented in software and tied to the Power7 processors that could do 2:1 squeezing on main memory. This data compression does two things: it allows more stuff to live in main memory, and it also allows for CPU utilization to be driven up in the system, pushing more work through it.

On one benchmark test (PDF) running SAP ERP applications on a 12-core Power 7 server with 18GB of physical memory, the memory was maxxed out but the CPU was only at 46 per cent and the machine only handled 1,000 SAP users and delivered 99 transactions per second of performance. With Active Memory Expansion turned on running AIX 6.1 on this system, the box was able to boost main memory by 37 percent to 24.7 GB. The SAP workload could then push CPU utilization up to 88 per cent (some from the memory compression), but now the machine supported 1,700 users and did 166 transactions per second. That's 70 per cent more users doing 65 per cent more work.

Active Memory Expansion imposed overhead on the Power7 CPU, but it is possible that IBM has etched the algorithms for crunching memory into the Power7+ chip, thereby eliminating the overhead on the cores in the processor. Also, if this memory compression is etched onto the chip, then it presumably could be used by Linux and IBM i operating systems, which do not currently support it. It will also presumably be a free feature instead of a charged feature, as it was with the AIX-Power7 combo.

"There should be nothing surprising here, as IBM has always followed a model of mapping processor architectures in the next generation of silicon to improve the value to the customer," explained Ron Kalla, chief engineer at IBM for both the Power7 and Power7+ processors, in an email exchange. "If you go back all the way to the RS64 processors, we mapped those into multiple technologies, adding a few new features along the way. This time, between Power7 and Power7+, we used the technology slightly differently. We decided to hold the power envelope and die area constant so we can easily plug upgrade existing systems while providing increased frequency."

So the Power7+ chips will slide into the current Power7 sockets, which is a good thing for customers and IBM alike.

"We also invested the additional transistors provided by 32nm technology in a few ways," continued Kalla. "We added eDRAM cache, which provides a high performance return on area and added on chip accelerators to offload work from the processor cores so more workload can be done by the existing cores – this has the same effect as adding cores. We also made security enhancements to provide higher levels of protection for our customers' data."

IBM doesn't publish thermal ratings for its various Power processors, which come with four, six, or eight cores with varying clock speeds. (There may also be differences in L3 cache, but IBM has never said so.) We will try to get some sense of where they are in terms of power consumption and heat dissipation at Hot Chips next week. ®

Similar topics

Other stories you might like

  • DRAM prices to drop 3-8% due to Ukraine war, inflation
    Wait, we’ll explain

    As the world continues to grapple with unrelenting inflation for many products and services, the trend of rising prices is expected to have the opposite impact on memory chips for PCs, servers, smartphones, graphics processors, and other devices.

    Taiwanese research firm TrendForce said Monday that DRAM pricing for commercial buyers is forecast to drop around three to eight percent across those markets in the third quarter compared to the previous three months. Even prices for DDR5 modules in the PC market could drop as much as five percent from July to September.

    This could result in DRAM buyers, such as system vendors and distributors, reducing prices for end users if they hope to stimulate demand in markets like PC and smartphones where sales have waned. We suppose they could try to profit on the decreased memory prices, but with many people tightening their budgets, we hope this won't be the case.

    Continue reading
  • Intel offers 'server on a card' reference design for network security
    OEMs thrown a NetSec Accelerator that plugs into server PCIe slots

    RSA Conference Intel has released a reference design for a plug-in security card aimed at delivering improved network and security processing without requiring the additional rackspace a discrete appliance would need.

    The NetSec Accelerator Reference Design [PDF] is effectively a fully functional x86 compute node delivered as a PCIe card that can be fitted into an existing server. It combines an Intel Atom processor, Intel Ethernet E810 network interface, and up to 32GB of memory to offload network security functions.

    According to Intel, the new reference design is intended to enable a secure access service edge (SASE) model, a combination of software-defined security and wide-area network (WAN) functions implemented as a cloud-native service.

    Continue reading
  • IBM buys Randori to address multicloud security messes
    Big Blue joins the hot market for infosec investment

    RSA Conference IBM has expanded its extensive cybersecurity portfolio by acquiring Randori – a four-year-old startup that specializes in helping enterprises manage their attack surface by identifying and prioritizing their external-facing on-premises and cloud assets.

    Big Blue announced the Randori buy on the first day of the 2022 RSA Conference on Monday. Its plan is to give the computing behemoth's customers a tool to manage their security posture by looking at their infrastructure from a threat actor's point-of-view – a position IBM hopes will allow users to identify unseen weaknesses.

    IBM intends to integrate Randori's software with its QRadar extended detection and response (XDR) capabilities to provide real-time attack surface insights for tasks including threat hunting and incident response. That approach will reduce the quantity of manual work needed for monitoring new applications and to quickly address emerging threats, according to IBM.

    Continue reading

Biting the hand that feeds IT © 1998–2022