Power9: Google gives Intel a chip-flip migraine, IBM tries to lures big biz

The CPU arch that refuses to die

OpenPower Summit IBM's Power9 processor, due to arrive in the second half of next year, will have 24 cores, double that of today's Power8 chips, it emerged today.

Meanwhile, Google has gone public with its Power work – confirming it has ported many of its big-name web services to the architecture, and that rebuilding its stack for non-Intel gear is a simple switch flip.

There was a lot announced at this morning's OpenPower Summit in San Jose, California. Here's what went down:

Big Blue teases Power9 details

Talk about core war. Intel announces a bunch of 22-core Xeon E5 v4 server chips, and a week or so later, IBM says its next big-iron chip – the Power9 – will have 24 cores.

Big Blue eked out a few more details about its processor for the first time today (we last saw a roadmap for the Power family way back in August). The Power9 will be a 14nm high-performance FinFET product fabbed by Global Foundries. It is directly attached to DDR4 RAM, talks PCIe gen-4 and NVLink 2.0 to peripherals and Nvidia GPUs, and can chuck data at accelerators at 25Gbps.

IBM says the design is optimized for two-socket scale-out servers, hence the name Power9 SO, and includes on-chip acceleration for compression and encryption.

OpenPower opened up ... Click to enlarge any photo

The chip is aimed at big biz and supercomputers crunching analytics, big data, machine learning, and that sort of stuff. Make no mistake: Intel has the data center compute market crushed; Power is still plucking away as a niche architecture. The Power9 is due to arrive in 2017, and be the brains in the US Department of Energy's Summit and Sierra supercomputers.

Don't forget about IBM's OpenPower Foundation, which licenses blueprints to the CPU's architecture, server hardware and software out to the world. Chinese companies are preparing to launch their own Power8 and 9 chips – dubbed "partner chips" – using the OpenPower blueprints in 2018 to 2020. Those will be built out of 7nm to 10nm gates.

So Uncle Sam is spinning up Power9 supercomputers next year, and then the year after China will have its own supply of Power8 or 9 processors to fill up its racks. And yet, there's a ban on supplying high-end Intel Xeons to Chinese supercomputer builders. Either the US government hasn't thought to outlaw the export of CPU blueprints, or Big Blue's technology in foreign hands isn't seen as a strategic threat to national security.

Summit's peak performance should be 300 peta-FLOPs, thrashing China's leading 55 PFLOPS Tianhe-2, but a good chunk of the American system's performance will come from the Nvidia Volta GPUs rather than the Power9s.


Google ports its big-name web services to Power

Google loves to keep its options for suppliers open, and like any other moneybags hyper-scale cloud provider, it has the cash to splash on experiments with non-Intel-x86 architectures.

We know it's toying with 64-bit ARMv8 cores, and now Power chips. This isn't too much of a surprise because Google is a founding member of the OpenPower Foundation.

Google says it has ported many of its big-name web services to run on Power systems; its toolchain has been updated to output code for x86, ARM or Power architectures with the flip of a configuration flag. We can imagine a shedload of Google's internal source code is rather portable, and cross compiling it isn't beyond its programmers. Indeed, Google senior director Gordon MacKean said in 2015 that the cloud goliath strives to keep its software platform agnostic. For one thing, targeting multiple architectures prevents bit rot by weeding out esoteric bugs.

Given the rate of increase in use of Google's services, the ad giant knows it has to try out competing technologies to ensure it's using the best possible combinations of hardware and software to meet demand – it has to be sure it's getting the best bang for the buck, and that requires testing and experimentation.

"A lot has changed at Google since I joined nine years ago," said Maire Mahony, a Google engineering manager and an OpenPower Foundation director.

"Search could find just under a trillion web addresses, now that's up to 60 trillion web addresses. Gmail has more than a billion active users, more than double the users we had in 2012. YouTube had seven hours of video uploaded every minute, now YouTube has 400 hours of video uploaded per minute. The demand on compute has been relentless, and I can't see it abating any time soon.

Scaling problems ... Mahony's slides at the OpenPower Summit

"Compute technology development is at a crossroads. The cost of making transistors smaller is increasing, and all of this overhead makes it more challenging for us to deliver on that equation of performance per TCO dollar. We need to have a different approach. Google is backing the vision that underpins the OpenPower Foundation.

"That vision is to build scale-out server solutions based on OpenPower. We're really excited where this platform will take us."

You could hear the screams from Intel's campuses in Oregon all the way down here in San Jose.

"We have ported our infrastructure onto the Power architecture. What that means is that our toolchain supports Power; for our Google developers, enabling Power for their software applications is simply a matter of modifying a config file and off they go," she added.

"Everyone needs a second source," shrugged an Intel staffer over coffee when it emerged Google was testing Qualcomm's ARM server-grade chips. Well, here's a third source. Now it must be said that Google appears to be assessing the Power architecture at this stage – the vast majority of its systems are Intel-driven.

However, IBM's architecture is enough of a draw for the web giant that it's added support for the chips into its toolchain, so that a shift from Intel is a recompile away. Rarely is this highly secretive Google so open about its internal structures.

Which leads us into news that broke an hour before Mahony took to the stage: Google and Rackspace working together on Power9 server blueprints for the Open Compute Project. These designs are compatible with the 48V Open Compute racks Google and Facebook are working on.

The blueprints can be given to hardware factories to turn out machines relatively cheaply, which is the point of the Open Compute Project: driving down costs and designing hardware to hyper-scale requirements. Rackspace will use the systems to run Power9 workloads in its cloud.

The system itself is codenamed Zaius: a dual-socket Power9 SO server with 32 DDR4 memory slots, two NVlink slots, three PCIe gen-4 x16 slots, and a total core count of 44. And what's not to like? For one thing: high-speed NVlink interconnects between CPUs and Nvidia GPU accelerators, which Google likes to throw its deep-learning AI code at.

Rackspace also announced the arrival of its Power8 Barreleye servers – you can find out more here on our sister site, The Next Platform.

Intel Inside meets Power

The OpenPower Foundation has rolled out an "OpenPower Ready" branding for Power systems that meet certain criteria, so that buyers know what they're getting into. It sorta reminded us of Intel Inside.

A vendor requests the right to stick the badge on their gear, and either claims they meet all the necessary requirements; demonstrate they meet the requirements at an event; or get someone to verify for them. Then, if accepted, they get the badge and go into the foundation's online catalog of gear that's been given the thumbs up. And now you know. ®

Broader topics

Other stories you might like

  • Ransomware encrypts files, demands three good deeds to restore data
    Shut up and take ... poor kids to KFC?

    In what is either a creepy, weird spin on Robin Hood or something from a Black Mirror episode, we're told a ransomware gang is encrypting data and then forcing each victim to perform three good deeds before they can download a decryption tool.

    The so-called GoodWill ransomware group, first identified by CloudSEK's threat intel team, doesn't appear to be motivated by money. Instead, it is claimed, they require victims to do things such as donate blankets to homeless people, or take needy kids to Pizza Hut, and then document these activities on social media in photos or videos.

    "As the threat group's name suggests, the operators are allegedly interested in promoting social justice rather than conventional financial reasons," according to a CloudSEK analysis of the gang. 

    Continue reading
  • Microsoft Azure to spin up AMD MI200 GPU clusters for 'large scale' AI training
    Windows giant carries a PyTorch for chip designer and its rival Nvidia

    Microsoft Build Microsoft Azure on Thursday revealed it will use AMD's top-tier MI200 Instinct GPUs to perform “large-scale” AI training in the cloud.

    “Azure will be the first public cloud to deploy clusters of AMD's flagship MI200 GPUs for large-scale AI training,” Microsoft CTO Kevin Scott said during the company’s Build conference this week. “We've already started testing these clusters using some of our own AI workloads with great performance.”

    AMD launched its MI200-series GPUs at its Accelerated Datacenter event last fall. The GPUs are based on AMD’s CDNA2 architecture and pack 58 billion transistors and up to 128GB of high-bandwidth memory into a dual-die package.

    Continue reading
  • New York City rips out last city-owned public payphones
    Y'know, those large cellphones fixed in place that you share with everyone and have to put coins in. Y'know, those metal disks representing...

    New York City this week ripped out its last municipally-owned payphones from Times Square to make room for Wi-Fi kiosks from city infrastructure project LinkNYC.

    "NYC's last free-standing payphones were removed today; they'll be replaced with a Link, boosting accessibility and connectivity across the city," LinkNYC said via Twitter.

    Manhattan Borough President Mark Levine said, "Truly the end of an era but also, hopefully, the start of a new one with more equity in technology access!"

    Continue reading
  • Cheers ransomware hits VMware ESXi systems
    Now we can say extortionware has jumped the shark

    Another ransomware strain is targeting VMware ESXi servers, which have been the focus of extortionists and other miscreants in recent months.

    ESXi, a bare-metal hypervisor used by a broad range of organizations throughout the world, has become the target of such ransomware families as LockBit, Hive, and RansomEXX. The ubiquitous use of the technology, and the size of some companies that use it has made it an efficient way for crooks to infect large numbers of virtualized systems and connected devices and equipment, according to researchers with Trend Micro.

    "ESXi is widely used in enterprise settings for server virtualization," Trend Micro noted in a write-up this week. "It is therefore a popular target for ransomware attacks … Compromising ESXi servers has been a scheme used by some notorious cybercriminal groups because it is a means to swiftly spread the ransomware to many devices."

    Continue reading
  • Twitter founder Dorsey beats hasty retweet from the board
    As shareholders sue the social network amid Elon Musk's takeover scramble

    Twitter has officially entered the post-Dorsey age: its founder and two-time CEO's board term expired Wednesday, marking the first time the social media company hasn't had him around in some capacity.

    Jack Dorsey announced his resignation as Twitter chief exec in November 2021, and passed the baton to Parag Agrawal while remaining on the board. Now that board term has ended, and Dorsey has stepped down as expected. Agrawal has taken Dorsey's board seat; Salesforce co-CEO Bret Taylor has assumed the role of Twitter's board chair. 

    In his resignation announcement, Dorsey – who co-founded and is CEO of Block (formerly Square) – said having founders leading the companies they created can be severely limiting for an organization and can serve as a single point of failure. "I believe it's critical a company can stand on its own, free of its founder's influence or direction," Dorsey said. He didn't respond to a request for further comment today. 

    Continue reading

Biting the hand that feeds IT © 1998–2022