This article is more than 1 year old
Blueprints revealed: Oracle crams Sparc M7 and InfiniBand into cheaper 'Sonoma' chips
Reg gets its hands on Big Red's designs
Hot Chips 2015 Oracle revealed on Monday the details of its new bargain-basement Sparc processor code-named Sonoma. The blueprints were shown at this year's Hot Chips semiconductor conference in Cupertino, California – and we've got a copy of the slides.
Sonoma is billed as a "low-cost Sparc processor for enterprise workloads," that you're supposed to cram into loads of servers and scale out across a data center. It is expected to power low-end high-density systems that Oracle hopes cloud providers will snap up. (Oracle's definition of low cost may not match yours.)
Why on Earth would a cloud provider use a ton of "low-cost" Sparc nodes over a ton of actually cheap Intel x86 nodes? Good question, and one that Oracle has tried to answer by cramming as many hardware and IO features into the Sonoma silicon as possible. That means – in Oracle's mind – fewer components and peripherals on the motherboard, thus smaller boxes thus less rack space, less power used, and lower costs.
Sparc. Lovely CPU architecture. Completely crushed in the data center. Intel chips power 99 per cent of server warehouses worldwide. That's a lot of software built for Intel processors. That's a lot of recompiling (or purchasing, if possible) if you want to use Sparc over x86.
Oracle genuinely thinks that, because its new design has InfiniBand and directly attached system RAM, thus minimizing external controller chips, you'll go for it. I don't know, perhaps you will?
All right, already. Enough cynicism, how about the facts?
Sonoma is essentially 2014's Sparc M7 design with DDR4, PCIe and InfiniBand interfaces all bundled into a single package. Just compare Sonoma, described here, with the M7's specs [PDF] revealed at last year's Hot Chips.
Sonoma has eight 4th generation Sparc cores. (The M7 has 32 4th-gen cores, the M6 (2013) had 12 3rd-gen cores, the M5 (2012) had 6 3rd-gen cores, the T5 (2012) had 16 3rd-gen cores, and the T4 (2011) had 8 3rd-gen cores. Oracle says Sonoma is ahead of the T5 in most benchmarks.)
The new chip also is built using a 20nm process size, and 13 layers of metal, just like the M7.
We're told that it features two DDR4 memory controllers, with four direct-attached DDR4-2133/2400 channels, up to two DIMMs per channel, and up to 1TB of memory per socket. The peak RAM bandwidth is 77GB/s. Sonoma does speculative memory reads just like the M7, which has four DDR4 channels and up to 2TB of memory per processor. The M7 mostly has the Sonoma beat, which is why the latter is the cheaper version.
The Sonoma package has a shared 8MB L3 cache, shared L2 caches totaling 512KB per core pair (core 0 and 1 are in one cluster, core 2 and 3 in another), and private L1 32KB caches. The M7 has 64MB of L3 cache, and four cores per cluster.
Sonoma, according to Oracle, has the same 4th-gen processor architecture as the M7: one to eight dynamic hardware threads, dual-issue out-of-order execution, and so on. The new chip has an "integrated cryptographic unit" that performs various encryption, decryption and hashing algorithms in hardware, which is faster than doing the math in software: the routines include AES, 3DES, RSA, DH, DSA, ECC, SHA-256, and SHA-512. It also supports unsafe algorithms such as MD5, SHA-1, and DES, weirdly.
We're told these hardware-accelerated routines can be accessed from applications, and "provide security and transparent encryption across Oracle software stack."
"Sonoma contains a crypto-unit with user-level crypto instructions," Basant Vinaik, Oracle's senior principal engineer of CPU and I/O verification, told the Hot Chips conference.
"The cache has been optimized to reduce latency and increase throughput," he added. "Sonoma achieves low latency with its integrated memory controller. We use speculative memory read to do this. Software can tune this using threshold registers."
To InfiniBand and ... that'll do for now
Also on the Sonoma die are controllers for InfiniBand, the networking standard for high-performance computing systems. There are two InfiniBand FDR links shifting 56Gbps, plus two PCIe 3 shifting 64Gbps, and four coherence links shifting a total of 128Gbps. Coherence links are Oracle's fancy interconnects for joining two processors together.
The key thing here is the InfiniBand: it's a fast interconnect that can hook together nodes in a cluster, and join them up to compatible mass storage systems. There's no need for separate InfiniBand cards or motherboard components – it's all in the Sonoma package.
Sonoma's integrated InfiniBand host channel adapter can virtualize itself so that it appears as multiple physical devices to host operating systems. Apparently, 16,000 software processes can simultaneously reach the hardware directly, with an IOMMU maintaining security by keeping the accesses in check.
"The [Infiniband Host Channel Adapter] is compliant with the OpenFabric spec, and Oracle Database," added Rahoul Puri, a senior architect of networking and low-latency I/O at Oracle.
There's also the hardware acceleration for Oracle Database, similar to the M7's, as well as Oracle's Application Data Integrity (ADI) mechanism that tries to trap buffer overrun bugs, and the use of old pointers, to stop miscreants from exploiting them. It does this by tagging a version number on each memory pointer and the same number on the data it references.
When memory is accessed through a pointer, the CPU checks to see if the version number in the pointer matches the version number for the block of data being accessed. If there's a mismatch, the pointer was out of date, or is trying to access memory it shouldn't – so an exception is raised to halt the errant software. The M7 also features this mechanism.
Vinaik and Puri both believe the addition of the cryptographic acceleration, the ADI protection, the direct-attach RAM, and the InfiniBand support, and the lower price tag because Sonoma is a cut-price M7, will make the new chip attractive to someone.
"When you're closer to memory, we have much lower latency. We can optimize the workloads, reduce cost and power ... it's beneficial for us," added Puri, when pushed to explain how Sonoma will make life easier with all the above stuff integrated in its system-on-chip. "We'll publish more numbers [on performance] when we're closer to a product."
When Sonoma will become an actual thing people can buy, we don't know: a formal launch hasn't happened yet. This could all be paperware. We've asked Oracle for a timetable of availability, but it was not available for immediate comment. ®