Oak Ridge changes Jaguar's spots from CPUs to GPUs

Nuke lab's super upgraded to a 20-petaflops Titan


The mystery surrounding the architecture of the future "Titan" supercomputer to be installed at Oak Ridge National Laboratories and funded by the US Department of Energy is over.

Oak Ridge, which is the US nuke lab most associated with open (rather than classified and often military) science, has been working with Cray and Nvidia on future hybrid CPU-GPU designs for several years and is Cray's keystone HPC customer. So it is no surprise that the Titan supercomputer will marry Cray's Opteron-based supers with Nvidia's GPU coprocessors to make a more energy-efficient super – weighing in at a top speed of 20 petaflops when (and if) it is fully extended with GPUs.

Oak Ridge Titan logo

Rather than just throwing out the existing "Jaguar" supercomputer, Oak Ridge is doing what it has done in the past – and what the Cray supercomputers are designed to undergo – a timely upgrade. Oak Ridge did a two-step upgrade from XT3 to Xt4 systems for around $200m to get to the initial Jaguar configuration, which was rated at 263 teraflops of sustained performance on the Linpack Fortran benchmark test commonly used to rank the performance.

The current Jaguar machine consists of a mix of Cray XT4 and XT5 cabinets using six-core Opteron HE processors with a total of 224,256 cores and 362TB of main memory. The Jaguar server nodes are linked by the "SeaStar2+" XT interconnect and deliver 2.33 peak theoretical petaflops and 1.76 petaflops sustained on the Linpack test. This machine has 18,688 nodes, each with two Opteron processors.

The Titan system, also known as the OLCF-3 system in Oak Ridge-ese, will be based on the Cray XK6 ceepie-geepie, announced in May. The XK6 swaps out one of the Opteron processor sockets on the four-node XE system blade and puts in four of Nvidia's Tesla X2090 GPU coprocessors and a chipset that implements a PCI-Express 2.0 link between the Opteron G34 sockets and the Nvidia GPUs.

Each Opteron socket has four memory slots sporting DDR3 main memory. (Cray currently supports 2GB and 4GB sticks, but can use fatter memory if customers wish to do so.) Cray is not shipping the XK6 machines with the existing 12-core "Magny-Cours" Opteron 6100 processors, but is rather waiting for Advanced Micro Devices to get the 16-core "Interlagos" Opteron 6200 chips out the door. AMD has begun shipping them to OEM customers (and presumably Cray is at the front of the line), but the Opteron 6200s have not been formally announced yet.

Cray XK6 supercomputer blade server

Cray's Tesla X2090-equipped XK6 system board

The XK6 is based on the "Gemini" XE interconnect, just like the XE6 Opteron-only supers upon which they are based. That interconnect, which debuted in May 2010, is the heart of the XE6 and XE6m supers, implemented in 3D torus and 2D torus interconnects respectively on those two families of machines.

The Gemini interconnect has a lot more bandwidth than the SeaStar2+ interconnect and with about one-third the latency for hops between adjacent nodes, which results in the XE6 machine being able to pass about 100 times the messages between nodes. This is key for the message passing interface (MPI) protocol underlying most parallel supercomputer apps. And as you can now see, it will be key for the XK6 ceepie-geepie machines, since the GPUs are going to be able to do more math than the CPUs they are lashed to. A lot more messages are going to be flying around as calculations get done more quickly.

Under the plan Oak Ridge has put together to upgrade the Jaguar system to the Titan machine, the first thing that will be done is that all of those 18,688 two-socket nodes will be replaced with 4,672 four-socket hybrid XK6 blade servers. The net result will be a machine with 299,008 Opteron cores and – depending on the clock speeds AMD can deliver – probably somewhere between 25 and 30 per cent more raw x86 flops in the system ... and presumably better performance given the faster interconnect. The machine will also have 600TB of main memory, which will not hurt performance, either.

Before the end of this year, 960 of the current Tesla X2090 GPUs will be added to the Titan system (in about one-twentieth of the system nodes). Eventually the future GPU coprocessors from Nvidia based on the "Kepler" GPUs – due later this year if all goes well – will also be added to the machine, according to Steve Scott, who is the new CTO of the Tesla GPU line at Nvidia.

Similar topics


Other stories you might like

  • Chip shortage forces temporary Raspberry Pi 4 price rise for the first time

    Ten-buck increase for 2GB model 'not here to stay' says Upton

    The price of a 2GB Raspberry Pi 4 single-board computer is going up $10, and its supply is expected to be capped at seven million devices this year due to the ongoing global chip shortage.

    Demand for components is outstripping manufacturing capacity at the moment; pre-pandemic, assembly lines were being red-lined as cloud giants and others snapped up parts fresh out of the fabs, and the COVID-19 coronavirus outbreak really threw a spanner in the works, so to speak, exacerbating the situation.

    Everything from cars to smartphones have felt the effects of supply constraints, and Raspberry Pis, too, it appears. Stock is especially tight for the Raspberry Pi Zero and the 2GB Raspberry Pi 4 models, we're told. As the semiconductor crunch shows no signs of letting up, the Raspberry Pi project is going to bump up the price for one particular model.

    Continue reading
  • Uncle Sam to clip wings of Pegasus-like spyware – sorry, 'intrusion software' – with proposed export controls

    Surveillance tech faces trade limits as America syncs policy with treaty obligations

    More than six years after proposing export restrictions on "intrusion software," the US Commerce Department's Bureau of Industry and Security (BIS) has formulated a rule that it believes balances the latitude required to investigate cyber threats with the need to limit dangerous code.

    The BIS on Wednesday announced an interim final rule that defines when an export license will be required to distribute what is basically commercial spyware, in order to align US policy with the 1996 Wassenaar Arrangement, an international arms control regime.

    The rule [PDF] – which spans 65 pages – aims to prevent the distribution of surveillance tools, like NSO Group's Pegasus, to countries subject to arms controls, like China and Russia, while allowing legitimate security research and transactions to continue. Made available for public comment over the next 45 days, the rule is scheduled to be finalized in 90 days.

    Continue reading
  • Global IT spending to hit $4.5 trillion in 2022, says Gartner

    The future's bright, and expensive

    Corporate technology soothsayer Gartner is forecasting worldwide IT spending will hit $4.5tr in 2022, up 5.5 per cent from 2021.

    The strongest growth is set to come from enterprise software, which the analyst firm expects to increase by 11.5 per cent in 2022 to reach a global spending level of £670bn. Growth has fallen slightly, though. In 2021 it was 13.6 per cent for this market segment. The increase was driven by infrastructure software spending, which outpaced application software spending.

    The largest chunk of IT spending is set to remain communication services, which will reach £1.48tr next year, after modest growth of 2.1 per cent. The next largest category is IT services, which is set to grow by 8.9 per cent to reach $1.29tr over the next year, according to the analysts.

    Continue reading

Biting the hand that feeds IT © 1998–2021