SC09 Intel CTO Justin Rattner has a stark warning for the HPC community: Come up with a killer application or the business will stagnate.
As the person who spearheaded the creation of the ASCI Red massively parallel system for the US Department of Energy's Sandia National Laboratories - the first machine to break through the 1 teraflops performance barrier and the top of the Top 500 charts from 1997 through 2000 - Rattner is no stranger to the HPC space. The Intel senior fellow and chief technology officer is responsible for a 40,000-node Xeon cluster that the chip maker uses to simulate future chip designs before it creates the masks that will be used to etch the real chips on silicon.
The HPC business has been good to Intel in the past decade since ASCI Red came out, but according to Rattner, it is stagnating with relatively anemic growth.
Kicking off the SC09 supercomputing event in his home town of Portland, Oregon, Rattner flashed up a chart showing spending on HPC hardware, software, and services from 2008 through 2013, and the business is basically stagnant. The projections that Rattner cited showed HPC sales growing at a compound annual growth rate of 3.6 per cent in those years, rising from a little bit below $8bn a year in 2008 to around $9bn in 2013.
"This is not a healthy business," Rattner declared. "If this is what we have to look forward to, we are all in for a tough time."
Intel, of course, doesn't want to have a flatlined HPC business - not with commercial server customers figuring out all kinds of ways to do more with fewer servers. And Rattner and his peers at Intel think they have the answer, a little something called the 3D Web. Basically, it is the World Wide Web redux, only this time with a standard 3D interface backed by complex and continuous simulations and creating simulated worlds where people create and sell products. In short, companies will need to design their products and scientists will need to do their research in simulated worlds.
To do this, it means the kind of exotic computing that the world's supercomputer centers don't even take for granted has to go mainstream. And rather than looking for a killer app in particular, Intel wants to help foster a killer application framework. "HPC needs a killer application - it needs to be simple, it needs to be elegant."
Rattner showed the standard block diagram of the functional capabilities of the 3D Web architecture Intel is trying to foster, but basically the idea is to merge technologies in the HPC field, which allow for the load balancing of applications across a cluster of machines and which provide the underlying physics of simulations to be married to the identity management, content distribution, and commerce and payment systems of emerging cloud infrastructure.
With such a combination, Rattner says that companies will be able to provide continuous, realistic simulations not only of objects in virtual worlds and how they interact, but also simulate the interactions of virtual users as they come into these worlds and interact with objects and each other. Basically, Rattner is talking about Second Life in high rez and with much more sophisticated simulations.
"Behind these three characteristics, " Rattner said, referring to continuous simulation, multi-view 3D animation, and immersion and collaboration, "I see enormous demands for computing power." Yes, there was a bit of a glint in his eyes. In tests that Intel has done in the labs, it has demonstrated that the computing requirements needed to simulate worlds takes on a log scale as users are added, their interactions increase, and the realism of the simulations increases. "It is an n2 relationship, and n2 problems always warm the cockles of my heart because n2 means revenue."
To make the point of how this 3D Web might work, Rattner linked remotely to Aaron Duffy, a biology researcher at Utah State University who is simulating fern populations in a simulated world created in an environment called ScienceSim. The simulation that Duffy has created simulates the land (including soil conditions), water, and weather of an environment that has simulated ferns growing in it.
The simulation is sophisticated enough to model real weather, and how wind patterns in the environment will affect the distribution of fern spores. With a click of a button, Duffy was able to show the genetic diversity of the ferns in the simulated population.
3D in fashion
To make the 3D Web a little more human and a little more appealing to businesses, Rattner trotted out Shenlei Winkler, a fashion designer and chief technology officer at the Fashion Research Institute, which has created a 3D immersive clothing design simulator. Winkler said that the $1.7 trillion apparel industry is largely still uncomputerized, and designers still work with sketches that they send overseas to factories (usually in Asia) that mock up prototypes.
Then designers see how the clothing looks, make changes, and get another set of prototpyes made. By doing a better job simulating human bodies and cloth types, FRI's 3D clothing design system has been able to slash design time by 75 per cent and reduce physical clothing sample costs by 65 per cent. Winkler said that what clothing designers really want to do is create clothes in real time with actual customers, simulating all of the sophisticated movement of cloth on avatars of specific people and do a virtual catwalk.
Here's the problem. Rattner showed a pretty slick simulation of a piece of silk cloth falling onto a wood pedestal and then slipping to the floor. While not a simulation that would trick the human eye, this nonetheless was much slicker than anything you will ever see in a video game or a virtual world. But on a cluster or servers, it took six minutes to calculate each frame of the simulation.
"That is pretty damned slow," Rattner said. "HPC community, how are we going to do that in real time?"
One answer that you can expect Intel to give is the coupling of its Xeon processors to Larrabee graphics co-processors. This being a tech show, and Rattner being CTO, there has to be some chip to show off, and Rattner brought a workstation equipped with a single Larrabee co-processor and put it through the paces on some benchmarks.
On the SGEMM single precision, dense matrix multiply test, Rattner showed Larrabee running at a peak of 417 gigaflops with half of its cores activated (presumably the 80-core processor the company was showing off last year); and with all of the cores turned on, it was able to hit 805 gigaflops. As the keynote was winding down, Rattner told the techies to overclock it, and was able to push a single Larrabee chip up to just over 1 teraflops, which is the design goal for the initial Larrabee co-processors.
Here's the next problem. Sparse matrix math is what is commonly needed in simulations involving cloth and water. And on that test, a Larrabee chip that was not overclocked was able to do between 7.9 and 8.1 gigaflops, depending on the test and the size of the matrices.
How many Larrabee chips will we all need to buy to simulate ourselves in virtual worlds? How many will be needed to simulate those virtual worlds? Rattner did not say.
But what he did say is that the Ct dialect of C++ that Intel has created will be going into beta soon to help with the parallelization of C++ code to run on multicore and multithreaded processors, and more importantly, to spread code across CPUs and GPU-based co-processors in workstations and services to maximize performance as transparently as possible. Ct will work in conjunction with the CUDA environment from Nvidia for its GPUs and for the OpenCL environment being pushed by Advanced Micro Devices and others.
Intel is also cracking the issue of sharing data between Core and Xeon CPUs and Larrabee GPU co-processors. Future Core and Xeon chips will be able to create a virtual shared memory pool that both the CPU and GPU can access so datasets are not crunched down, serialized, and moved over the PCI-Express bus from the CPU to the GPU and then back again after calculations are done. The shared virtual memory allows the CPU and GPU to work off the same data in sequence without any movement, which should radically improve performance and smooth out simulations.
The 3D Web, says Rattner, will also require open standards. People will want to create an avatar once and teleport it to any world and be able to bring all their virtual stuff with them.
"There is no standard to move between virtual worlds, and this should give you a touch of deja vu," Rattner said. In the late 1980s, when online services like CompuServe, AOL, and Prodigy were being launched and the Web as we know it did not exist as a commercial entity, it was HPC researchers like Tim Berners-Lee at CERN and Marc Andreessen of the University of Illinois who cooked up the Web interface and perfected the Web that made it useful.
"It can be our job again to bring order to this chaos," Rattner declared.
Crista Lopes, a researcher at the University of California at Irvine has created an open source simulation environment that is extensible and modular (meaning you can yank out an replace components to, say, add the physics engines necessary for simulating cloth instead of some other engine). This simulation environment, called OpenSim, has already been used to create an interconnected set of worlds called HyperGrid that allows avatars to move from world to world, retaining their identities and virtual stuff.
Welcome to the World Wide Waste. Or the Matrix. I think I prefer to fight SkyNet. ®