This article is more than 1 year old
Amazon rejigs EC2 to run parallel HPC apps
A veritable cluster
Online retailer and IT disrupter Amazon is getting its high performance computing act together on its Elastic Compute Cloud (EC2) service by allowing customers to spin up tightly coupled virtual server nodes to run real-world, parallel supercomputing applications.
On Tuesday, Amazon Web Services launched a new service called Cluster Compute Instances, which takes a bunch of x64 servers using Intel's Xeon processors and links them together using 10 Gigabit Ethernet interfaces and switches. As you can see from the Cluster Compute Instances sign-up page, the EC2 virtual server slices function just like any other sold by Amazon, except that the HPC variants have 10 Gigabit Ethernet links and also have a specific hardware profile so propellerheads can seriously tune their applications to run well.
With other EC2 slices, you never know what specific iron you are going to get when you buy a small, medium, large, or extra large virtual slice rated at a certain number of EC2 compute units.
In the case of HPC-specific slices, Amazon is providing a slice that has a two-socket x64 server based on Intel's Xeon X5570, which has a clock speed of 2.93 GHz and 8 MB of on-chip cache memory. Those processors are in the quad-core "Nehalem-EP" family that was announced by Intel in March 2009, not the latest six-core "Westmere-EP" Xeon 5600s that debuted in March of this year. (Amazon could easily plug six-core Xeon 5600s in these machines, since they are socket compatible with the Xeon 5500s).
This server represents an aggregate of 33.5 EC2 compute units and presents 23 GB of virtual memory to the HPC application running atop it. This is four times the extra large EC2 slice in terms of compute units, according to Amazon. The chips run in 64-bit mode, which is necessary to address more than 2 GB of memory in a node.
HPC shops are not generally keen on hypervisors because they eat CPU cycles and generally add network and storage I/O latencies, but at a certain price, some people will try anything and make do, and thus the Cluster Compute Instances on EC2 are based on the Amazon variant of the Xen hypervisor (called Hardware Virtual Machine, or HVM) to virtualize the server's hardware. Amazon requires that the cluster nodes be loaded with an Amazon Machine Image (AMI) stored on Amazon's Elastic Block Storage (EBS) storage cloud.
At the moment, Amazon is restricting the cluster size to eight instances, for a total of 64 cores. This is not a particularly large cluster, probably something on the order of 750 gigaflops of peak theoretical number-crunching oomph before you take out the overhead of virtualization. But it is more than a lot of researchers have on their workstations and PCs, and that is the point. If you want to get more oomph, you can request it.
Clearly larger configurations will not only be available, but are necessary. In the announcement, Lawrence Berkeley National Laboratory, which had been testing HPC applications on the EC2 cloud, said that the new Cluster Compute Instances had a factor of 8.5 times better performance than other EC2 instances that it had been testing. While LBNL was not specific, presumably it was using slow Gigabit Ethernet and perhaps less impressive iron. (Amazon had better hope that was the case).
Peter De Santis, general manager of the EC2 service at Amazon, said that an 880 server sub-cluster was configured to run the Linpack Fortran benchmark test to rank supercomputer power, and was able to deliver 41.82 teraflops (presumably sustained performance, not peak). If by "server" De Santis meant a physical server, then roughly half of the peak flops in the machines are going up the chimney on the EC2 slices.
That sounds pretty awful, but if you sift through the latest Top 500 rankings to find an x64 cluster using 10 Gigabit Ethernet interconnects, you'll see the fattest one is the "Coates" cluster at Purdue University, which is based on 7,944 quad-core Opterons running at 2.5 GHz cores, is rated at a peak 79.44 teraflops but on the Linpack test only delivers 52.2 teraflops. So 34 per cent of the flops on the unvirtualized cluster go up the chimney.
InfiniBand networks deliver a much better ratio because of their higher bandwidth and lower latency, which is why HPC shops prefer them and why Amazon will eventually have to offer InfiniBand too, if it wants serious HPC business. And eventually, Amazon will also have to offer GPU co-processors as well because codes are being adapted to use their relatively cheap teraflops.
As you can see from Amazon's EC2 price list, the Cluster Compute Instances cost $1.60 per hour for on-demand slices, which is actually quite a bit less than the $2.40 per hour Amazon is charging for generic quadruple extra large instances with fat memory. So it looks like Amazon understands that HPC shops are cheapskates compared to other kinds of IT organizations. If you want to reserve an HPC instance for a whole year, you're talking $4,290 and for three years, it's $6,590, plus 56 cents per hour usage.
The HPC slices on EC2 are available running Linux operating systems and are for the moment restricted to the North Virginia region of Amazon's distributed data centers in the United States. (Right next to good old Uncle Sam). No word on when the other regions in the US get HPC slices, or when it will be available in other geographies. Amazon had not returned calls as El Reg went to press to get some more insight into how it will be rolled out in Amazon's Northern California, Ireland, and Singapore data centers. ®