This article is more than 1 year old
Amazon sounds death knell for rocket-science grids
Clustered instances semi standard
Comment Amazon's Cluster Compute Instances officially sounded the death knell for grid computing efforts that once held promise as the "next big thing".
Cluster Compute Instances takes a multiple of x64 and links them together using 10 Gigabit Ethernet interfaces and switches. The EC2 virtual server slices function just like any other sold by Amazon, except that the HPC variants have 10 Gigabit Ethernet links and also have a specific hardware profile that allow for fine-tuning of applications.
This semi-standardization of clustered instances reduces not only the cost to run a grid or high-performance computing HPC application, but also the vast complexity associated with building grids and the associated applications.
And it's all thanks to the cloud. No, really, grid applications are one of the best use-cases for cloud service yet. Not only does the cloud have scale, but there are simple deployment methods and far less operational concerns. And the cloud has market momentum versus Grid's scientific and academic connection.
Much in the same way that Linux usurped the marketing crown from Unix - as well as eventual market share - cloud computing took away all the glory from grid computing, which circa 2004/2005 was the term used to describe large-scale distributed computing systems - unless of course you listened to pundit Nicholas Carr and called it Utility Computing. Either way, cloud won.
And while the technological approach underlying grid and cloud are a bit different - an oversimplified explanation involves the fact that most clouds run stacks atop of virtual machines whereas grids tend to use whole machines for processing - the underlying notion of elasticity and pay-as-you go consumption is roughly the same, although the implementation and operations require different approaches and skillsets.
So why cloud and not grid? Grid computing has tended to focus on computationally intense operations, whereas cloud is more oriented toward scale and ease of deployment. Most HPC applications are typically designed to perform one specific set of functions on a specific set of hardware, whereas new-school data processing tools like Hadoop were developed to run on distributed systems that care much less about the underlying infrastructure.
I'm not suggesting that new-school applications would or should only run in the cloud. What I am saying is these new architectural patterns mean that developers can mimic a distributed environment much more easily, and that data can cross enterprise and data center boundaries in new ways. There are also many more deployment options when you are targeting clouds than your own data center.
With the exception of very specific privacy and security issues - which can arguably be addressed anyway - there are fewer and fewer reasons why any organization would want or need to run their own massive server farm.
This is not to suggest that grid and HPC will become completely obsolete but rather that, going forward, will exist in the context of cloud and will be prime candidates to parcel out to providers who can provide a vast amount of on-demand compute capacity.
In place of large numbers of servers that have to be procured and managed, cloud-based grids application deployments will look a lot more like XML and a lot less like rocket science.
Perhaps what matters most is the way developers and system administrators interact with a large amount of computing resources. It's not so much the specific code or application infrastructure that makes the cloud more appealing but the methods and capabilities that make the cloud significantly easier to use and manage.
To be clear, the new AWS offering is not a "complete" solution. Just as AWS lacks tooling for standard AMIs, so too do you need the proper tooling to manage your HPC applications on the new cluster instances. But it doesn't matter. You no longer have to own, deploy and manage hundreds of boxes to run an HPC application. You simply deploy a bunch of AMIs and kill them when the job is done.
The last iteration of grid computing required too much hardware, too much software and way too much money to reach its true potential. Clouds, both public and private, are a giant step on the data processing evolutionary scale. ®