Re:Invent Retailing giant Amazon runs AWS as a subsidiary at arm's length and in a somewhat stealthy manner at that.
Having AWS be a little removed from Amazon is necessary because Amazon often competes with some of the companies that want to host applications on its cloud, and the stealth is just a way for AWS to preserve some mystique in terms of just how large the cloud computing unit is and how pervasive its use is.
As the first mover and undisputed leader in cloud computing, what we want to know is exactly the kind of data that Rackspace Hosting provides in its quarterly reports: server count, customer count, employee count, revenue per customer, and revenue by category for the collection of infrastructure and platform services that are known as the AWS cloud.
We also want to know what percentage of Amazon's own compute capacity is met by AWS and what percent is not, and how much of the AWS revenue and capacity pie is sliced out by Amazon itself.
And what El Reg really wants to know is this: Are revenues and profits at AWS sufficient that the rest of us using the AWS cloud are basically paying the bill for Amazon's IT infrastructure?
At some point, if AWS is large enough (earlier this year we estimated AWS revenues for 2012 would be somewhere north of $2.1bn in 2012), we are helping Bezos & Co build a much better IT mousetrap than it might have otherwise done both by helping it not only cut its IT costs as it scales its systems up and lets us share, but also cutting costs because of the operational benefits of cloudy infrastructure and gaining revenue back on its unused capacity. It is like Amazon tricked the world's data centers into paying its own IT bill.
This is a funny turnaround. Historically, IT system makers have been among the worst users of systems, and while Amazon is a platform supplier in its own right thanks to AWS, the need to stay out in front and that cut-throat retail instinct that Bezos & Co has means it has to be among the best of IT users. Notwithstanding an occasional crash in an availability zone that brings much weeping and gnashing of teeth among the digital natives.
AWS SVP Andy Jassy
When AWS gets large enough, and its financial results are material enough, Amazon will have to disclose more, but for now, AWS is just in the "Other" category. But just to keep us interested, Amazon floats out some statistics here and there about the AWS business.
Here's an interesting one that Andy Jassy, senior vice president of the AWS subsidiary, tossed out during his keynote address at the re:Invent partner and customer conference in Las Vegas on Wednesday. Every day this year, on average, AWS added the same server capacity to its public cloud as it took to run the Amazon.com retail business back in 2003, when it had $5.2bn in revenues. Amazon had "a whole lot of servers" back then in 2003, but Jassy did not give any precise figures.
This time last year, AWS execs were bragging that the public cloud, on average through 2011, was adding enough server capacity each day to run Amazon.com when it was a $2.76bn business in 2000.
It took Amazon.com nearly three years to double its revenues, and presuming (fairly enough for an online retailer) that server capacity tracks more or less with revenue, three years to double server capacity. Amazon's AWS cloud has done it in a year, not just because it now has hundreds of thousands of companies in 190 countries using it, but because companies are using AWS for more and more services.
The S3 object store continues to grow, even as AWS delivered a feature a year ago that can automagically retire and delete old objects, thus helping companies cut down on their capacity needs on the S3 storage cloud. In the first quarter of this year, Amazon said that the S3 cloud had 905 billion objects and was handling up to 650,000 requests for objects per second during peaks. But now the pace has picked up again.
As the third quarter came to a close, AWS had 1.3 trillion objects stored in the S3 service, and it has fielding 835,000 requests per second for objects under peak loads. As you can see, there was a slight pause earlier in 2012 as the expiration and deletion service took hold, but exponential growth is now resuming as more companies adopt S3.
Amazon does not provide similar stats for the EC2 compute cloud and the EBS block storage service, which is what a lot of applications use, but obviously everyone is dying to know how many virtual machines there are out there on the AWS cloud, how much block storage they use, and how both are trending over time.
Amazon did not provide any data on the new DynamoDB NoSQL data storage service, announced earlier this year, but Jassy said it was the fastest growing service in AWS history. He did trot out some stats on the Elastic MapReduce service, which runs open source Apache Hadoop or the MapR M3 Hadoop distro on the AWS cloud:
This chart shows the cumulative number of Hadoop clusters that were fired up on the EMR service since it was launched in May 2010. This data does not show the number of current server nodes on AWS that are running the EMR service at any given time, which might be a more interesting figure. Quarterly cluster starts are obviously on the rise, but it is hard to say by how much.
In May 2011, a year after EMR had launched, there looks like there were around 800,000 cumulative EMR cluster starts, and in May 2012, it looks like something on the order of 2.3 million. That averages around 125,000 starts per month in that one year's time. The pace for the prior twelve months rolling backward averages around 200,000 starts per month.
So clearly this is a nice linear curve even if it is not exponential. And for all we know, the server count under EMR is growing exponentially as customers fire up larger and larger virtual Hadoop clusters.
There are two ways to look at this: Either AWS is presenting the data that best reflects its growth, or it is downplaying its growth to keep the competition guessing. If I were Jeff Bezos, once I got back from the Moon, I would be downplaying AWS growth and letting the competition eat static.
AWS now operates in nine regions and has 25 availability zones (where data center capacity is isolated to keep failures to a minimum), with 38 edge locations on its CloudFront content caching service.
"We are adding a lot of data centers and a lot of servers every day, and it is a pretty formidable operation," says Jassy.
One of the things that helps drive the AWS business is the relentless pursuit of better operational efficiencies for IT gear and services, which allows Amazon to cut prices. Since Amazon Web Services was launched in 2006, Amazon has cut prices 26 times on various services as they matured and it could pass on savings to customers. The 26th one will take place on December 1, when S3 prices will be cut:
Amazon thinks it is the Wal-Mart of cloud computing--and it is
The price cut will be applied across all regions. Jassy just showed the data from the US East region as an example. The price cuts may or may not be in reaction to price cuts that Google just unleashed for its raw compute and storage on its own infrastructure and platform cloud services. Microsoft, it is your turn. ®