re:Invent 2013 The more data you put in a cloud, the harder it is to migrate away. And so Amazon's new "Kinesis" data ingester is a neat piece of technology, and at the same time a canny way to turn Amazon Web Services into the Hotel California of the cloud.
Kinesis was announced by the web bazaar's chief technology officer Werner Vogels in a speech at the company's re:Invent conference today. It's essentially Amazon's attempt to fire up a commercial variant of open-source data processing and messaging engines Storm, Spark Streaming, and Kafka.
The difference between Kinesis and these systems is that Amazon handles all the pesky infrastructure management and provisioning, and simply exposes the system to a developer as a service that lets the programmer pick what data to ingest, how much, and where to feed it to.
Kinesis will also compete with commercial systems, like Google's BigQuery – though the ad-slinger's streaming capabilities are rudimentary in comparison to Kinesis's data-huffing tech.
Amazon's service can "collect and process hundreds of terabytes of data per hour from hundreds of thousands of sources," the biz wrote in a document discussing the tech. It can then stream this out rapidly to other complementary AWS technologies such as DynamoDB or RedShift for analysis or presentation.
Some of apps this makes possible could include social media data-mining software, live data analytics around financial markets, or a way of feeding changing inventory information from massive stores into software to re-order stock.
"In a few clicks and a couple of lines of code, you can start building applications which respond to changes in your data stream in seconds, at any scale, while only paying for the resources you use," the company wrote.
Admins can set up a Kinesis service by saying how much input and output they need in blocks of 1MBps named 'Shards', via the AWS console, API, or SDKs. The size of the stream can be adjusted without needing a restart, and data is loaded in with
Data placed in Kinesis is available for analysis "within seconds" the company said, and is stored across several bit barns for 24 hours, during which time it can be "read, re-read, backfilled, and analyzed," or moved into storage services like S3 and RedShift, the company said.
Amazon has also produced a client library for the service that automates how Kinesis adapts to "changes in stream volume, load-balancing streaming data, coordinating distributed services, and processing data with fault-tolerance," the company said.
However, Kinesis is an ambitious project and perhaps more prone to performance wobbles than other AWS services. "Real-time is definitely one of the areas where we have to crack a lot of nuts still," Vogels told El Reg. We will be watching its performance closely for any wobbles.
Kinesis costs $0.015 per 'Shard' per hour, along with $0.028 per 1,000,000
PUT transactions. Initially, the tech is available from the US East Region as a "Limited Preview" that developers will need to apply for. Inbound data transfer is free and there's no cost to transmit data from Kinesis into other AWS apps.
An AWS pricing example says Kinesis could be used to create an app ingesting 10MBps of data while feeding info out to two real-time processing applications for $4.22 a day.
Kinesis looks like the outcome of a big-data service which El Reg spotted in February, and dubbed the Mystery-Amazon-Data-Service (MADS).
MADS was going to be capable of "highly available, highly reliable processing of data in near-realtime". Recruitment adverts at the time said the service would have to slurp between two and five million database records per second at launch, and eventually scale to deal with hundreds of millions – which is exactly the sort of capability Kinesis requires.
With Kinesis, Amazon is giving companies a way to analyze streams of changing web data, and to do so without operating any complex hardware or infrastructure. It also gives it an app that will pour more data than ever before into its cloud, fattening its margins and letting it buy more storage gear at lower prices than ever before, allowing Bezos & Co to make money twice – first from the customer paying him, and then from the discounts he can get from his equipment suppliers. ®