Devops

This article is more than 1 year old

Amazon finally opens doors to its serverless analytics

Still managing app servers by hand? What is this, 2012?

Thu 2 Jun 2022 // 19:42 UTC

If you want to run analytics in a serverless cloud environment, Amazon Web Services reckons it can help you out all while reducing your operating costs and simplifying deployments.

As is typical for Amazon, the cloud giant previewed this EMR Serverless platform – EMR once meaning Elastic MapReduce – at its Re:Invent conference in December, and only opened the services to the public this week.

AWS is no stranger to serverless with products like Lambda. However, its EMR offering specifically targets analytics workloads, such as those using Apache Spark, Hive, and Presto.

Amazon’s existing EMR platform already supported deployments on VPC clusters running in EC2, Kubernetes clusters in EKS, and on-prem deployments running on Outposts. And while this provides greater control over the application and compute resources, it also required the user to manually configure and manage the cluster.

What’s more, the compute and memory resources needed for many data analytics workloads are subject to change depending on the complexity and volume of the data being processed, according to Amazon.

EMS Serverless promises to eliminate this complexity by automatically provisioning and scaling compute resources to meet the demands of open-source workloads. As more or less resources are required to accommodate changing data volumes, the platform automatically adds or removes workers. This, Amazon says, ensures that compute resources aren’t underutilized or over-committed. And customers are only charged for the time and number of workers required to complete the job.

Customers can further control costs by specifying a minimum and maximum number of workers and the virtual CPUs and memory allocated to each worker. Each application is fully isolated and runs within a secure instance.

According to Amazon, these capabilities make the platform ideal for a number of data pipeline, shared cluster, and interactive data workloads.

By default EMS Serverless workloads are configured to start when jobs are submitted and stop after the application has been idle for more than 15 minutes. However, customers can also per-initialize workers to reduce the time require starting the process.

EMR Serverless also supports shared applications using Amazon’s identity and access management roles. This enables multiple tenants to submit jobs using a common pool of workers, the company explained in a release.

At launch, EMS Serverless supports applications built using the Apache Spark and Hive frameworks.

Regardless of how the application is deployed, workloads are managed centrally from Amazon’s EMR Studio. The control plane also allows customers to spin up new workloads, submit jobs, and review diagnostics data. The service also integrates with AWS S3 object storage, enabling Spark and Hive logs to be saved for review.

EMR Serverless is available now in Amazon’s North Virginia, Oregon, Ireland, and Tokyo regions. ®

Topics

Special Features

Vendor Voice

Resources

Devops

Amazon finally opens doors to its serverless analytics

Still managing app servers by hand? What is this, 2012?

More about

More about

Narrower topics

Broader topics

More about

More about

More about

Narrower topics

Broader topics

TIP US OFF

Other stories you might like

Irish power crunch could be prompting AWS to ration compute resources

AWS must pay $525M to cloud storage patent holder, says jury

Digital Realty wants to turn Irish datacenters into grid-stabilizing power jugglers

Industrial systems integrating digitalisation

AI energy draw from Chicago datacenters to rise ninefold

GenAI will be bigger than the cloud or the internet, Amazon CEO hopes

Snowmobile, Amazon's truck-powered migration service, reaches the end of the road

US-EAST-1 region is not the cloudy crock it's made out to be, claims AWS EC2 boss

Blackstone wants to plug hyperscale datacenter into former Britishvolt battery site

Microsoft aims to triple datacenter capacity to fuel AI boom

Digital Realty ditches diesel for salad dressing in US to cut datacenter emissions

Using its own sums, AMD claims it's helping save Earth with Epyc server chiplets

About Us

Our Websites

Your Privacy