Databases

This article is more than 1 year old

Cloudera launches SaaS platform for the lakehouse crowd

And it IS a crowd – marketplace is busy, so it's hoping open approach sets it apart

Thu 18 Aug 2022 // 10:22 UTC

Former Hadoop stalwart Cloudera has announced a fully managed software as a service (SaaS) version of its data platform which it claims is more open than rivals in the over-crowded market.

With the product Cloudera Data Platform (CDP) One — initially available only on AWS — Cloudera promises analytics and data exploration in a single platform.

Adopting the term "lakehouse" — coined by Databricks to bring to together the messy world of data lakes with the ordered approach of data warehouses — Cloudera is also claiming the new product offers a set of low-code data engineering and exploration tools to improve efficiency for expert business users.

Cloudera merged with Hortonworks in 2018 in a deal worth $5 billion after both firms had ridden the wave of the big-data-on-Hadoop hype.

The merger coincided with the emergence of cloud-based object storage technologies such as AWS S3, Azure BlobStorage and GCP Cloud Storage, which solve many of the same problems as Hadoop Distributed File System.

In September 2019, the company launched its Cloudera Data Platform (CDP) designed to produce an integrated approach to how organizations deploy, manage and consume data across on-premises, hybrid cloud and private cloud infrastructure.

While the cloud version of CDP was available in AWS, Google Cloud and Azure, CTO Ram Venkatesh told The Register it was a platform-as-a-service offer it operated jointly with customers. CDP One is a fully managed service.

It does, however, enter a crowded market. Snowflake has been trying to bring together structured and unstructured data in its SaaS data platform, while Databricks — which shares Cloudera's Hadoop heritage — has brought SQL analytics to its data lake.

But one difference, Venkatesh said, is Cloudera openness to giving customers choice over the tools they use to manage and analyse their data.

"The cardinal sin that was in previous attempts [at combining data lakes and data warehouses] the mapping was always tied to one engine. If it was built on Hive, then Spark would be a second-class citizen. If Spark came up with it — which is which is [Databrick's] Delta — it is not so great for Impala," he said.

But Venkatesh said Cloudera had eschewed this approach with the adoption of Apache Software Foundation’s Iceberg, which offers an open table format, designed for high-performance on big data workloads while supporting query engines including Spark, Trino, Flink, Presto, Hive and Impala.

"The middle layer — if it's independent — it's not tied to one master. It's been designed from the ground up to work with cloud storage — not just HDFS — on the bottom end, and on the top end, it is Spark, Hive, Impala and Pesto, things Cloudera may not even support.

"When you have so much data under management, it's just hubris to think that one engine can solve it all," Venkatesh said.

CDP One is now available to customers that sign up and will be widely available later this year. ®

Topics

Special Features

Vendor Voice

Resources

Databases

Cloudera launches SaaS platform for the lakehouse crowd

And it IS a crowd – marketplace is busy, so it's hoping open approach sets it apart

More about

More about

Narrower topics

More about

More about

More about

Narrower topics

TIP US OFF

Other stories you might like

Tencent Cloud to revisit design after circular dependencies slowed emergency API fix

Alleged cryptojacker accused of stealing $3.5M from cloud to mine under $1M in crypto

Alibaba Cloud reveals network telemetry tool that helped cut number of engineers needed by 86%

Protecting distributed branch office environments from ransomware

Backblaze cloud storage buzzes with added Event Notifications

Microsoft hikes Dynamics 365 prices by around ten percent or more

AWS must pay $525M to cloud storage patent holder, says jury

SharePoint logs are easily circumvented and Microsoft is dragging its heels

US-EAST-1 region is not the cloudy crock it's made out to be, claims AWS EC2 boss

Huawei Cloud reveals the dynamic traffic allocation system it uses to cut bandwidth bills

Irish power crunch could be prompting AWS to ration compute resources

Alibaba Cloud slashes prices outside China

About Us

Our Websites

Your Privacy