Cloudera execs grab the mic at ex-Hortonworks gig, dish details on new data platform

Go off-cluster if you wanna, plus 'batteries-included' Kubernetes containers

Cloudera, fresh from the uneven merger with former Hadoop distro competitor Hortonworks, used its first major public event to thrust a new data platform hard at the enterprise.

Koopmans presenting at Cloudera DataWorks 2019

Product management Fred Koopmans takes the crowd through CDP

Execs at the not-so-new-look firm (a revamped logo and branding will be revealed soon) are this week taking to the stage at the DataWorks Summit, which was Hortonworks' annual shindig before the corporate mash-up.


Cisco and AWS hop into bed for steamy hybrid Kubernetes action


At a press and analyst event yesterday ahead of the main conference, top brass fleshed out the firm's new flagship product, the Cloudera Data Platform (CDP), which was trailed as the Hadoop-flinger recently reported what were widely seen as lacklustre financials.

CMO Mick Hollinson in typically understated fashion said Cloudera's "enterprise data cloud" was solely focused on "solving the big data problem for the largest companies on the planet".

Hollinson outlined four elements: multi-function analytics anywhere; the ability to support "every conceivable cloud mechanism"; common support and governance; and that it is still an open platform. Having been historically open-core, he said, Cloudera's distro will now be 100 per cent open source – as was Hortonworks.

The first incarnation of CDP will be delivered in summer. This will support two public clouds – Azure and AWS – and support both data engineering and data warehousing. Later in the year, or possibly early 2020, there will be a second release supporting private cloud containerization and further analytics functions.

As we pointed out last week, CEO Tom Reilly has said Cloudera can use CDP to compete against AWS because it can sell multi-vendor clouds and hybrid clouds too.

What's in the box?

The technical spec of the platform was added by Fred Koopmans, veep for product management, who touted the service as an answer to all its customers' prayers.

For instance, a "huge driver" for many of its customers is to ensure products are open source and "provide no dead ends", while a hybrid, multi-cloud platform allows them to be prepared for rapid and dramatic changes in infrastructure.

Koopmans noted that both companies last summer introduced major new versions of their platforms for the first time in about five years – but that the "vast majority" of customers were still running previous versions. They will be given a direct upgrade path to CDP, which he said was a common question for customers.

However, CDP, he said, was a much bigger leap – "a new kind of platform."

Sheaf of £50 notes poised on the rim of a toilet bowl as toilet is flushed. Collage of two photos sourced from Shutterstock

So despite all the cash ploughed into big data, no one knows how to make it profitable


It will include a shiny interaction model that means not everyone in a firm has to share the same base clusters or upgrade cycles. Koopmans said this was in response to a common question of how firms can speed up internal access to data, and how they can "be more agile".

CDP, he claimed, addresses this by allowing biz users to deploy new applications off-cluster. And options for the application experience are a flexible approach called Distro-X or a self-service experience that is supposed to be more simplified and constrained. The latter option trades flexibility in order to get a lot more automation and self-service.

A biz, for example, can build a self-serve data-mart that can be shared with a particular team for a few weeks, and then "throw the whole environment away". This, Koopmans said, was perfect for people who don't want to invest a lot of time and scripting for something "ephemeral".

CDP also brings with it a new computing model – rather than deploying on bare metal, or if in the cloud running on IaaS from Amazon, Google or Microsoft, Koopmans said it will now run on a container platform.

"First off it's virtualized by default, rather than as an afterthought; second it's elastic, so you can grow and shrink these resources much more simply, much more efficiently," he said. This means storage and compute don't have to scale at the same rate.

In the data centre, there will be two deployment models: first, the customer provides one. Second, Cloudera offers a "batteries-included" version. "Most customers don't yet have a general-purpose Kubernetes environment we can run a container on, or if they do it's not really optimised for big data applications," he said.

There will also be a new management framework that Koopmans said would enable much greater scale for applications. With potentially thousands of applications sharing a data lake and computing environment, the exec said it was crucial for customers to have unified control of that, along with automated management of their life cycle. There will also be unified metadata management for common security and governance.

Hollinson previously claimed this common security and governance model helped Cloudera's platform stand out from those from the other vendors, as it didn't create a "competitive moat".

"There are many companies that may offer one workload, they offer that with their own set of security and governance models. Then if you buy another workload from another company, you need another [model]," he said. "This is true even inside the large public cloud vendors."

Other elements are new form factors for simplified operation in the cloud, new portability and integration tools, and a new development model that Koopmans claimed will allow faster execution, with updates pushed out twice a month.

Amazon CEO Jeff Bezos

Cloudera shakes off Hortonworks fixation, realises AWS was the big baddie all along


There will be expanded data warehouse tools, taking the best elements of each of the distros' toolsets – which had diverged – with an eventual aim being to automate the choice for customers so the best one is automatically selected for use case.

Koopmans also pointed to new capabilities available for existing customers now, "without any major surgery". This includes, for CDH, a remote cluster management service, which was an operational element Hortonworks had. HDP customers will get Cloudera's integrated machine learning model development platform, which aims to help data scientists to be more productive.

Hollinson also said that the firm was trying to "teach customers how to fish" – by which he meant that Cloudera would sell them professional services and training – adding that the two Hadoop-flingers would continue their respective business relationships with other firms. ®

Other stories you might like

  • It's primed and full of fuel, the James Webb Space Telescope is ready to be packed up prior to launch

    Fingers crossed the telescope will finally take to space on 22 December

    Engineers have finished pumping the James Webb Space Telescope with fuel, and are now preparing to carefully place the folded instrument inside the top of a rocket, expected to blast off later this month.

    “Propellant tanks were filled separately with 79.5 [liters] of dinitrogen tetroxide oxidiser and 159 [liters of] hydrazine,” the European Space Agency confirmed on Monday. “Oxidiser improves the burn efficiency of the hydrazine fuel.” The fuelling process took ten days and finished on 3 December.

    All eyes are on the JWST as it enters the last leg of its journey to space; astronomers have been waiting for this moment since development for the world’s largest space telescope began in 1996.

    Continue reading
  • China to upgrade mainstream RISC-V chips every six months

    Home-baked silicon is the way forward

    China is gut punching Moore's Law and the roughly one-year cadence for major chip releases adopted by the Intel, AMD, Nvidia and others.

    The government-backed Chinese Academy of Sciences, which is developing open-source RISC-V performance processor, says it will release major design upgrades every six months. CAS is hoping that the accelerated release of chip designs will build up momentum and support for its open-source project.

    RISC-V is based on an open-source instruction architecture, and is royalty free, meaning companies can adopt designs without paying licensing fees.

    Continue reading
  • The SEC is investigating whistleblower claims that Tesla was reckless as its solar panels go up in smoke

    Tens of thousands of homeowners and hundreds of businesses were at risk, lawsuit claims

    The Securities and Exchange Commission has launched an investigation into whether Tesla failed to tell investors and customers about the fire risks of its faulty solar panels.

    Whistleblower and ex-employee, Steven Henkes, accused the company of flouting safety issues in a complaint with the SEC in 2019. He filed a freedom of information request to regulators and asked to see records relating to the case in September, earlier this year. An SEC official declined to hand over documents, and confirmed its probe into the company is still in progress.

    “We have confirmed with Division of Enforcement staff that the investigation from which you seek records is still active and ongoing," a letter from the SEC said in a reply to Henkes’ request, according to Reuters. Active SEC complaints and investigations are typically confidential. “The SEC does not comment on the existence or nonexistence of a possible investigation,” a spokesperson from the regulatory agency told The Register.

    Continue reading

Biting the hand that feeds IT © 1998–2021