Cloudera blows sawdust off new data science workbench

Still keeping schtum on IPO, though

Tue 14 Mar 2017 // 17:01 UTC

Cloudera is adding a data science workbench to its enterprise product, based on the offerings of acquired startup Sense.io, which the company bought last year.

The product addition comes as the Palo Alto-based company reportedly prepares for a $4.1bn initial public offering later this year, though official channels are keeping quiet about the matter.

Not about everything, however, with the self-service tool for data scientists being announced as an add-on to Cloudera Enterprise, which is currently in beta, although its price has not yet been announced. It will allow data scientists to use their preferred languages – including R, Python and Scala – and libraries within the Spark- and Hadoop-integrated platform.

Speaking to The Register, Cloudera's head of data science, Sean Owen, declined to start the “holy war” of figuring out the best language workbench was offering, but explained that “all are relevant for data science” bringing together the data and compute platform of Hadoop for large-scale production efforts, typically written with Java, Scala or JVM, and also providing data scientists access to their tools of choice in R and Python within the same ecosystem.

Asked whether the workbench could be used to analyse whether to buy shares in a company's initial public offering, Owen chuckled and told The Register: “That depends on what company we're talking about. Other than that I'd say no comment.”

Workbench will be accessible like other web-based notebook tools Zeppelin and Jupyter, to access codes and scripts, edit them, and execute them without fiddling with the command line or firing up an IDE, Owen explained, which is enjoyable to do on-cluster “because I don't have to copy the data out of the secured parameter” of Cloudera Enterprise.

This will also open up hybrid possibilities for data analytics, according to Owen, who acknowledged that, for instance, Hadoop and deep learning was difficult. “That's easier with Data Science Workbench,” said Owen, with the notebook running on customers' clusters and allowing both Hadoop-native tooling and libraries, but also tools developed elsewhere such as Google's Tensorflow.

Projects to tackle distributed deep learning are around, including Deeplearning4j being commercially supported by Skymind, while Yahoo!'s interest in Hadoop has seen it open-source its TensorFlowOnSpark offering.

Owen told us “Cloudera doesn't have special secret plans” to develop its own machine learning tools. Just like it doesn't have any special secret plans to IPO this year… ®

Topics

Special Features

Vendor Voice

Resources

Software

Cloudera blows sawdust off new data science workbench

Still keeping schtum on IPO, though

More about

More about

More about

More about

More about

TIP US OFF

Other stories you might like

Airbus pulls up hard, no longer buying 29.9% stake in Atos-owned Evidian

Ex-BigQuery exec and Motherduck CEO: For some users, the answer is to think small

Cloudera launches SaaS platform for the lakehouse crowd

Reducing the cloud security overhead

Cloudera adopts Apache Iceberg, battles Databricks to be most open in data tables

UK.gov finds billions in cash for big data contracts

China outlines plan for National Integrated Government Affairs Big Data System

Revealed: US telcos admit to storing, handing over location data

Amazon finally opens doors to its serverless analytics

We've never even built datacenters using robots here on Earth

Pyramid Analytics receives $120m in VC funding for 'decision intelligence'

Mastering metadata key to shifting data fast, says Arcitecta

About Us

Our Websites

Your Privacy