Cloudera blows sawdust off new data science workbench

Still keeping schtum on IPO, though

Cloudera is adding a data science workbench to its enterprise product, based on the offerings of acquired startup, which the company bought last year.

The product addition comes as the Palo Alto-based company reportedly prepares for a $4.1bn initial public offering later this year, though official channels are keeping quiet about the matter.

Not about everything, however, with the self-service tool for data scientists being announced as an add-on to Cloudera Enterprise, which is currently in beta, although its price has not yet been announced. It will allow data scientists to use their preferred languages – including R, Python and Scala – and libraries within the Spark- and Hadoop-integrated platform.

Speaking to The Register, Cloudera's head of data science, Sean Owen, declined to start the “holy war” of figuring out the best language workbench was offering, but explained that “all are relevant for data science” bringing together the data and compute platform of Hadoop for large-scale production efforts, typically written with Java, Scala or JVM, and also providing data scientists access to their tools of choice in R and Python within the same ecosystem.

Asked whether the workbench could be used to analyse whether to buy shares in a company's initial public offering, Owen chuckled and told The Register: “That depends on what company we're talking about. Other than that I'd say no comment.”

Workbench will be accessible like other web-based notebook tools Zeppelin and Jupyter, to access codes and scripts, edit them, and execute them without fiddling with the command line or firing up an IDE, Owen explained, which is enjoyable to do on-cluster “because I don't have to copy the data out of the secured parameter” of Cloudera Enterprise.

This will also open up hybrid possibilities for data analytics, according to Owen, who acknowledged that, for instance, Hadoop and deep learning was difficult. “That's easier with Data Science Workbench,” said Owen, with the notebook running on customers' clusters and allowing both Hadoop-native tooling and libraries, but also tools developed elsewhere such as Google's Tensorflow.

Projects to tackle distributed deep learning are around, including Deeplearning4j being commercially supported by Skymind, while Yahoo!'s interest in Hadoop has seen it open-source its TensorFlowOnSpark offering.

Owen told us “Cloudera doesn't have special secret plans” to develop its own machine learning tools. Just like it doesn't have any special secret plans to IPO this year… ®

Similar topics

Other stories you might like

  • Robotics and 5G to spur growth of SoC industry – report
    Big OEMs hogging production and COVID causing supply issues

    The system-on-chip (SoC) side of the semiconductor industry is poised for growth between now and 2026, when it's predicted to be worth $6.85 billion, according to an analyst's report. 

    Chances are good that there's an SoC-powered device within arm's reach of you: the tiny integrated circuits contain everything needed for a basic computer, leading to their proliferation in mobile, IoT and smart devices. 

    The report predicting the growth comes from advisory biz Technavio, which looked at a long list of companies in the SoC market. Vendors it analyzed include Apple, Broadcom, Intel, Nvidia, TSMC, Toshiba, and more. The company predicts that much of the growth between now and 2026 will stem primarily from robotics and 5G. 

    Continue reading
  • Deepfake attacks can easily trick live facial recognition systems online
    Plus: Next PyTorch release will support Apple GPUs so devs can train neural networks on their own laptops

    In brief Miscreants can easily steal someone else's identity by tricking live facial recognition software using deepfakes, according to a new report.

    Sensity AI, a startup focused on tackling identity fraud, carried out a series of pretend attacks. Engineers scanned the image of someone from an ID card, and mapped their likeness onto another person's face. Sensity then tested whether they could breach live facial recognition systems by tricking them into believing the pretend attacker is a real user.

    So-called "liveness tests" try to authenticate identities in real-time, relying on images or video streams from cameras like face recognition used to unlock mobile phones, for example. Nine out of ten vendors failed Sensity's live deepfake attacks.

    Continue reading
  • Lonestar plans to put datacenters in the Moon's lava tubes
    How? Founder tells The Register 'Robots… lots of robots'

    Imagine a future where racks of computer servers hum quietly in darkness below the surface of the Moon.

    Here is where some of the most important data is stored, to be left untouched for as long as can be. The idea sounds like something from science-fiction, but one startup that recently emerged from stealth is trying to turn it into a reality. Lonestar Data Holdings has a unique mission unlike any other cloud provider: to build datacenters on the Moon backing up the world's data.

    "It's inconceivable to me that we are keeping our most precious assets, our knowledge and our data, on Earth, where we're setting off bombs and burning things," Christopher Stott, founder and CEO of Lonestar, told The Register. "We need to put our assets in place off our planet, where we can keep it safe."

    Continue reading

Biting the hand that feeds IT © 1998–2022