This article is more than 1 year old

Cloudera tells bright Sparks: Go teach yourselves Hadoop

Online DIY courses gotta be less tedious than teaching GCHQ's finest, right?

Cloudera, presumably sick of paying its staff to train spies and their ilk, has decided to launch online courses for those wanting to familiarise themselves with Hadoop and Spark.

The Palo Alto-based business has long offered training courses, including to Blighty's surveillance agency GCHQ, whose recently open sourced graph database Gaffer is made of Java code sat atop the Hadoop Distributed File System.

According to the Heilbronn Institute for Mathematical Data Mining Research Problem Book (another document ridiculously classified as top secret), which was pilfered by former NSA sysadmin Edward Snowden, the spooks' internal GCWiki has a page on Hadoop which includes Cloudera resources for learning how to use the big data tool.

GCHQ's wiki currently holds six lectures and two exercises from Cloudera and the organisation has also sent staff to attend multi-day training courses by the company.

The technology is also seeing use elsewhere in the public sector, with the Home Office currently working to centralise its many databases on the good folk of Britain using Hadoop, and of course without presenting the capability increases to the public or subjecting them to Parliamentary scrutiny.

Cloudera's instructor-led training remains a solid source of revenue for the company, although it views training as a supplementary area for revenue. Unlike fellow Hadoop heavy Hortonworks, however, where there is some strong conflict about the operating costs of wringing revenue out of support versus services, Cloudera simply doesn't believe that “paying for support calls [is] a viable business model,” according to Cloudera-man and Hadoop inventor Doug Cutting.

Speaking to The Register earlier this year, Cutting said: “Hortonworks is obviously attempting that, and we'll see if they can actually achieve profitability on that basis. We've looked at it and we don't believe it can be done at the rates people are willing to pay for support.”

OnDemand is the company's eLearning platform, which can be expected to generate some revenue and provide some community support with very low operating costs, combining recorded lectures from Cloudera senior instructors with hands-on exercises that customers complete in a cloud-based lab environment.

Among the courses being made available through OnDemand [PDF] are Developer Training For Spark and Hadoop, Cloudera Administrator Training, and Data Analyst Training. It will be available for purchase as a library, where individuals receive access to Cloudera's entire suite of training content for a year, or by individual course title available in six-month subscriptions. ®

More about


Send us news

Other stories you might like