Comment Hadoop is hot. That’s not in question. What is in question, however, is what it’s hot for, and whether it can move beyond Silicon Valley geeks to mainstream enterprises. With Hortonworks filing to IPO and Cloudera reportedly doing more than $100m in sales, it’s tempting to think that Hadoop has already gone mainstream.
The reality, however, is very different.
Hadoop remains an overly complex beast for most enterprises, one reason that 43 per cent of Hortonworks’ revenue derives from low, low-margin services. While many people know how to pronounce Hadoop, far fewer know how to implement it, or why they should.
For Hadoop to truly become mainstream, then, it needs to move beyond its current status as an arcane science for the Silicon Valley elite and instead become the “operating system for distributed data” that its founder, Doug Cutting, proposes.
“Now what am I supposed to do with this?”
Most every enterprise today purports to be dabbling in Hadoop, and the jobs data confirms it. Each year Gartner asks enterprises about their big data plans, with “big data” often a synonym for “Hadoop” in the minds of many. In the most recent survey, 73 per cent of enterprises declared they have already invested or plan to invest in a big data project within the next two years.
More tellingly, while 2012 to 2013 saw little movement beyond pilots to deployments, from 2013 to 2014 more enterprises got off the big data fence and started running Hadoop and other Big Data technologies in production:
While this seems like a big win for big data and Hadoop, its poster child, the survey also revealed confusion.
For example, respondents suggested a wide variety of data sources they plan to add to their projects, including difficult sources like audio and video. As Gartner analyst Nick Heudecker posits, this “overly optimistic and apparently random nature of future data sources for analysis indicates” that “organizations don't have a plan for what they're going to do next” as “picking everything isn't a strategy.” Indeed, it may simply “indicate a fear of missing out on an opportunity yet to be defined.”
That ill-defined “strategy” comes through clearly in a separate Gartner survey on Hadoop adoption.
Here Gartner asked about Hadoop blockers and found that the biggest roadblock to Hadoop adoption by far was its “undefined value proposition”:
Perhaps because of that ill-defined “Thneed-like” value proposition (“A Hadoop’s a fine something that all people need”), much of the vendor revenue derives from professional services (at “horrifc[ally bad] -35 per cent margins,” as Host Analytics chief executive Dave Kellogg notes) meant not only to implement Hadoop, but also help customers figure out why they need it in the first place.
This may be yet another reason that, hot as Hadoop may be, Hortonworks’ IPO appears to be anything but: despite claiming a $1bn valuation a few months back, its IPO values the company at a mere $659m.
No billion-dollar unicorn, this.
But, again, much of this stems from customer confusion. Former Wall Street analyst turned MongoDB senior director of corporate strategy Peter Goldmacher recently related to me a conversation he had with a salesperson at one of the Hadoop startups, who had just sold a $1m deal to an enterprise buyer. After the deal closed, the buyer asked: "Now what am I supposed to do with this?"
I’d argue that what Hadoop needs is more Microsoft-esque influence
Making Hadoop mainstream It needn’t always be like this, of course. While 72 per cent of CIOs surveyed by Barclays believe it is “still too early to say whether Hadoop would become an important technology in their organization,” there’s so much money and energy going into Hadoop that over time its community should be able to overcome its hurdles.
Most Hadoop jobs congregate in Silicon Valley where well-paid propellerheads employ Hadoop and other Big Data technologies to convince consumers to click on more ads. But these aren’t going to be the real winners in Hadoop. Not according to Goldmacher, who argues:
The biggest category of winners is the big-data practitioners. These are the business people that have identified opportunities to use data to create new opportunities or disrupt legacy business models.
But this won’t happen if Hadoop remains overly complex to use. Or maybe we’ll simply turn to other technologies.
It may be that Hadoop expertise will then trickle down to the rest of the planet, but Cloudera and other Hadoop could do a lot for their Hadoop treasure chests by mass-producing online training similar to what MongoDB – where I was vice president of community until recently - has done for the NoSQL database (more than 200,000 registrants).
They could also invest far more in making Hadoop easy to consume. Some of this work is being done by the Hadoop vendor community, with Apache Spark an excellent example of Hadoop’s original MapReduce being displaced by Spark, “largely been driven by developers who are tired of the complexity of MapReduce and who want an easier and faster way to build big data applications, primarily for Hadoop.”
But much more is needed. In fact, I’d argue that what Hadoop needs is more Microsoft-esque influence, a company that can provide easy-to-use tooling for Hadoop that makes it approachable to the average data analyst rather than requiring an overpaid data scientist.
Now, please. ®
Matt Asay is vice president of mobile at Adobe. Previously he was corporate strategy at 10gen and SVP of business development at Nodeable, acquired in October 2012. He was formerly SVP of biz dev at HTML5 start-up Strobe (now part of Facebook) and chief operating officer of Ubuntu commercial operation Canonical. With more than a decade spent in open source, Asay served as Alfresco's general manager for the Americas and vice president of business development, and he helped put Novell on its open source track. Asay is an emeritus board member of the Open Source Initiative (OSI). You can follow him on Twitter @mjasay.