Microsoft pact holds gun to Yahoo!'s stuffed elephant

Just when Yahoo! was relevant again...


Updated You didn't shed a tear over the death of Yahoo!'s independent search engine? That may change.

As the two companies finally ended the epic gestation period for their inevitable web search pact, Yahoo! and Microsoft announced that Bing - Redmond's fledgling decision engine search engine - will be "the exclusive algorithmic search and paid search platform for Yahoo! sites." And though the two Google chasers made it clear that Yahoo! will continue to use its own technologies to drive other areas of its business, you have to wonder what the pact means for the future of Hadoop, the open-source grid platform that had finally restored Yahoo!'s mojo.

Yahoo! is the largest contributor to the increasingly popular Apache project, contributing more than 70 per cent of all patches, and it employs the project's founder, Nutch-crawler-creator Doug Cutting. But in signing its pact with Microsoft, it would appear that the company has agreed to bury its largest Hadoop application: the Yahoo! Search Webmap.

The Webmap - which provides the Yahoo! search engine with a database of all known web pages, complete with all the necessary metadata - has also been described (by Yahoo!) as the world's largest Hadoop application. And though Hadoop powers other portions of Yahoo!, it's unclear whether the company will put as much time and money into moving the platform forward. Yahoo! has not responded to our requests for comment. Nor has Microsoft.

Redmond told Cnet that it's "open" to merging Bing with Yahoo!'s Searchmonkey platform, a misguided effort to expose the company's search results to third party developers. But although Bing's "reference vertical" uses Hadoop - thanks to the acquisition of semantic search startup Powerset - it seems unlikely that Redmond would embrace Hadoop on Bing proper. Indeed, Powerset's general manager has told us that nearly a year after the startup's acquisition, Microsoft has made no plans to do so.

Even if it did, that's beside the point. The point here is that Yahoo! - Hadoop's godfather - is giving up the crown jewel in its Hadoop empire.

Inspired by Google-published research papers describing Mountain View’s proprietary software infrastructure, Hadoop is a means of crunching epic amounts of data across a network of distributed machines. Doug Cutting originally developed the platform for use with Nutch, naming it after his son's stuffed elephant. But in 2006, he was hired by Yahoo!, and by the beginning of last year Hadoop had made its way onto Yahoo! production systems.

Webmap is the big example. But Yahoo! does use Hadoop for various other tasks. The platform now powers the real-time automated algorithms that select news stories for the Yahoo! home page. And in some cases it's used to optimize ads - i.e. to match content with relevant advertising.

Presumably, Hadoop will continue to drive these non-search tools. But does that mean Yahoo! will continue to put its considerably weight behind the project's continued development?

Christophe Bisciglia is confident that Yahoo!'s commitment will remain. "Hadoop isn't just about search," says Bisciglia, one of the minds behind Cloudera, a Silicon Valley startup offering a commercialized version of Hadoop. "Over the coming months, we will likely see Yahoo! shift resources towards the advertising and content businesses, but Hadoop plays a critical role there as well, so even if the clients for Hadoop change a bit, I don't see the overall investment from Y! decreasing.

"The expensive part of operating a search business is the hardware itself - not the development team working on Hadoop. If anything, this will better position their Hadoop team to attack challenges that have more impact on Yahoo!'s bottom line."

Granted, Bisciglia has a certain interest in Yahoo! maintaining its Hadoop efforts. But let's hope he's right. The destruction of Yahoo!'s search engine comes just as Hadoop is taking off. It underpins Facebook's backend infrastructure. It's offered up from Amazon's Web Services cloud. And last month's Hadoop Summit - driven by, yes, Yahoo! - attracted more than 700 developers from around the globe.

What's more, Hadoop had finally made Yahoo! relevant again. Yes, the project was inspired by work done at Google. But whereas Google has kept GFS and MapReduce largely hidden behind the walls of the Mountain View Chocolate Factory, Yahoo! has embraced this new-age distributed computing paradigm as an open source project, inspiring countless other developers and web outfits along the way. And at least until Google says otherwise, the open-source incarnation of MapReduce is outperforming the original.

After years as a frivolous headline that few actually bothered to click on, Yahoo! has finally found its mojo. What a shame it would be if Microsoft took it away. ®

Update

With a blog post Thursday morning, after this story was published, Hadoop development VP Eric Baldeschwieler has reaffirmed Yahoo!'s commitment to the project. "Don't Panic!," he wrote. "We are as committed as ever to building a world class open source Cloud Computing infrastructure and Apache Hadoop remains our solution for batch computing. Hadoop is used to solve many, many internet scale problems beyond search at Yahoo. Today's deal only improves Yahoo's ability to invest in Hadoop.

"Yahoo is buzzing with more energy and bigger plans than ever before. The Hadoop team is running to keep up with our internal customers demands for ever larger, faster and better clusters. We are all looking forward to working with you, the wider Hadoop community, to build the better Hadoop that we all want."


Other stories you might like

  • Red Hat Kubernetes security report finds people are the problem
    Puny human brains baffled by K8s complexity, leading to blunder fears

    Kubernetes, despite being widely regarded as an important technology by IT leaders, continues to pose problems for those deploying it. And the problem, apparently, is us.

    The open source container orchestration software, being used or evaluated by 96 per cent of organizations surveyed [PDF] last year by the Cloud Native Computing Foundation, has a reputation for complexity.

    Witness the sarcasm: "Kubernetes is so easy to use that a company devoted solely to troubleshooting issues with it has raised $67 million," quipped Corey Quinn, chief cloud economist at IT consultancy The Duckbill Group, in a Twitter post on Monday referencing investment in a startup called Komodor. And the consequences of the software's complication can be seen in the difficulties reported by those using it.

    Continue reading
  • Infosys skips government meeting – and collecting government taxes
    Tax portal wobbles, again

    Services giant Infosys has had a difficult week, with one of its flagship projects wobbling and India's government continuing to pressure it over labor practices.

    The wobbly projext is India's portal for filing Goods and Services Tax returns. According to India's Central Board of Indirect Taxes and Customs (CBIC), the IT services giant reported a "technical glitch" that meant auto-populated forms weren't ready for taxpayers. The company was directed to fix it and CBIC was faced with extending due dates for tax payments.

    Continue reading
  • Google keeps legacy G Suite alive and free for personal use
    Phew!

    Google has quietly dropped its demand that users of its free G Suite legacy edition cough up to continue enjoying custom email domains and cloudy productivity tools.

    This story starts in 2006 with the launch of “Google Apps for Your Domain”, a bundle of services that included email, a calendar, Google Talk, and a website building tool. Beta users were offered the service at no cost, complete with the ability to use a custom domain if users let Google handle their MX record.

    The service evolved over the years and added more services, and in 2020 Google rebranded its online productivity offering as “Workspace”. Beta users got most of the updated offerings at no cost.

    Continue reading

Biting the hand that feeds IT © 1998–2022