Google buys search engine – PageRank™ RIP?

Got bots?


Google has bought Kaltix, a three-month-old, three-man Stanford startup that's working on personalized and context-sensitive search. Despite its battalion of PhDs, Google isn't too proud to acquire external search technologies, and earlier this year bought Applied Semantics for its CIRCA ontology, which "understands, organizes, and extracts knowledge from websites and information repositories in a way that mimics human thought".

Google has made no secret of its goal to "understand" the web, an acknowledgement that its current brute-force text index produces search results with little or no context. The popularity of Teoma demonstrates that even a small index can produce superior results for certain kind of searches. Teoma leans on existing classification systems.

While Google relied on PageRank™ to provide context, all was well. But PageRank is now widely acknowledged to be broken, so new, smarter tricks are required.

Regarded as heresy when we raised the issue last spring, now some of Google's warmest admirers, MetaFilter's Matt Haughey and web designer Jason Kottke have acknowledged the problem.

As Gary Stock noted here last May, Google "didn't foresee a tightly-bound body of wirers. They presumed that technicians at USC would link to the best papers from MIT, to the best local sites from a land trust or a river study - rather than a clique, a small group of people writing about each other constantly. They obviously bump the rankings system in a way for which it wasn't prepared."

Although it's tempting to suggest that bloggers broke PageRank™ it might equally be the case that the Blog Noise issue is emblematic rather than causal. Blog Noise - in the form of 'trackbacks', content-free pages and other chaff - is the most visible manifestation, but mindless list-generators are also to blame for Google's poor performance. And the truth is every successful search engine will find itself engaged in an arms race with gamers. (Deliciously, in the case of email spammer Elwyn Jenkins, a former e-currency salesman who proselytizes weblogs by day, and by night offers advice on how to improve your PageRank, the bloggers and the Google gamers are one and the same [includes screenshots]).

Daniel Brandt, who runs the 100,000-document NameBase archive, has been PageRank&trade's most severe critic, and acknowledges that it lives on in name only. Google no longer performs a monthly recalculation of PageRank values and anchor text is the most highly valued criteria for a search, he says. Which makes Google hardly distinguishable from AltaVista five years ago.

"Quantity not quality is the word on the street," he told us. "The old method of doing PageRank™ which had more integrity or consistency, to it is no longer being done. PageRank™ was a bad idea philosophically to begin with, but now some spammer can set up hundreds or thousands of sites automatically with anchor texts pointing to one page. Before, each would have such a tiny PageRank™ that it wouldn't amount to a hill of beans. Now you can do that and if the anchor text is carefully chosen it will make a difference," he reckons. "The cure is worse than the disease".

As an example he cites the results for the search "discount brokers" - of all the discount brokers, Google's top results is a page which has been empty for a year.

So perhaps Google needs to give a formal burial to PageRank™ rather than maintaing
ing a goulish afterlife as a marketing gimmick. The future promises to be much more interesting. ®

Related Links

"Make Money Online" - How bloggers game Google
PageRank is Dead [Zawodny]
"Google is busted" [Kottke]
Maybe I should write for the Register UK too? [Haughey] - follow-up

Related Stories

Google to fix blog noise problem
Blog noise is 'life or death' for Google
Google - the only archive we'll ever need?
Google heals the sick


Other stories you might like

  • EU-US Trade and Technology Council meets to coordinate on supply chains
    Agenda includes warning system for disruptions, and avoiding 'subsidy race' for chip investments

    The EU-US Trade and Technology Council (TTC) is meeting in Paris today to discuss coordinated approaches to global supply chain issues.

    This is only the second meeting of the TTC, the agenda for which was prepared in February. That highlighted a number of priorities, including securing supply chains, technological cooperation, the coordination of measures to combat distorting practices, and approaches to the decarbonization of trade.

    According to a White House pre-briefing for US reporters, the EU and US are set to announce joint approaches on technical discussions to international standard-setting bodies, an early warning system to better predict and address potential semiconductor supply chain disruptions, and a transatlantic approach to semiconductor investments aimed at ensuring security of supply.

    Continue reading
  • US cops kick back against facial recognition bans
    Plus: DeepMind launches new generalist AI system, and Apple boffin quits over return-to-work policy

    In brief Facial recognition bans passed by US cities are being overturned as law enforcement and lobbyist groups pressure local governments to tackle rising crime rates.

    In July, the state of Virginia will scrap its ban on the controversial technology after less than a year. California and New Orleans may follow suit, Reuters first reported. Vermont adjusted its bill to allow police to use facial recognition software in child sex abuse investigations.

    Elsewhere, efforts are under way in New York, Colorado, and Indiana to prevent bills banning facial recognition from passing. It's not clear if some existing vetoes set to expire, like the one in California, will be renewed. Around two dozen US state or local governments passed laws prohibiting facial recognition from 2019 to 2021. Police, however, believe the tool is useful in identifying suspects and can help solve cases especially in places where crime rates have risen.

    Continue reading
  • RISC-V needs more than an open architecture to compete
    Arm shows us that even total domination doesn't always make stupid levels of money

    Opinion Interviews with chip company CEOs are invariably enlightening. On top of the usual market-related subjects of success and failure, revenues and competition, plans and pitfalls, the highly paid victim knows that there's a large audience of unusually competent critics eager for technical details. That's you.

    Take The Register's latest interview with RISC-V International CEO Calista Redmond. It moved smartly through the gears on Intel's recent Platinum Membership of the open ISA consortium ("they're not too worried about their x86 business"), the interest from autocratic regimes (roughly "there are no rules, if some come up we'll stick by them"), and what RISC-V's 2022 will look like. Laptops. Thousand-core AI chips. Google hyperscalers. Edge. The plan seems to be to do in five years what took Arm 20.

    RISC-V may not be an existential risk to Intel, but Arm had better watch it.

    Continue reading

Biting the hand that feeds IT © 1998–2022