Google Research head dubs holy PageRank 'over-hyped'

Norvig mum on 'Caffeine' search shot


Google research head Peter Norvig says that the search giant's hallowed PageRank link-analysis algorithm is overrated. And always has been.

"One thing that I think is still over-hyped is PageRank," Norvig said this morning during a question and answer keynote at the search-obsessed SMX West conference in Santa Clara, California. "People think we just do this computation on the web graph and order all the pages and that's it. That computation is important, but it's just one thing that we do.

"People [webmasters and SEOs] always said, 'We're stuck if we don't have [a high PageRank].' But we never felt that way. We never felt that it was such a big factor."

PageRank attempts to measure the relative importance of a website based on what other sites it's linked to. Named for Google co-founder Larry Page - who developed the idea while at Stanford University - the patented technology is a central pillar in the Google Mythology, receiving much of the credit for the Mountain View search engine's rise to world dominance. The PageRank patent actually belongs to Stanford, with Google owning exclusive license rights.

Late last year, Google added so-called "real-time search" to its engine - serving up links to fresh Tweets, news, and other recent web posts - and this morning, Norvig was asked if this was a far more difficult undertaking, considering that PageRank doesn't work well with Web2.0rhea. Norvig - the former director of search quality at Google - was quick to say "no," explaining that even with core search, PageRank isn't as important as people think it is. And never was.

"[PageRank] has a catchy name and the name recognition. But we've always looked at all the things that are available [when ranking search results]. We look at where do things come from, what are the words used, how do they interact with each other, how do people interact with them," he said.

"[Real-time search] is more similar [to core search] than dissimilar, in that you're grabbing every available signal and trying to figure out the best way to combine them. The fact that there aren't legacy links from a long time ago - we don't think of that as that much different."

The key to real-time search, Norvig said, is Google's famously distributed back-end infrastructure, which is able to re-build its web index with relatively little delay. When Norvig first joined the company, the Google web index was built once a month. Then the company moved to once a day and then to once an hour. Now, its distributed infrastructure - using proprietary technologies like the Google File System and MapReduce - can update its index in "10 seconds," according to Norvig.

When the hourly index was rolled out, Norvig remembered, Larry Page insisted on calling it the "3600 second" index. "If it was hourly, it was just going to stay like that," Norvig said. "But if you talk about it in seconds, people are going to push it down to 1000 seconds and eventually you get it down to 10. And that's where we are now. His vision has come true."

Google is currently testing a new search indexing system known as "Caffeine," which uses, among other things, a complete rewrite of the Google File System known at least informally as GFS2. In the fall, uber Googler Matt Cutts indicated that Caffeine would begin rolling out across the company's global infrastructure after the Christmas holidays, but that hasn't happened yet.

Norvig would not be drawn on the state of Caffeine, merely confirming that it is still being tested in a single data center. The Google Research team, he said, has very little to do with the company's infrastructure work. "For historical reasons, that kind of systems programming stuff has not been done in Research," he said. ®

Similar topics

Broader topics


Other stories you might like

  • Dog forgets all about risk of drowning in a marsh as soon as drone dangles a sausage

    It's not the wurst idea in the world

    Man's best friend, though far from the dumbest animal, isn't that smart either. And if there's one sure-fire way to get a dog moving, it's the promise of a snack.

    In another fine example of drones being used as a force for good, this week a dog was rescued from mudflats in Hampshire on the south coast of England because it realised that chasing a sausage dangling from a UAV would be a preferable outcome to drowning as the tide rose.

    Or rather the tantalising treat overrode any instinct the pet had to avoid the incoming water.

    Continue reading
  • Almost there: James Webb Space Telescope frees its mirrors and prepares for insertion

    Freed of launch restraints, mirror segments can waggle at will

    NASA scientists have deployed mirrors on the James Webb Space Telescope ahead of a critical thruster firing on Monday.

    With less than 50,000km to go until the spacecraft reaches its L2 orbit, the segments that make up the primary mirror of the James Webb Space Telescope (JWST) are ready for alignment. The team carefully moved all 132 actuators lurking on the back of the primary mirror segments and secondary mirror, driving the former 12.5mm away from the telescope structure.

    Continue reading
  • Arm rages against the insecure chip machine with new Morello architecture

    Prototypes now available for testing

    Arm has made available for testing prototypes of its Morello architecture, aimed at bringing features into the design of CPUs that provide greater robustness and make them resistant to certain attack vectors. If it performs as expected, it will likely become a fundamental part of future processor designs.

    The Morello programme involves Arm collaborating with the University of Cambridge and others in tech to develop a processor architecture that is intended to be fundamentally more secure. Morello prototype boards are now being released for testing by developers and security specialists, based on a prototype system-on-chip (SoC) that Arm has built.

    Arm said that the limited-edition evaluation boards are based on the Morello prototype architecture embedded into an Armv8.2-A processor. This is an adaptation of the architecture in the Arm Neoverse N1 design aimed at data centre workloads.

    Continue reading

Biting the hand that feeds IT © 1998–2022