Google Books Settlement Con Is Google Book Search the last library?
Geoff Nunberg, one of America's leading linguistics researchers, laid this rather ominous tag on Google's controversial book-scanning project amidst an amusingly-heated debate this afternoon on the campus of the University of California, Berkeley.
"This is likely to be The Last Library," Nunberg said during a University conference dedicated to Google Book Search and the company's accompanying $125m settlement with US authors and publishers. "Nobody is very likely to scan these books again. The cost of scanning isn't going to come down. There's no Moore's Law for scanning.
"We don't know who's going to be running these files 100 years from now. It may be Google. It may be News Corp. It may WalMart. But we can say with some certainty that 100 years from now, these are the very files scholars will be using."
Even in the short term, many are worried that Google's settlement with authors and publishers gives it an undue amount of control of the world's digital books. The pact - which still requires court approval - provides a unique license to scan, use and make money from so-called orphan works, titles whose rights holders have yet to come forward.
With its Book Rights Registry, the settlement also gives third parties the opportunity to negotiate access to the books whose rights holders have indeed come forward - and given their approval. But Google will still control the scanning and the cataloging of the books, and Nunberg questions whether the company is prepared to get things right. He spent a good 20 minutes questioning the quality of Google's scans and metatagging, pointing out error after error in its catalog - from books on Jimi Hendrix dated before his birth to a book on copyright categorized under drama.
"I wonder if this is Google's idea of a joke," he said, referring to that last snafu.
Predictably, Google Book Search engineering lead Dan Clancy takes issue with The Last Library characterization. He acknowledges that some of the works Google has scanned will never be scanned again. But he's adamant that although Google has a 10-million-book head start - and a monopoly-building boondoggle of a settlement with authors and publishers - others will compete.
"I don't view Google Book Search as the one and only library," he said. "I don't think it should be and I don't think it will be - in part because, remember, a library is about accessing information, not just accessing books. Libraries were created because books were where information was in the past.
"Libraries are about information, and...Google is not the only book-scanning activity in existence today. There will continue to be other activities. And the internet provides all sorts of information that are linked together in all sorts of ways."
No, he did not acknowledge that Google is well on its way to controlling the internet as well.
Instead, he argued that if Book Search is The Last Library, then Google was our only hope for a universal digital library in the first place. "To the extent that Google Book Search is our last shot, if that were really true, then it's probably also true that if not for Google Book Search, we never would have had a shot - which I don't think is true either."
But several minutes later, he said it was true. "If Google hadn't have done it, would someone else have done it? I think probably not."
Google has a head start in part because it had the money needed to scan those 10 million books - and it had the audacity to spend. But it also had the money - and the blatant disregard for copyright law - needed to endure a lawsuit from authors and publishers and negotiate a $125m settlement.
Though he wouldn't say how much Google has spent scanning books, Clancy admitted it wasn't cheap. "It's a lot," he said. "If this was just tens of millions of dollars, we wouldn't all be siting here debating this. Microsoft would have kept scanning. And there would be much more incentive to do this."
But as usual, he did not acknowledge that not everyone can afford to get themselves sued for multi-millions of dollars. Yes, there are other book scanning operations, including the Internet Archive. But as the IA's Peter Brantley told The Reg, the not-for-profit can't afford to wade into that legal territory.
And even if it did, it would likely end up spending far more than Google has spent. Now that Google has already settled with authors and publishers, could another operation swing the same terms? And as Brantley points out, now that Google has already inked deals with the world's leading research libraries to scan books, are those libraries going to let other operations into their stacks?
So, to summarize: Google says that if it hadn't scanned all those books, no one else would have. And now there's less incentive to scan all those books. But Google insists it's not The Last Library. ®