In what could be a historic move in the history of the internet, Google has announced arrangements with Harvard University, and a handful of public libraries, to digitize parts of their valuable collections and make them available over the public web. Yahoo!, Grokker and Microsoft are working on similar ventures.
Google Print Library, as it's called, will take many years to complete its first phase, and like the others, faces tremendous hurdles. Copyright and licensing issues remain a huge obstacle; the ontological expertise remains the domain of information professionals; and as a monopoly gateway to the world's information, no private corporation can expect to evade regulatory concerns. And lazy governments, both central and local, could find use it as an excuse to axe what commitments they have to making high quality information available. Any of these issues could hobble the venture, providing a service that's as useful as the fake cardboard book-props one can buy by the yard to fill an empty study bookcase. But as a statement of intent, such ventures deserve to be taken seriously.
Google will co-operate in scanning and digitizing works with major academic libraries and make them searchable. The results will be displayed using Google Print - which uses DRM to restrict the viewing and printing of copyright material - and display links to either commercial booksellers such as Amazon.com or, using Open Worldcat metadata, provide information where to find it at your local library. Initial partners include Harvard, with 15 million books, Oxford's Bodleian Library, Stanford and Michigan University, where the scanning of seven million books is expected to take six years. Google won't at first offer advertisements on Print Library, although there's plenty of scope for this to change. For example: Do you want fries with your burgher?.
At ResourceShelf, Gary Price has a roundup of other digitization projects, and librarian Steve Cohen offers a few notes of caution. Google will need to improve on the brute force text search algorithms it uses today, he notes, and "libraries should be pushing their own materials through their websites rather than having to 'rely' on Google to do so".
The promise of universal access to data repeated over a decade of internet hype has not been fulfilled, and the role of librarians as information professionals has been consistently undervalued - something, we suspect, to do with the adolescent hostility to expertise that characterizes so much internet evangelism. Which in turn, probably has a lot to do with the internet's libertarian backgrounds. Whether the private sector succeeds after a decade of failure in overcoming copyright interests remains to be seen, and whether it can be trusted to do so is another question. We'll certain need the librarians, to keep the Microsofts and Googles both honest and effective. ®