Six Goals of Mass Digitization 1.Improve discovery –full text indexing via Internet search engines makes relevant book content easier to find 2.Improve access –Books available in full view can be delivered wherever users are working – home, office, laptops, mobile devices, ebook readers… 3.Enable new modes of scholarship –Text mining and other forms of sophisticated textual or computational analysis can unlock the knowledge contained in books in new ways 4.Preserve and protect our collections –Whether from normal or catastrophic loss, preserving our books digitally will safeguard the intellectual record for future generations 5.Manage our print collections more effectively –By making our collections available in digital form, libraries can adopt more efficient strategies for managing and providing access to the corresponding print volumes when needed 6.Fulfill our public service mission –Many books of enduring general interest in the public domain can now be read by anyone, anywhere, anytime.
Could we digitize our collections ourselves on this scale?
Book digitization without Google COSTTIME 15 million UC books: $495 million / 144 years* * Based on mass digitization throughput of 2,000 volumes / week for a 20-scanner facility working two shifts @ $.10 per page
Library partnerships with commercial firms are not new
GBS Pre-Settlement Full view for pre-1923 public domain only Full text search + snippets for in-copyright material Full View (Public Domain) || Discovery Only + Snippets (In-Copyright) Full View (Public Domain) || Discovery Only + Snippets (In-Copyright) ------------ 1923
GBS Post-Settlement Full access to ‘millions’ more books –including most of the 20 th century Previews for in-print material Research corpus for advanced research Access for print-disabled users Will support more efficient print collection management Full View (Public Domain + Out-of-Print) --------------------------- Preview (In-Print) 199-?
Criticism has improved the Settlement Most Favored Nation clause removed Better treatment of orphan works –Eliminates conflict of interest with known rightsholders Explicit provisions enabling rightsholders to make their books available without restriction or cost
Some responses to the Settlement controversy Pricing Monopoly Orphan Works
Charging high prices for subscriptions –The Settlement’s broad distribution requirement will work against this –Pricing is subject to a formal challenge process including binding arbitration –Libraries are experienced in negotiating access to ebook content and will assess the Institutional Subscription critically –Under the ASA, Google can offer discounted pricing indefinitely –There will be a more level playing field if orphan works legislation passes
Giving Google a monopoly –Amazon and Microsoft protesting Google’s monopoly??? –The book market is large and diverse $24 billion in the US alone Most book sales are for very current materials; out of print books are a minor factor Many out of print works are in the public domain with no barriers to competition (1.8M free ebooks are available for the Kindle today) Microsoft and others had the same opportunity, but withdrew –Under the ASA, 3 rd party resellers can sell access to Google books (including orphans) under the Consumer Purchase Model and receive most of Google’s 37% revenue share –The Book Rights Registry and individual rights holders can strike deals with other providers, and are expected to do so The Registry will have an incentive to work with other distribution channels to demonstrate its relevance to rights holders Most Favored Nation clause no longer a factor
Locking up orphan works –Google’s competitors will be motivated to join the push for orphan works legislation If successful, orphan works legislation will override the Settlement terms, making Google’s privileged position short-lived –Meanwhile, the ASA removes the conflict of interest between registered rights holders and rights holders of unclaimed works –As rights holders surface via the Settlement claims process, the true scope of orphan works will be better known Once that happens, providing access to remaining unclaimed works will entail less risk than it does currently, making the legal protections offered to Google under the Settlement less of an advantage
The Bottom Line As one commentator has written : –“ The settlement is not what you would come up with if you began with a blank piece of paper and designed the optimal system for all the interested parties.” Nonetheless, the Settlement will: –Make millions of books in research library collections more accessible to users and the general public than ever before – more accessible than they are now via Google Book Search –Provide a significant corpus of material for advanced computational research –Allow individual rights holders to convey broader use rights if they wish –Allow participating libraries to retain their copies of Google in-copyright scans for replacement purposes, for the creation of additional services, and for long-term preservation of the intellectual record –Create additional opportunities for management of library print collections –Potentially, spur a more rational legislative solution for orphan works
Finally… Libraries, not Google, are in charge of their own future
http://catalog.hathitrust.org Currently digitized Currently digitized: 6.6 million volumes 1.3 million public domain Projected: 12 million by 2014