Presentation is loading. Please wait.

Presentation is loading. Please wait.

Anatomy of Aggregate Collections: The Example of Google Print for Libraries Brian Lavoie Senior Research Scientist OCLC Research OCLC Members Council Meeting.

Similar presentations


Presentation on theme: "Anatomy of Aggregate Collections: The Example of Google Print for Libraries Brian Lavoie Senior Research Scientist OCLC Research OCLC Members Council Meeting."— Presentation transcript:

1 Anatomy of Aggregate Collections: The Example of Google Print for Libraries Brian Lavoie Senior Research Scientist OCLC Research OCLC Members Council Meeting October 2005

2 Aggregate collections Boundaries between local and external collections increasingly blurred … Resource sharing (digital/network technologies) Cooperative collection management (resource allocation) Shift in focus to resources of the system (or subsets of the system), rather than individual collections Need data to support/illuminate system-wide perspective Characterize/analyze aggregate collections WorldCat: largest aggregate collection Aggregate holdings of >20,000 libraries Bridge from local to system-wide perspective

3 The system-wide print book collection as represented in WorldCat (January 2005) ~55 million ~41 million ~35 million ~32 million print books More information: http://www.oclc.org/research/presentations/lavoie/cni2005.ppt

4 Google Print for Libraries Aggregate collection of print books Aggregate print book holdings of five major research libraries (Harvard, Michigan, Oxford, NYPL, and Stanford) Focus on copyright issues; very little discussion of Google Print for Libraries as an aggregate collection What are characteristics of this aggregate collection? How does it relate to the system-wide collection? WorldCat: useful data source for analysis Lavoie, Connaway, Dempsey: Anatomy of Aggregate Collections: The Example of Google Print for Libraries D-Lib (September 2005) http://www.dlib.org/dlib/september05/lavoie/09lavoie.html

5 G5 coverage of system-wide print book collection 10.5 million unique books 10.5 million unique books

6 Holdings overlap Potential redundancy rate of 40 percent Potential redundancy rate of 40 percent

7 Language distribution LanguageGoogle 5System-wide English0.490.52 German0.100.08 French0.080.08 Spanish0.050.06 Chinese0.040.04 Russian0.040.03 Italian0.030.03 Japanese0.020.04 Hebrew0.020.01 Arabic0.010.01 Portuguese0.010.01 Polish0.010.01 Dutch0.010.01 Latin0.010.01 Korean0.010.01 Swedish0.01< 0.01 All others0.070.08 More than 430 languages in Google 5 collection More than 430 languages in Google 5 collection

8 Cumulative age distribution of G5 holdings > 80 percent of Google 5 collection still in copyright > 80 percent of Google 5 collection still in copyright

9 Works Coverage slightly higher (35 %) Holdings overlap slightly greater (56 % held uniquely) Coverage slightly higher (35 %) Holdings overlap slightly greater (56 % held uniquely)

10 Some speculation … What results would have been obtained if a different group of libraries had been selected? What incremental extensions to coverage can be obtained by adding additional library collections to original Google 5? Chose 5 new libraries: Small US liberal arts college Large US public university Large US private university Large US metropolitan library Large Canadian university

11 Beyond the Google 5 … New Google 5Original Google 5 Total holdings:~8 million~18 million Total unique books:5.9 million10.5 million % of system-wide:18 percent33 percent Redundant holdings:26 percent42 percent Impact by library type:% of holdings unique relative to original G5 collection: Large US metropolitan library:39 percent (most unlike G5) Large US private university:25 percent Large Canadian university:23 percent Large US public university:21 percent Small US liberal arts college:13 percent (most like G5)

12 The Google 10 Original Google 5 (10.5 million books) Google 10 collection: 12.3 million books + 1.8 million (17 %) Google 10 collection: 12.3 million books + 1.8 million (17 %) Diminishing returns? Original G5: ~18 million holdings 58% unique New G5: ~8 million holdings 22% unique

13 Mass digitization programs and other aggregate collections increasingly common features of library landscape Effective decision-making/planning aided by convergence on set of standard questions that help map out anatomy of aggregate collections Example: mass digitization programs What are characteristics of overarching population of materials that is target of digitization effort? How much of population will digitization effort cover? What is potential degree of redundancy? What bibliographic unit is focus of digitization (e.g., manifestations, expressions, works)? What number of participants and combination of institution types is optimal for obtaining maximum benefit with minimum cost? Anatomy of aggregate collections

14 Aggregate collections and WorldCat WorldCat more than tool for cataloging and reference; also strategic resource for managing aggregate collections OCLC Group Services http://www.oclc.org/groupservices/ OCLC WorldCat Collection Analysis Service http://www.oclc.org/collectionanalysis/ OCLC Research data-mining activities Web site: http://www.oclc.org/research/projects/mining/


Download ppt "Anatomy of Aggregate Collections: The Example of Google Print for Libraries Brian Lavoie Senior Research Scientist OCLC Research OCLC Members Council Meeting."

Similar presentations


Ads by Google