Presentation is loading. Please wait.

Presentation is loading. Please wait.

OCLC Online Computer Library Center Data Mining Library Collection Silos: Print Books and E-books in Library Collections Lynn Silipigni Connaway Ed O’Neill.

Similar presentations


Presentation on theme: "OCLC Online Computer Library Center Data Mining Library Collection Silos: Print Books and E-books in Library Collections Lynn Silipigni Connaway Ed O’Neill."— Presentation transcript:

1 OCLC Online Computer Library Center Data Mining Library Collection Silos: Print Books and E-books in Library Collections Lynn Silipigni Connaway Ed O’Neill Chandra Prabha Brian Lavoie

2 Collection Assessment Why assess collections? –Provide data for member libraries for decision-making Description of the collection –Identify specific subject areas »Determine collection age »Rate of growth »Strengths and weakness Overlap/gap analysis Identify last copy Useful information –Outside funding –Library collection comparisons –Remote storage decisions –Collection development and management –Identify role of non- ARL libraries

3 WorldCat as a Collection World’s largest bibliographic database –July 1, 2003 = 50 million+ records –1 billion holdings Ideal source for data-mining Characteristics of WorldCat –Age –Subject, using NATC –Holdings by type of library ARL Academic, non-ARL Public School Special

4 WorldCat as a Collection Use of MARC data elements in WorldCat –Types of materials –Library holdings to determine audience levels Collection assessment and collection use –Unique titles –Analyze and compare aggregate holdings for libraries –Identify print books (p-books) and electronic books (e-books)

5 WorldCat Holdings by Library Types

6 WorldCat Number of Holdings

7 WorldCat Number of Records

8 WorldCat Holdings

9

10 Study Objective Digital materials constitute increasing proportion of library collections Effective strategies for integrating print and digital materials within a library collection –Eliminate redundancies –Meet user expectations Data-mining increasingly important to support collection management decisions –WorldCat World’s largest bibliographic database Ideal as source for data-mining Data-mine WorldCat in order to examine characteristics of p-books and e-books

11 Rationale Collection management –Development –Cooperation –Deselection –Preservation Space allocation and management Meet user expectations Services for off-site users Migration from print to digital Convenient access –24/7 access –Desk-top delivery

12 Scope WorldCat –July 1, 2003 = 50 million+ records –1 billion holdings Digital Items Books –Print (p-book) –Digital (e-book)

13 Strategy Identify digital items Identify digital items with at least one other manifestation in WorldCat –FRBRize database Work –Distinct intellectual or artistic expression –Cluster works in WorldCat Manifestation –Physical embodiment of a work Identify digital items with p-book equivalents –Assumption If digital items have p-book equivalents, then digital items are e-books –Identify publishers and publication dates

14 Need to Determine Comparison of p-books and e-books –What is a book? –What is a p-book? –What is an e-book? –What is a digital item? –How do we extend p-book criteria to digital world?

15 What is a Digital Item? Working definition of digital item –Computer file –OR Electronic resource –OR Appropriate 856 field Indicates electronic location or access

16 What is a P-book? No consensus for definition of a book –Text (type = a) and monograph (bib level = m) Broadsides? Pamphlets? Government documents? Children’s books? Microforms? –Authoritative Definitions UNESCO –Nonperiodical literary publication consisting of > 49 pages, covers excluded ANSI –Publications consisting of > 49 pages –Hard covers US Postal Service (publication) –Publications > 24 pages

17 A P-book IS: Based on UNESCO definition Working definition of a p-book –Printed on paper (excludes microform) –Language material –Monograph –Physical description –Form of item = regular or large print –Title does not include a GMD –Substantial length (> 49 pages; > 25 to include juvenile titles) –Excludes manuscripts (dissertations and theses)

18 What is an E-book? Difficult to define e-book –Digital version of p-book (straightforward) –New conceptual views of a book in digital environment Assumption –P-book is well-defined –If digital item has manifestation as a p- book, then digital item must also be a book –If p-book has digital equivalent or vice-versa, ignore e-book that has no print equivalents

19 An E-book IS: E-Book = Electronic (Digital) + Book Definition of e-Book: –Digital equivalents of p-books –New conceptual definitions of books in digital environment

20 WorldCat Record Analysis P-book records = 24,048,235 (48% of WC) Digital item records = 795,630 (15% of WC) –Web sites Collections of interlinked, Web-accessible materials residing at a single location on the Internet –Documents Various forms of electronic documents E-books with no p-book equivalents and no minimum page requirements –Book chapters –Broadsides –Brochures –Pamphlets –Reprints E-books with p-book equivalents = 76,375 (1.5% of WC)

21 WorldCat Record Analysis Digital item records (continued) –Interactive learning objects Computer programs offering self-contained, interactive tutorial or educational experience – Software Computer programs for creating and manipulating information –Serials Journals Proceedings –Images –Theses –Other (2 records) Computer game Raw data file

22 Digital Items in WorldCat

23 Publication Dates of Digital Items With P-Book Equivalents in WorldCat

24 Publishers of Digital Items With P-Book Equivalents in WorldCat Approximately 15,000 unique publishers Approximately 150 publishers with > 25 records Top 10 publishers –Institute of Electrical and Electronic Engineers (IEEE) –National Bureau of Economic Research –US Government Printing Office –Springer –Inter-University Consortium for Political and Social Research –PowerKids Press –University of Virginia Library –MIT Press –Microsoft –Broderbund Software and Books

25 Discussion of Analysis Small number of –E-books with p-book equivalents –Publishers with > 25 records for e-books with p-book equivalents Recent publication dates for e-books with p-book equivalents More Web sites than documents or reprints Difficult to identify and categorize digital items –Inconsistent cataloging policies and practices for digital items –Inconsistent definitions for types of digital items

26 Future Research Establish accepted criteria for defining an e-book independent of p-books Identify and compare type of library holdings and NATC subjects for p-books and e-books –Identify electronic collection silos Continue to collect these data to compare for trends Identify types of content/materials that are better suited for either print or digital environment

27 OCLC Online Computer Library Center Questions and Discussion connawal@oclc.org oneill@oclc.org


Download ppt "OCLC Online Computer Library Center Data Mining Library Collection Silos: Print Books and E-books in Library Collections Lynn Silipigni Connaway Ed O’Neill."

Similar presentations


Ads by Google