OCoLR 20041025 #53928015 OCLCR Making data work harder Lorcan Dempsey OCLC OVGTSL 2005 Conference Newark, May 11-13.

Slides:



Advertisements
Similar presentations
Mining for Digital Resources: Identifying and Characterizing Digital Materials in WorldCat Brian Lavoie Lynn Silipigni Connaway Ed ONeill ACRL 12 th National.
Advertisements

Ithaka A Systemwide View of Library Collections Brian Lavoie, OCLC Research Roger C. Schonfeld, Ithaka CNI Spring Task Force Meeting April 5, 2005.
Click to edit Master title style OCLC Online Computer Library Center OCLC and FRBR: directions and research results Lorcan Dempsey with contributions from.
OCLC Research OCLC Online Computer Library Center 2006 WebWise Los Angeles, CA 17 February 2006 FictionFinder: Don Quixote to Graphic Novels Diane Vizine-Goetz.
OCLC Research OCLC Online Computer Library Center ALA Midwinter 2006 San Antonio, TX OCLC FictionFinder & OCLC DeweyBrowser Eric Childress OCLC Research.
OCoLR # OCLCR Making data work harder Lorcan Dempsey OCLC Members Council 17 May 2005.
Libraries and the network platform: a new cooperative context Lorcan Dempsey 2006 OCLC/Frederick G. Kilgour Lecture in Information and Library Science.
FRBR Workshop, May 2-4, 2005 Subjects in Fiction: the Experience with WorldCat Diane Vizine-Goetz OCLC Research.
OCLC Online Computer Library Center Place and space: Collections and access in light of changing patterns of research and learning: a schematic view Lorcan.
The U N I V E R S I T Y of C A L I F O R N I A L I B R A R I E S Next Generation Melvyl at the University of California Patricia Martin LAUC Assembly May.
LIS618 lecture 6 Thomas Krichel Structure Probabilistic model News from the front line –Open WorldCat Pilot –Amazon Search Inside the book.
Developing catalogues for customers (not cataloguers) Gordon Dunsire Presented at Branch/Group Day, CILIP in Scotland 5 th Annual Conference, 13 th June.
Relevance ranking of results from MARC-based catalogues: from guidelines to implementation exploiting structured metadata Tony Boston and Alison Dellit.
OCLC Online Computer Library Center A Global OpenURL Resolver Registry Phil Norman OCLC Dlsr4lib Workshop March 23 rd, 2006 Arlington VA.
OCLC Online Computer Library Center Data Mining Library Collection Silos: Print Books and E-books in Library Collections Lynn Silipigni Connaway Ed O’Neill.
OCLC Online Computer Library Center OCLC Research Eric Childress OCLC Research SHARES Meeting NYU New York, NY
William Shakespare ( ) a great English writer. He is one of the most famous ever. He was born 1564 in Stratford - on - Avon. His plays and tragedies.
5/14/2003ALAO Spring Workshop 2003 Providing Access Cataloging –Requirements –One record or separate records for multiple formats –CONSER policy for simultaneous,
An informational future Information professionals 2050 conference Lorcan Dempsey UNC - 6/5/12.
GOLD/GALILEO Conference Agenda OCLC FirstSearch Databases: GALILEO and beyond WorldCat.org The Open WorldCat Project Encouraging Usage.
Knowledge creation, dissemination and implementation: The Librarians role in today’s knowledge economy Stellenbosch Symposium / IFLA Presidential Meeting.
OCLC Online Computer Library Center Strategic Partnerships: An International View 30 October 2003.
Interoperable Digitised Content “Discover, search, extract, link, associate, and view digitised content” Les Carr.
WORKFLOWS AND OTHER CONSIDERATIONS FOR DIGITIZATION  Steve Bingo  Processing Archivist Washington State University Libraries  Alex Merrill  Assistant.
Collection Strategy Context Setting “The real problem is… not whether libraries ought to cooperate… the problem is not whether cooperation is possible…
Linking resources Praha, June 2001 Ole Husby, BIBSYS
Creating (Library) Value in the Age of the Amazoogles University of British Columbia 2006 September 18 Stuart L. Weibel Senior Research Scientist, OCLC.
OCLC Research: an update Lorcan Dempsey
Anatomy of Aggregate Collections Exploring Mass Digitization and the “Collective Collection” Brian Lavoie Research Scientist OCLC Research NELINET September.
Programs and Research Libraries in a web 2.0 environment Lorcan Dempsey Bibliothèque National de France 8 December 2006.
Society of American Archivists Research Forum 18 August 2015 A Deep Dive into the Archival MARC Records in WorldCat (and ArchiveGrid) Jackie Dooley Program.
OCLC Research OCLC Online Computer Library Center Members Council Research and New Technologies Interest Group Québec, Québec, Canada 6 February 2007 FictionFinder:
@LorcanD Lorcan Dempsey, OCLC 11 October 2013 ARL Fall Forum: Mobilizing the research enterprise #ARLforum13 SHARE : Discovery:Focus on papers.
ERIC and the WorldCat Registry Lawrence Henry ERIC Program Manager Joanna White WorldCat Registry Product Manager.
OCLC Research OCLC Online Computer Library Center Research & New Technologies Interest Group 24 October 2005 DeweyBrowser & Curiouser Diane Vizine-Goetz.
Google Confidential Daniel Clancy Engineering Director, Google Print 18-July-05.
BALDWIN LIBRARY DIGITAL COLLECTIONS.
CNI Fall 2004 Task Force Meeting Going 'On-Web‘ Google, Yahoo, Open WorldCat and Library Services Chip Nilges Lorcan Dempsey OCLC.
Sessieronde: WorldCat Janet Lees OCLC PICA. 2 Agenda WorldCat overview European library holdings in WorldCat OpenWorldCat and WorldCat.org Future Directions.
A Future for the Library Catalogue T. Hickey ACRL/DVC Bryn Mawr 3 November 2006.
Programs and Research Thinking about collections Lorcan Dempsey Fiesole retreat The University of Hong Kong 13 April 2007.
OCLC Programs & Research Prospecting in the library data mines Brian Lavoie Consulting Research Scientist OCLC Programs & Research Annual Partners Meeting.
Utilizing OPAC Search Logs and Google Analytics Assessing OPAC Effectiveness and User Search Behavior VALE Users'/NJLA CUS/NJ ACRL Conference January 9,
What users want & how FRBR can help Diane Vizine-Goetz Research Scientist OCLC Research.
Improving Access to Geoscience Resources via Content Enhancement Linda R. Musser Pennsylvania State University October 2011.
The Collaborative Reference Database Project of the National Diet Library of Japan By Kiyoko MURAKAMI Assistant Director Domestic Materials Acquisition.
Lifecycle Metadata for Digital Objects November 1, 2004 Descriptive Metadata: “Modeling the World”
APPLYING FRBR TO LIBRARY CATALOGUES A REVIEW OF EXISTING FRBRIZATION PROJECTS Martha M. Yee September 9, 2006 draft.
NetLibrary Publishers’ Summit Looking at libraries Lorcan Dempsey OCLC NetLibrary Publishers’ Summit June 2005.
Intellectual Works and their Manifestations Representation of Information Objects IR Systems & Information objects Spring January, 2006 Bharat.
FRBR: Cataloging’s New Frontier Emily Dust Nimsakont Nebraska Library Commission NCompass Live December 15, 2010 Photo credit:
Functional Requirements for Bibliographic Records The Changing Face of Cataloging William E. Moen Texas Center for Digital Knowledge School of Library.
MARC Content Designation and Utilization Learning from Artifacts: Metadata Utilization Analysis William E. Moen School of Library and Information Sciences.
OCLC Research Library Partnership Work-In-Progress webinar 3 December 2015 A Close Look at the Four Million Archival MARC Records in WorldCat Jackie Dooley.
Future of Cataloguing: how RDA positions us for the future for RDA Workshop June, 2010.
The Future of Scholarly Communication & the Role of Libraries Roy Tennant eScholarship, The California Digital Library.
International Forum on “Local Wisdom as Power to Social and Economic Development” ELECTRONIC RESOURCES OF LOCAL INFORMATION IN NATIONAL LIBRARY OF VIETNAM.
TAG YOU’RE IT: ENHANCING ACCESS TO GRAPHIC NOVELS WENDY WEST
Structured data, Web 2.0, libraries Lorcan Dempsey Computers in libraries Washington DC 23 March 2006.
A centre of expertise in digital information management UKOLN is supported by: Metadata – what, why and how Ann Chapman.
A Complex Standard and Its Use Results from an empirical analysis of MARC 2004 Texas Library Association Annual Conference, March 18, 2004, San Antonio,
Lorcan Dempsey, OCLC Environmental trends and OCLC Research. RLP meeting, U Melbourne, 2 Dec
AN ARCHETYPE FOR INFORMATION ORGANIZATION AND CLASSIFICATION OCLC WorldCat.
WHAT DOES THE FUTURE HOLD? Ann Ellis Dec. 18, 2000
A Future for the Library Catalogue
MARC: Beyond the Basics 11/24/2018 (C) 2006, Tom Kaun.
Web 2.0 and libraries: Some examples
Onboarding Webinar 13 April 2019 Presented by and.
OCLC, WorldCat and Connexion
ALA Midwinter 2006 San Antonio, TX
Presentation transcript:

OCoLR # OCLCR Making data work harder Lorcan Dempsey OCLC OVGTSL 2005 Conference Newark, May 11-13

OCoLR # OCLCR Overview  Some context  Looking at data in action OpenWorldCat FRBR Data mining

OCoLR # OCLCR Context: value  Amazoogle: what should we be doing which fits into a world that they occupy. Where do we provide unique value.  ROI: libraries invest in data but do not extract as much value as they might from it. Unless we release more value, then the argument for this investment becomes weaker.  User: how do we co-create value with users. What opportunities are there for mixing catalog data and user contributed data?  Management intelligence: how do we use data better to inform management decisions?

OCoLR # OCLCR Context: consequences  The role of the catalog?  The role of structured data?  The role of the library?

OCoLR # OCLCR Data  Open WorldCat  FRBR  WorldCat Wiki  Management intelligence

OCoLR # OCLCR FRBR  ‘Interim FRBR’ in OWC  FRBR in research projects FictionFinder Curioser xISBN Algorithm Top 1000  FRBR in FirstSearch – late this year

OCoLR # OCLCR

Top Sets for Fiction (Records) RecordKeys 1,296defoe, daniel\ /robinson crusoe 1,267 carroll, lewis\ /alices adventures in wonderland 971 cervantes saavedra, miguel de\ /don quixote 828 stevenson, robert louis\ /treasure island 689 twain, mark\ /adventures of huckleberry finn 624 twain, mark\ /adventures of tom sawyer 618 swift, jonathan\ /gullivers travels

Top Sets for Fiction (Holdings) HoldingKeys 29,043twain, mark\ /adventures of huckleberry finn 26,088carroll, lewis\ /alices adventures in wonderland 20,843twain, mark\ /adventures of tom sawyer 19,410defoe, daniel\ /robinson crusoe 18,566cervantes saavedra, miguel de\ /don quixote 18,492stevenson, robert louis\ /treasure island 18,123dickens, charles\ /christmas carol

OCoLR # OCLCR Taking FRBR onto the open web  Curio(u)ser

OCoLR # OCLCR MetaWiki  WIKI – web pages  metaWIKI – data  Capture user input in structured ways

OCoLR # OCLCR Extending Wiki’s utility Wiki:  supported markup: wikitext  page editing: a single text block  searches: full text searching  collections managed: one per wiki MetaWiki:  supported markup: wikitext structured data (e.g., MARC, METS, DC…)  page editing: a single text block, or, field level  searches: full text searching fielded searching  collections managed: one/multiple per OaiWiki

Lorcan: note that this is a work in progress

OCoLR # OCLCR Management intelligence  So we have all this data – what can it tell us?  Several projects underway: only some discussed here

OCoLR # OCLCR Making Data Work Harder  Activities “shed” data: Cataloging  bibliographic information Web site traffic  transaction logs Reference queries  search term lists  Need to mine this data for intelligence that creates value for libraries and users  OCLC Research undertaking a number of data-mining projects aimed at: Knowing more about the characteristics of library collections Creating interesting and useful data displays Generating intelligence to support library decision-making

OCoLR # OCLCR Data mining  OCLC has a new collection analysis service  Some research projects looking at systemic questions described here.

OCoLR # OCLCR Looking at Library Print Book Collections … Systematically 32 million print books, representing 26 million distinct works Half of print books published after 1977; more than 80% still “in copyright” Rareness is common! Only a third of print books have more than five holdings; half have two or less OCLC/Ithaka collaboration: Use WorldCat to characterize the “system-wide” print book collection – i.e., aggregate print book holdings in WorldCat Intelligence of this kind can help establish digitization priorities and inform preservation planning More information: Only about 120,000 works had both print book and e-book manifestations

OCoLR # OCLCR The Implications of GooglePrint … Potentially covers about one third of print books in WorldCat ~60 percent of “GooglePrint” books held by only one of the Google 5 Less than 5 percent held by all of the Google 5 ~20 percent of “GooglePrint books” out of copyright Paper forthcoming …

OCoLR # OCLCR Know Your Audience! Implies: we can infer materials’ audience level from holdings patterns, which in turn can support: Collection management Readers’ advisory services Reference services Information retrieval Holdings represent selection decisions by librarians … implies there are about 1 billion individual selection decisions in the WorldCat holdings file Selections are made to serve the interests of a library’s target community … Associate target community (audience level) to particular library profiles - e.g., ARL, non-ARL academic, public, K-12 school … Paper forthcoming! ?

OCoLR # OCLCR “Last Copy”: Identifying At-Risk Materials ~23 million WorldCat records have only a single holding attached Libraries need to know what portions of their collections are: Rare … Rare and valuable … “Last copy” (artifact and/or content) Identification of rare materials essential intelligence in support of storage, digitization, and preservation decision-making Data-mining study of Vanderbilt holdings in WorldCat: Identified 23,000 items held uniquely by Vanderbilt ~60 % are print books ~60 % produced prior to 1950; ~25 % produced after 1970 Paper forthcoming!

OCoLR # OCLCR Thank you! OCLC Research: Lorcan: