The Open Archives Initiative Protocol for Metadata Harvesting: Overview Jewel Ward Visiting Scholar, Keio University Lib-Sys Seminar, Keio University,

Slides:



Advertisements
Similar presentations
A centre of expertise in digital information management The OAI Protocol for Metadata Harvesting Andy Powell UKOLN,
Advertisements

OAI Protocol for Metadata Harvesting Tim Brody Intelligence, Agents, Multimedia Group University of Southampton OpCit –
A brief overview of the Open Archives Initiative Steve Hitchcock Open Citation Project (OpCit) Southampton University Prepared for Z39.50/OAI/OpenURL plenary.
Open Archives Initiative Primer DC2001 – Tokyo, October 25, 2001 Thomas Krichel Palmer School of Library and Information Science Long Island University.
Y.T. a brief history of the OAI 0 Kaynak: Herbert van de Sompel.
OAI in DigiTool DigiTool Version 3.0.
Harvesting Metadata Using OAI-PMH Roy Tennant California Digital Library.
OAI-PMH Dawn Petherick, University Web Services Team Manager, Information Services, University of Birmingham MIDESS Dissemination.
Version 2 of the OAI-PMH & some other stuff 2 nd Workshop on the OAI, CERN Geneva, October 17 th 2002 Herbert Van de Sompel Los Alamos National Laboratory.
1 Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) Alon Kadury.
Infrastructures for Using Metadata RSS and OAI-PMH CS 431 – March 14, 2005 Carl Lagoze – Cornell University.
Building Digital Libraries on Open Archives Donatella Castelli IEI-CNR Italy.
New Developments in OAI Michael L. Nelson Old Dominion University OA-Forum May 13-14, 2002 Pisa, Italy Many.
The Open Archives Initiative Simeon Warner (Cornell University) Symposium on “Scholarly Publishing and Archiving on the Web”, University.
OAI Standards for Sheet Music Meeting March 28-29, 2002 Basic OAI Principals How They Apply to Sheet Music Presenter: Curtis Fornadley, Senior Programmer/Analyst.
The Open Archives Initiative Simeon Warner (Cornell University) Open Archives seminar “Facilitating Free and Efficient Scientific.
OAI-PMH at Yale Report on the DLF OAI Training Session November 10, 2005 Charlottesville, VA.
The Open Archives Initiative Simeon Warner Cornell University, Ithaca, NY, USA CREPUQ 2002, Montréal, Canada 14:00, 24 October 2002.
A Digital Library Repository Utilizing the Open Archives Initiative Developed to meet the needs of UTK Library Special Collections.
Introduction to the OAI Metadata Harvesting Protocol Hussein Suleman, Digital Library Research Laboratory Virginia Tech.
How to participate in the Union Catalogue Project Hussein Suleman Sivulile – Open Access South Africa Advanced Information Management.
Metadata Harvesting The Hague, 13 & 14 January 2009 Julie Verleyen Scientific Coordinator, Europeana Office EuropeanaLocal Knowledge Sharing Workshop.
32nd LIBER Annual General Conference - Rome, June 2003 Open archive solutions to traditional archive/library cooperation Donatella Castelli ISTI-CNR.
Metadata Harvesting Interoperable digital collections.
Metadata Harvesting Interoperable digital collections.
Metadata Harvesting Interoperable digital collections.
Herbert van de sompel Workshop on OAI and peer review journals in Europe Geneva, Switserland – March 22nd to 24th 2001 Herbert Van de Sompel Cornell University.
The OAI: overview and historical context OAI Open Meeting – Washington DC – January 23 rd 2001 Herbert Van de Sompel & Carl Lagoze Cornell University --
OAI-PMH: Open Archives Initiative Protocol for Metadata Harvesting T.B. Rajashekar National Centre for Science Information (NCSI) Indian Institute of Science,
Introduction to Digital Libraries hussein suleman uct cs honours 2004.
New Digital Library Possibilities Using the Open Archives InitiativeProtocol for Metadata Harvesting (OAI-PMH) Michael L. Nelson Old Dominion University.
The OAI Protocol for Metadata Harvesting Van de Sompel, Herbert Los Alamos National Laboratory – Research Library.
Digital Library Interoperability Architecture CS 502 – Carl Lagoze – Cornell University.
Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA Digital Libraries, OAI and Free Software.
The OAI: overview and historical context OAI Open Meeting – Washington DC – January 23 rd 2001 Herbert Van de Sompel & Carl Lagoze Cornell University --
Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) Phil Barker, March © Heriot-Watt University. You may reproduce all or any part.
Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi
Caltech CODA CODA: Collection of Digital Archives Caltech Scholarly Communication.
Slavic Digital Text Workshop 2006 The Open Archives Initiative Protocol for Metadata Harvesting: an Opportunity for Sharing Content in a Distributed Environment.
1 GRID Based Federated Digital Library K. Maly, M. Zubair, V. Chilukamarri, and P. Kothari Department of Computer Science Old Dominion University February,
OAI Protocol for Metadata Harvesting hussein suleman uct cs honours 2006.
OAI Overview DLESE OAI Workshop April 29-30, 2002 John Weatherley
Bitter Harvest Metadata Harvesting Issues, Problems, and Possible Solutions Roy Tennant California Digital Library.
Search Interoperability, OAI, and Metadata Sarah Shreeves University of Illinois at Urbana-Champaign Basics and Beyond Grainger Engineering Library April.
NSDL October 12-15, 2003Eisenhower National Clearinghouse Slide 1 NSDL and the Open Archives Initiative NSDL – OAI – and the Eisenhower National Clearinghouse.
Enforcing Interoperability with the Open Archives Initiative Repository Explorer Hussein Suleman, Digital Library Research.
Building Interoperable Digital Libraries: A Practical Guide to creating Open Archives Hussein Suleman, Digital Library Research.
Building Interoperable and Accessible ETD Collections: A Practical Guide to Creating Open Archives Hussein Suleman, Digital.
The OAI: technical overview OAI Open Meeting – Washington DC – January 23 rd 2001 Herbert Van de Sompel & Carl Lagoze Cornell University -- Computer Science.
The Open Archives Initiative Marshall Breeding Director for Innovative Technologies and Research Vanderbilt University
Open Archives Initiative Protocol for Metadata Harvesting.
OAI from the needle box Humboldt Universität Berlin, March 20, 2002 Thomas Krichel Palmer School of Library and Information Science Long Island University.
Metadata Harvesting Interoperable digital collections.
Sharing Digital Scores: Will the Open Archives Initiative Protocol for Metadata Harvesting Provide the Key? Constance Mayer, Harvard University Peter Munstedt,
2/22/2016J Ammerman1 Open Archives Initiative What is it? What’s it good for?
NSDL & the Open Archives Initiative A Brief Introduction to OAI Timothy W. Cole Mathematics Librarian & Professor of Library Administration.
1 CS 430: Information Discovery Lecture 26 Architecture of Information Retrieval Systems 1.
OAI and ODL Building Digital Libraries from Components Ryan Richardson Virginia Tech DLRL 18 September 2003.
ODU CS CS 695 Fall 2002 Michael L. Nelson Introduction to Digital Libraries Week 10: The Open Archives Initiative Old Dominion University.
OAI and ODL Building Digital Libraries from Components Hussein Suleman Virginia Tech DLRL 12 September 2002.
NDLTD Standards, Metadata and the OAI-PMH Hussein Suleman University of Cape Town October 2003.
OAI Protocol for Metadata Harvesting hussein suleman uct cs honours 2009.
Introduction to OAI Static Repositories By Thomas G. Habing Grainger Engineering Library.
Metadata Harvesting - OAI-PMH
Building Interoperable Digital Libraries: A Practical Guide to creating Open Archives Hussein Suleman, Digital Library Research Laboratory.
Georges Arnaout Chaitanya Krishna
CS431 guest lecture Simeon Warner
OAI and Metadata Harvesting
Enforcing Interoperability with the Open Archives Initiative Repository Explorer Hussein Suleman, Digital Library Research Laboratory Virginia.
Open Archive Initiative
Presentation transcript:

The Open Archives Initiative Protocol for Metadata Harvesting: Overview Jewel Ward Visiting Scholar, Keio University Lib-Sys Seminar, Keio University, Mita Campus 17 June 2003

2 Acknowledgements JCDL 2001/2002: OAI-PMH Introduction –Hussein Suleman (then at Virginia Tech) JCDL 2003: Introduction to the OAI-PMH –Timothy W. Cole (UIUC) –William H. Mischo (UIUC) –Thomas Habing (UIUC)

3 Acknowledgements JCDL 2003: Advanced Overview of Version 2.0 of the OAI-PMH –Michael L. Nelson (Old Dominion University) –Herbert Van de Sompel (LANL) –Simeon Warner (Cornell University) Digital Library Federation (DLF) Spring Forum 2003 "The OAI Static Repository: a file-based approach to exposing metadata via the OAI-PMH." Herbert Van de Sompel (LANL) This research was conducted by Patrick Hochstenbach (LANL), Henry Jerez (LANL) and Herbert Van de Sompel.

4 Outline Briefly: Institutional Repositories Background & Development OAI-PMH Basics New Developments Further Information Questions?

5 Institutional Repositories Institutional Repository: digital collections capturing and preserving the intellectual output of a single or multi-university community It’s a way to aggregate the research output of an organization into one location as opposed to the current “scatter” method

6 Institutional Repositories arXiv is not an institutional repository (and it is University) Current LANL institutional repository projects –AISTI (the Alliance for Innovation in Scientific and Technical Information) –Within LANL

7 Movement and Protocol The Open Archives Movement –Enhance public access to research output and scholarly materials –Reaction to commercial publisher’s pricing of scholarly journals The Open Archives Protocol for Metadata Harvesting –Number of ePrint repositories and DLs growing –ePrint/Library community desired interoperability of scholarly archives

8 OAI-PMH Technical Development Gopher, FTP Union Catalogs Z39.50 Kahn-Wilensky Framework Dienst Protocol Harvest UPS OAI-PMH

9 Overview of the OAI-PMH What is the OAI-PMH? –The protocol defines an application- independent specification for the interoperability [of digital libraries] through metadata harvesting. –The protocol is a building block that can facilitate/enable variety of services and functions. OAI versus OAI-PMH

10 Overview of the OAI-PMH What the OAI-PMH is not –The protocol is not a search service –The protocol is not a database –The protocol is not OAIS –The protocol does not define a metadata specification –The protocol does not equal Dublin Core

11 Data & Service Providers Data Providers (DPs) – Repositories – refer to entities who possess resources and metadata and are willing to share metadata with others via well-defined OAI protocols Service Providers (SPs) – Harvesters – are entities who harvest metadata from DPs in order to provide high level services to users (such as search and discovery). Data equals server, Service equals client

12 OAI-PMH Verb Set VerbFunction Identifydescription of repository ListMetadataFormatsmetadata formats supported by repository ListSetssets defined by repository ListIdentifiersOAI unique ids contained in repository ListRecordslisting of N records GetRecordlisting of a single record metadata about the repository harvesting verbs Most verbs take arguments: dates, sets, ids, metadata formats and resumption token (for flow control).

13 OAI-PMH Metadata Repositories are required to expose their metadata as the Dublin Core Metadata Element Set (DCMES). Repositories are strongly encouraged to expose their metadata in more expressive formats. Examples of other formats in use: –MARC –RFC-1807 –Open Languages Archives Community Metadata Set –Electronic Theses and Dissertation Metadata Set

14 resource all available metadata about David item Dublin Core metadata MARC metadata SPECTRUM metadata records item = identifier record = identifier + metadata format + datestamp set-membership is item-level property resource – item - record

15 Unique Identifiers Each item must have a unique identifier Identifiers must follow the URI syntax –OAI has its own format: oai: : oai:etd.vt.edu:edt –Can also use other formats http handle

16 Datestamps Required to support incremental harvesting Can be either YYYY-MM-DD or YYYY-MM- DDThh:mm:ssZ (must be GMT timezone) Different from dates within the metadata; this datestamp is used only for harvesting The datestamp is the creation date of the metadata record itself –It is not the publication date –It is not the creation date of the item

17 Sets Optional, depends on local DPs Must provide setSpec & setName, may provide setDescription, for each set in DP May be hierarchical (use “:”) to allow for harvesting of subcollections

18 How the OAI-PMH Works OAI “VERBS” Identify ListMetadataFormats ListSets ListIdentifiers ListRecords GetRecord HARVESTERHARVESTER REPOSITORYREPOSITORY OAI Service Provider Metadata Provider HTTP Request HTTP Response (OAI Verb) (Valid XML)

19 baseURL+verb Examples – – – – efix=oai_dc – &metadataPrefix=oai_dc – fix=oai_dc

20 Example Response T20:13:50Z -..

21 Example Record - oai:arXiv.org:acc-phys/ physics:acc-phys physics:physics - Symplectic Computation of Lyapunov Exponents Habib, Salman Ryne, Robert D. Accelerator Physics.. Comment: 12 pages, uuencoded PostScript (figures included) text

22 Optional Container Elements Repository level (set) – Additional information about repository oai-identifier, eprints, friends, branding, other… – Metadata level – Meta-metadata, i.e. record level rights

23 Resumption Tokens, etc. Resumption Tokens/Flow Control/Load Balancing –“ resumptionToken ” is used for an incomplete response –The client is issued a response with a token which may be presented to the server to receive more results at a later time

24 Resumption Tokens, etc. Resumption Tokens/Flow Control/Load Balancing –Options include: completeListSize, cursor, and expiration date attributes –Combine from/until/metadataPrefix/set and a record number indicator with delimiters into a sequential token from!until!metadataPrefix!set!recordnumber ! !oai_dc!All!100 –Use a session manager with automatic expiry

25 Resumption Tokens, etc. Resumption Tokens/Flow Control/Load Balancing Idempotency –Purpose is to allow harvesters to recover from lost responses or crashes without starting a large harvest from scratch –Recover by re-issuing request using resumptionToken from previous request –IMPLICATION: harvester must accept both the most recent resumptionToken issued and the previous one

26 Error Handling All protocol errors are in XML format –badVerb: illegal verb requested –badArgument: illegal parameter values or combinations –badResumptionToken, cannotDisseminateFormat, idDoesNotExist: parameters are in right format but are not legal under current conditions –noRecordsMatch, noMetadataFormats, noSetHierarchy: empty response exception

27 Example Error Message T20:32:53Z Verb 'ListRecords', argument 'metadataPrefix' required but not supplied.

28 OAI-PMH Static Repository Motivation –OAI-PMH is a low-barrier protocol –OAI-PMH favors to make it easy for Data Providers Bias has its origins in the Santa Fe Convention

29 OAI-PMH Static Repository Motivation Implementation is sometimes not trivial –Lack of technical expertise –Size of collection does not justify the investment –Security considerations re: database –ISP does not allow 3 rd party software –Cf. OLAC, union catalogue, LoCKSS

30 OAI-PMH Static Repository Motivation Therefore: research to devise approaches to further lower the barrier to sharing metadata collections through the OAI-PMH.

31 OAI-PMH Static Repository

32 Rights Effort Exploring rights about: –Resource –Metadata Framework based on the Creative Commons (CC) Collaborative Effort JISC/OAI/CC (JISC is the “Joint Information Systems Committee”involved with RoMEO.)

33 Further Information Institutional Repositories Partnering with Faculty to Enhance Scholarly Communication – SPARC Institutional Repository Checklist & Resource Guide –

34 Further Information Open Archives Initiative – OAI Metadata Harvesting Protocol OAI-PMH Tools Index – Virginia Tech DLRL OAI Projects – Repository Explorer – ARC Cross-Archive Search Service –

35 Further Information ARC Cross-Archive Search Service – OAI-PMH Static Repository –Registration –Example Repository etadataPrefix=oai_dc –Specification repository.htm

36 Further Information Creative Commons – JISC – Dspace – E-Prints DL-in-a-box – Greenstone Digital Library –

37 Further Information NDLTD – XML Schema Validator – Dublin Core Metadata Initiative – XML Tools at W3C –

38 Questions?