Open Archives Initiative OAI openarchives.org “Opening Remarks & Historical Overview” - ACM SIGIR’2001 Ed Fox (w. Lagoze.

Slides:



Advertisements
Similar presentations
OAI from 50,000 Feet OAI develops and promotes interoperability solutions that aim to facilitate the efficient dissemination of content. Begun in 1999.
Advertisements

A centre of expertise in digital information management The OAI Protocol for Metadata Harvesting Andy Powell UKOLN,
A brief overview of the Open Archives Initiative and OpenURL Steve Hitchcock Open Citation Project (OpCit) Southampton University Prepared for Z39.50/OAI/OpenURL.
A brief overview of the Open Archives Initiative Steve Hitchcock Open Citation Project (OpCit) Southampton University Prepared for Z39.50/OAI/OpenURL plenary.
DLM-Forum - Barcelona, 7-8 May 2002 Promoting and Supporting Open Archives in Europe: The Open Archives Forum Project Donatella Castelli IEI-CNR
Open Archives Initiative Primer DC2001 – Tokyo, October 25, 2001 Thomas Krichel Palmer School of Library and Information Science Long Island University.
The Open Archives Initiative Thomas Krichel
Rapid Visual OAI Tool S. Kothamasa, K. Maly, M. Zubair (Old Dominion University) X. Liu (Los Alamos National Laboratory) RCDL 2003, St. Petersburg.
Y.T. a brief history of the OAI 0 Kaynak: Herbert van de Sompel.
Object Re-Use and Exchange Mellon Retreat, Nassau Inn, Princeton, NJ, March Herbert Van de Sompel, Carl Lagoze The OAI Object Re-Use & Exchange.
OAI-PMH Dawn Petherick, University Web Services Team Manager, Information Services, University of Birmingham MIDESS Dissemination.
Building Reliable Distributed Information Spaces Carl Lagoze CS /22/2002.
Building Digital Libraries on Open Archives Donatella Castelli IEI-CNR Italy.
The Open Archives Initiative Simeon Warner (Cornell University) Symposium on “Scholarly Publishing and Archiving on the Web”, University.
OAI Standards for Sheet Music Meeting March 28-29, 2002 Basic OAI Principals How They Apply to Sheet Music Presenter: Curtis Fornadley, Senior Programmer/Analyst.
The Open Archives Initiative Simeon Warner (Cornell University) Open Archives seminar “Facilitating Free and Efficient Scientific.
OAI-PMH at Yale Report on the DLF OAI Training Session November 10, 2005 Charlottesville, VA.
The Open Archives Initiative Simeon Warner Cornell University, Ithaca, NY, USA CREPUQ 2002, Montréal, Canada 14:00, 24 October 2002.
Basic Concepts Architecture Topology Protocols Basic Concepts Open e-Print Archive Open Archive -- generalization of e-print Data Provider and Service.
National Aeronautics and Space Administration Implementing DSpace at NASA Langley Research Center 1 Greta Lowe Librarian NASA Langley Research Center
Digital Library Architecture and Technology
Dienst Distributed Networked Publishing Carl Lagoze Digital Library Scientist Cornell University.
Introduction to Digital Libraries hussein suleman uct cs honours 2004.
Introduction to the OAI Metadata Harvesting Protocol Hussein Suleman, Digital Library Research Laboratory Virginia Tech.
ECDL Workshop “Extending Interoperability of Digital Libraries: Building on the Open Archives Initiative” Lisbon – September 21, 2000 Edward A. Fox
1 The NSDL: A Case Study in Interoperability William Y. Arms Cornell University.
US-Korea Joint Workshop on Digital Libraries SDSC - August 10-11, 2000 Open Archives Edward A. Fox CS DLRL Internet TIC.
How to participate in the Union Catalogue Project Hussein Suleman Sivulile – Open Access South Africa Advanced Information Management.
Rapid Visual OAI Tool S. Kothamasa, K. Maly, M. Zubair (Old Dominion University) X. Liu (Los Alamos National Laboratory) RCDL 2003, St. Petersburg.
32nd LIBER Annual General Conference - Rome, June 2003 Open archive solutions to traditional archive/library cooperation Donatella Castelli ISTI-CNR.
Open Archives Initiative OAI openarchives.org “Opening Remarks & Historical Overview” - ACM SIGIR’2001 Ed Fox (w. Lagoze.
1 XML as a preservation strategy Experiences with the DiVA document format Eva Müller, Uwe Klosa Electronic Publishing Centre Uppsala University Library,
The Metadata Object Description Schema (MODS) NISO Metadata Workshop May 20, 2004 Rebecca Guenther Network Development and MARC Standards Office Library.
Herbert van de sompel Workshop on OAI and peer review journals in Europe Geneva, Switserland – March 22nd to 24th 2001 Herbert Van de Sompel Cornell University.
The OAI: overview and historical context OAI Open Meeting – Washington DC – January 23 rd 2001 Herbert Van de Sompel & Carl Lagoze Cornell University --
OAI-PMH: Open Archives Initiative Protocol for Metadata Harvesting T.B. Rajashekar National Centre for Science Information (NCSI) Indian Institute of Science,
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
Modularization and Interoperability: Dublin Core and the Warwick Framework Sandra D. Payette Digital Library Research Group Cornell University November.
The OAI Protocol for Metadata Harvesting Van de Sompel, Herbert Los Alamos National Laboratory – Research Library.
Digital Library Interoperability Architecture CS 502 – Carl Lagoze – Cornell University.
Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA Digital Libraries, OAI and Free Software.
The OAI: overview and historical context OAI Open Meeting – Washington DC – January 23 rd 2001 Herbert Van de Sompel & Carl Lagoze Cornell University --
Tsinghua University Library Yang Zhao & Airong Jiang Tsinghua University Library, Beijing China 4 June, 2004 Electronic Thesis and Dissertation System.
Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi
Caltech CODA CODA: Collection of Digital Archives Caltech Scholarly Communication.
Slavic Digital Text Workshop 2006 The Open Archives Initiative Protocol for Metadata Harvesting: an Opportunity for Sharing Content in a Distributed Environment.
1 GRID Based Federated Digital Library K. Maly, M. Zubair, V. Chilukamarri, and P. Kothari Department of Computer Science Old Dominion University February,
OAI Overview DLESE OAI Workshop April 29-30, 2002 John Weatherley
Integrating Access to Digital Content Sarah Shreeves University of Illinois at Urbana-Champaign Visual Resources Association 23 rd Annual Conference Miami.
Search Interoperability, OAI, and Metadata Sarah Shreeves University of Illinois at Urbana-Champaign Basics and Beyond Grainger Engineering Library April.
The OAI: technical overview OAI Open Meeting – Washington DC – January 23 rd 2001 Herbert Van de Sompel & Carl Lagoze Cornell University -- Computer Science.
The Open Archives Initiative Marshall Breeding Director for Innovative Technologies and Research Vanderbilt University
OAI from the needle box Humboldt Universität Berlin, March 20, 2002 Thomas Krichel Palmer School of Library and Information Science Long Island University.
OAI: What happened since Summer 2000 End of Summer 2000 –Not only e-prints research library community publishers, librarians, scholars –Digital Library.
Open Archives Initiative Gail McMillan Digital Library and Archives, Virginia Tech Society for Scholarly Publishing: June 1, 2000.
Sharing Digital Scores: Will the Open Archives Initiative Protocol for Metadata Harvesting Provide the Key? Constance Mayer, Harvard University Peter Munstedt,
Open Archives Initiative CNI Phoenix December 13, 1999 Dale Flecker, Harvard Carl Lagoze, Cornell John Ober, CDL Don Waters, Mellon.
Designing Protocols in Support of Digital Library Componentization Hussein Suleman and Edward A. Fox Digital Library Research Laboratory Virginia Tech.
2/22/2016J Ammerman1 Open Archives Initiative What is it? What’s it good for?
NSDL & the Open Archives Initiative A Brief Introduction to OAI Timothy W. Cole Mathematics Librarian & Professor of Library Administration.
1 CS 430: Information Discovery Lecture 26 Architecture of Information Retrieval Systems 1.
Describing resources II: Dublin Core CERN-UNESCO School on Digital Libraries Rabat, Nov 22-26, 2010 Annette Holtkamp CERN.
The OAI PMH (Open Archives Initiative Protocol for Metadata Harvesting) MetaScholar Initiative All-Project Meeting Atlanta, GA 6/18/2002 Edward A. Fox.
NDLTD Toward Universal Accessibility of ETDs: Building the NDLTD Union Archive Hussein Suleman, Edward A. Fox,
OAI and ODL Building Digital Libraries from Components Hussein Suleman Virginia Tech DLRL 12 September 2002.
OAI and Metadata Harvesting
The Open Archives Initiative Story
Digitometric Services for Open Archives Environments
Open Archive Initiative
Institutional Repositories
Presentation transcript:

Open Archives Initiative OAI openarchives.org “Opening Remarks & Historical Overview” - ACM SIGIR’2001 Ed Fox (w. Lagoze & Suleman)

Acknowledgements People –Dan Greenstein –Carl Lagoze –Clifford Lynch –Hussein Suleman –Herbert Van de Sompel –Members of the OAI community Funding Organizations –Coalition for Networked Information –Digital Library Federation –National Science Foundation, CONACyT, DFG, Mellon, …

Open Archives: Communities, Interoperability and Services (Workshop - Sep. 13, New Orleans) Session 1: Intro to OAI Session 2: Technical Details Session 3: Concurrent Group Discussions –Applicability of OAI to distributed community building,; community support needed to leverage OAI standards –Evaluation of tech stds; current and future directions of stds and services (related to the OAI protocols) –See details on next slide Session 4: Presentations of Group Findings Session 5: Moving Forward

Open Archives: Communities, Interoperability and Services (Workshop - Sep. 13, New Orleans) Building CommunitiesTechnical Services Support for different types of communities Protocol evaluation: experiences, efficiency, … Developments aiding community building Support for internationalization Selective harvesting (sets)Services enabled by OAI Community building ex’sSupport for full-text retrieval Social aspects of OAI-based community projects Support for protocol adoption

Open Archives: Communities, Interoperability and Services (Workshop - Sep. 13, New Orleans) Attendees from various institutions CaltechU. of Illinois, U-C CMIS, Carlton, AustraliaU. of Oldenburg, GE Dartmouth CollegeU. of Southampton Emory UniversityU. of Tennessee Los Alamos Nat’l LabUS Dept. of Energy Louisiana State Univ.Virginia Tech Michigan State Univ. NASA Center for Aerospace Information

Ex.: NDLTD Access Possibilities Web search engines library catalog clients www. theses. org www. openarchives. org 3 rd Party Services (e.g., UMI) Virginia Tech National Library of Portugal CBUC (Spain) Ohio Link MITNational Projects: AU, GE, …

Open Archives Initiative (OAI) high-energy physics (Ginsparg, 1991) CSTR + WATERS = NCSTRL (Lagoze,1994) xxx + NCSTRL = CoRR collaboration (1998) Universal Preprint Service protoproto, Oct , 1999, Santa Fe – led by LANL, CNI, DLF, Mellon --> OAi Santa Fe Convention (see Feb. D-Lib Magazine article) Follow-on mtgs: Antonio, (ECDL) Archives -> Open Archives –Support unique archive identifiers –Implement Open Archives metadata set (DC, using XML) –Implement OA harvesting protocol (derived from Dienst protocol) –Register the archive Build tools, layer other services: linking, searching, …

OAi Philosophy Self-archiving = submission mechanism Long-term storage system = archive Open interface = harvesting mechanism Data provider + service provider Start with “gray literature” –e-prints/pre-prints, reports, dissertations, …

Repository of Digital Objects Repository Access Protocol handle Digital object terms and conditions

OAI – Repository Perspective Required: Protocol DO MDO

OAI – Black Box Perspective OA 1OA 2OA 4OA 3OA 5OA 6OA 7

ETD Union Collection (OAI)

Open Archives (protoproto) ArXiv & Los Alamos National Lab CogPrints & U. Southampton NACA & NASA (reports) NCSTRL & Cornell U. NDLTD & Virginia Tech RePEc & U. Surrey Total of around 200K records

Original Open Archives Members American Physical Society California Digital Library Caltech Coalition for Networked Info. Cornell University Harvard University Library of Congress Los Alamos Nat’l Lab Mellon Foundation NASA Langley Research Cntr Old Dominion University Stanford University U. of Ghent U. of Surrey U. of Southampton Vanderbilt University Virginia Tech Washington University

Open Archives Future EconWPA (U. Washington) e-biomed -> PubMed Central (NIH) PubScience (DOE) Clinical Medicine Netprints (+ other HighWire Press holdings ) University ePub (California Digital Library) All public e-prints (MIT) Scholar’s Forum (Caltech) Int’l: CERN, Germany, India, Mexico, … Goal: millions of books/articles/reports / yr

Approaches to Open Archives Build By Discipline Build By Institution

Approaches to Open Archives Build By Discipline Build By Institution Author Category Interdisciplinary Year Language Query …

Mechanisms Sharing –Join federation, run software –Make metadata and archive available Aggregating –By discipline –By institution –By genre Automating –Workflow –Harvesting and providing services –Federated searching –Dynamic linking (e.g., with SFX (OpenURLs))

VT View of the Open Archives Initiative (OAI) Enable sharing of publication metadata and full- text by digital libraries Standardize low-level mechanisms to share contents of libraries Build higher-level user-centric and administrative services in meta-libraries Install organizational mechanisms to support the technical processes

Virginia Tech Projects MARC XML-DTD Computer Science Teaching Centre (CSTC) W3C Web Characterization Repository OAI Repository Explorer Networked Digital Library of Theses and Dissertations (NDLTD)

MARC XML-DTD XML Transport format for US-MARC records Standardized metadata exchange format for traditional library services joining OAI

OAI Repository Explorer Serves as a compliancy test Allows browsing of open archives using only OAI protocol Sends requests on behalf of user, parses and checks responses and displays browsable interface Will detect most discrepancies in protocol

Request, Response – OAI, VT ETDs

Motivation Existence of some established but independent archives Need for cross-archive services (like search engines) Lack of low-cost interoperability technology Experience from past projects such as Dienst

Agenda Goal: to produce communities of OAI implementers and supporters Process: –History and context of the OAI –Definitions and concepts of the technology –Protocol details –Working with the OAI community Tools Mailing lists Projects –Future Plans

Digital Library Interoperability Paepcke, A., C.-C. Chang, et al. (1998). "Interoperability for Digital Libraries Worldwide." Communications of the ACM 41(4):

A Short History of Interoperability Naming: URNs, Handles, DOIs Metadata: Dublin Core, IMS, MARC Search and Discovery: Z39.50, Harvest, Dienst, STARTS, SDLIP Object Models: Kahn/Wilensky, FEDORA, Buckets Encoding: SGML, HTML, XML, RDF

Interoperability Trade-offs Functionality Cost HTTP Google Z39.50 SGML Dublin Core OAI

OAI's Location in a Broader Interoperability Fabric Data Structuring (XML, XML Schema) Data Semantics (Dublin Core, other metadata) Object Access Exchange of Structured Information

metadata Yes, it’s about resource discovery over distributed collections Author Title Abstract Identifer

Beyond resource discovery to distributed custodianship Traditional portal (e.g., Yahoo!) –linkage with limited responsibility Hybrid Portal –Goal: assertion of (some semblance) of curatorial role over linked objects –Mechanism: sharing structured information (metadata) amongst distributed content providers

The Library should selectively adopt the portal model for targeted program areas. By creating links from the Library’s Web site, this approach would make available the ever- increasing body of research materials distributed across the Internet. The Library would be responsible for carefully selecting and arranging for access to licensed commercial resources for its users, but it would not house local copies of materials or assume responsibility for long-term preservation. LC21: Digital Strategy for the Library of Congress page 5 Broadening the Goals of Interoperability

Facilitating/Monitoring Longevity of Distributed Content Preservation Service

DigitalObject Realaudio video PowerPoint presentation SMIL synchronization metadata structural metadata Portal APortal B View A: View slides View video View synchronized presentation using applet View B: Get transcript of audio Search for keyword Get slides translated to French Tool Repository Personalization of Content

Cross-Repository Reference Linking citation metadata citation metadata citation metadata citation metadata citation metadata Linkage Service

Origins of the OAI Increasing interest in alternative scholarly publishing solutions – e.g., LANL arXiv Increasing impact through federation UPS Mtg., Sante Fe, October 1999 –Representatives of various E-Print, library, and publishing communities –Goal: definition of an interoperability framework among E-Print providers –Result: Santa Fe Convention, interoperability through metadata harvesting

“Open” Archives Political Agenda? –Author self-archiving of E-Prints –“Mission” to reformulate scholarly publishing framework Technical? –Infrastructure to facilitate interoperability across multiple domains

Other Communities of Interest “Cambridge” Digital Library Federation meetings –research library community has many materials for which they’d like to ‘expose’ metadata OAI workshops –librarians, publishers (some), researchers, others Museum Community –Museums on the Web and CIMI

Technical Umbrella for Practical Interoperability… Reference Libraries Publishers E-Print Archives …that can be exploited by different communities Museums

OAI Organizational Structure Key Features Clear focus and scope –Developing and refining technical specification –Community building and evangelism limited to serving that goal and to encouraging widespread adoption Encouraging specialization and community- specific activities Division of responsibility –Executive (Van de Sompel and Lagoze) –Steering Committee –Technical Committee –Mailing Lists (community)

OAI Technical Infrastructure Key Technical Features Deploy now technology – 80/20 rule Two-party model – providers (data providers) and consumers (service providers) Simple HTTP encoding XML schema for some degree of protocol conformance Extensibility –Multiple item-level metadata –Collection level metadata

Discovery Current Awareness Preservation Service Providers Data Providers Metadata harvesting The World According to OAI

What is the OAI-MHP ? What is the Metadata Harvesting Protocol? –Protocol to transfer metadata from a source archive to a destination archive Any metadata In a continuous stream As simply as possible

Key Features of the OAI Metadata Harvesting Protocol definitions & concepts –repository –record –identifier –datestamp –set protocol features –HTTP encoding –metadata prefix & schema –flow control protocol requests –supporting requests –harvesting requests

repository repositoryrepository OAI protocol harvesterharvester support data harvesting data items

record oai:eg: My Example No restrictions protocol support format-specific metadata community-specific record data

identifiers oai-identifier = oai:archive-identifier:record-identifier Registered URI Scheme Archive Identifier: Registered within OAI Unique ID within archive: (syntax is archive- specific) example = oai:ncstrl:ncstrl.cornellcs/TR locally unique key for extracting a record from a repository

selective harvesting - datestamps repositoryrepository harvest within date range record

selective harvesting - sets repositoryrepository harvest within set S1 record S2

set specifics repositories define hierarchical organization each item in a repository may be organized in one set, several sets, or no sets at all meaning of sets or of set hierarchy is not defined in protocol individual communities may formulate common set configurations

HTTP encoding - requests BASE-URL >an.oa.org/OAI-script keyword arguments -->verb=ListIdentifers&set=S1 GET POST POST HTTP/1.0 Content-Length: 78 Content-Type: application/x-www-form-urlencoded verb=ListIdentifers&set=S1

HTTP encoding - responses T19:30:30-04:00 &identifier=oai%3AarXiv%3A0001 &metadataPrefix=oai_dc record contents response header xml namespaces response data

metadata prefix and schema support for harvesting multiple metadata formats –metadata schema: each format must have a validating XML schema at a publicly accessible URL (communities may define shared formats and schema). –metadata prefix: each repository maps a prefix to the schema it supports, which is used in protocol requests. support for unqualified Dublin Core mandatory –reserved schema URL at –reserved prefix oai_dc.

flow control protocol request harvesterharvester repositoryrepository

flow control specifics applies to all protocol requests that return lists: ListRecords, ListIdentifiers, ListSets resumptionToken is opaque semantics of partitioning of responses within resumption requests is undefined time-to-live of resumptionToken is not defined by the protocol

Supporting protocol requests: Identify ListMetadataFormats ListSets Harvesting protocol requests: ListRecords ListIdentifiers GetRecord repositoryrepository harvesterharvester service providerdata provider OAI Protocol

Identify Repository name Base-URL Admin OAI protocol version Description Container repositoryrepository harvesterharvester service providerdata provider Supporting Protocol Requests

ListMetadataFormats REPEAT Format prefix Format XML schema /REPEAT repositoryrepository harvesterharvester service providerdata provider Supporting Protocol Requests

ListSets REPEAT Set Specification Set Name /REPEAT repositoryrepository harvesterharvester service providerdata provider Supporting Protocol Requests

* from=a * until=b * set=klm ListRecords * metadataPrefix=oai_dc REPEAT Identifier Datestamp Metadata About Container /REPEAT repositoryrepository harvesterharvester service providerdata provider Harvesting Protocol Requests

REPEAT Identifier Datestamp /REPEAT repositoryrepository * from=a * until=b ListIdentifiers * set=klm harvesterharvester service providerdata provider Harvesting Protocol Requests

* identifier=oai:mlib:123a GetRecord * metadataPrefix=oai_dc Identifier Datestamp Metadata About repositoryrepository harvesterharvester service providerdata provider Harvesting Protocol Requests

Open Archives Initiative OAI openarchives.org “Opening Remarks & Historical Overview” - ACM SIGIR’2001 Ed Fox (w. Lagoze & Suleman):B

Other OAI Functions Registry of data and service providers Tool registry Community communication