The Open Archives Initiative Story

Slides:



Advertisements
Similar presentations
A centre of expertise in digital information management The OAI Protocol for Metadata Harvesting Andy Powell UKOLN,
Advertisements

A busy persons introduction to OAI-PMH Christopher Gutteridge ALT, April 2003.
A brief overview of the Open Archives Initiative Steve Hitchcock Open Citation Project (OpCit) Southampton University Prepared for Z39.50/OAI/OpenURL plenary.
Open Archives Initiative Primer DC2001 – Tokyo, October 25, 2001 Thomas Krichel Palmer School of Library and Information Science Long Island University.
The Open Archives Initiative Story Thomas Krichel Uni. of Surrey, Hitotsubashi Uni., Long Island Uni.
The Open Archives Initiative Thomas Krichel
February Harvesting RDF metadata Building digital library portals with harvested metadata workshop EU-DL All Projects concertation meeting DELOS.
Y.T. a brief history of the OAI 0 Kaynak: Herbert van de Sompel.
OAI in DigiTool DigiTool Version 3.0.
OAI-PMH Dawn Petherick, University Web Services Team Manager, Information Services, University of Birmingham MIDESS Dissemination.
National Science Digital Library (NSDL) Core Infrastructure Metadata Repository (“union catalog”) Naomi Dushay Cornell University.
Building Digital Libraries on Open Archives Donatella Castelli IEI-CNR Italy.
The Open Archives Initiative Simeon Warner (Cornell University) Symposium on “Scholarly Publishing and Archiving on the Web”, University.
OAI Standards for Sheet Music Meeting March 28-29, 2002 Basic OAI Principals How They Apply to Sheet Music Presenter: Curtis Fornadley, Senior Programmer/Analyst.
The Open Archives Initiative Simeon Warner (Cornell University) Open Archives seminar “Facilitating Free and Efficient Scientific.
The Open Archives Initiative Simeon Warner Cornell University, Ithaca, NY, USA CREPUQ 2002, Montréal, Canada 14:00, 24 October 2002.
Basic Concepts Architecture Topology Protocols Basic Concepts Open e-Print Archive Open Archive -- generalization of e-print Data Provider and Service.
A Digital Library Repository Utilizing the Open Archives Initiative Developed to meet the needs of UTK Library Special Collections.
UKOLUG - July Metadata for the Web RDF and the Dublin Core Andy Powell UKOLN, University of Bath UKOLN.
Dienst Distributed Networked Publishing Carl Lagoze Digital Library Scientist Cornell University.
Introduction to the OAI Metadata Harvesting Protocol Hussein Suleman, Digital Library Research Laboratory Virginia Tech.
32nd LIBER Annual General Conference - Rome, June 2003 Open archive solutions to traditional archive/library cooperation Donatella Castelli ISTI-CNR.
Herbert van de sompel Workshop on OAI and peer review journals in Europe Geneva, Switserland – March 22nd to 24th 2001 Herbert Van de Sompel Cornell University.
Dec 9-11, 2003ICADL Challenges in Building Federation Services over Harvested Metadata Hesham Anan, Jianfeng Tang, Kurt Maly, Michael Nelson, Mohammad.
07/11/2002Thomas Baron - JACoW Workshop1 CERN Library Requirements T. Baron CERN ETT-DH-CDS.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
The OAI: overview and historical context OAI Open Meeting – Washington DC – January 23 rd 2001 Herbert Van de Sompel & Carl Lagoze Cornell University --
OAI-PMH: Open Archives Initiative Protocol for Metadata Harvesting T.B. Rajashekar National Centre for Science Information (NCSI) Indian Institute of Science,
1 CS 502: Computing Methods for Digital Libraries Lecture 19 Interoperability Z39.50.
The OAI Protocol for Metadata Harvesting Van de Sompel, Herbert Los Alamos National Laboratory – Research Library.
Digital Library Interoperability Architecture CS 502 – Carl Lagoze – Cornell University.
Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA Digital Libraries, OAI and Free Software.
The OAI: overview and historical context OAI Open Meeting – Washington DC – January 23 rd 2001 Herbert Van de Sompel & Carl Lagoze Cornell University --
Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) Phil Barker, March © Heriot-Watt University. You may reproduce all or any part.
Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi
Caltech CODA CODA: Collection of Digital Archives Caltech Scholarly Communication.
Slavic Digital Text Workshop 2006 The Open Archives Initiative Protocol for Metadata Harvesting: an Opportunity for Sharing Content in a Distributed Environment.
1 GRID Based Federated Digital Library K. Maly, M. Zubair, V. Chilukamarri, and P. Kothari Department of Computer Science Old Dominion University February,
OAI Overview DLESE OAI Workshop April 29-30, 2002 John Weatherley
Search Interoperability, OAI, and Metadata Sarah Shreeves University of Illinois at Urbana-Champaign Basics and Beyond Grainger Engineering Library April.
NSDL October 12-15, 2003Eisenhower National Clearinghouse Slide 1 NSDL and the Open Archives Initiative NSDL – OAI – and the Eisenhower National Clearinghouse.
A centre of expertise in digital information managementwww.ukoln.ac.uk DCMI Affiliates: Implications for Institutions Rosemary Russell UKOLN University.
The OAI: technical overview OAI Open Meeting – Washington DC – January 23 rd 2001 Herbert Van de Sompel & Carl Lagoze Cornell University -- Computer Science.
The Open Archives Initiative Marshall Breeding Director for Innovative Technologies and Research Vanderbilt University
Open Archives Initiative Protocol for Metadata Harvesting.
OAI from the needle box Humboldt Universität Berlin, March 20, 2002 Thomas Krichel Palmer School of Library and Information Science Long Island University.
OAI: What happened since Summer 2000 End of Summer 2000 –Not only e-prints research library community publishers, librarians, scholars –Digital Library.
Open Archives Initiative Gail McMillan Digital Library and Archives, Virginia Tech Society for Scholarly Publishing: June 1, 2000.
Sharing Digital Scores: Will the Open Archives Initiative Protocol for Metadata Harvesting Provide the Key? Constance Mayer, Harvard University Peter Munstedt,
Arc – Federated Searching Service Kurt Maly, Xiaoming Liu, M.Zubair, Michael L.Nelson Old Dominion University January 23, 2001.
The UPS protoproto project herbert van de sompel, michael nelson, thomas krichel UPS 1 Meeting Santa Fe - October 21th 1999.
Open Archives Initiative CNI Phoenix December 13, 1999 Dale Flecker, Harvard Carl Lagoze, Cornell John Ober, CDL Don Waters, Mellon.
2/22/2016J Ammerman1 Open Archives Initiative What is it? What’s it good for?
1 CS 430: Information Discovery Lecture 26 Architecture of Information Retrieval Systems 1.
Describing resources II: Dublin Core CERN-UNESCO School on Digital Libraries Rabat, Nov 22-26, 2010 Annette Holtkamp CERN.
The NSDL, OAI and Your Metadata Core Infrastructure Metadata Repository (“union catalog”) Naomi Dushay Cornell University.
OAI and ODL Building Digital Libraries from Components Hussein Suleman Virginia Tech DLRL 12 September 2002.
Introduction to OAI Static Repositories By Thomas G. Habing Grainger Engineering Library.
Metadata Harvesting - OAI-PMH
Getting a Leg Up on OAI for the NSDL
University of Illinois at Urbana-Champaign OAI Alpha Experiences
Systems for scholarly communication
Georges Arnaout Chaitanya Krishna
A step-by-step guide to DOI registration
CS431 guest lecture Simeon Warner
OAI and Metadata Harvesting
Open Archives Initiative
Digitometric Services for Open Archives Environments
Open Archive Initiative
IVOA Interoperability Meeting - Boston
Presentation transcript:

The Open Archives Initiative Story Thomas Krichel http://openlib.org/home/krichel Uni. of Surrey, Hitotsubashi Uni., Long Island Uni.

About this talk Follows essentially a historical approach mixes in a few digital library concepts, interrupt me if you do not get some of them does not represent an official statement botches together various ideas from different people benefited from funding by DLF, LANL, CLIR, JISC, DINI

UPS call 1999-07 Ginsparg, Luce and Van de Sompel “The purpose of this call is the mobilisation of a core group to work towards achieving a universal service for author-archived literature” emphasis on a pragmatic level of interoperability

UPS protoproto By Krichel, Nelson and Van de Sompel found that the main problems of interoperability between eprint initiative are poor metadata no uniform identifier structure unclear legal terms and conditions lack of selective harvesting

Santa Fe meeting 1999-10 Representatives of arXiv, cogprints, Highwire, NCSTRL, NDLTD, RePEc, SLAC/SPIRES and others chaired by Lynch and Waters sponsored by CLIR, LANL and SPARC

basic concepts “Managed” or formal e-print archive; not papers on the web Open e-print archive means that there is a machine interface “record” can be metadata or metadata & full text archive may be partitioned

business model Inspired by RePEc initiative Separation between data providers and service providers Many archives Many metadata collections Many services

requirements & realisations Metadata harvesting (not distributed database) Namespace mandatory metadata & parallel sets acceptable use registration OA Dienst subset full id=archive|record OAMS and XML transport gentleperson’s agreement in a provider statement primitive templates

technical model Subset of Dienst protocol used by NCSTRL Compatible archive respond to 4 requests List-Partitions List-Meta-Formats List-Contents (partitionspec, file-after, meta-format) Disseminate (fullID, meta-format, content- type)

Dublin Core-ish Minimal Metadata for selective harvesting optional Display ID [R] Abstract Subject [R]. Comment [R] Date for Discovery [R] mandatory Title Date of Accession Full ID Author [R]

Implementation efforts Implementation of Dienst subset arXiv.org done Cornell NCSTRL server done WCR done RePEc fails Harvesting arXiv NCSTRL for a test library

Critique Why OAMS, not Dublin Core Dienst subset carries a lot of legacy to the full dienst protocol that.

development in DL community Interest in interoperability for a long time, stated interest of the digital library federation trouble: two approaches union catalogue causes friction distributed search high entry requirement problematic to implement

Harvard meeting 2000-05 Vision statement: SFc a new way forward for interoperability could the OAi develop in a more general fashion such that it can be used by different communities? political agenda of OAi (free access) perceived as problem

San Antonio meeting 2000-06 45 people show broad range of interest leads to problem of not getting lost. View that SFc is a technical support infrastructure Communities in different business and contents model can adopt the framework for interoperabilty

San Antonio meeting 2000-05 Carl’s reverse bubble First there was the OAi that made the SFc. Now there is the SFc that is implemented by more than the original OAi discussion of what changes required to the OAi steering committee attract funding to develop other application domain

Ithaca meeting 2000 -09 Experience gained with implementing & discussing the current SFc specs aim: new spec by the end of 2000 stable for experimentation but not definite hope to minimise risks for implementors maximise chances for interoperability SFc+ to translate from eprint domain interoperability towards general domain interoperability

Abstract concepts to keep open eprint archive --> open archive data provider / service provider archive management issue of records needed to be discussed OAMS confuses metadata and full text

Implementation features to keep Metadata harvesting OAi namespace shared metadata and parallel metadata acceptable use registration of data and service providers

All change please, all change... OAi DIENST replaced by OA protocol OAi ID revised OAMS replaced by wrapped DC introduction of the concept of native metadata generalised and marginalised partitions revisited registrations

New OAi metadata Accession date to be renamed datestamp and stripped of semantic link to the records Full ID kept, colon used as canonical separator unqualified DC is mandatory, but empty DC may be returned introduction of the idea of native metadata OAMS scrapped, Krichel and Warner to lead an EPMS discussion

Solution: encapsulate metadata <oai> <oai.fullid>dini:01</oai.fullid> <oai:datestamp>”2000-09-21” <oai:datestamp> <dc xmlns:dc=“…”> <dc.title> Someone’s paper </dc.title> </dc> </oai>

Identifier Identifiers point to metadata records Concatenate Case sensitive archive name delimiter is a colon anything internal to the archive appearing after that prefixed by OAI as a pointer to a resolution mechanism

Sets replace partitions ONLY for a local community to implement selective harvesting there can be zero or more sets in an archive records can exist at interior nodes in the set hierarchy asking for records in a set returns records in the set and in all its subsets.

OA protocol Identify (no arguments, no exceptions) ListMetadataFormats ([fullId]), response is the same as for the SFc ListSets (no arguments, empty response ok) ListRecord ([Sets] colon as separator)

OA protocol ListContents ([sets][recordbefore] [recordafter][metaformat]) response as before but may contain resumption token (set,recordbefore,recordafter) errors 206,503,302 GetRecord (fullId) response as before error 404

Encoding via cgi General syntax baseurl?verb=verbname&argname=argval... baseurl is the location of the OA v1 protocol as registered at openarchives.org verbname is the name of the verb argname is the name of the attribute argval is the value of the attribute

Registration of archives Metadata format registration as now, names alphanumeric and underscore Self-description introduced in the OA protocol through the identify verb Fields of data provider templates Natural language name description url archive id maintainer (of OA interface) email version of OA protocol used OA base url

Conclusion After the Ithaca work, the OAi is set for another time of testing, with a broader set of tests rather than at the first time. Many ideosyncracies of the old SFc have been removed, and that will increase the overall acceptability. The new version one of the OAi protocol may be a bit more complicated than the SFc, but a lot more sound. It still is not definite.