Presentation is loading. Please wait.

Presentation is loading. Please wait.

OAI Protocol for Metadata Harvesting Tim Brody Intelligence, Agents, Multimedia Group University of Southampton OpCit –

Similar presentations


Presentation on theme: "OAI Protocol for Metadata Harvesting Tim Brody Intelligence, Agents, Multimedia Group University of Southampton OpCit –"— Presentation transcript:

1 OAI Protocol for Metadata Harvesting Tim Brody Intelligence, Agents, Multimedia Group University of Southampton OpCit – http://opcit.eprints.org/http://opcit.eprints.org/ www.ecs.soton.ac.uk BCS Metadata Meeting, London 29 th May 2002 (Many slides borrowed from Michael L. Nelson)

2 OAI 2.0 Public, stable not released yet … (but very close) –Beta released mid-May –Public release scheduled: 1 st June 2.0 implementations in the pipeline –British Library, Cornell Univ, Ex Libris, my.OAI, Humbolt Univ, InQuirion Pty Ltd, Library of Congress, NASA, OCLC, Old Dominion Univ, U. of Illinois, U. of Southampton, UCLA, John Hopkins U., Indiana U., NYU, UKOLN, Virginia Tech

3 Open Archives Initiative The protocol is openly documented, and metadata is exposed to at least some peer group (note: rights management can still apply!) Archive defined as a collection of stuff -- not the archivists definition of archive. Repository used in most OAI documents. OAI is happening at break-neck speed...

4 Metadata Harvesting Move away from distributed searching Extract metadata from various sources Build services on local copies of metadata –Resources remain at remote repositories user... search for cfd applications local copy of metadata harvested offline metadata harvested offline metadata harvested offline metadata harvested offline each node independently maintained all searching, browsing, etc. performed on the metadata here individual nodes can still support direct user interaction

5 Metadata Harvesting Repositories (archives etc.) = low implementation cost Services = higher implementation cost Similar to web search model –DP9 gateway makes it exactly the same

6 abouteprints document like objects resourcesmetadata OAMS unqualified Dublin Core unqualified Dublin Core transport HTTP responsesXML requests HTTP GET/POST verbs Dienst OAI-PMH natureexperimental stable model metadata harvesting metadata harvesting metadata harvesting Santa Fe convention OAI-PMH v.1.0/1.1 OAI-PMH v.2.0

7 OAI-PMH v.2.0 [06/2002] Goal: recurrent exchange of metadata about resources between systems Input: OAI-PMH v.1.0 [01/01 – 09/02] feedback on OAI-implementers deliberations by OAI-tech [09/01 -] alpha test group of OAI-PMH v.2.0 [03/02 -]

8 low-barrier interoperability specification metadata harvesting model: data provider / service provider metadata about resources autonomous protocol distinction between protocol and periphery community-specific extensions HTTP based XML responses unqualified Dublin Core stable (1.0 characterized as experimental) OAI-PMH v.2.0 [06/2002]

9 OAI Data Model: Resources / Items / Records resource all available metadata about David item Dublin Core metadata MARC metadata SPECTRUM metadata records item = identifier record = identifier + metadata format + datestamp

10 Overview of OAI Verbs VerbFunction Identifydescription of archive ListMetadataFormatsmetadata formats supported by archive ListSetssets defined by archive ListIdentifiersOAI unique ids contained in archive ListRecordslisting of N records GetRecordlisting of a single record archival metadata harvesting verbs most verbs take arguments: dates, sets, ids, metadata formats and resumption token (for flow control)

11 Identify Arguments –none Errors –none Arguments –none Errors –badArgument 1.12.0

12 ListMetadataFormats Arguments –identifier (OPTIONAL) Errors –id does not exist Arguments –identifier (OPTIONAL) Errors –badArgument –noMetadataFormats –idDoesNotExist 1.12.0

13 ListSets Arguments –resumptionToken (EXCLUSIVE) Errors –no set hierarchy Arguments –resumptionToken (EXCLUSIVE) Errors –badArgument –badResumptionToken –noSetHierarchy 1.12.0

14 ListIdentifiers Arguments –from (OPTIONAL) –until (OPTIONAL) –set (OPTIONAL) –resumptionToken (EXCLUSIVE) Errors –no records match Arguments –from (OPTIONAL) –until (OPTIONAL) –set (OPTIONAL) –resumptionToken (EXCLUSIVE) –metadataPrefix (REQUIRED) Errors –badArgument –cannotDisseminateFormat –badResumptionToken –noSetHierarchy –noRecordsMatch 1.12.0

15 ListRecords Arguments –from (OPTIONAL) –until (OPTIONAL) –set (OPTIONAL) –resumptionToken (EXCLUSIVE) –metadataPrefix (REQUIRED) Errors –no records match –metadata format cannot be disseminated Arguments –from (OPTIONAL) –until (OPTIONAL) –set (OPTIONAL) –resumptionToken (EXCLUSIVE) –metadataPrefix (REQUIRED) Errors –noRecordsMatch –cannotDisseminateFormat –badResumptionToken –noSetHierarchy –badArgument 1.12.0

16 GetRecord Arguments –identifier (REQUIRED) –metadataPrefix (REQUIRED) Errors –id does not exist –metadata format cannot be disseminated Arguments –identifier (REQUIRED) –metadataPrefix (REQUIRED) Errors –badArgument –cannotDisseminateFor mat –idDoesNotExist 1.12.0

17 2002-0208T08:55:46Z http://arXiv.org/oai2 oai:arXiv:cs/0112017 2001-12-14 cs math ….. response no errors

18 2002-0208T08:55:46Z http://arXiv.org/oai2 ShowMe is not a valid OAI-PMH verb response with error

19 Idempotency of resumptionToken: return same incomplete list when rT is re-issued while no changes occur in the repo: strict while changes occur in the repo: all items with unchanged datestamp new attributes for the resumptionToken: expirationDate completeListSize cursor resumptionToken Flow-Control

20 evolution from talking about OAI-PMH to talking about projects that use OAI-PMH to talking about projects and failing to mention they use OAI-PMH => OAI-PMH becomes part of the infrastructure Adoption

21 49 registered repositories [11/2001] 65 registered repositories [03/2002] 77 registered repositories [05/2002] 5+ million records many unregistered repositories private implementations (e.g. RDN) Data Providers (a.k.a. repositories)

22 Arc: cross-searching of registered repositories [ http://arc.cs.odu.edu ]http://arc.cs.odu.edu CiteBase: research literature search + citation ranking [ http://citebase.eprints.org ]http://citebase.eprints.org OLAC: cross-searching of Language Archive Community repositories [ http://www.language-archives.org/index.html ]http://www.language-archives.org/index.html Service Providers

23 Scirus scientific search engine [Elsevier] [ http://www.scirus.com ]http://www.scirus.com my.OAI : user-tailorable cross-searching of registered repositories [FS Consulting, Inc.] [ http://www.myoai.com ]http://www.myoai.com Growing interest from web search engines S ervice Providers

24 Repository Explorer: interactive exploration of repositories [Virginia Tech] [ http://www.purl.org/NET/oai_explorer ]http://www.purl.org/NET/oai_explorer eprints.org: generic OAI-PMH compliant repository software [U of Southampton] [ http://www.eprints.org ]http://www.eprints.org ALCME repository and harvester software [OCLC] [ http://alcme.oclc.org/index.html ]http://alcme.oclc.org/index.html APIs, others tools @ www.openarchives.org OAI-PMH tools

25 http://www.openarchives.org/ openarchives@openarchives.org


Download ppt "OAI Protocol for Metadata Harvesting Tim Brody Intelligence, Agents, Multimedia Group University of Southampton OpCit –"

Similar presentations


Ads by Google