Presentation is loading. Please wait.

Presentation is loading. Please wait.

Georges Arnaout Chaitanya Krishna

Similar presentations


Presentation on theme: "Georges Arnaout Chaitanya Krishna"— Presentation transcript:

1 Georges Arnaout Chaitanya Krishna
The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) Website: Editors: Carl Lagoze (Cornell University) Herbert Van de Sompel (Los Alamos Laboratory) Michael Nelson (NASA Langley Research Ctr) Simeon Warner (Cornell University) Presented by: Georges Arnaout Chaitanya Krishna CS 791/891-WEB SYNDICATION FORMATS 1

2 OAI Open Archives Initiative The protocol is openly
documented, and metadata is “exposed” to at least some peer group Archive defined as a “collection of stuff” -- or “Repository” OAI is happening at break-neck speed... figure reference: CS 791/891-WEB SYNDICATION FORMATS

3 But what is interoperability ???
Definition OAI-PMH: - A protocol that provides an application-independent interoperability framework based on metadata harvesting. But what is interoperability ??? CS 791/891-WEB SYNDICATION FORMATS 3

4 What is Interoperability?
It is the ability of exchanging and using information from 2 or more applications or systems. CS 791/891-WEB SYNDICATION FORMATS 4

5 CS 791/891-WEB SYNDICATION FORMATS
What’s a Harvester ??? it’s a client application that issues OAI-PMH requests, operated in order to collect metadata from the repositories. CS 791/891-WEB SYNDICATION FORMATS 5

6 CS 791/891-WEB SYNDICATION FORMATS
What is a repository ??? It is a BIG database – A place where data is stored and maintained. It is a network accessible server. The data contained in the repository are the metadata that are exposed to harvesters. CS 791/891-WEB SYNDICATION FORMATS 6

7 Verbs Summary Verb Function Identify description of repository
ListMetadataFormats metadata formats supported by repository ListSets sets defined by repository ListIdentifiers OAI unique ids contained in repository ListRecords listing of N records GetRecord listing of a single record figure reference: CS 791/891-WEB SYNDICATION FORMATS

8 CS 791/891-WEB SYNDICATION FORMATS
OAI-PMH Data Model OAI-PMH distinguishes between 3 distinct entities related to the exposed metadata: 1- Resource: The object that metadata is about. 2- Item: Instance of a metadata object -That instance may be disseminated on the fly, cross-walked from some canonical form , actually stored in repository. 3- Record: is metadata in a specific metadata format. CS 791/891-WEB SYNDICATION FORMATS 8

9 Example: resource item = identifier all available metadata item
about David item Dublin Core metadata MARC SPECTRUM records record = identifier + metadata format + datestamp figure reference: CS 791/891-WEB SYNDICATION FORMATS

10 The XML-encoding of records
Header Metadata About Above link shows encoding of a record in XML CS 791/891-WEB SYNDICATION FORMATS 10

11 What happens if a record was deleted from the repository???
deleteRecord CS 791/891-WEB SYNDICATION FORMATS 11

12 What happens if a record was deleted from the repository???
Repositories must declare one of 3 levels of support: 1- no  repository does not maintain information about deletions  MUST NOT reveal a deleted status in any response. 2- persistent  (opposite) maintains info about deletions with no time limit  MUST persistently keep track of deletions and reveal the status of a deleted record. 3- transient  persistent but to a limited time. Such a repository MAY reveal a deleted status. Not revealing the status is acceptable CS 791/891-WEB SYNDICATION FORMATS 12

13 Selective Harvesting (datestamp and SET)
Selective harvesting allows harvesters to limit harvest requests to portions of the metadata available from a repository. CS 791/891-WEB SYNDICATION FORMATS 13

14 Selective Harvesting via datestamps
Request: CS 791/891-WEB SYNDICATION FORMATS

15 CS 791/891-WEB SYNDICATION FORMATS
SET membership A set is an optional construct for grouping items for the purpose of selective harvesting. Think of it as a Fraternity. A student (item) may belong to a fraternity. Not all students belong to a fraternity. CS 791/891-WEB SYNDICATION FORMATS 15

16 Selective Harvesting Via Set
<record> <header> <identifier>oai:arXiv:cs/ </identifier> <datestamp> </datestamp> <setSpec>cs</setSpec> <setSpec>math</setSpec> </header> <metadata> ….. </metadata> </record> CS 791/891-WEB SYNDICATION FORMATS

17 CS 791/891-WEB SYNDICATION FORMATS
Date/time: T20:30:00Z is: UTC 8:30:00 PM on March 20th 1957 Encoded in: ISO8601, Z-notation Request: YYYY-MM-DD and YYYY-MM-DDThh:mm:ssZ. Response: YYYY-MM-DDThh:mm:ssZ. CS 791/891-WEB SYNDICATION FORMATS 17

18 The BIG PICTURE CS 791/891-WEB SYNDICATION FORMATS 18
Figure reference: CS 791/891-WEB SYNDICATION FORMATS 18

19 Request/Response Request is encoded in http Response in XML
figure reference: CS 791/891-WEB SYNDICATION FORMATS

20 CS 791/891-WEB SYNDICATION FORMATS
GET Example CS 791/891-WEB SYNDICATION FORMATS 20

21 CS 791/891-WEB SYNDICATION FORMATS
Flow Control List requests: A number of OAI-PMH requests. The number could be very large  partition them among a series of requests and response CS 791/891-WEB SYNDICATION FORMATS 21

22 Flow Control Example harvester RDBMS ListRecords
Records 1-100, resumptionToken=AXad31 ListRecords, resumptionToken=AXad31 Records , resumptionToken=pQ22-x ListRecords, resumptionToken=pQ22-x Records figure reference: CS 791/891-WEB SYNDICATION FORMATS

23 Response with no errors
<?xml version="1.0" encoding="UTF-8"?> <OAI-PMH> <responseDate> T08:55:46Z</responseDate> <request verb=“GetRecord”… …> <GetRecord> <record> <header> <identifier>oai:arXiv:cs/ </identifier> <datestamp> </datestamp> <setSpec>cs</setSpec> <setSpec>math</setSpec> </header> <metadata> ….. </metadata> </record> </GetRecord> </OAI-PMH> CS 791/891-WEB SYNDICATION FORMATS 23

24 CS 791/891-WEB SYNDICATION FORMATS
Response with errors In event of an error or exception condition, repositories must indicate OAI-PMH errors by including the error in the response. Request: verb=nastyVerb Response: <?xml version="1.0" encoding="UTF-8"?> <OAI-PMH xmlns=" xmlns:xsi=" xsi:schemaLocation=" <responseDate> T19:20:30Z</responseDate> <request verb="ListRecords" from=" T02:00:00Z" until=" T03:020:00Z" metadataPrefix="oai_marc"> <error code="badArgument"/> </OAI-PMH> Figure reference: CS 791/891-WEB SYNDICATION FORMATS 24

25 CS 791/891-WEB SYNDICATION FORMATS
Request Verbs There are six different request types: 1) GetRecord 2) Identify 3) ListIdentifiers 4) ListMetadataFormats 5) ListRecords 6) ListSets CS 791/891-WEB SYNDICATION FORMATS

26 Argument Summary   metadataPrefix from until set resumptionToken
identifier Identify ListMetadata Formats optional ListSets exclusive ListIdentifiers ListRecords GetRecord Figure reference: CS 791/891-WEB SYNDICATION FORMATS

27 Error Summary BA NMF IDDNE BRT NSH CDF NRM Identify ListMetadata
Formats NMF IDDNE ListSets BRT NSH ListIdentifiers CDF NRM ListRecords GetRecord Figure reference: CS 791/891-WEB SYNDICATION FORMATS

28 CS 791/891-WEB SYNDICATION FORMATS
Dublin Core The Dublin Core metadata element set is a standard for cross-domain information resource description. Mandated metadata format since the initial release of protocol. Purpose of this requirement was to promote interoperability among data providers. CS 791/891-WEB SYNDICATION FORMATS 28

29 Example http://memory.loc.gov/cgi-bin/oai2_0?verb=Identify

30 Repository explorer and example
We shall discuss following HU-Berlin example in above repository explorer

31 OAI-PMH service provider
this is a service provider using OAI-PMH. CS 791/891-WEB SYNDICATION FORMATS

32 CS 791/891-WEB SYNDICATION FORMATS
Conclusion OAI-PMH allows for any metadata format, so long as it is encoded in XML with an XML schema. All repositories must support oai_dc for a minimum level of interoperability. OAI-PMH now defines a single XML Schema to validate responses to all OAI-PMH requests In a successful and trend-setting collaboration with the Dublin Core Metadata Initiative, an XML Schema for unqualified Dublin Core has been created, which is hosted by the DCMI and used in the delivery of metadata in the mandatory DC format in the OAI-PMH. CS 791/891-WEB SYNDICATION FORMATS 32

33 CS 791/891-WEB SYNDICATION FORMATS
Questions? What are the benefits of OAI-PMH? Is the open archives initiative only concerned with metadata? Why choosing the Dublin Core as the standard for OAI-PMH? CS 791/891-WEB SYNDICATION FORMATS 33

34 CS 791/891-WEB SYNDICATION FORMATS
References [CENDI Meeting, MD(4/3/02)] [OA Forum Workshop, Pisa Italy(5/13/02)] CS 791/891-WEB SYNDICATION FORMATS 34


Download ppt "Georges Arnaout Chaitanya Krishna"

Similar presentations


Ads by Google