Presentation is loading. Please wait.

Presentation is loading. Please wait.

Interoperability Fundamentals: OAI-PMH and OAI-ORE SUETr Interoperability Event 9 th December 2008 London School of Economics Library Dr Robert Sanderson.

Similar presentations


Presentation on theme: "Interoperability Fundamentals: OAI-PMH and OAI-ORE SUETr Interoperability Event 9 th December 2008 London School of Economics Library Dr Robert Sanderson."— Presentation transcript:

1 Interoperability Fundamentals: OAI-PMH and OAI-ORE SUETr Interoperability Event 9 th December 2008 London School of Economics Library Dr Robert Sanderson Dept. of Computer Science University of Liverpool azaroth@liverpool.ac.uk http://www.openarchives.org/ore/ http://foresite.cheshire3.org/ SUETr Interoperability Event 9 th December Slide 1 Interoperability Fundamentals: OAI-PMH and OAI-ORE

2 Overview OAI: Protocol for Metadata Harvesting Introduction Technical Details Example OAI: Object Reuse and Exchange Introduction ORE for Repositories (Motivation)‏ RDF and Atom Support SUETr Interoperability Event 9 th December Slide 2 Interoperability Fundamentals: OAI-PMH and OAI-ORE

3 OAI: Protocol for Metadata Harvesting Does pretty much what it says on the tin: An XML over HTTP protocol... that allows a client to harvest... all of the metadata records in a repository. SUETr Interoperability Event 9 th December Slide 3 Interoperability Fundamentals: OAI-PMH and OAI-ORE OAI-PMH Request: ListIdentifiers, GetRecords OAI-PMH Response: (Records)‏ Local Fetch Record Local Store Record Service ProviderData Provider

4 Architecture Distinction between Data Provider (repository) and Service Provider (someone who does something with the data) Most service providers are aggregators of more than one repository Eg: Search, Analysis, Summarization, Caching, Proxies,... Or could be used for inter-repository transfer/update, where the Service Provider is also a Data Provider. Distinction between Centralized and Distributed architecture Centralized: Harvest everything into one place and then search (PMH)‏ Distributed: Leave data where it is and search remotely (Z39.50/SRU) But can be combined – distributed search over centralized database providing an SRU interface and single distributed databases SUETr Interoperability Event 9 th December Slide 4 Interoperability Fundamentals: OAI-PMH and OAI-ORE

5 Technical Details Single URL end point that handles protocol eg: http://www.cheshire3.org/services/oai? Operation (verb) as a parameter: Identify: Tell me about yourself ListMetadataFormats: Tell me which formats you support ListSets: What sets of records do you support ListIdentifiers: Retrieve headers for records ListRecords: Retrieve full records GetRecord: Retrieve single known record List operations by timestamp of update to the record:...?verb=ListIdentifers&metadataPrefix=oai_dc&from=2008-12-01 Hence can ask only for changed records since you last harvested Compare to RSS/Atom (even order isn't guaranteed!)‏ SUETr Interoperability Event 9 th December Slide 5 Interoperability Fundamentals: OAI-PMH and OAI-ORE

6 Support? LOTS of libraries, as the protocol is easy to implement. Simple google stats for +oai-pmh +download +(language)‏ c#: 2,450 perl: 4,520 c++: 5,440 ruby: 19,800 python: 21,700 java: 28,000 php: 47,300 Okay not all are implementations, but you get the picture! Active mailing list (still!) Repository Explorer / Conformance Tester Lots of service providers looking to suck up data (eg OAIster)‏ SUETr Interoperability Event 9 th December Slide 6 Interoperability Fundamentals: OAI-PMH and OAI-ORE

7 Example Interaction Harvester wants to fetch all of the metadata records in a repository since it last harvested, in the simple dublin core format. Verb to use: ListRecords http://repo.example.org/oai?verb=ListRecords&from=2008-11-01&metadataPrefix=oai_dc http://repo.example.org/oai?verb=ListRecords&from=2008-11-01&metadataPrefix=oai_dc Response: SUETr Interoperability Event 9 th December Slide 7 Interoperability Fundamentals: OAI-PMH and OAI-ORE 2002-06-01T19:20:30Z http://repo.example.org/oai? oai:arXiv.org:hep-th/9901001 2008-12-02...

8 I Don't Want All This @#*&)#%*&! Problem: In order to download the records you want, you have to download everything and then filter it. This just wastes everyone's time. Solution (?): There are server defined sets of records (not nested). Each record knows which sets it is a member of. Can fetch only those records which are part of a named set. How are the sets defined? By the server/repository admin... Many people have tried to add search functionality to OAI-PMH... This is Wrong Wrong Wrong and shows a fundamental misunderstanding of the role of OAI-PMH in the overall information landscape! For search, there's OpenSearch and SRU. (Another talk!)‏ SUETr Interoperability Event 9 th December Slide 8 Interoperability Fundamentals: OAI-PMH and OAI-ORE

9 What is ORE? A method for making complex digital objects available over the web... In order for the object and its component parts to be easily and seamlessly reused as parts of other objects and in other contexts... And exchanged between organizations, infrastructures and services. A set of projects funded by the Andrew W. Mellon Foundation, the Coalition for Networked Information, Microsoft, the National Science Foundation, and the Joint Information Systems Committee, under the Open Archives Initiative. SUETr Interoperability Event 9 th December Slide 9 Interoperability Fundamentals: OAI-PMH and OAI-ORE

10 Who is Responsible? Principal Investigators: Carl Lagoze (Cornell University)‏ Herbert Van de Sompel (Los Alamos National Labs)‏ Editors: Pete Johnston (Eduserv Foundation)‏ Michael Nelson (Old Dominion University)‏ Rob Sanderson (University of Liverpool)‏ Simeon Warner (Cornell University)‏ Technical and Advisory Boards: Including: Liz Lyon, Peter Murray Rust, Les Carr, Richard Jones, Julie Allinson, Andy Powell, Lorcan Dempsey, John Erickson, MacKenzie Smith, Tony Hammond, Savas Parastatidis, Robert Tansley, Jane Hunter, Tim Cole, Leigh Dodds, Tim DiLauro, Jeff Young,... SUETr Interoperability Event 9 th December Slide 10 Interoperability Fundamentals: OAI-PMH and OAI-ORE

11 Main Idea of ORE Create a way to describe an Aggregation of Resources... and the relationships between them... without changing the way we do things... without changing the resources themselves... in a manner consistent with the web architecture Add boundary information over top of the connected resources on the web Publish this information using existing technologies... which we call a Resource Map This is concept is nothing new... SUETr Interoperability Event 9 th December Slide 11 Interoperability Fundamentals: OAI-PMH and OAI-ORE

12 SUETr Interoperability Event 9 th December Slide 12 Interoperability Fundamentals: OAI-PMH and OAI-ORE The Sky An Aggregation of Stars

13 The Web SUETr Interoperability Event 9 th December Slide 13 Interoperability Fundamentals: OAI-PMH and OAI-ORE Ag gr ReMReM ReMReM... with Boundary Information... and Additional Relationships

14 ORE for Repositories SUETr Interoperability Event 9 th December Slide 14 Interoperability Fundamentals: OAI-PMH and OAI-ORE Key: 1 URI 2 Formats 3 Title 4 Authors 5 Creation Dates 6 Similar Objects 7 Versions 8 Links out 9 Citations in/out a Abstract b Journal a b

15 SUETr Interoperability Event 9 th December Slide 15 Interoperability Fundamentals: OAI-PMH and OAI-ORE 1 3 4 5 a 9 8 7 b 2

16 Interoperability Fundamentals 9 th December Slide 16 Interoperability Fundamentals: OAI-PMH and OAI-ORE 2 3 4 5 6 9 8 a b

17 RDF The ORE Data Model is defined as a Graph, and expressed in RDF. We express these relationships as triples: Interoperability Fundamentals 9 th December Slide 17 Interoperability Fundamentals: OAI-PMH and OAI-ORE 4 4 1 1 5 5 3 3 6 6 2 2 Ag gr ReMReM ReMReM X X URI-ReM URI-Aggr URI-1 URI-5 ore:describes ore:aggregates dcterms:references rdf:seeAlso URI-Aggr URI-1 [...] URI-6 URI-2 URI-X

18 Where's the Data? Triples can also have literal strings, numbers, dates etc: URI-Aggr dcterms:modified “2008-12-09T10:30:00Z” URI-Aggr dc:title “Rob's New Aggregation” In our examples, the green aggregated resources are the different formats for the same work. That makes the Aggregation a resource that somehow represents the work in the abstract, and the Resource Map a description of that. URI-ReM ore:describesURI-Aggr URI-ReMdcterms:modified“2008-12-09T10:30:00Z” URI-Aggrore:aggregatesURI-ps, URI-pdf, URI-html URI-Aggrdc:title“Parametrization of...” URI-Aggrdcterms:modified“2006-01-18T06:30:00Z” URI-Aggrore:similarTo info:doi/10.1142/S02177... URI-Aggr dcterms:creatorURI-Hui URI-Huifoaf:name“Hui Li” URI-psdc:format “application/postscript”... Interoperability Fundamentals 9 th December Slide 18 Interoperability Fundamentals: OAI-PMH and OAI-ORE

19 Serializations RDF has MANY serializations, including simple triple formats, XML formats, and RDFa – a way to embed RDF in XHTML. Recommended are RDF/XML and RDFa. Also recommended is an Atom serialization: Each Aggregation is an Atom, and the atom elements are mapped to the predicate (middle) part of the triple, eg author → dcterms:creator Aggregated Resources are referenced in elements. Anything that can't be expressed natively in atom goes into an extension block. This allows aggregations to sit in regular Atom feeds for discovery And plays nicely with other Atom based protocols like OpenSearch or other GData like systems Interoperability Fundamentals 9 th December Slide 19 Interoperability Fundamentals: OAI-PMH and OAI-ORE

20 Support Not as much as OAI-PMH... yet! Version 1.0 only released in October. Libraries: Foresite Toolkit http://foresite-toolkit.googlecode.com/ Java (ORE 0.9, Richard Jones) and Python (ORE 1.0, me) Idea: Build an object model on top of RDF graph: a = Aggregation() a.title = “New Aggregation” http://foresite-toolkit.googlecode.com/ Validator: Atom/ORE Validator http://www.openarchives.org/ore/1.0/atom-validator From Los Alamos National Labs, plus other transforms http://www.openarchives.org/ore/1.0/atom-validator Generic RDF Libraries, Converters: Available in most languages... talk to me about writing a foresite library! Interoperability Fundamentals 9 th December Slide 20 Interoperability Fundamentals: OAI-PMH and OAI-ORE

21 Repository Operations Create: Send ORE in Atom via SWORD from client Update: Send ORE in Atom via SWORD from client to existing URI Search: Return ORE via OpenSearch/SRU Harvest: Return ORE via OAI-PMH Archive: Archive ORE Resource Map plus Aggregated Resources Export: Export ORE (to be created by other) Import: Create from ORE Real Life Example Wrapper around Flickr API to export Photos/Photosets (Rob)‏ ORE Importer into Omeka Digital Library Platform (Sean Hannan)‏ Ran importer against flickr wrapper to import photos out of flickr, along with metadata, different sizes, etc. Seamless Interoperability! Other examples: DSpace, Fedora, MyExperiment, JSTOR, WordPress,... Interoperability Fundamentals 9 th December Slide 21 Interoperability Fundamentals: OAI-PMH and OAI-ORE

22 Thank You :)‏ Questions? URLs: Me: azaroth@liverpool.ac.ukazaroth@liverpool.ac.uk PMH: http://www.openarchives.org/pmh/http://www.openarchives.org/ ORE: http://www.openarchives.org/ore/http://www.openarchives.org/ore/ Foresite: http://foresite-toolkit.googlecode.com/http://foresite-toolkit.googlecode.com/ This:http://www.csc.liv.ac.uk/~azaroth/papers/suetr-ore.pdf (Bonus points for expressing the above as an ORE Aggregation!)‏ Interoperability Fundamentals 9 th December Slide 22 Interoperability Fundamentals: OAI-PMH and OAI-ORE


Download ppt "Interoperability Fundamentals: OAI-PMH and OAI-ORE SUETr Interoperability Event 9 th December 2008 London School of Economics Library Dr Robert Sanderson."

Similar presentations


Ads by Google