Presentation is loading. Please wait.

Presentation is loading. Please wait.

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY Lessons in.

Similar presentations


Presentation on theme: "Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY Lessons in."— Presentation transcript:

1 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY Lessons in Cross-Repository Interoperability learned from the aDORe effort Herbert Van de Sompel Research Library Los Alamos National Laboratory, USA

2 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY The repository model "Pattern Recognition: The 2003 OCLC Environmental Scan" http://www.oclc.org/membership/escan/toc.htm

3 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY Credits The reported material is based on the following work: o The LANL aDORe repository effort o The upcoming PhD thesis by Jeroen Bekaert (Advisor Herbert Van de Sompel) regarding protocol-based interfaces for Open Archival Information Systems (OAIS) o The NSF-funded Pathways project in collaboration with the Information Science group at Cornell University (Carl Lagoze, Sandy Payette, Simeon Warner)

4 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY Outline aDORe A few words about the aDORe architecture A Federation of Repositories A new level of cross-repository interoperability Pathways InterDisseminator A context-sensitive service overlay for a federation of repositories

5 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY aDORe

6 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY aDORe effort aDORe is 2 things: o Standards-based, modular repository architecture - Distributed architecture - Protocol-based interactions between modules - Applicable to create interoperable federations of heterogeneous repositories o Actual implementation of the architecture at LANL for local storage of digital assets (currently in its 2 nd version) aDORe is not a product o Components of aDORe software, usable in other environments, will be released

7 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY aDORe effort Standards used in aDORe include: o XML, o XML Schema, o MPEG-21 Digital Item Declaration, o MPEG-21 Digital Item Identification, o W3C XML Signatures, o OAI-PMH, o NISO OpenURL Framework for Context-Sensitive Services, o Internet Archive ARC file format, o OAIS concepts

8 Compound objects Repository Registry Identifier Locator

9 OpenURL Resolver OAI-PMH Federator Dynamic Dissemination Engine

10 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY OAI-PMH Federator & OpenURL Resolver aDORe front-end Interface standard identifier OAIS Access Type # items in response OAI-PMH Federator OAI-PMHPackage IdentifierOAIS DIP1 or more OpenURL Resolver NISO OpenURL Content Identifier, Package Identifier (with XML ID fragment) OAIS DIP & Result Set 1

11 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY aDORe effort Standards Distributed architecture Protocol-based communication Insights in Cross-Repository Interoperability

12 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY Outline aDORe A few words about the aDORe architecture A Federation of Repositories A new level of cross-repository interoperability Pathways InterDisseminator A context-sensitive service overlay for a federation of repositories

13 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY The repository model Different repository types: scholarly communication (preprint, postprint), dataset repositories, cultural heritage collections, cultural event collections, learning object repositories, teaching object repositories, digitized book repositories, …. Can be institution-based, discipline-based, …

14 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY The repository model In an updated worldview: These repositories are about facilitating the (re)use of materials in many contexts These repositories are the starting point of value chains

15 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY http://www.technorati.com Value chains emerging from RSS feeds

16 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY Value chains starting in repositories recombine add value

17 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY The interoperable repository model I will try to show that: a significantly higher level of cross-repository interoperability can be achieved with relatively modest means those means are largely available and agreed upon in our community

18 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY Part 1 : Requirements for a repository in a federation

19 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY Repositories & Units of Communication Data-oriented research => not only textual materials, but also datasets, software, simulations, dynamic knowledge presentations, … Research results represented by variety of digital media  these media must receive status similar to that of text in current system Materials in various stages of certification:  units of communication not only ‘papers’ but also preprints, raw datasets, prototype simulations, … Facilitate collaboration  re-use of units of communications

20 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY Repositories & Units of Communication Handling this requires: o a compound object view of a unit of communication o stop thinking in terms of metadata versus content Compound object: o Has a persistent identifier o Contain materials and metadata about those materials o Can contain other compound objects

21 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY Compound objects URI_7 URI_3 URI_9 URIs: minted by different repositories from different namespaces not (necessarily) locators compound object

22 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY XML-based representation of compound objects URI_7 URI_3 URI_9 compound object URI_7 URI_3 URI_9 MPEG-21 DIDL METS IMS/CP RDF XML-based representation

23 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY Repository Interop Interface 1: OAI-PMH & CO OAI-PMH baseURL_m URI_7 URI_3 URI_9 OAI-PMH harvester repository_a machine consumption batches of compound objects OAI-PMH datestamp ~ new version of object

24 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY OAI-PMH interface to OAIS (Jeroen Bekaert)

25 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY Repository Interop Interface 1: OAI-PMH & CO OAI-PMH harvester URI_7 URI_3 URI_9 URI_7 URI_3 URI_9 URI_12 add value recombine repository_b OAI-PMH baseURL_n include provenance ~ version of compound object

26 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY Repository Interop Interface 2: OpenURL & CO OpenURL baseURL_o repository_n OpenURL baseURL_x? url_ver=Z39.88-2004 & rft_id=URI_7 & svc_id=info:pathways/svc/dip.* machine (& human) consumption single object dissemination ~ identifier of compound object URI_7 URI_3 URI_9

27 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY ServiceType = Request a representation of the DO expressed using a compound object format o Example: - svc_id = info:pathways/svc/dip.didl (request MPEG-21 DIDL representation) - svc_id = info:pathways/svc/dip.mets (request METS representation) - svc_id = info:pathways/svc/dip.rdf (request RDF representation – see later) Other Entities could be added to Interface #2 (think Requester) Repository Interop Interface 2: OpenURL & CO

28 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY Repository Interop Interface 2: OpenURL & CO OpenURL baseURL_o repository_n OpenURL independent of nature of identifiers ‘resolution’ independent of scheme- specific mechanisms conceptual interface is persistent over time KEV & HTTP XML & SOAP …

29 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY OpenURL interface to OAIS (Jeroen Bekaert)

30 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY Part 2 :Requirements for an infrastructure supporting a federation of repositories

31 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY Repository Registry: Who is part of the Federation? Repository Registry register Per Repository: Repository identifier baseURL of OAI-PMH interface baseURL of OpenURL interface whichever kind of information that helps downstream applications understand about the nature of the repository

32 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY Object Registry: What is part of the Federation? Object Registry harvest (identifiers) Per compound object: Object identifier Object datetime ~ OAI-PMH datestamp OAI-PMH identifier Repository identifier of the object itself, and of its contained objects SRU SRW handle

33 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY OAI-PMH & OpenURL access to objects in federation Object Registry Repository Registry URI_7 List of existing copies Per copy: OAI-PMH access info OpenURL access info URI_7 URI_3 URI_9 SRU SRW handle

34 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY Part 3 : Summary of requirements

35 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY Summary of requirements RequirementRepositoryInfrastructure Compound Object model support X XML-based representations support X? OAI-PMH CO supportX OpenURL CO supportX Repository RegistryX Object RegistryX

36 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY Summary of requirements Many variations on the design possible, yet most of this can be achieved with: Off-the-shelf tools o OAI-PMH tools o Handle system, SRU/W tools o OpenURL tools o Tools to generate XML-based representations of objects Surprisingly little effort A feasible amount of coordination/specification Some shared infrastructure

37 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY Big unaddressed issues Rights: When facilitating the (re)use of materials (not just metadata) IP concerns increase significantly: Data authenticity Data integrity Usage rights Need machine readable rights expressions: Robots are the next generation readers Even when materials are “free” Object-level expressions The world of CC, MPEG-21 REL. ODRL, XRML Object relationships Complex, yet secondary information in the architecture

38 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY Outline aDORe A few words about the aDORe architecture A Federation of Repositories A new level of cross-repository interoperability Pathways InterDisseminator A context-sensitive service overlay for a federation of repositories

39 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY Pathways InterDisseminator Service Overlay Pathways InterDisseminator: Dynamic Service-Oriented Overlay upon the federated architecture Assumes the existence of: OpenURL Interface to all repositories in the federation Object Registry (given an identifier, at which OpenURL interface is the object available?) Availability of an RDF-based representation of DO compliant with a Pathways OWL core ontology Is itself exposed as a different OpenURL Resolver

40 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY Pathways InterDisseminator : core ontology

41 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY DSpace baseURL_y? url_ver=Z39.88-2004 & rft_id=URI_7 & svc_id=info:pathways/boostrap Fedora aDORe baseURL_y? url_ver=Z39.88-2004 & rft_id=URI_7 & svc_id=info:pathways/dip.rdf URI_7 URI_3 URI_9 RDF magic engin e OpenURL ContextObject Container Interop Interface 2 OpenURL Service Overlay OpenURL Application

42 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY Part of the dissemination OpenURL Application is an engine that dynamically decides upon services for a given object from a repository (in a federation). o It grabs the (RDF) representation of the DO from its origin repository o It introspects on the properties expressed in that (RDF) representation o It compares these properties with its knowledge database o It returns a list of possible services/disseminations There can be many of these engines in a federation. The result is the ability to provide context-sensitive disseminations of DOs in (a federation of) repositories. Pathways InterDisseminator Service Overlay

43 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY There can be many of these engines in a federation. The result is the ability to provide context-sensitive disseminations of DOs in (a federation of) repositories. Pathways InterDisseminator Service Overlay

44 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY DSpace baseURL_y? url_ver=Z39.88-2004 & rft_id=URI_7 & svc_id=info:magic/justdoit Fedora aDORe baseURL_y? url_ver=Z39.88-2004 & rft_id=URI_7 & svc_id=info:pathways/dip.rdf URI_7 URI_3 URI_9 RDF service execution engine web service

45 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY Pathways InterDisseminator Demo aDORe Digital Object in Demo TypeMIMEidentifier Digital Objectscholarly paperN/ADOI Constituent Datastream 1metadata record application/xml (MARCXML) aDORe datastream id (info URI) Constituent Datastream 2metadata record application/xml (original metadata) aDORe datastream id (info URI) Constituent Datastream 3fulltext fileapplication/pdf aDORe datastream id (info URI)

46 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY Demo Install TSCC coded (http://www.techsmith.com)http://www.techsmith.com Launch movie Pathways_InterDisseminator.avi in same path as this presentation

47 Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY Comments, Flames, Questions


Download ppt "Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY Lessons in."

Similar presentations


Ads by Google