Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October 20-22 2005, CERN, Geneva, Switzerland RESEARCH LIBRARY Lessons in.

Slides:



Advertisements
Similar presentations
Richard Jones, Systems Developer Technical Issues for Repository Software Theses Alive! Edinburgh University Library SHERPA Nottingham.
Advertisements

Interoperability: the value of recombinant potential Lorcan Dempsey VP Research and Chief Strategist ARLIS 2004, New York, April 2004.
RESEARCH LIBRARY Content Packaging for Complex Objects MPEG – 21 1 February 2007 Frances Knudson Repository Team Los Alamos National Laboratory Research.
UKOLN is supported by: JISC Information Environment update Repositories and Preservation Programme meeting, October 24-25, 2006 Rachel Heery UKOLN
An Introduction to Repositories Thornton Staples Director of Community Strategy and Alliances Director of the Fedora Project.
CNI Fall Task Force Meeting 2003, Portland, OR Using MPEG-21 DIDL, the OAI-PMH, and the OpenURL as building blocks for storing & disseminating complex.
Depositing e-material to The National Library of Sweden.
Object Re-Use and Exchange Mellon Retreat, Nassau Inn, Princeton, NJ, March Herbert Van de Sompel, Carl Lagoze The OAI Object Re-Use & Exchange.
DSpace Devika P. Madalli DRTC, ISI Bangalore.
Planning for Flexible Integration via Service-Oriented Architecture (SOA) APSR Forum – The Well-Integrated Repository Sydney, Australia February 2006 Sandy.
MIT’s DSpace A good fit for ETDs Margret Branschofsky Keith Glavash MIT LIBRARIES.
Building a Digital Library with Fedora International Conference on Developing Digital Institutional Repositories Hong Kong December 9, 2004.
SOAPI: a flexible toolkit for implementing ingest and preservation workflows Mark Hedges Centre for e-Research, King’s College London Arts and Humanities.
The Fedora Project April 28-29, 2003 CNI, Washington DC Thornton Staples University of Virginia Sandy Payette Cornell Information Science.
Interoperability Among Scholarly Repositories: Enabling Workflows Across Distributed Information Carl Lagoze Information Science Cornell University, USA.
1 MPEG-21 : Goals and Achievements Ian Burnett, Rik Van de Walle, Keith Hill, Jan Bormans and Fernando Pereira IEEE Multimedia, October-November 2003.
UKOLN is supported by: A non-technical introduction to: OAI-ORE ( Defining Image Access project meeting.
Some thoughts on OpenURL version 1.0 Herbert Van de Sompel Los Alamos National Laboratory – Research Library NISO AX meeting, Getty Museum, May
The Open Archives Initiative Simeon Warner (Cornell University) Symposium on “Scholarly Publishing and Archiving on the Web”, University.
OAI Standards for Sheet Music Meeting March 28-29, 2002 Basic OAI Principals How They Apply to Sheet Music Presenter: Curtis Fornadley, Senior Programmer/Analyst.
The Open Archives Initiative Simeon Warner (Cornell University) Open Archives seminar “Facilitating Free and Efficient Scientific.
The Open Archives Initiative Simeon Warner Cornell University, Ithaca, NY, USA CREPUQ 2002, Montréal, Canada 14:00, 24 October 2002.
Addressing Metadata in the MPEG-21 and PDF-A ISO Standards NISO Workshop: Metadata on the Cutting Edge May 2004 William G. LeFurgy U.S. Library of Congress.
OAI-PMH for Resource Harvesting Tutorial OAI4, October 20 th 2005, CERN, Geneva, Switzerland OAIResource Software Her This work supported in part by the.
MPEG-21 : Overview MUMT 611 Doug Van Nort. Introduction Rather than audiovisual content, purpose is set of standards to deliver multimedia in secure environment.
Herbert van de sompel Workshop on OAI and peer review journals in Europe Geneva, Switserland – March 22nd to 24th 2001 Herbert Van de Sompel Cornell University.
WikiD (Wiki/Data) Jeffrey A. Young OCLC Office of Research Presented by Stu Weibel CERN OAI4 Geneva, Switzerland 20 October 2005.
ECDL 2005, September 18 th - 23 th 2005, Vienna, Austria File-based storage of Digital Objects: XMLtapes & Internet Archive ARC files Xiaoming Liu, Luda.
Fedora Content Models for the National Science Digital Library Data Repository Fedora User’s Group Meeting Copenhagen, September 28, 2005 Carl Lagoze Cornell.
European Endeavor Users Group Meeting Helsinki, Sept Esa-Pekka Keskitalo, System Analyst Helsinki University Library OpenURL 1.0.
OAI-PMH for Resource Harvesting Tutorial OAI4, October 20 th 2005, CERN, Geneva, Switzerland A New Model for Web Resource Harvesting Her This work supported.
OAI-PMH for Resource Harvesting Tutorial OAI4, October 20 th 2005, CERN, Geneva, Switzerland OAI-PMH for Resource Harvesting Herbert Van de Sompel Digital.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
The OAI: overview and historical context OAI Open Meeting – Washington DC – January 23 rd 2001 Herbert Van de Sompel & Carl Lagoze Cornell University --
Emerging Uses for the OpenURL Framework Ann Apps and Ross MacIntyre MIMAS, The University of Manchester.
Research Library, Los Alamos National Laboratory RESEARCH OAI4 - Geneva, Switzerland Digital Library Research & Prototyping Team Multi-Graph.
Van de Sompel, Herbert Los Alamos National Laboratory – Research Library OAI-PMH for Resource Harvesting.
Herbert Van de Sompel Los Alamos National Laboratory – Research Library DC Florence, October 14 th 2002 the things I work on (and/or think about)
Kurt Maly Department of Computer Science Old Dominion University Norfolk, Virginia 23529, USA Digital Libraries, OAI and Free Software.
The OAI: overview and historical context OAI Open Meeting – Washington DC – January 23 rd 2001 Herbert Van de Sompel & Carl Lagoze Cornell University --
VIVO and Scholarly Repositories: Synergistic Opportunities.
1 GRID Based Federated Digital Library K. Maly, M. Zubair, V. Chilukamarri, and P. Kothari Department of Computer Science Old Dominion University February,
JISC Information Environment Service Registry (IESR) Ann Apps MIMAS, The University of Manchester, UK.
How to Implement an Institutional Repository: Part II A NASIG 2006 Pre-Conference May 4, 2006 Technical Issues.
NSDL October 12-15, 2003Eisenhower National Clearinghouse Slide 1 NSDL and the Open Archives Initiative NSDL – OAI – and the Eisenhower National Clearinghouse.
DSpace - Digital Library Software
The library is open Digital Assets Management & Institutional Repository Russian-IUG November 2015 Tomsk, Russia Nabil Saadallah Manager Business.
OAI-PMH for Resource Harvesting Tutorial OAI4, October 20 th 2005, CERN, Geneva, Switzerland The American Physical Society Project: Standards-based Mirroring.
UKOLN is supported by: Content packaging and MPEG-21 DID Andy Powell, UKOLN, University of Bath JISC Joint Programmes Meeting, July.
Lifecycle Metadata for Digital Objects November 15, 2004 Preservation Metadata.
Carl Lagoze Digital Library Service Registry Workshop Services in a Scholarly Communication Framework.
Digital libraries research IG Cataloging and metadata IG Web services and metadata switch February 2003 Web services and metadata switch February 2003.
The Mellon-Funded Fedora Project A Presentation to the European Digital Library Conference September 17, 2002 Sandy Payette and Thornton Staples.
Technical Report 4th CERN Workshop of Innovations in Scholarly Communication (OAI4)
Mod_oai: Metadata Harvesting for Everyone Michael L. Nelson, Herbert Van de Sompel, Xiaoming Liu, Aravind Elango
LWW January 27, 2004, Los Alamos, NM LANL Ingestion and Repository architecture Research Library, Los Alamos National Laboratory RESEARCH LIBRARY LANL’s.
The Multi-Faceted Use of the OAI-PMH in the LANL Repository Written By: Henry, Xiaoming,Patrick Henry, Xiaoming,Patrick and Herbert. Presented By: Shashi.
The Fedora Project March 19, 2003 ISTEC Symposium, Brazil
Systems for scholarly communication
Overview: Fedora Architecture and Software Features
Flexible Extensible Digital Object Repository Architecture
Flexible Extensible Digital Object Repository Architecture
An Architecture for Complex Objects and their Relationships
OAI protocol beyond discovery metadata
OAI and Metadata Harvesting
NSDL Data Repository (NDR)
Digital Preservation Seminar
Open Archive Initiative
Disseminating Service Registry Records
Institutional Repositories
Presentation transcript:

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY Lessons in Cross-Repository Interoperability learned from the aDORe effort Herbert Van de Sompel Research Library Los Alamos National Laboratory, USA

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY The repository model "Pattern Recognition: The 2003 OCLC Environmental Scan"

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY Credits The reported material is based on the following work: o The LANL aDORe repository effort o The upcoming PhD thesis by Jeroen Bekaert (Advisor Herbert Van de Sompel) regarding protocol-based interfaces for Open Archival Information Systems (OAIS) o The NSF-funded Pathways project in collaboration with the Information Science group at Cornell University (Carl Lagoze, Sandy Payette, Simeon Warner)

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY Outline aDORe A few words about the aDORe architecture A Federation of Repositories A new level of cross-repository interoperability Pathways InterDisseminator A context-sensitive service overlay for a federation of repositories

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY aDORe

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY aDORe effort aDORe is 2 things: o Standards-based, modular repository architecture - Distributed architecture - Protocol-based interactions between modules - Applicable to create interoperable federations of heterogeneous repositories o Actual implementation of the architecture at LANL for local storage of digital assets (currently in its 2 nd version) aDORe is not a product o Components of aDORe software, usable in other environments, will be released

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY aDORe effort Standards used in aDORe include: o XML, o XML Schema, o MPEG-21 Digital Item Declaration, o MPEG-21 Digital Item Identification, o W3C XML Signatures, o OAI-PMH, o NISO OpenURL Framework for Context-Sensitive Services, o Internet Archive ARC file format, o OAIS concepts

Compound objects Repository Registry Identifier Locator

OpenURL Resolver OAI-PMH Federator Dynamic Dissemination Engine

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY OAI-PMH Federator & OpenURL Resolver aDORe front-end Interface standard identifier OAIS Access Type # items in response OAI-PMH Federator OAI-PMHPackage IdentifierOAIS DIP1 or more OpenURL Resolver NISO OpenURL Content Identifier, Package Identifier (with XML ID fragment) OAIS DIP & Result Set 1

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY aDORe effort Standards Distributed architecture Protocol-based communication Insights in Cross-Repository Interoperability

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY Outline aDORe A few words about the aDORe architecture A Federation of Repositories A new level of cross-repository interoperability Pathways InterDisseminator A context-sensitive service overlay for a federation of repositories

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY The repository model Different repository types: scholarly communication (preprint, postprint), dataset repositories, cultural heritage collections, cultural event collections, learning object repositories, teaching object repositories, digitized book repositories, …. Can be institution-based, discipline-based, …

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY The repository model In an updated worldview: These repositories are about facilitating the (re)use of materials in many contexts These repositories are the starting point of value chains

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY Value chains emerging from RSS feeds

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY Value chains starting in repositories recombine add value

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY The interoperable repository model I will try to show that: a significantly higher level of cross-repository interoperability can be achieved with relatively modest means those means are largely available and agreed upon in our community

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY Part 1 : Requirements for a repository in a federation

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY Repositories & Units of Communication Data-oriented research => not only textual materials, but also datasets, software, simulations, dynamic knowledge presentations, … Research results represented by variety of digital media  these media must receive status similar to that of text in current system Materials in various stages of certification:  units of communication not only ‘papers’ but also preprints, raw datasets, prototype simulations, … Facilitate collaboration  re-use of units of communications

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY Repositories & Units of Communication Handling this requires: o a compound object view of a unit of communication o stop thinking in terms of metadata versus content Compound object: o Has a persistent identifier o Contain materials and metadata about those materials o Can contain other compound objects

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY Compound objects URI_7 URI_3 URI_9 URIs: minted by different repositories from different namespaces not (necessarily) locators compound object

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY XML-based representation of compound objects URI_7 URI_3 URI_9 compound object URI_7 URI_3 URI_9 MPEG-21 DIDL METS IMS/CP RDF XML-based representation

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY Repository Interop Interface 1: OAI-PMH & CO OAI-PMH baseURL_m URI_7 URI_3 URI_9 OAI-PMH harvester repository_a machine consumption batches of compound objects OAI-PMH datestamp ~ new version of object

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY OAI-PMH interface to OAIS (Jeroen Bekaert)

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY Repository Interop Interface 1: OAI-PMH & CO OAI-PMH harvester URI_7 URI_3 URI_9 URI_7 URI_3 URI_9 URI_12 add value recombine repository_b OAI-PMH baseURL_n include provenance ~ version of compound object

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY Repository Interop Interface 2: OpenURL & CO OpenURL baseURL_o repository_n OpenURL baseURL_x? url_ver=Z & rft_id=URI_7 & svc_id=info:pathways/svc/dip.* machine (& human) consumption single object dissemination ~ identifier of compound object URI_7 URI_3 URI_9

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY ServiceType = Request a representation of the DO expressed using a compound object format o Example: - svc_id = info:pathways/svc/dip.didl (request MPEG-21 DIDL representation) - svc_id = info:pathways/svc/dip.mets (request METS representation) - svc_id = info:pathways/svc/dip.rdf (request RDF representation – see later) Other Entities could be added to Interface #2 (think Requester) Repository Interop Interface 2: OpenURL & CO

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY Repository Interop Interface 2: OpenURL & CO OpenURL baseURL_o repository_n OpenURL independent of nature of identifiers ‘resolution’ independent of scheme- specific mechanisms conceptual interface is persistent over time KEV & HTTP XML & SOAP …

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY OpenURL interface to OAIS (Jeroen Bekaert)

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY Part 2 :Requirements for an infrastructure supporting a federation of repositories

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY Repository Registry: Who is part of the Federation? Repository Registry register Per Repository: Repository identifier baseURL of OAI-PMH interface baseURL of OpenURL interface whichever kind of information that helps downstream applications understand about the nature of the repository

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY Object Registry: What is part of the Federation? Object Registry harvest (identifiers) Per compound object: Object identifier Object datetime ~ OAI-PMH datestamp OAI-PMH identifier Repository identifier of the object itself, and of its contained objects SRU SRW handle

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY OAI-PMH & OpenURL access to objects in federation Object Registry Repository Registry URI_7 List of existing copies Per copy: OAI-PMH access info OpenURL access info URI_7 URI_3 URI_9 SRU SRW handle

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY Part 3 : Summary of requirements

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY Summary of requirements RequirementRepositoryInfrastructure Compound Object model support X XML-based representations support X? OAI-PMH CO supportX OpenURL CO supportX Repository RegistryX Object RegistryX

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY Summary of requirements Many variations on the design possible, yet most of this can be achieved with: Off-the-shelf tools o OAI-PMH tools o Handle system, SRU/W tools o OpenURL tools o Tools to generate XML-based representations of objects Surprisingly little effort A feasible amount of coordination/specification Some shared infrastructure

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY Big unaddressed issues Rights: When facilitating the (re)use of materials (not just metadata) IP concerns increase significantly: Data authenticity Data integrity Usage rights Need machine readable rights expressions: Robots are the next generation readers Even when materials are “free” Object-level expressions The world of CC, MPEG-21 REL. ODRL, XRML Object relationships Complex, yet secondary information in the architecture

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY Outline aDORe A few words about the aDORe architecture A Federation of Repositories A new level of cross-repository interoperability Pathways InterDisseminator A context-sensitive service overlay for a federation of repositories

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY Pathways InterDisseminator Service Overlay Pathways InterDisseminator: Dynamic Service-Oriented Overlay upon the federated architecture Assumes the existence of: OpenURL Interface to all repositories in the federation Object Registry (given an identifier, at which OpenURL interface is the object available?) Availability of an RDF-based representation of DO compliant with a Pathways OWL core ontology Is itself exposed as a different OpenURL Resolver

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY Pathways InterDisseminator : core ontology

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY DSpace baseURL_y? url_ver=Z & rft_id=URI_7 & svc_id=info:pathways/boostrap Fedora aDORe baseURL_y? url_ver=Z & rft_id=URI_7 & svc_id=info:pathways/dip.rdf URI_7 URI_3 URI_9 RDF magic engin e OpenURL ContextObject Container Interop Interface 2 OpenURL Service Overlay OpenURL Application

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY Part of the dissemination OpenURL Application is an engine that dynamically decides upon services for a given object from a repository (in a federation). o It grabs the (RDF) representation of the DO from its origin repository o It introspects on the properties expressed in that (RDF) representation o It compares these properties with its knowledge database o It returns a list of possible services/disseminations There can be many of these engines in a federation. The result is the ability to provide context-sensitive disseminations of DOs in (a federation of) repositories. Pathways InterDisseminator Service Overlay

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY There can be many of these engines in a federation. The result is the ability to provide context-sensitive disseminations of DOs in (a federation of) repositories. Pathways InterDisseminator Service Overlay

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY DSpace baseURL_y? url_ver=Z & rft_id=URI_7 & svc_id=info:magic/justdoit Fedora aDORe baseURL_y? url_ver=Z & rft_id=URI_7 & svc_id=info:pathways/dip.rdf URI_7 URI_3 URI_9 RDF service execution engine web service

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY Pathways InterDisseminator Demo aDORe Digital Object in Demo TypeMIMEidentifier Digital Objectscholarly paperN/ADOI Constituent Datastream 1metadata record application/xml (MARCXML) aDORe datastream id (info URI) Constituent Datastream 2metadata record application/xml (original metadata) aDORe datastream id (info URI) Constituent Datastream 3fulltext fileapplication/pdf aDORe datastream id (info URI)

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY Demo Install TSCC coded ( Launch movie Pathways_InterDisseminator.avi in same path as this presentation

Herbert Van de Sompel Research Library, Los Alamos National Laboratory OAI4, October , CERN, Geneva, Switzerland RESEARCH LIBRARY Comments, Flames, Questions