Presentation is loading. Please wait.

Presentation is loading. Please wait.

The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel The Open Archives Initiative Object Re-Use & Exchange.

Similar presentations


Presentation on theme: "The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel The Open Archives Initiative Object Re-Use & Exchange."— Presentation transcript:

1 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel The Open Archives Initiative Object Re-Use & Exchange (ORE) Project Michael L. Nelson (1) Herbert Van de Sompel (2) Carl Lagoze (3) (1) Computer Science, Old Dominion University (2) Research Library, Los Alamos National Laboratory (3) Information Science, Cornell University ORE is supported by the Andrew W. Mellon Foundation with additional support of the National Science Foundation

2 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel General information about OAI-ORE

3 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel OAI Object Re-Use and Exchange OAI-ORE is a new effort conducted under the umbrella of the OAI Supported by the Andrew W. Mellon Foundation; additional support from the National Science Foundation International effort; October 2006 - September 2008 http://www.openarchives.org/ore/

4 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel Meeting in NYC, April 20-21 2006 Supported by Microsoft, Mellon Foundation, Coalition for Networked Information, Digital Library Federation, JISC Representatives from institutional Repository projects, scholarly content Repositories, Registry projects, various projects that touch on interoperability See http://msc.mellon.org/Meetings/Interop/ for Agenda, Participants, Topics & Goals, Terminology, Presentations, Prototype demonstration, Meeting Report.http://msc.mellon.org/Meetings/Interop/

5 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel OAI Object Re-Use and Exchange OAI-ORE project organization: o Coordinators: Carl Lagoze & Herbert Van de Sompel o ORE Advisory Committee o ORE Technical Committee o ORE Liaison Group

6 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel ORE Technical Committee Les Carr - University of Southampton (UK) Leigh Dodds - Ingenta (UK) Tim DiLauro - Johns Hopkins University Dave Fulker - University Corporation for Atmospheric Research Tony Hammond - Nature Publishing Group (UK) Richard Jones - Imperial College (UK) Peter Murray - OhioLINK Michael Nelson - Old Dominion University Ray Plante - National Center for Supercomputing Applications Pete Johnston - Eduserv Foundation (UK) Rob Sanderson - University of Liverpool (UK) Simeon Warner - Cornell University Jeff Young - OCLC

7 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel ORE Liaison Group Leonardo Candela - EC DRIVER Tim Cole - UUIC ; for DLF Aquifer Julie Allinson - UKOLN ; for the JISC Digital Repository support effort (substituting for Rachel Heery ) Jane Hunter - University of Queensland; for Australian Department of Education, Science and Technology Savas Parastatidis - Microsoft Thomas Place - University of Tilburg ; for DARE (soon to be renamed SurfShare) Andy Powell - EduServ; for the DC community Rob Tansley - Google ; for Google and DSpace

8 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel ORE Advisory Committee Sayeed Choudhury - Johns Hopkins University Gregory Crane - Tufts University Lorcan Dempsey - OCLC Mark Doyle - The American Physical Society John Erickson - Hewlett-Packard Laboratories Steve Griffin - National Science Foundation Robert Hanisch - Space Telescope Science Institute Jane Hunter - The University of Queensland Clifford Lynch (chair) - Coalition for Networked Information Liz Lyon - UKOLN Peter Murray-Rust - University of Cambridge Jim Ostell - National Center for Biotechnology Information Sandy Payette - Cornell University Robby Robson - Eduworks MacKenzie Smith - MIT Libraries Leo Waaijers - SURF Platform ICT and Research

9 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel Context of OAI-ORE Standards & Protocols

10 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel OAI-PMHOAI-ORE Repository structureObject structure Metadata centricResource centric Metadata harvestingObject re-use (obtain, harvest, register) OAI-PMH and OAI-ORE are complimentary; o you can do one without the other o you can do them together OAI: Its Not Just for Metadata Harvesting Anymore…

11 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel An Early Formulation of the Problem First noticed in how people would populate their Dublin Core records o people need the HTML splash page o crawlers need the PDF file Ad-hoc conventions and methods used to expose the repository’s knowledge about the structure of the object Next three slides taken from “Resource Harvesting Within the OAI-PMH Framework” o http://www.dlib.org/dlib/december04/vandesompel/12vandesompel.html

12 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel Dublin Core Encoding Type 1 A Simple Parallel-Plate Resonator Technique for Microwave. Characterization of Thin Resistive Films Vorobiev, A. ING-INF/01 Elettronica A parallel-plate resonator method is proposed for non-destructive characterisation of resistive films used in microwave integrated circuits. A slot made in one... Microwave engineering Europe 2002 Documento relativo ad una Conferenza o altro Evento PeerReviewed http://amsacta.cib.unibo.it/archive/00000014/ pdf http://amsacta.cib.unibo.it/archive/00000014/01/GaAs_1_Vorobiev.pdf locator of resource splash page

13 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel Dublin Core Encoding Type 2 … http://amsacta.cib.unibo.it/archive/00000014/ http://amsacta.cib.unibo.it/archive/00000014/01/GaAs_1_Vorobiev.pdf … locator of resource splash page

14 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel Dublin Core Encoding Type 3 … http://amsacta.cib.unibo.it/archive/00000014/ http://resolver.unibo.it/00000014/ http://amsacta.cib.unibo.it/archive/00000014/01/GaAs_1_Vorobiev.pdf … locator of resource splash page

15 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel And more recently … “Are repositories successfully exposing the full-text of articles (the PDF file or whatever) to Google rather than (or as well as) the abstract page?” “Are we consistent in the way we create hypertext links between research papers in repositories?” (from Andy Powell’s eFoundations blog)eFoundations

16 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel As the objects get more complex, things get worse Rather than continue down that path, let’s back up and restart…

17 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel Compound Information Objects Units of scholarly communication are compound information objects: Identified, bounded aggregations of related information units that form a logical whole. Components of compound object may vary according to: o Semantic type: book, article, moving image, dataset, … o Media type: PDF, HTML, JPEG, MP3,. o Internal relationship: parts, views, … o External relationships compound information objects id

18 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel Access Repositories Compound objects are made accessible by a variety of scholarly repositories: Institutional repositories Discipline-oriented repositories Publisher repositories Dataset repositories Cultural heritage repositories Learning object repositories Digitized book and manuscript collections Research-group and managed personal (ePortfolio) repositories …

19 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel Access Repositories Repositories expose compound objects in manners specific to the repository architecture: Interfaces (API & user-oriented) Identification schemes Representation of compound objects Mapping of compound objects and components to the Web

20 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel Their Structure is Obfuscated When Mapped to the Web

21 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel Structure Can Be Even Harder to Infer When Server/Domain Boundaries are Crossed http://foo.edu/repo1/object12/index.html http://foo.edu/repo1/object12/object12.pdf http://foo.edu/repo1/object12/metadata.dc http://foo.edu/repo1/object12/errata.html http://foo.edu/repo1/object12/index.html http://blurple.org/service?citing-author=Nelson http://blurple.org/service?citing-paper=object12 http://bar.edu/~mln/jcdl-2007.pdf

22 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel Fun CDO Example: Flickr we’d “href” to: http://www.flickr.com/photos/73977402@N00/162521629/http://www.flickr.com/photos/73977402@N00/162521629/ but “img src” to: http://farm1.static.flickr.com/62/162521629_f988d1e5fa.jpghttp://farm1.static.flickr.com/62/162521629_f988d1e5fa.jpg public + private tags (service links) Peers

23 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel http://citeseer.ist.psu.edu/lagoze01open.htmlhttp://citeseer.ist.psu.edu/lagoze01open.html (with semantics); http://citeseer.ist.psu.edu/500650.html (without)http://citeseer.ist.psu.edu/500650.html Scholarly CDO Example: CiteSeer Peers Representations Original, remote version Representation

24 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel Scholarly CDO Examples: arXiv http://arxiv.org/abs/astro-ph/0611775 Representations Service Links Locally held versions Remotely held version

25 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel More Scholarly Compound Digital Object Possibilities An issue of an overlay journal built from distributed ePrints eScience resource combining text, data, simulations eHumanities resource combining primary and derived content

26 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel Systems that manage digital objects Institutional repositories Discipline-oriented repositories Publisher repositories Dataset repositories Cultural heritage repositories Learning object repositories Digitized book and manuscript collections Image repositories … Systems that leverage managed digital objects All repositories from left column Search engines Authoring tools Citation management tools Collaborative environments Social network applications Graph analysis tools Preservation services Workflow tools … OAI-ORE Standards Protocols

27 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel OAI Object Re-Use and Exchange Develop, identify, and profile extensible standards and protocols to allow repositories, agents, and services to interoperate in the context of use and reuse of compound digital objects beyond the boundaries of the holding repositories. Aim for more effective and consistent ways: o to facilitate discovery of these objects, o to reference (link to) these objects (and parts thereof), o to obtain a variety of disseminations of these objects, o to aggregate and disaggregate these objects, o Enable processing by automated agents

28 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel Taking the Web perspective

29 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel Working with the web architecture Whatever we do must be congruent with the web architecture o Use existing capabilities where they are appropriate o Cleanly layer capabilities meeting the needs of our problem space Provide the infrastructure for web-based information systems that exploit/enhance and therefore overlay on the existing web.

30 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel ORE: An Interoperability Layer A projection of private object structure into the public web, using the web architecture: o URIs that identify o resources, which are “items of interest”, that, o when accessed through standard protocols such as HTTP, return o representations of current resource state o and which are linked via URI references o thus forming the graph that is the Web.

31 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel W3C Web Architecture Resource URI Representation 2 Represents Representation 1 Represents Identifies Content Negotiation

32 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel W3C Web Architecture: more details Resource: First-class object Linkable Representation: Second-class object (identified only in context of resource) Not linkable Many representations/resource Relationship: Usually untyped Link type ontologies not-standardized Aggregation: No standard way to describe finite set of resources and relationships

33 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel Compound Object id astro-ph/0611775 Article in PDFArticle in PSSplash page in HTMLMetadata in DC Multiple Views, diverging in media-type, format, and content-type

34 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel More complexity … id astro-ph/0611775 Article in PDFArticle in PSSplash page in HTMLMetadata in DC id hasPart id hasRelationshipTo boundary, logical unit local, remote lineage, version, citation, etc.

35 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel Compound Object id astro-ph/0611775 Article in PDFArticle in PSSplash page in HTMLMetadata in DC Let’s publish it to the Web

36 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel Resource 1 Article in PDF http://arxiv.org/astro-ph/0611775/article/ Article in PS Resource 2 Splash page in HTML http://arxiv.org/astro-ph/0611775/splash/ Resource 3 DC meta XML http://arxiv.org/astro-ph/0611775/meta/DC/ DC meta HTML

37 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel Compound Digital Object mapped to the Web “Are repositories successfully exposing the full-text of articles (the PDF file or whatever) to Google rather than (or as well as) the abstract page?” o Discovery: How does Google find all these resources that originate from the same digital object? o Boundary: How does Google know these resources originate in the same digital object?

38 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel Compound Digital Object mapped to the Web “Are we consistent in the way we create hypertext links between research papers in repositories?” o Citation: Which Resource to link to? o Citation: How to reference the PDF version (and not the PS version)?

39 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel Thoughts about a possible approach

40 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel Observation 1 Components of compound object must be published as resources in order to be reference-able

41 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel Observation 2 The object “as such” (boundary, structure, relationships) is invisible to Web applications

42 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel Observation 2 bis How about publishing a resource that makes a Resource Map available that formally expresses the boundaries of the object? Machine readable Resource Map

43 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel Observation 3 And now facilitate discovery of the Resource Map (and hence of the compound object) by Web applications HTTP LINK HEADER

44 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel Observation 4 bis Through the Resource Map, the Web application sees the compound object

45 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel Observation 5 This approach reveals compound objects in the Web graph

46 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel Resource Map available from ORE resource Expresses an aggregation of resources and relationships in a machine-readable manner. Describes a graph: o finite set of resources and relationships among the resources o relationships among resources that are members of the aggregation and & resources are external to the aggregation Can be used to express: o Our scholarly compound objects o Whichever aggregation of resources and relationships Having a standardized format for Resource Maps opens the door to “graph publishing” (cf. Semantic Web notion).

47 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel Use and Re-Use enabled by the ORE resource ORE resource has a URI: HTTP ORE o let’s call that ORE resource a “Resource Map” HTTP ORE identifies a graph (cf. Semantic Web notion Named Graph) The Resource Map is available via HTTP GET on HTTP ORE HTTP ORE can become the key for object re-use: Obtain, Harvest, Register (cf. Web 2.0 mash-up) “The {Resource} Map is not the Resource” (apologies to Alfred Korzybski) o Crawlers, agents will initially transact with the Resource Map, not the components of the resource

48 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel More About Resource Map Discovery Two general approaches: o create new resources that describe the boundary & relationships that make up the CDO - web crawling (cf. sitemaps) - new metadataPrefix in OAI-PMH repositories - Atom feeds o instrument existing resources to “point” to the resources - http content negotiation - http headers - html “microformats” Selective discovery o you should never get a Resource Map unless you really asked for it; existing harvesters, crawlers will not break o Resource Maps are for machines, not humans

49 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel So, where does ORE stand?

50 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel OAI-ORE : Current Status Ongoing definition of the ORE framework o Reach joint problem statement o Issues regarding identification o Model for ORE resource o Publishing ORE resources to the Web o Discovering ORE resources Review of appropriate technologies for ORE Model and Resource Map o ATOM o DID/DIDL, IMS/CP, METS, Ramlet o RDF, RDF/XML o Dublin Core Abstract Model o …

51 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel OAI-ORE : Current Status Explore demonstrators using these concepts in preparation of May 2007 ORE Technical Committee meeting Post May 2007 meeting: o Hopefully work towards alpha specs for ORE resource, Resource Map, discovery of ORE resource o Experimentation with alpha specs

52 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel OAI-ORE : Afterwards Look into core services Obtain, Harvest, Register, in terms of ORE resource and Resource Map.

53 The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel Questions Further information http://www.openarchives.org/ore/


Download ppt "The OAI-ORE Project DLF Spring Forum 2007, Pasadena CA, April 25, 2007 Nelson, Lagoze & Van de Sompel The Open Archives Initiative Object Re-Use & Exchange."

Similar presentations


Ads by Google