The RMap Project: Linking the Products of Research and Scholarly Communication 2015-04-22 Tim DiLauro.

Slides:



Advertisements
Similar presentations
UKOLN is supported by: The JISC Information Environment Metadata Schema Registry (IEMSR): Update DC-2006, Manzanillo, Mexico October 3-6, 2006 Rachel Heery.
Advertisements

Yammer Technical Solutions Overview
Bibliographic Framework Initiative Approach for MARC Data as Linked Data Sally McCallum Library of Congress.
Semantic Web Introduction
Digital Preservation - Its all about the metadata right? “Metadata and Digital Preservation: How Much Do We Really Need?” SAA 2014 Panel Saturday, August.
(1) Standardizing for Open Data Ivan Herman, W3C Open Data Week Marseille, France, June Slides at:
A community-maintained data store for descriptions of library resources Global Open Knowledgebase (GOKb)
Object Re-Use and Exchange Mellon Retreat, Nassau Inn, Princeton, NJ, March Herbert Van de Sompel, Carl Lagoze The OAI Object Re-Use & Exchange.
© 2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice RDF and SOA David Booth, Ph.D. HP.
Planning for Flexible Integration via Service-Oriented Architecture (SOA) APSR Forum – The Well-Integrated Repository Sydney, Australia February 2006 Sandy.
UKOLN is supported by: OAI-ORE a perspective on compound information objects ( Defining Image Access.
Requirements Specification
Rutgers University Libraries What is RUcore? o An institutional repository, to preserve, manage and make accessible the research and publications of the.
RDF: Building Block for the Semantic Web Jim Ellenberger UCCS CS5260 Spring 2011.
Institutional Repositories Tools for scholarship Mary Westell University of Calgary AMTEC Conference May 26, 2005.
Data Sources & Using VIVO Data Visualizing Scholarship VIVO provides network analysis and visualization tools to maximize the benefits afforded by the.
Chapter 7 Structuring System Process Requirements
USE Case Model.
Key integrating concepts Groups Formal Community Groups Ad-hoc special purpose/ interest groups Fine-grained access control and membership Linked All content.
The Data Attribution Abdul Saboor PhD Research Student Model Base Development and Software Quality Assurance Research Group Freie.
DATA FOUNDATION TERMINOLOGY WG 4 th Plenary Update THE PLUM GOALS This model together with the derived terminology can be used Across communities and stakeholders.
Metadata: An Overview Katie Dunn Technology & Metadata Librarian
Chapter 7 Structuring System Process Requirements
Digital Object Architecture
RECALL THE MAIN COMPONENTS OF KIM Functional User Interfaces We just looked at these Reference Implementation We will talk about these later Service Interface.
Electronic Theses at Rhodes University presented by Irene Vermaak Rhodes University Library National ETD Project CHELSA Stakeholder Workshop 5 November.
ASG - Towards the Adaptive Semantic Services Enterprise Harald Meyer WWW Service Composition with Semantic Web Services
RMap Project RDA Fourth Plenary Amsterdam 23 September 2014 Sayeed Choudhury, Data Conservancy Sheila Morrissey, Portico.
© 2008 IBM Corporation ® IBM Cognos Business Viewpoint Miguel Garcia - Solutions Architect.
Interfacing Registry Systems December 2000.
DDI-RDF Leveraging the DDI Model for the Linked Data Web.
Scholarly communications Discussion group Linked Data Workshop May 2010.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Web Services Based on SOA: Concepts, Technology, Design by Thomas Erl MIS 181.9: Service Oriented Architecture 2 nd Semester,
Ocean Observatories Initiative Data Management (DM) Subsystem Overview Michael Meisinger September 29, 2009.
Chapter 10 Analysis and Design Discipline. 2 Purpose The purpose is to translate the requirements into a specification that describes how to implement.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
1 Open Ontology Repository: Architecture and Interfaces Ken Baclawski Northeastern University 1.
PlumX and Pitt: Understanding and Visualizing Research Impact Rush G. Miller Hillman University Librarian and Director, ULS University Library System University.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Digital Library The networked collections of digital text, documents, images, sounds, scientific data, and software that are the core of today’s Internet.
1 Not So Strange Bedfellows: Information Standards For Librarians AND Publishers November 6, 2015.
 Key integrating concepts  Groups  Formal Community Groups  Ad-hoc special purpose/ interest groups  Fine-grained access control and membership 
Connecting components that graph the “new” article Gerry Grenier Senior Director IEEE, Inc.
Riccardi: DIALOGUE Workshop August 1, 2005 Supported by NSF BDI 1 Representing and Using Phylogenetic Characters in Morphbank Greg Riccardi, David Gaitros,
UML - Development Process 1 Software Development Process Using UML.
Carl Lagoze Digital Library Service Registry Workshop Services in a Scholarly Communication Framework.
Electronic Theses and Dissertations: The bepress Approach Ben Hermalin Interim Dean, Haas School of Business, UC Berkeley & Co-Founder, bepress.
CNI Task Force Meeting April 7, 2008 OAI-ORE Project Briefing David Reynolds Tim DiLauro Sayeed Choudhury Library Digital Programs Sheridan Libraries Johns.
Data Citation Implementation Pilot Workshop
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Event Linking With Meaning: Ontological Hypertext and the Semantic Web Hugh Davis Learning Societies Lab ECS The University of Southampton, UK All Notes.
© 2010 IBM Corporation RESTFul Service Modelling in Rational Software Architect April, 2011.
SysML v2 Model Interoperability & Standard API Requirements Axel Reichwein Consultant, Koneksys December 10, 2015.
International Planetary Data Alliance Registry Project Update September 16, 2011.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
Developing our Metadata: Technical Considerations & Approach Ray Plante NIST 4/14/16 NMI Registry Workshop BIPM, Paris 1 …don’t worry ;-) or How we concentrate.
Data Sources & Using VIVO Data Visualizing Science VIVO provides network analysis and visualization tools to maximize the benefits afforded by the data.
Research Developer, Portico
Using RMap to Describe Distributed Works as Linked Data Graphs: Outcomes and Preservation Implications iPres. Bern, Switzerland, October 5, 2016 Karen.
Packaging Specification Package Ingest Service
DataNet Collaboration
ORCID AND HRA What, Why, 05 OCT 2017
Jenn Riley Metadata Librarian Digital Library Program
An Architecture for Complex Objects and their Relationships
The Re3gistry software and the INSPIRE Registry
MANAGING DATA RESOURCES
NSDL Data Repository (NDR)
Jenn Riley Metadata Librarian Digital Library Program
Presentation transcript:

The RMap Project: Linking the Products of Research and Scholarly Communication Tim DiLauro

Motivation Compound objects fast becoming the norm for outputs of scholarly communication. In many circumstances, the traditional article is not the object of long-term interest for at least some segment of the community. Components may reside in different repositories, maintained by different institutions, employing different technologies. –Some of these components and their repositories are not part of the traditional scholarly communication ecosystem. Acknowledgement that these objects do not stand alone -- and of a broad need to understand their context.

Research Partnership Data Conservancy: Expertise in management of large data archives from multiple disciplines IEEE: Expertise in management of data-intensive scholarly journal publications Portico: Expertise in digital preservation, publisher workflow requirements, and existing relationships with 275 publishers Funding from the Alfred P. Sloan Foundation

Some High-Level Goals RMap tool working prototype Collaborative partnerships with the community System that supports emerging forms of digital scholarship and publishing Plan for sustainability of the project

Work Plan Year One—Planning Phase: Gather requirements, create use cases, hold workshop with stakeholders, refine use scenarios based on community feedback [You are here] Year Two—Prototype Development: Create system to identify, store, update, and retrieve relationships among publications and other forms of scholarly output, including data and software

TECHNOLOGY The RMap Project

Key Objectives Support assertions from broad set of contributors Integrate with Linked Data Leverage data from existing scholarly publishing stakeholders (publishers, identifier providers, data and software repositories) Provide support for agents and other resources without identifiers (authors, textual citations)

Data Model (simplified)

Data Model - Resource Things (abstract or concrete) that can have an identifier Basic building block of the WWW Key entity for description and retrieval within RMap Other core entities in the data model are also Resources Things (abstract or concrete) that can have an identifier Basic building block of the WWW Key entity for description and retrieval within RMap Other core entities in the data model are also Resources

Data Model - Agent A person or thing (or group of these) responsible for some action Distinction between scholarly (e.g., author, funder, publisher, data processing program) and system (RMap component, user, etc.) A person or thing (or group of these) responsible for some action Distinction between scholarly (e.g., author, funder, publisher, data processing program) and system (RMap component, user, etc.)

Data Model - Event Capture provenance within RMap system An action or activity involving System Agents and other resources Provenance of Scholarly Resources can be captured separately by registering it in RMap via DiSCOs. Capture provenance within RMap system An action or activity involving System Agents and other resources Provenance of Scholarly Resources can be captured separately by registering it in RMap via DiSCOs.

Data Model – RDF Statement (triple) Building blocks of the semantic web Conceptually of the form: Like subject-verb-object in English Building blocks of the semantic web Conceptually of the form: Like subject-verb-object in English

Data Model - DiSCO Distributed Scholarly Compound Object Primary unit of registration within RMap Basically a set of resources and related RDF description. Similar to OAI-ORE Distributed Scholarly Compound Object Primary unit of registration within RMap Basically a set of resources and related RDF description. Similar to OAI-ORE

Data Model - DiSCO

Create dataset D-1

Create software S-1

Generate dataset D-2

Article related to D-2

Creation of software S-2

Generation of dataset D-3

Article A-2 related to D-3

Correct article identifier

Dataset D-1 connections

Creator C-1 connections

Associate resources with C-1 identity

Associate resources with more identities

RESTful APIs Programming language independent Easy to test with web tools (curl, wget) Abstraction away from underlying implementations and models, which we expect to change more often

Function HTTP verb API rel path (base=/api/{version}) Retrieve related triples GET /{resourceURI}/stmts Retrieve related events GET /{resourceURI}/events Retrieve related DiSCOs GET /{resourceURI}/discos Create DiSCO POST /disco Retrieve DiSCO GET /disco/{discoId} Update DiSCO POST /disco/{discoId}/update Delete a DiSCO DELETE /disco/{discoId}/delete Retrieve an Event GET /event/{eventId} Get DiSCOs related to event GET /event/{eventId}/discos Perform SPARQL query POST /sparql REST APIs (subset)

Behaviors API paths Data Models Serializations (media types, content negotiation) Implementations API Specification and Documentation

Function: Update DiSCO Behavior within RMap – Failed requests will be rolled back, so as not to require manual cleanup (transaction) – Insufficient authorization will result in failed transaction and offer to authenticate with other credentials. – A new DiSCO will be instantiated; the previous (old) DiSCO will be marked “inactive” – Add triple – Resources will be instantiated for objects without identifiers (e.g., citation as string) – Scholarly Agents will be instantiated for agents lacking URIs (e.g., creator as string) – Event(s) created capture activity Request – Verb/relative path: POST /disco/{id}/update – Path parameters: {id} - URI of existing (old) DiSCO – Model: Resources + relationships (like OAI-ORE) – Serializations: RDF/XML, Turtle, or JSON-LD Response – Model: (custom) – Serializations: JSON, HTML – New DiSCO URI in header: Location: – Old DiSCO URI in header: Link ;rel=“predecessor-version” – Event URI(s) in header: Link ;rel=“ – [Enumerate response codes, labels, and their meanings] API Description (simplified)

Current focus on APIs to populate and access the graph Future focus – Authentication – Administration – Composition & normalization – Inferencing – System operability API Coverage

Developed and captured initial set of use cases Developed and documented initial data model Specified API behaviors Developed and documented API methods, including REST paths, request and response formats, models, and serializations (media types) – Still a couple issues to sort out Prototype platform implementation Participation in RDA Data Publishing groups Actively working on harvesting relationship data to push into RMap Technical Team Activity

Harvesting links and proxy registration

Community Engagement The RMap Project

Workshop: Key Feedback RMap Project should be a clearinghouse or meta- service that captures information about various data-linking services Important to add value to the publication & data linking work already underway in the community Having an established publisher as a research partner is a comparative advantage for the RMap Project

Workshop Feedback (continued) One approach would be to focus on the “input” side of the process (with special attention for software and research workflows) in order to create a generalizable approach to gathering content The challenge of “secondary data”, such as the inferred connections between publications and data or software remains unaddressed and important

Some of the things you can do for RMap Feedback –Do the articulated use cases, approach, goals, and proposed offerings align with your interests. Where they don’t, how could we better align? Share Your Data –As we populate our prototype, we need to gather a broad swath of test data, covering a variety of resource types (e.g., journals, repositories, funders, creators, articles, data, software, instruments, samples) and the relationships that connect them. Use –Consider using RMap capabilities to register, discover connections to, and augment your own content, once those capabilities become available.

Some of the things RMap will do for you Aggregate and offer an inclusive and normalized view of distributed scholarly compound objects and associated resource relationships, including those from sources without membership in existing identity services (e.g., source code management platforms, institutional repositories). –Reduce cost and complexity of transforming information from multiple systems. Provide a single mechanism to discover context (e.g., relationships and related resources) for scholarly objects in which you are interested. –Reduce cost and complexity of developing and managing multiple interfaces for multiple systems. Expose records of a particular statement (e.g., who has asserted that Resource X was created by Agent Y?) or the history of assertions associated with a with a particular resource (i.e., what has been said about Resource X?). Capture sufficient provenance information to allow evaluation of assertions by their source and content. –Streamline logic for automatic integration of citation and reference to objects of interest.

Team Members and Acknowledgements Sayeed Choudhury, Tim DiLauro: Data Conservancy, Johns Hopkins Mark Donoghue, Gerry Grenier, Renny Guida, Ken Rawson: IEEE Vinay Cheruku, Karen Hanson, Amy Kirchhoff, John Meyer, Sheila Morrissey, Stephanie Orphan, Jabin White, Kate Wittenberg: Portico This research project is made possible through generous support from the Alfred P. Sloan Foundation We thank our workshop participants for their valuable feedback

Q&A For more information, please visit: –