Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using the RDA Collections API to Shape Humanities Data

Similar presentations


Presentation on theme: "Using the RDA Collections API to Shape Humanities Data"— Presentation transcript:

1 Using the RDA Collections API to Shape Humanities Data
Bridget Almas, Frederik Baumgardt (Perseids Project, Tufts University) Tobias Weigel, DKRZ, Thomas Zastrow MPCDF DH 2017, Montreal

2 @resdatall @PerseidsProject
Session Objectives Introduce the RDA Collections API: what and why? Demonstrate how we have applied it to manage Perseids humanities data Offer examples of how it is being applied in other disciplines Get your feedback on applicability of this work for other collections of cultural heritage data

3 RDA Collections API - What?
An abstract data model for machine-actionable data collections Model-agnostic - can work with various collections models, both RDF- and non-RDF-based An HTTP-based API Create/Read/Update/Delete/List (CRUD/L) operations Advanced set-based operations Extensible via Service Features and Collection Capabilities

4 RDA Collections API - Why?
Make solutions for managing collections more sustainable and widely available Encourage better data management practices Facilitate cross-collection and cross-discipline interoperability Enable development of common tools and services for sharing and expanding data collections across repositories and disciplines

5 RDA Collections API - Model

6 RDA Collections API - Operations
Collections: POST (Create) GET (Read and List) PUT (Update) DELETE Operations: Intersection, Union, Flatten... Collection Members:

7 Implementations: Perseids Manifold (Python + LDP)
Python/Flask implementation Main payload are linked data annotations Multiple data backends: Filesystem, RDF/LDP, MongoDB Deployed at Source at Demo endpoint: Docker image:

8 Humanities Use Case: Perseids

9 Humanities Use Case: Perseids
Eventually want to enable requests for (e.g.): All data created by User X All data approved by Community X All treebank data for Homer’s Iliad Book 1 Lines 1-10 All translation alignments of Homer’s Iliad Book 1, Lines 1-10 All semantic annotations on Vergil’s Aeneid Book 1 And so forth….

10 Humanities Use Case: Perseids

11 Humanities Use Case: Perseids
Demo

12 Proposed Implementation: Fedora
Perseids implementation confirms ability to work with LDP-modelled collections Currently looking for stakeholders among the Fedora Community to pursue adding support for the Collections API via the Fedora API-X service

13 Implementations: Reptor
Reptor („Repository“) is a data repository which was developed following some RDA recommendations Implements RDA DTR, DFT and Collection WG recommendations Example: Collection API calls: curl -X GET curl -X GET

14 Implementations: Ruby Collections Client
Client-side gem Supports all API operations Auto-generated from the Swagger API specification

15 Testbed Deployment: RPID
Testbed to stimulate and enable evaluation of complementary RDA outputs in PID-oriented data management Provides: Handle Service Data Type Registry PID Kernel Information WG profiles Collections API Persistent Identifier Types (PIT) API

16 Disciplinary Use Cases: GEOFON/Seismology Data
One of the biggest seismological data archives in Europe Responds dynamically to requests for big datasets providing all the data related to a project or different (overlapping) subsets from it (e.g. seismic waveforms related to an earthquake). Data sets point to data files in the archive via PIDs or URLs Keeps copies of the request definitions made by users in order to be recreated if needed (e.g. if shared with another user) for reproducibility Exposes around 6,000 collections with more than 1.5 million members

17 Disciplinary Use Cases: DKRZ/Climate Data
Coupled Model Intercomparison Project (CMIP) is comparing the outputs of running multiple climate models with same input and boundary conditions Data are individual files, bearing PIDs, containing a single data variable of a single simulation over the simulation's time range and covering the whole globe. These files are then combined into datasets, which consequently represent all data from a single simulation. Extending an existing solution to become conformant with the Collections API to enable standardized read access from third parties Use of persistently-identified collections will enable citation and reproducibility of results, including provenance relationships across collections

18 Disciplinary Use Cases: CAU Kiel/IGSN
Lab1 Collection South Pacific Collection Seamount Collection Cruise1 Collection ... Cruise2 Collection Type1 Collection ... Type2 Collection Event Collection Event Collection Event Collection ... Storage Collection Section Half/liner ... Cruise2 Collection Event Collection Samples Samples Samples Core Geological age Collection

19 Feedback and Questions

20 @resdatall @PerseidsProject
Where to find out more


Download ppt "Using the RDA Collections API to Shape Humanities Data"

Similar presentations


Ads by Google