Presentation is loading. Please wait.

Presentation is loading. Please wait.

Research Objects Preserving scientific data and methods Stian Soiland-Reyes, Khalid Belhajjame School of Computer Science, Univ of Manchester myGrid NIHBI.

Similar presentations


Presentation on theme: "Research Objects Preserving scientific data and methods Stian Soiland-Reyes, Khalid Belhajjame School of Computer Science, Univ of Manchester myGrid NIHBI."— Presentation transcript:

1 Research Objects Preserving scientific data and methods Stian Soiland-Reyes, Khalid Belhajjame School of Computer Science, Univ of Manchester myGrid NIHBI meet-up Manchester 2013-01-17

2 2 Agenda » Preserving digital science » The Research Object » Anatomy » Lifecycle » Wf4Ever Tools » Future developments

3 3 Computation Processes in Today’s Research » Research is being conducted in increasingly digital and online environment » This has led to the emergence of new digital artifacts » In some respects, these objects can be regarded as data » However, some objects include the description of the research method that is captured as a computational process » Such processes encapsulate the knowledge related to the generation, (re)use and general transformation of data in experimental sciences Raw data Computational process Results

4 4 Scientific Workflow »A scientific workflow is a precise, executable description of a scientific procedure - a series of analysis operations connected using data links »Each operation represents the execution of a computational process »Can be supplied by independently developed web services »Can also use existing data sources that are accessible on the Web In this work, we focus on a particular kind of computational processes called scientific workflows

5 5 Preservation Challenges »Changes by 3 rd parties »Workflow may produce different lists at different times »Workflow may become inoperable Challenges deal with their executable aspects and their vulnerability to the volatility of the resources required for their execution »Workflow decay – The execution of the workflow may fail or yield different results, due to dependencies on resources and services subject to independent changes, e.g., EMBL-EBI. Even workflows that depend on local resources are vulnerable.

6 Laboratory Instruments Methods Materials Publication Models, Techniques, Algorithms Data Laboratory Instruments Methods Materials Provenance Attribution Credit Provenance Attribution Credit Context Investigation Study Experiment Context Investigation Study Experiment Replicate / Repeat Exactly replicate the original experiment and experimental conditions. Eliminate change. Observe. Reproduce Run experiment with differences in experimental conditions.. Compare to test for same result. Observe. Capture Curate Discover Use Reuse Preserve Reproduce Between Labs Repeat Within Lab

7 RO Architecture is Hourglass ROs structured packages Provenance, Versioning, Mim services Viewing, collaboration services/protocols Astronomy, Biology, services/protocols Exchange services (media specific) Storage services (media specific)

8 8 Research Object Datasets Results Scientists Hypothesis Experiments Annotations Provenance Electronic paper Workflows From Electronic papers to Research objects

9 9

10 10 Research Object: A user scenario

11 11 Why research objects?  A research object aggregates all elements deemed necessary to understand research investigations  Promote reuse, sharing  Enable the verification of reproducibility of the results  Trackable, versionable, referenceable

12 12 Anatomy of a research object ro:Resource ro:ResearchObject ro:Manifest ore:aggregates ore:describes ro:Folder ro:FolderEntry ore:proxyFor ore:proxyIn Subclass of ro:SemanticAnnotation ore:aggregates ro:annotatesAggregatedResource RDF file ao:body

13 Grounding Workflow-centric Research Objects Using Semantic Technologies  Workflow-centric research objects are encoded using RDF, according to a set of ontologies that are publicly available  Research objects extend the Object Exchange and Reuse (ORE) model, to represent aggregation. 13 ORE

14  We use the Annotation Ontology (AO) to annotate research object resources and their relationships. 14 Grounding Workflow-centric Research Objects Using Semantic Technologies

15 15 Relating resources in research object Results Logs Results Metadata Paper Slides Feeds into produces Included in produces Published in produces Included in Published in Workflow_16 Workflow_13 Common pathways QTL The provenance of the RO elements is key to understanding, comparing and debugging scientific workflows and to verifying the validity of a claim made within the context of a RO

16 16 Scientist Live RO RO snapshot > Identified by a URI Some metadata Some curation Mostly private (for my group) RO snapshot > Identified by a URI Some metadata Some curation Mostly private (for my group and for paper reviewers) Librarian/Curator Scientist My supervisor calls me to report my work My supervisor calls me again and we decide to publish our RO+paper > Archived RO > Identified by a URI Good metadata and curation Mostly public Reviews received and final version published > A new PhD student continues my work > Evolution of a research object

17 17 PROV standard - Basis for evolution model http://www.w3.org/TR/prov-primer/ Candidate Recommendation

18 18 Customizable preservability checklists Wf4Ever Tools

19 19 Portal: Browsing and annotating Wf4Ever Tools

20 20 Command line tools, Client libraries Wf4Ever Tools https://github.com/wf4ever/

21 21 Specifications and APIs Wf4Ever Tools

22 22 Current Status and Ongoing Work 22 [3] http://www.myexperiment.org/  Models/spec v0.1 public: http://purl.org/wf4ever/model - Upcoming revision v0.2: (Q1 2013) Minor additions to workflow model terms “RO Terms” – Upper user level view of RO: hypothesis, results – many are “shortcuts” for structured model - TODO: Update annotation model to Open Annotation Data Model (OAC) - TODO: PAV for detailed authorship provenance  Showing, managing and sharing of Research Objects through myExperiment web site

23 23 Open Annotation Data Model http://www.openannotation.org/spec/core/ “Almost final” spec: 2013-01-28 Roll out meeting in Manchester: March 2013 Community Draft

24 24 myExperiment RO support

25 Thank you! http://www.wf4ever-project.org / http://www.mygrid.org.uk/


Download ppt "Research Objects Preserving scientific data and methods Stian Soiland-Reyes, Khalid Belhajjame School of Computer Science, Univ of Manchester myGrid NIHBI."

Similar presentations


Ads by Google