Open Provenance Model Tutorial Session 6: Interoperability.

Slides:



Advertisements
Similar presentations
Testing Relational Database
Advertisements

Chapter 7 System Models.
Designing Services for Grid-based Knowledge Discovery A. Congiusta, A. Pugliese, Domenico Talia, P. Trunfio DEIS University of Calabria ITALY
UKOLN is supported by: Put functionality Augmenting interoperability across scholarly repositories 20/21 April 2006 Rachel Heery, UKOLN, University of.
Open Provenance Model Tutorial Session 2: OPM Overview and Semantics Luc Moreau University of Southampton.
Open Provenance Model Tutorial Session 3: OPM Serializations Luc Moreau University of Southampton.
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
S&I Framework Provider Directories Initiative esMD Work Group October 19, 2011.
Characteristics of on-line formation courses. Criteria for their pedagogical evaluation Catalina Martínez Mediano, Department of Research Methods and Diagnosis.
Build VIVO in the Cloud NIH Workshop on Value Added Services for VIVO Brand Niemann Semantic Community March 25-26,
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 8 Slide 1 System models.
NextGRID & OGSA Data Architectures: Example Scenarios Stephen Davey, NeSC, UK ISSGC06 Summer School, Ischia, Italy 12 th July 2006.
Software Engineering Module 1 -Components Teaching unit 3 – Advanced development Ernesto Damiani Free University of Bozen - Bolzano Lesson 2 – Components.
Data Sources & Using VIVO Data Visualizing Scholarship VIVO provides network analysis and visualization tools to maximize the benefits afforded by the.
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation Mike Smorul, Joseph JaJa, Yang Wang, and Fritz McCall.
Lecture Nine Database Planning, Design, and Administration
 MODERN DATABASE MANAGEMENT SYSTEMS OVERVIEW BY ENGINEER BILAL AHMAD
Distributed Data Analysis & Dissemination System (D-DADS) Prepared by Stefan Falke Rudolf Husar Bret Schichtel June 2000.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Section 01Resources1 HSQ - DATABASES & SQL 01 Resources And Franchise Colleges Name :MANSHA NAWAZ room :G 0/32
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
Annual SERC Research Review - Student Presentation, October 5-6, Extending Model Based System Engineering to Utilize 3D Virtual Environments Peter.
Open Provenance Model Tutorial Session 5: OPM Emerging Profiles.
Developing Health Geographic Information Systems (HGIS) for Khorasan Province in Iran (Technical Report) S.H. Sanaei-Nejad, (MSc, PhD) Ferdowsi University.
Semantic Interoperability Jérôme Euzenat INRIA & LIG France Natasha Noy Stanford University USA.
DBA230 Introducing SQL Server 2000 Reporting Services Jason Carlson Product Unit Manager SQL Server Microsoft Corporation.
Database System Development Lifecycle © Pearson Education Limited 1995, 2005.
Crystal Hoyer Program Manager IIS Team Preview of features that will be announced at MIX09 Please do not blog, take pictures or video of session.
MAHI Research Database Data Validation System Software Prototype Demonstration September 18, 2001
CISB594 – Business Intelligence
Department of Biomedical Informatics Service Oriented Bioscience Cluster at OSC Umit V. Catalyurek Associate Professor Dept. of Biomedical Informatics.
©Ian Sommerville 2000 Software Engineering, 6th edition. Slide 1 Component-based development l Building software from reusable components l Objectives.
Introducing Reporting Services for SQL Server 2005.
Presented by Abirami Poonkundran.  Introduction  Current Work  Current Tools  Solution  Tesseract  Tesseract Usage Scenarios  Information Flow.
® IBM Software Group © 2007 IBM Corporation J2EE Web Component Introduction
International Telecommunication Union Geneva, 9(pm)-10 February 2009 ITU-T Security Standardization on Mobile Web Services Lee, Jae Seung Special Fellow,
The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Dataset Caitlin Minteer & Kelly Clynes.
Copyright © 2004 by The Web Services Interoperability Organization (WS-I). All Rights Reserved 1 Interoperability: Ensuring the Success of Web Services.
Configuration Management (CM)
What are the main differences and commonalities between the IS and DA systems? How information is transferred between tasks: (i) IS it may be often achieved.
Data Visualization Project B.Tech Major Project Project Guide Dr. Naresh Nagwani Project Team Members Pawan Singh Sumit Guha.
DEPICT: DiscovEring Patterns and InteraCTions in databases A tool for testing data-intensive systems.
EU Project proposal. Andrei S. Lopatenko 1 EU Project Proposal CERIF-SW Andrei S. Lopatenko Vienna University of Technology
Esri UC 2014 | Technical Workshop | Esri Roads and Highways: Integrating and Developing LRS Business Systems Tom Hill.
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
Large Scale Nuclear Physics Calculations in a Workflow Environment and Data Provenance Capturing Fang Liu and Masha Sosonkina Scalable Computing Lab, USDOE.
NMI End-to-End Diagnostic Advisory Group BoF Fall 2003 Internet2 Member Meeting.
ITGS Databases.
ABSTRACT The JDBC (Java Database Connectivity) API is the industry standard for database- independent connectivity between the Java programming language.
Standards for Technology in Automotive Retail STAR Update Michelle Vidanes STAR XML Data Architect April 30 th, 2008.
Project Database Handler The Project Database Handler is a brokering application that mediates interactions between the project database and the external.
Implementation of a Relational Database as an Aid to Automatic Target Recognition Christopher C. Frost Computer Science Mentor: Steven Vanstone.
Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.
DANIELA KOLAROVA INSTITUTE OF INFORMATION TECHNOLOGIES, BAS Multimedia Semantics and the Semantic Web.
IBM Global Services © 2005 IBM Corporation SAP Legacy System Migration Workbench| March-2005 ALE (Application Link Enabling)
Semantic Data Extraction for B2B Integration Syntactic-to-Semantic Middleware Bruno Silva 1, Jorge Cardoso 2 1 2
Introduction to SQL Server 2000 Reporting Services Jeff Dumas Technical Specialist Microsoft Corporation
Slide 1 © 2016, Lera Technologies. All Rights Reserved. SAP BO vs SPLUNK vs OBIEE By Lera Technologies.
Exeter – Implementation of a Crosswalk Connector S. Trowell, University of Exeter Nov 2013.
Model Based Engineering Environment Christopher Delp NASA/Caltech Jet Propulsion Laboratory.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
IPDA Registry Definitions Project Dan Crichton Pedro Osuna Alain Sarkissian.
Chris Menegay Sr. Consultant TECHSYS Business Solutions
Mastering the Fundamentals of RESTful API Design
Chapter 18 MobileApp Design
Outline Pursue Interoperability: Digital Libraries
Gateway to Competency Portability
Web Services Interoperability Organization
敦群數位科技有限公司(vanGene Digital Inc.) 游家德(Jade Yu.)
Reportnet 3.0 Database Feasibility Study – Approach
Presentation transcript:

Open Provenance Model Tutorial Session 6: Interoperability

Session 6: Aims In this session, you will learn about: Steps towards interoperability Interoperability challenges Next steps towards achieving interoperability

Session 6: Contents The Open Provenance Vision (revisited) PC3 PC4 Beyond Representation Discussion

THE OPEN PROVENANCE VISION

Context: heterogeneous environments Applications consist of compositions of loosely coupled, multi-institutional, heterogeneous components How to trace the origin of data in such environments?

Provenance Across Applications Application How to understand the provenance of data products derived by all these applications?

Provenance Across Applications Application Provenance Inter-Operability Layer The Open Provenance Model (OPM)

Provenance Inter-Operability Layer

Open Provenance Vision Open Provenance Vision is a vision of a set of architectural guidelines to support provenance inter-operability, consisting of – controlled vocabulary, – serialization formats and – APIs Open Provenance Vision allows provenance from individual systems to be expressed, connected in a coherent fashion, and queried seamlessly.

Export/Import Approach(PC3) N+1 conversions Centralisation (scalability, security concerns) Running queries is easy PS1 PS2 PS3 PS4 Provenance Inter-Operability Layer PS Convert PS i content to OPM Import OPM into PS Run queries over PS

Distributed Query Approach Query API not specified N query APIs to implement Running queries is challenging Better scalability PS1 PS2 PS3 PS4 Query API Offer OPM based Query API Federated query component Federated Queries Query API

Provenance Inter-Operability Layer Common Tools VisualisationReasoningConversion

MOVING TOWARDS INTEROPERABILITY (PC3)

Provenance Challenge 3 Identify weaknesses and strengths of the OPM specification Encourage the development of concrete bindings for OPM in a variety of languages Determine how well OPM can represent provenance for a variety of technologies (scientific workflow, databases, etc.) Demonstrate that a complex data products provenance can be constructed from process assertions produced by multiple combinations of heterogeneous applications Bring together the community to further discuss the interoperability of provenance systems.

PC3 Workflow The Pan-STARRS project is building and operating the next generation sky surveyPan-STARRS project The load workflow PC3, appearing at the handoff between the image pipeline and the object data management, ingests incoming CSV files into a SQL database.

PC3 Objectives Implement Load workflow Implement queries: – For a given detection, which CSV files contributed to it? – The user considers a table to contain values they do not expect. Was the range check (IsMatchTableColumnRanges) performed for this table? Export provenance to OPM Import other teams OPM outputs Run queries over other teams’ provenance

Good First Steps Teams were able to read and write each others OPM Graphs Most teams were able to perform queries on other OPM Graphs Common Tools for provenance – OPM Toolbox – Tupelo API – Graph visualizations

Challenges Different structures for the same process Difficult to determine where to start a provenance query Lack of values or ability to look-up values made querying hard Lack of types for filtering Lack of consistency across time – This is the same artifact but in a different state

Updates to OPM 1.1 Profiles to: – Enable guidance about structures used – Ability to look up particular values through vocabulary Types Persistent names

VERIFYING INTEROPERABILITY (PC4)

Are we closer? Propose a final step (PC4) Comprehensive test of interoperability using OPM Like prior challenges but expanding the application – Include users – Include interactive applications – Include decision points

Publish Data to Third Party User DecisionPoi nt Workflow Collections Processing Collections Processing Publish Data at URL Publish Data at URL User Performs Action Exchange between Services User Decision Point Running a service by others Workflow Collaborati ve Editing Collaborati ve Editing Running Services with data others Citing Data in Paper Social Collaborati on Discovery by Query Credentials Abstract Scenario

Crystallography Workflow

Provenance Questions How many times has this data been cited in other reports? For a given crystal, how often did a crystallographer reject and reproduce coordinates (the later stages of the experiment)? – This is important because difficulty in obtaining an adequate crystal image can indicate that the original diffraction data was poor quality The report has been published but how many times has it been edited before being published?

Additions A common vocabulary Integration points – Allow different kinds of systems to “drop test” integration Key: distinguish between provenance interoperability and other forms of interoperability End-to-end provenance, not everything within the same system

Schedule Abstract Scenario Identify all the data flowing in the system with respect to the crystallography scenario (this can be mocked up) where possible we have example data: (August 30) For each pattern of the process produce a mock-up of the opm graph with respect to the data in step 2 and make sure they stitch together (Nov 30) Finalize queries with respect to scenario (Dec 15) Import and implement queries over the mockup (Feb 28) Generate and publish Provenance for each pattern (Feb 28) Import and Implement Queries over the generated provenance (Mar 30) Decide whether to do api compatibility Prepare slides for challenge [Jun 1 - Jun 8] PC4 Workshop June 10

BEYOND REPRESENTATION

Vision OPM provides a representation of provenance But interoperability requires some more: – Access provenance – Given a document, what is its provenance – Record provenance

Answering these questions Simple solutions Access: http get Document: embedding information using RDFa [Groth2010-provenancejs] Record: basic web service [prep2009]

Conclusion We are close to interoperability in provenance systems Community! Community! Community! Please participate Feedback, where do you need interop?