The Semantic Web, Service Oriented Architectures, the my Grid Experience Carole Goble

Slides:



Advertisements
Similar presentations
GRADD: Scientific Workflows. Scientific Workflow E. Science laboris Workflows are the new rock and roll of eScience Machinery for coordinating the execution.
Advertisements

Taverna: From Biology to Astronomy Dr Katy Wolstencroft University of Manchester my Grid OMII-UK.
© Geodise Project, University of Southampton, Applying the Semantic Web to Manage Knowledge on the Grid Feng Tao, Colin.
© Geodise Project, University of Southampton, Semantic Web based Content Enrichment and Knowledge Reuse in e-Science.
Sandra Gesing Division for Simulation of Biological Systems Eberhard-Karls-Universität Tübingen Portals for Life.
Sandra Gesing Eberhard-Karls-Universität Tübingen Requirements on a portal for MoSGrid (Molecular Simulation.
Center for Bioinformatics, University of Tübingen
Peter Rice Bioinformatics and Grid: Progress and Potential Peter Rice, EBI ISGC, April 2005.
Semantic Web Agents: Hope or Hype Nicholas Gibbins School of Electronics and Computer Science University of Southampton.
Classical and myGrid approaches to data mining in bioinformatics
Taverna the story from up-above Antoon Goderis The University of Manchester, UK DART workshop, Brisbane,
ISWC 2005, Galway Seven Bottlenecks to Workflow Reuse and Repurposing Antoon Goderis Ulrike Sattler Phillip Lord Carole Goble University of Manchester.
IBM Watson Research © 2004 IBM Corporation BioHaystack: Gateway to the Biological Semantic Web Dennis Quan
Doing it again: Workflows and Ontologies Supporting Science Phillip Lord Frank Gibson Newcastle University.
Workflows within Taverna Stuart Owen University of Mancester, UK
The my Grid project aims to provide middleware layers that make the Information Grid appropriate for the needs of bioinformatics. my Grid is building high.
The Representation of Scientific Data
A Semantic Workflow Mechanism to Realise Experimental Goals and Constraints Edoardo Pignotti, Peter Edwards, Alun Preece, Nick Gotts and Gary Polhill School.
Provenance in my Grid Jun Zhao School of Computer Science The University of Manchester, U.K. 21 October, 2004.
Špindlerův Mlýn, Czech Republic, SOFSEM Semantically-aided Data-aware Service Workflow Composition Ondrej Habala, Marek Paralič,
Knowledge Management in Geodise Geodise Knowledge Management Team Liming Chen, Barry Tao, Colin Puleston, Paul Smart University of Southampton University.
Scientific Workflows Scientific workflows describe structured activities arising in scientific problem-solving. Conducting experiments involve complex.
Taverna and my Grid A solution for confusion intensive computing? Tom Oinn – EMBL-EBI,
USC Viterbi School of Engineering Scientific Workflows and Systems Ewa Deelman.
Science, Workflows and Collections Professor Carole Goble The University of Manchester, UK
The Taverna Workbench: Integrating and analysing biological and clinical data with computerised workflows Dr Katy Wolstencroft myGrid University of Manchester.
Taverna and my Grid Basic overview and Introduction Tom Oinn
EXCS Sept Knowledge Engineering Meets Software Engineering Hele-Mai Haav Institute of Cybernetics at TUT Software department.
Designing, Executing, Reusing and Sharing Workflows: Taverna and myExperiment Supporting the in silico Experiment Life Cycle Katy Wolstencroft Paul Fisher.
An Introduction to Taverna Workflows Franck Tanoh my Grid University of Manchester.
1 Foundations V: Infrastructure and Architecture, Middleware Deborah McGuinness TA Weijing Chen Semantic eScience Week 10, November 7, 2011.
OMII-UK Software Activities Steven Newhouse, Director.
(Bio)Web Services at the INB BioMOBY. Instituto Nacional de Bioinformática.
Taverna and my Grid Open Workflow for Life Sciences Tom Oinn
Taverna: A Workbench for the Design and Execution of Scientific Workflows Dr Katy Wolstencroft myGrid University of Manchester.
Linked-data and the Internet of Things Payam Barnaghi Centre for Communication Systems Research University of Surrey March 2012.
Going with the Flow Distributed Computing for Systems Biology Using Taverna Prof Carole Goble The University of Manchester, UK
EU Project proposal. Andrei S. Lopatenko 1 EU Project Proposal CERIF-SW Andrei S. Lopatenko Vienna University of Technology
© DATAMAT S.p.A. – Giuseppe Avellino, Stefano Beco, Barbara Cantalupo, Andrea Cavallini A Semantic Workflow Authoring Tool for Programming Grids.
Provenance challenge --- my Grid David De Roure University of Southampton Jun Zhao, Carole Goble and Daniele Turi University of Manchester.
VBI Web Services Workshop May 2005 Performing In silico Experiments in a Service Based Architecture: Solutions and Issues Chris Wroe, Phillip Lord,
Professor Carole Goble
©Ferenc Vajda 1 Semantic Grid Ferenc Vajda Computer and Automation Research Institute Hungarian Academy of Sciences.
Quality views: capturing and exploiting the user perspective on data quality Paolo Missier, Suzanne Embury, Mark Greenwood School of Computer Science University.
Grid Computing & Semantic Web. Grid Computing Proposed with the idea of electric power grid; Aims at integrating large-scale (global scale) computing.
Presented by Scientific Annotation Middleware Software infrastructure to support rich scientific records and the processes that produce them Jens Schwidder.
Exploring Williams-Beuren Syndrome using my Grid R.D. Stevens, a H.J. Tipney, b C.J. Wroe, a T.M. Oinn, c M. Senger, c P.W. Lord, a C.A. Goble, a A. Brass,
An Identity Crisis in the Life Sciences Jun Zhao, Carole Goble and Robert Stevens The University of Manchester, UK Thanks to: Tom Oinn, Matthew Pocock,
ICCS WSES BOF Discussion. Possible Topics Scientific workflows and Grid infrastructure Utilization of computing resources in scientific workflows; Virtual.
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
© Geodise Project, University of Southampton, Knowledge Management in Geodise Geodise Knowledge Management Team Barry Tao, Colin Puleston, Liming.
Taverna Workbench Stuart Owen University of Mancester, UK
My Grid and Taverna: Now and in the Future Dr. K. Wolstencroft University of Manchester.
Presented by Jens Schwidder Tara D. Gibson James D. Myers Computing & Computational Sciences Directorate Oak Ridge National Laboratory Scientific Annotation.
Enabling complex queries to drug information sources through functional composition Olivier Bodenreider Lister Hill National Center for Biomedical Communications.
First International Workshop on Portals for Life Sciences Sandra Gesing
MyGrid/Taverna Provenance Daniele Turi University of Manchester OMII f2f Meeting, London, 19-20/4/06.
EScience Case Studies Using Taverna Dr. Georgina Moulton The University of Manchester
1 Class exercise II: Use Case Implementation Deborah McGuinness and Peter Fox CSCI Week 8, October 20, 2008.
Using DAML+OIL Ontologies for Service Discovery in myGrid Chris Wroe, Robert Stevens, Carole Goble, Angus Roberts, Mark Greenwood
Selected Workflow and Semantic Experiences from my Grid Professor Carole Goble The University of Manchester, UK
Semantics in Web Service Composition for Risk Management Michael Lutz European Commission – DG Joint Research Centre Ispra, Italy EcoTerm IV, Vienna,
An Introduction to Taverna caBIG monthly workspace call and Taverna, Franck Tanoh.
Introduction to Workflows with Taverna and myExperiment Aleksandra Pawlik University of Manchester materials by Katy Wolstencroft and Aleksandra Pawlik.
Introduction to Workflows with Taverna and myExperiment Aleksandra Pawlik University of Manchester materials by Dr Katy Wolstencroft.
Katy Wolstencroft University of Manchester
Professor Carole Goble University of Manchester, UK
Distributed Computing for System Biology using Taverna Workflows
LOD reference architecture
Presentation transcript:

The Semantic Web, Service Oriented Architectures, the my Grid Experience Carole Goble

Roadmap The problem my Grid Semantic Service / Workflow Discovery Provenance and metadata modelling Semantic Web is Semantic Glue

EPSRC funded UK e-Science Program Pilot Project Thanks to the other members of the Taverna project,

1.Identify new, overlapping sequence of interest 2.Characterise the new sequence at nucleotide and amino acid level Cutting and pasting between numerous web-based services i.e. BLAST, InterProScan etc acatttctac caacagtgga tgaggttgtt ggtctatgtt ctcaccaaat ttggtgttgt cagtctttta aattttaacc tttagagaag agtcatacag tcaatagcct tttttagctt gaccatccta atagatacac agtggtgtct cactgtgatt ttaatttgca ttttcctgct gactaattat gttgagcttg ttaccattta gacaacttca ttagagaagt gtctaatatt taggtgactt gcctgttttt ttttaattgg gatcttaatt tttttaaatt attgatttgt aggagctatt tatatattct ggatacaagt tctttatcag atacacagtt tgtgactatt ttcttataag tctgtggttt ttatattaat gtttttattg atgactgttt tttacaattg tggttaagta tacatgacat aaaacggatt atcttaacca ttttaaaatg taaaattcga tggcattaag tacatccaca atattgtgca actatcacca ctatcatact ccaaaagggc atccaatacc cattaagctg tcactcccca atctcccatt ttcccacccc tgacaatcaa taacccattt tctgtctcta tggatttgcc tgttctggat attcatatta atagaatcaa

Middleware for Life Science solutions Interoperation of services and data sources Repeat Reuse and Share Provenance Manage results My tools, my resources acatttctac caacagtgga tgaggttgtt ggtctatgtt ctcaccaaat ttggtgttgt cagtctttta aattttaacc tttagagaag agtcatacag tcaatagcct tttttagctt gaccatccta atagatacac agtggtgtct cactgtgatt ttaatttgca ttttcctgct gactaattat gttgagcttg ttaccattta gacaacttca ttagagaagt gtctaatatt taggtgactt gcctgttttt ttttaattgg

Middleware for Life Science acatttctac caacagtgga tgaggttgtt ggtctatgtt ctcaccaaat ttggtgttgt cagtctttta aattttaacc tttagagaag agtcatacag tcaatagcct tttttagctt gaccatccta atagatacac agtggtgtct cactgtgatt ttaatttgca ttttcctgct gactaattat gttgagcttg ttaccattta gacaacttca ttagagaagt gtctaatatt taggtgactt gcctgttttt ttttaattgg

Taverna Workflow Workbench OGSA-Distributed Query Processing Results management LSID mIR e-Science coordination e-Science mediator e-Science process patterns e-Science events Notification service Architectural framework my Grid information model Metadata & provenance management using semantics KAVE Legacy integration Publication and Discovery using semantics Feta Pedro Ontology Portal & Application tools

How to select among services? Mostly inputs & outputs are “string” Domain specific descriptions of capabilities Selection is part of workflow assembly by bioinformaticians Selection of alternates for failure also generally user defined, and usually replicas, but need not be. First, find your service

Which means describe your service… Publish and find services (and workflows) with description using an ontology Define domain types for objects passed around workflow Define a set of dimensions with which service capabilities GRIMOIRES / WebDAV directory Tied to BioMOBY Central

Semantic discovery Publish and find services (and workflows) with description using an ontology (in OWL/RDF) Define domain types for objects passed around and a set of dimensions with which service capabilities can be defined using processor abstraction Bootstrapping descriptions Mining and maintaining descriptions The Expert Annotator GRIMOIRE / WebDAV directory Tie into BioMOBY central a-beta/mygrid/descriptions/ a-beta/mygrid/descriptions/ Phillip Lord, Pinar Alper, Chris Wroe, and Carole Goble Feta: A light-weight architecture for user oriented semantic service discovery in Proc of 2 nd European Semantic Web Conference, Crete, June 2005

OWL-S OWL-WS WSMO WSDL-S

Web Interface Processor API Processor API Generic Schema for Service (part of Information model) Specific Application Ontology e.g. caCORE Semantic Web Services Layered model Wroe C, Goble CA, Greenwood M, Lord P, Miles S, Papay J, Payne T, Moreau L Automating Experiments Using Semantic Data on a Bioinformatics Grid in IEEE Intelligent Systems Jan/Feb 2004 We don’t describe WSDL, we describe operations and processors We are classifying for people not machines, so don’t be too clever!

Operation name, description task method resource application Service name description author organisation Parameter name, description semantic type format transport type collection type collection format WSDL based Web service WSDL based operation Soaplab servicebioMoby serviceworkflow hasInput hasOutput Local Java code subclass

Semantic Web Services Semantic Descriptions for Discovery Automated Discovery services or workflows Knowledge assisted brokering & match making Guided instantiation and substitution Composition Automated Composition Self organising SOA Guided workflow assembly Composition (workflow) verification and validation

Semantics-enabled Problem Solving Task configuration Workflow construction Workflow Advisor Semantic service discovery EDSO task ontology

Observations Technical and Abstraction mismatches –Man vs Machine. Manual vs Automation. Service vs Domain Semantics. Basic errors in modelling. –Web services in the wild suck. Not everything is a Web Service. Legacy –Services, middleware, content and practice. Practicality mismatches –Automated or assisted discovery desirable, likely, popular –Automated composition undesirable, unlikely, unpopular Capturing and Curating Content –Annotation is hard. Building the Ontology is hard. QA is hard. Keeping the annotation up to date is hard. The Expert annotator; Altruism for Reuse. Quality Control; Hendler’s Principle –A little semantics goes a long way! Too complicated to use. Tools!!

Sharing takes effort. Unanticipated reuse by people you don’t know in automated workflows. The metadata needed pays off but its challenging and costly to obtain.. Automated, service providers, network effects Quality control. Misuse. Inappropriate use. Competitive advantage, Intellectual property. Workflow design - local or licensed services

The devil is in the detail Experiment provenance Simple workflow Descriptions in biological language Workflows for automagical execution – implicit iteration, generous typing … Debugging and rerunning provenance logs Simple classifications of services Expressive ontologies to match up services automatically Descriptions for automatic service execution and fault management

Courtesy Jim Myers, NCSA e-Scientific method in vivo in vitro in silico Discovery Electronic Notebook Scientific Provenance Engineering Provenance Authorization Project Organization Logging Curation Scientific Content

Tavena workflow workbench in my Grid

Provenance in myGrid The process The data derivation path The ownership The evidence of knowledge a1 E1:S1 X1 E1:S2 Y1Z1 Manchester university “how the Y1 was produced using a1”

Provenance graph representation Identity for the node: URI –Universal Resource Identifier –An extension of URL An RDF (Resource Description Framework) graph: derivedFrom inputOf Ontologies –Telling what they are isA hasFeature Each URI is associated with: –A set of provenance statements –A RDF provenance graph

urn:data:f2 urn:data1 urn:data2 urn:compareinvocation3 urn:data12 Blast_report [input] [output] [input] [distantlyDerivedFrom] SwissProt_seq [instanceOf] Sequence_hit [hasHits] urn:hit2…. urn:hit1… urn:hit50….. [instanceOf] [similar_sequence_to] Data generated by services/workflows Concepts [ ] [performsTask] Find similar sequence [contains] Services urn:data:3 urn:hit8…. urn:hit5… urn:hit10….. [contains] [instanceOf] urn:BlastNInvocation3 urn:invocation5 urn:data:f1 [output] New sequence Missed sequence [hasName] literals DatumCollection [type] LSDatum [type] Properties [instanceOf] [output] [directlyDerivedFrom] Resource Description Framework

Provenance Flexible and extensible schema Data fusion and aggregation across provenance metadata Reasoning and querying over descriptions Transparent description

myGrid Provenance example

Annotate Anything People, meetings, discussions, conference talks Scientific publications, recommendations, quality comments Events, notifications, logs Services and resources Schemas and catalogue entries Models, codes, builds, workflows, Data files and data streams Sensors and sensor data … DFDL, JSDL, SAML, WSDL, WSRF, DL*, ML* as RDF? If you are using a controlled vocabulary, then lets use a standard controlled vocabulary language.

Seamark Demonstration: Identification of new drug candidates for BRKCB-1 Courtesy Joanne Luciano

Observations Flexible metadata description for data Multi tiered model for different perspectives –Machine vs Person; The ontologies for people discovery are not good enough for knowledge aggregation Make the semantics invisible Provenance aggregation Identity crisis Exposing knowledge means knowledge exposure. –Reluctance to give up knowledge assets. Vulnerability. Knowledge is power. Incentive models. IPR. Privacy. Capturing the Semantic Content explicitly. –Acquiring ontology annotations; Hard to describe policies. Vagueness and trivia. Trying to capture people-focused provenance. Hendler principle A little semantics goes a long way.

Data mining Knowledge Discovery Smart search Social networking Smart portals Agents Information Integration and aggregation Use of Semantic Web Technologies A Semantic Web of Life Science