Scientific Workflow systems: Summary and Opportunities for SEEK and e-Science.

Slides:



Advertisements
Similar presentations
Overview of the Science Environment for Ecological Knowledge (SEEK) Ricardo Scachetti Pereira.
Advertisements

ISWC Doctoral Symposium Monday, 7 November 2005
UCSD SAN DIEGO SUPERCOMPUTER CENTER Ilkay Altintas Scientific Workflow Automation Technologies Provenance Collection Support in the Kepler Scientific Workflow.
Web Accessible Virtual Research Environment for Ecosystem Science Community Presentation by Siddeswara Guru.
Ewa Deelman, Integrating Existing Scientific Workflow Systems: The Kepler/Pegasus Example Nandita Mangal,
SONet (Scientific Observations Network) and OBOE (Extensible Observation Ontology): Mark Schildhauer, Director of Computing National Center for Ecological.
DSM Workshop, October 22 OOPSLA 2006 Model-Based Workflows Leonardo Salayandía University of Texas at El Paso.
Introduction to Web services MSc on Bioinformatics for Health Sciences May 2006 Arnaud Kerhornou Iván Párraga García INB.
Chad Berkley National Center for Ecological Analysis and Synthesis (NCEAS), University of California, Santa Barbara February.
Experiences in Integration of the 'R' System into Kepler Dan Higgins – National Center for Ecological Analysis and Synthesis (NCEAS), UC Santa Barbara.
Fungal Semantic Web Stephen Scott, Scott Henninger, Leen-Kiat Soh (CSE) Etsuko Moriyama, Ken Nickerson, Audrey Atkin (Biological Sciences) Steve Harris.
EInfrastructures (Internet and Grids) - 15 April 2004 Sharing ICT Resources – “Think Globally, Act Locally” A point-of-view from the United States Mary.
Leveraging semantic metadata for ecological data discovery and integration for analysis and modeling Matthew B. Jones Mark P. Schildhauer with contributions.
The Kepler Project Overview, Status, and Future Directions Matthew B. Jones on behalf of the Kepler Project team National Center for Ecological Analysis.
Composing Models of Computation in Kepler/Ptolemy II Summary. A model of computation (MoC) is a formal abstraction of execution in a computer. There is.
A Semantic Workflow Mechanism to Realise Experimental Goals and Constraints Edoardo Pignotti, Peter Edwards, Alun Preece, Nick Gotts and Gary Polhill School.
Improving Data Discovery in Metadata Repositories through Semantic Search Chad Berkley 1, Shawn Bowers 2, Matt Jones 1, Mark Schildhauer 1, Josh Madin.
Biology.sdsc.edu CIPRes in Kepler: An integrative workflow package for streamlining phylogenetic data analyses Zhijie Guan 1, Alex Borchers 1, Timothy.
1 Open Library Environment Designing technology for the way libraries really work December 8, 2008 ~ CNI, Washington DC Lynne O’Brien Director, Academic.
January, 23, 2006 Ilkay Altintas
Key integrating concepts Groups Formal Community Groups Ad-hoc special purpose/ interest groups Fine-grained access control and membership Linked All content.
1 Yolanda Gil Information Sciences InstituteJanuary 10, 2010 Requirements for caBIG Infrastructure to Support Semantic Workflows Yolanda.
Web Services Experience Language Web Services eXperience Language Technical Overview Ravi Konuru e-Business Tools and Frameworks,
Composing Models of Computation in Kepler/Ptolemy II
Introduction for BEAM Ecological Niche Modeling Working Meeting Deana Pennington University of New Mexico December 14, 2004.
Taverna and my Grid Basic overview and Introduction Tom Oinn
Supporting Large-Scale Science with Workflows Deana Pennington University of New Mexico Long-Term Ecological Research Network Office ITR: Science Environment.
Data R&D Issues for GTL Data and Knowledge Systems San Diego Supercomputer Center University of California, San Diego Bertram Ludäscher
Taverna and my Grid Open Workflow for Life Sciences Tom Oinn
Science Environment for Ecological Knowledge: EcoGrid Matthew B. Jones National Center for.
© DATAMAT S.p.A. – Giuseppe Avellino, Stefano Beco, Barbara Cantalupo, Andrea Cavallini A Semantic Workflow Authoring Tool for Programming Grids.
SAN DIEGO SUPERCOMPUTER CENTER This is a title AN NSF SPONSORED WORKSHOP HOSTED BY THE PARTNERSHIP FOR BIODIVERSITY INFORMATICS NATIONAL CENTER FOR ECOLOGICAL.
SEEK EcoGrid l Integrate diverse data networks from ecology, biodiversity, and environmental sciences l Metacat, DiGIR, SRB, Xanthoria,... l EML is the.
Chad Berkley NCEAS National Center for Ecological Analysis and Synthesis (NCEAS), University of California Santa Barbara Long Term Ecological Research.
1 Ilkay ALTINTAS - July 24th, 2007 Ilkay ALTINTAS Director, Scientific Workflow Automation Technologies Laboratory San Diego Supercomputer Center, UCSD.
Grid Technologies Arcot Rajasekar (SEEK) Paul Watson (North East eScience Centre)
©Ferenc Vajda 1 Semantic Grid Ferenc Vajda Computer and Automation Research Institute Hungarian Academy of Sciences.
Ecoinformatics Workshop Summary SEEK, LTER Network Main Office University of New Mexico Aluquerque, NM.
The SEEK EcoGrid: A Data Grid System for Ecology Arcot Rajasekar Matthew Jones Bertram Ludäscher
Quality views: capturing and exploiting the user perspective on data quality Paolo Missier, Suzanne Embury, Mark Greenwood School of Computer Science University.
San Diego Supercomputer Center Grid Physics Network (GriPhyN) University of Florida DGL: The Assembly Language for Grid Computing Arun swaran Jagatheesan.
Using R in Kepler Dan Higgins – NCEAS Prepared for: Ecoinformatics Training for Ecologists LTER (Albuquerque) January 8-12, 2007
Using Desktop Data in Kepler Dan Higgins – NCEAS Prepared for: Ecoinformatics Training for Ecologists LTER (Albuquerque) January 8-12, 2007
ICCS WSES BOF Discussion. Possible Topics Scientific workflows and Grid infrastructure Utilization of computing resources in scientific workflows; Virtual.
Kepler includes contributors from GEON, SEEK, SDM Center and Ptolemy II, supported by NSF ITRs (SEEK), EAR (GEON), DOE DE-FC02-01ER25486.
EScience Workshop on Scientific Workflows Matthew B. Jones National Center for Ecological Analysis and Synthesis University of California Santa Barbara.
16/11/ Semantic Web Services Language Requirements Presenter: Emilia Cimpian
Lecture 9-1 : Intro. to UML (Unified Modeling Language)
A Mediated Approach towards Web Service Choreography Michael Stollberg, Dumitru Roman, Juan Miguel Gomez DERI – Digital Enterprise Research Institute
1 Class exercise II: Use Case Implementation Deborah McGuinness and Peter Fox CSCI Week 8, October 20, 2008.
SEEK Science Environment for Ecological Knowledge l EcoGrid l Ecological, biodiversity and environmental data l Computational access l Standardized, open.
Matthew B. Jones National Center for Ecological Analysis and Synthesis (NCEAS) University of California Santa Barbara Advancing Software for Ecological.
Satisfying Requirements BPF for DRA shall address: –DAQ Environment (Eclipse RCP): Gumtree ISEE workbench integration; –Design Composing and Configurability,
Visualization in Kepler Dan Higgins – NCEAS Prepared for: Ecoinformatics Training for Ecologists LTER (Albuquerque) January 8-12, 2007
Ocean Observatories Initiative OOI Cyberinfrastructure Life Cycle Objectives Review January 8-9, 2013 Scientific Workflows for OOI Ilkay Altintas Charles.
5. 2Object-Oriented Analysis and Design with the Unified Process Objectives  Describe the activities of the requirements discipline  Describe the difference.
Workflow-Driven Science using Kepler Ilkay Altintas, PhD San Diego Supercomputer Center, UCSD words.sdsc.edu.
Award No: SES/SBE Project Title: Interoperability Strategies for Scientific Cyberinfrastructure: A Comparative Study Investigators: Geoffrey C.
EcoGrid in SEEK A Data Grid System for Ecology Bertram Ludaescher University of California, Davis Arcot Rajasekar San Diego Supercomputer Center, University.
Scientific workflow in Kepler – hands on tutorial
Computational Reasoning in High School Science and Math
Data R&D Issues for GTL Bertram Ludäscher Data and Knowledge Systems
Web Ontology Language for Service (OWL-S)
Semantic Mediation System
COMPASS: A Geospatial Knowledge Infrastructure Managed with Ontologies
A Semantic Type System and Propagation
Title of Poster Site Visit 2018 Introduction Results
Chaitali Gupta, Madhusudhan Govindaraju
Business Process Management and Semantic Technologies
This material is based upon work supported by the National Science Foundation under Grant #XXXXXX. Any opinions, findings, and conclusions or recommendations.
Presentation transcript:

Scientific Workflow systems: Summary and Opportunities for SEEK and e-Science

Scientific workflow systems Workflows are a way of documenting what has been done (provenance) Can be seen as their conceptual model of what needs to be done, need for more descriptive information in the process Combine the conceptual view with the executable workflow Go from napkin diagram to formal conceptual workflow to executable workflow As important to design the workflow than to execute it Documentation contributes to reproducibility of results because of the exact record a workflow creates Annotation of usage history for workflows gives new users an idea of the quality, appropriateness, and reliability of the workflow for their own usage Need to be able to get more information about the workflow than the WSDL provides strong ties to semantic mediation, in terms of: Integration, composition, discovery User interface

Distributed computing Distributed computing with workflows Good idea but the human cost of coordinating the system is still too high to be practical when ad-hoc analytical services are considered Gains may be made by leveraging existing systems like Condor and Pegasus Process flows could also demonstrate the benefits of infrastructure development to the domain scientists

Models of computation There’s an important point in them, but has as much to do with how you separate different scientific problems – I.e, does ecology have different needs than bioinformatics that is implicit in the discipline Need much clearer ways of communicating about these models, and the need for different models may not ever arise Partly driven by how you scope the domain of usefulness for a tool, for example if you’re handling just web services you’ll never need a continuous time model User probably shouldn’t have to select the model of computation, especially for workflows that can only use one model

Workflow languages Two separate languages: for designing the actors and the workflow You can describe the workflow without understanding what each component does Need another language to describe semantics of individual components (e.g. OWL-S, Web service model ontology (WSMO)) Our current efforts focus on describing semantics of data flow, not processing Simplest descriptions of components are name, can extend it over time with better and better approximations of a formal specification Inputs and outputs alone doesn’t cut it Mathematical description alone doesn’t cut it Really need concept that constrains how the statistical approach is used Mathematically simple models are rare in ecology, complex arbitrary designs are common and extremely difficult to describe Until we learn how to represent models declaratively, we’ll never fully understand these complex models Shared language: good idea, but all current languages incorporate references that can only be interpreted within one specific environment

Collaboration opportunities Shared workflow languages Scufl/MoML/DPML/… Shared work on semantic annotation of workflow components Shared ontologies that cross domains SEEK ontologies focus on ecology & environment myGrid ontologies focus on molecular biology Shared case study: conservation genetics Incorporates data from multiple disciplines Incorporates workflows, mediation, grid issues all in one issue Ecoinformatics.org

Acknowledgements This material is based upon work supported by the National Science Foundation under awards for SEEK and (AWSFL008-DS3) for GEON and by the Department of Energy under Contract No. DE-FC02- 01ER25486 for SciDAC/SDM and by DARPA under Contract No. F C-1703 for Ptolemy. Any opinions, findings and conclusions or recomendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF). The National Center for Ecological Analysis and Synthesis, a Center funded by NSF (Grant Number ), the University of California, and the UC Santa Barbara campus. The Andrew W. Mellon Foundation. PBI Collaborators: NCEAS, University of New Mexico (Long Term Ecological Research Network Office), San Diego Supercomputer Center, University of Kansas (Center for Biodiversity Research) Kepler contributors: SEEK, Ptolemy II, SDM/SciDAC, GEON