Presentation on theme: "GRADD: Scientific Workflows. Scientific Workflow E. Science laboris Workflows are the new rock and roll of eScience Machinery for coordinating the execution."— Presentation transcript:
GRADD: Scientific Workflows
Scientific Workflow E. Science laboris Workflows are the new rock and roll of eScience Machinery for coordinating the execution of (scientific) services and linking together (scientific) resources. Era of service oriented apps (SOA) Repetitive and mundane boring tasks made easier (data cleaning...) Facilitates sharing of science
Trident Scientific Workflow Workbench Visually program workflows, through a web browser Libraries of activities, workflows and services – Social annotations and search Abstract parallelism, for HPC & many core (CCR) Adaptive workflows, to detect and respond to events Automatic provenance capture, open provenance model Costing model, resources include time, power, data xfer Integrated data storage and access Integrated visualization tools Fault tolerance, facilitate smart reruns, what-if analysis Factory scheduling of workflows
Trident Implementation Built on top of industrial workflow engine Windows Workflow Foundation – Workflow in a general purpose framework – Part of Microsofts.NET Framework 3.5
Trident Logical Architecture Domain specific custom activities Visual Workflow Designer Runtime Services Provenance Fault Tolerance HPC Scheduling Service Monitoring Service Registry Runtime Admin Tools Community Site
Activities: An Extensible Approach OOB activities, workflow types, General-purpose Basic workflow constructs constructs Create/Extend/ Compose activities Read from sensors, Data pipelines, etc. First-class citizens Base Activity Library Custom Activity Libraries Read from Sensor Out-of-BoxActivities Extendactivity Domain-specific activities Domain specific workflow packages - oceanography Domain-Specific Workflow Packages Rosetta net CRM Biology Oceanography Composeactivities
Trident Workflow Designer Visually compose, search and archive (share)
Workflow Execution Provenance For a workflow management system, provenance identifies what activities were executed, parameters supplied at runtime, data passed between activities, intermediate results generated, etc Explain how a workflow result was created – sufficient to establish trust; Provides a replication recipe; Guide development of future experiments; – Scientists routinely record the provenance of bench experiments in lab notebooks – this is essential for computational experiments as well.
Provenance in Trident Enactment engine documents all steps linking original inputs with final result so execution can be verified, reproduced or rerun – provenance is a first class data product in Trident… Provenance capture is automatic and transparent Will persist provenance data for a fixed period of time. Supports multiple levels of representation. Storage provided by underlying system Interface to query and reason over provenance data. Efficient storage representation and query performance.
Applications and Scientists need a Curated Registry of Services Just having a workflow system isnt enough and its not just about workflows... Note: Registry, not repository Services are hosted elsewhere Trident Registry