Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Pegasus and wings WINGS/Pegasus Provenance Challenge Ewa Deelman Yolanda Gil Jihie Kim Gaurang Mehta Varun Ratnakar USC Information Sciences Institute.

Similar presentations


Presentation on theme: "1 Pegasus and wings WINGS/Pegasus Provenance Challenge Ewa Deelman Yolanda Gil Jihie Kim Gaurang Mehta Varun Ratnakar USC Information Sciences Institute."— Presentation transcript:

1 1 Pegasus and wings WINGS/Pegasus Provenance Challenge Ewa Deelman Yolanda Gil Jihie Kim Gaurang Mehta Varun Ratnakar USC Information Sciences Institute

2 2 Pegasus and wings Workflow Selection Workflow Template Data Selection Workflow Instance Workflow Libraries Data Repositories Application Components Ontologies: Domain terms, Component types, Workflow Products - Preexisting data collections - Workflow execution results “Show me workflows that generate hazard maps” “Run that with the USGS data set” “Validate this workflow based on the component specs” SCIENTIST EXPERT SCIENTIST Workflow Creation SCIENTIST RESEARCHING NEW MODELS -Workflow templates specify complex analyses sequences - Workflow instances specify data “Here is a new wave propagation model, takes in a series of fault ruptures, is compiled for MPI” Component Specification Executable Workflow Pegasus WINGS - Specifies data requirements - Specifies execution requirements DAGMan/ Globus (OWL) WINGS/Pegasus: Workflow Instance Generation and Selection

3 3 Pegasus and wings Workflow Template Collections Computational nodes

4 4 Pegasus and wings Workflow Instance

5 5 Pegasus and wings Executable Workflow

6 6 Pegasus and wings Metadata Constraints (in OWL ontology) Constraints on Files metadata attributes: data types and default values Constraints on collections and collection of collection Type of each element Relations between metadata of a collection and metadata of individual items Component-level constraints on metadata attributes of input/output files or collections Deriving metadata of output files from metadata of input files Template level constraints on metadata attributes of files or collections Input/output files of different components can have the same metadata Checking number of items in collections

7 7 Pegasus and wings Provenance records Workflow Selection Workflow Template Data Selection Workflow Instance Workflow Libraries Data Repositories Application Components Ontologies: Domain terms, Component types, Workflow Products - Preexisting data collections - Workflow execution results “Show me workflows that generate hazard maps” “Run that with the USGS data set” SCIENTIST EXPERT SCIENTIST Workflow Creation -Workflow templates specify complex analyses sequences - Workflow instances specify data Component Specification Executable Workflow Pegasus WINGS - Specifies data requirements - Specifies execution requirements DAGMan/ Globus (OWL) VDS PTC

8 8 Pegasus and wings Queries answered Keys to provenance Capturing the correct metadata and propagating it through the template and instance Capturing runtime information Used (SparQL and scripting) and SQL to pose queries Queries 1,2,5,6,8—query to File and Workflow Instance Ontologies Query 4—query to the VDS PTC n Queries 3,7,9 —lack of time

9 9 Pegasus and wings hasType AnatomyImages OfPatient CollOf Collection FileCollection hasType File hasType AnatomyImages OfPatientInPeriod AnatomyImageFile hasType hasPatientID Metadata:String hasType hasPeriodID Metadata:Int hasIndexID hasPatientID PatientID1 hasPatientID PeriodID1 hasTimePeriodID Constraints on collection element types metadata constraints on collections & their elements … CC-AnatomyImages-Skolem C-AnatomyImages-Skolem AnatomyImage-Skolem hasType hasItems CollectionList FileList hasItems … CC-AnaImages-for-Patient112 C-AnaImages_P112_p1 C-AnaImages_P112_p2 C-AnaImages_P112_p12 hasItems Domain independent definitions Domain dependent definitions Skolem instance definitions hasItems part3 127_6.part2 img112_1.part1 112_2.par3 112-2.part2 img112-2.part1 112_12.part5 112_12.part2 img112_12.part1 … hasItems example files and collections... Constraints on Nested Collections hasTimePeriodID IndexID1 hasIndexID

10 10 Pegasus and wings Align_Warp Component Type hasInputs FileOrCollection List hasOutputs hasInputs Align_Warp_InputsAlign_Warp_Outputs hasOutputs hasIndexID Anatomy_IndexID1 hasPatiendID PatientID1 metadata constraints on input and output files Constraints on the types of input and output file and collections … … … Align_Warp_Skolem AnatomyHeader1 WarpParamFile1 Component level constraints on metadata attributes of input/output files or collections hasIndexID AnatomyImage1 hasPatiendID hasIndexID

11 11 Pegasus and wings fMRI Template1 InputLink_XYZInputFi le_to_Convert InputLink_ReslicedIma ge_to_Softmean hasLink InputLink_AnatomyImage s_to_Align_Warp hasFile hasPatientID PatientID1 N_Images hasN_Items … … Constraints on number of elements in different collections metadata constraints on files/collections of different components XYZInputFile1 Collection_Anato myImage1 Collection_Reslice dImages1 Template level (global) constraints on metadata attributes of files or collections hasPatientID

12 12 Pegasus and wings Refinement provenance (in design) We not only consider the provenance of the executing application but also of the refinement process that maps an abstract workflow (workflow instance) onto a set of resources The refinement process can be multi-staged Stages of the refinement can execute on a variety of resources We capture provenance of the entire workflow as well as workflow constituent The representations of the refinement and of the workflow provenance are uniform

13 13 Pegasus and wings Original Workflow Workflow 1

14 14 Pegasus and wings 1 st executable partition mapped onto resources

15 15 Pegasus and wings Chain of Refinement and Execution Steps

16 16 Pegasus and wings Definition of refinement and execution provenance [[I/O] data input/output [function performed] [performance info] [optional annotations]] Could include a justification of the reasons for the tasks performed

17 17 Pegasus and wings Provenance records relating to the refinement process [I:[ O: ; ] [ ][ ] [ ] [I:[ O: ] [ ][ ][ ] [I: O: ] [ ][ ][ ] [I:[ O: ] [ ][][] [[I: ] [O: ] [ ( could be in a form of a DAX (XML-DAG used by Pegasus )), ] [ …..][]] [I:, O: ] [ …][]] Thanks to Luc Moreau for his input!


Download ppt "1 Pegasus and wings WINGS/Pegasus Provenance Challenge Ewa Deelman Yolanda Gil Jihie Kim Gaurang Mehta Varun Ratnakar USC Information Sciences Institute."

Similar presentations


Ads by Google