Presentation is loading. Please wait.

Presentation is loading. Please wait.

Pegasus and Condor Gaurang Mehta, Ewa Deelman, Carl Kesselman, Karan Vahi Center For Grid Technologies USC/ISI.

Similar presentations


Presentation on theme: "Pegasus and Condor Gaurang Mehta, Ewa Deelman, Carl Kesselman, Karan Vahi Center For Grid Technologies USC/ISI."— Presentation transcript:

1 Pegasus and Condor Gaurang Mehta, Ewa Deelman, Carl Kesselman, Karan Vahi Center For Grid Technologies USC/ISI

2 PEGASUS Pegasus – Planning for Execution in Grid
Pegasus is a configurable system that can plan, schedule and execute complex workflows on the Grid. Algorithmic and AI based techniques are used. Pegasus takes an abstract workflow as input. The abstract workflow describes the transformations and data in terms of their logical names. It then queries the Replica Location Service (RLS) for existence of any materialized data. If any derived data exists then it is reused and a workflow reduction is done 11/20/2018 Condor-Week

3 Workflow Reduction E1 E3 E2 f.b f.c f.a1 f.a2 f.d E3 f.b f.c f.d
Execution nodes Transfer nodes Registration nodes f.b and f.c exist in RLS Reduced workflow Original Abstract workflow 11/20/2018 Condor-Week

4 Pegasus (Cont) It then locates physical locations for both components (transformations and data) Uses Globus Replica Location Service (RLS) and the Transformation Catalog (TC) Finds appropriate resources to execute Via Globus Monitoring and Discovery Service (MDS) Adds the stage-in jobs to transfer raw and materialized input files to the computation sites. Adds the stage out jobs to transfer derived data to the user selected storage location. Both input and output staging is done Globus GridFtp Publishes newly derived data products for reuse RLS, Chimera virtual data catalog (VDC) 11/20/2018 Condor-Week

5 Workflow Modification
E3 f.b f.c f.d T1 E3 T2 f.b f.c f.d T3 R1 Final Dag Execution nodes Transfer nodes Registration nodes Reduced workflow 11/20/2018 Condor-Week

6 Pegasus (Cont) Pegasus generates the concrete workflow in Condor Dagman format and submits them to Dagman/Condor-G for execution on the Grid. These concrete Dags have the concrete location of the data and the site where the computation is to be performed. Condor-G submits these jobs via Globus-Gram to remote schedulers running Condor, PBS, LSF and Sun Grid Engine. Part of a software package distributed by GriPhyN called Virtual Data System (VDS). VDS-1.2.3(Pegasus+Chimera) currently included in the Virtual Data Toolkit (VDT). 11/20/2018 Condor-Week

7 Workflow Construction
PEGASUS VDC DAGMAN CONDOR-G CHIMERA TC MDS RLS VDLX DAX USER SUPPLIED DAX Dag/Submit Files VDL GRID 11/20/2018 Condor-Week

8 Current System 11/20/2018 Condor-Week

9 Deferred Planning in Pegasus
Current Pegasus implementation plans the entire workflow before submitting it for execution. (Full ahead) Grids are very dynamic and resources come and go pretty often. Currently adding support for deferred planning where in only a part of the workflow will be planned and executed at a time. Chop the abstract workflow into partitions. Plan on one partition and submit it to Dagman/Condor-G The last job in the partition calls Pegasus again and plans the next partition and so on.. Initial partitions will be level based on breadth-first search. 11/20/2018 Condor-Week

10 Incremental Refinement
Partition Abstract workflow into partial workflows 11/20/2018 Condor-Week

11 Meta-DAGMan 11/20/2018 Condor-Week

12 Current Condor Technologies Used
Dagman to manage the dependencies in the acyclic workflow. Provides support to resume a failed workflow using rescue dag generated by Dagman. Condor-G to submit jobs to the grid (globus-jobmanager). Jobs are submitted using Globus GRAM and the stdout/stdin/stderr is streamed back using GLOBUS GASS. Condor as a scheduler to harness idle cpu cycles on existing desktops. ISI has a small 36 node condor pool consisting of primarily Linux and Solaris machines. 11/20/2018 Condor-Week

13 Future Condor Technologies to be integrated.
Nest We are looking at integrating support for nest which allows disk space reservation on remote sites Stork (Data Placement Scheduler) Allows support of multiple transfer protocols. (ftp, http, nest/chirp, gsiftp, srb, file) Reliably transfers your file across the grid. 11/20/2018 Condor-Week

14 Applications Using Pegasus and Condor Dagman
GriPhyN Experiments Laser Interferometer Gravitational Wave Observatory (Caltech/UWM) ATLAS (U of Chicago) SDSS (Fermilab) Also IVDGL/GRID3 National Virtual Observatory and NASA Montage Biology BLAST (ANL, PDQ-funded) Neuroscience Tomography for Telescience(SDSC, NIH-funded) 11/20/2018 Condor-Week

15 A small Montage workflow
1202 nodes 11/20/2018 Condor-Week

16 Pegasus Acknowledgements
Ewa Deelman, Carl Kesselman, Gaurang Mehta, Karan Vahi, Mei-Hui Su, Saurabh Khurana, Sonal Patil, Gurmeet Singh (Center for Grid Computing, ISI) James Blythe, Yolanda Gil (Intelligent Systems Division, ISI) Collaboration with Miron Livny and the Condor Team (UW Madison) Collaboration with Mike Wilde, Jens Voeckler (UofC) - Chimera Research funded as part of the NSF GriPhyN, NVO and SCEC projects and EU-funded GridLab For more information Contacts: deelman , gmehta , 11/20/2018 Condor-Week


Download ppt "Pegasus and Condor Gaurang Mehta, Ewa Deelman, Carl Kesselman, Karan Vahi Center For Grid Technologies USC/ISI."

Similar presentations


Ads by Google