Pegasus and Condor Gaurang Mehta, Ewa Deelman, Carl Kesselman, Karan Vahi Center For Grid Technologies USC/ISI.

Slides:



Advertisements
Similar presentations
3 September 2004NVO Coordination Meeting1 Grid-Technologies NVO and the Grid Reagan W. Moore George Kremenek Leesa Brieger Ewa Deelman Roy Williams John.
Advertisements

A. Arbree, P. Avery, D. Bourilkov, R. Cavanaugh, S. Katageri, G. Graham, J. Rodriguez, J. Voeckler, M. Wilde CMS & GriPhyN Conference in High Energy Physics,
Ewa Deelman, Integrating Existing Scientific Workflow Systems: The Kepler/Pegasus Example Nandita Mangal,
Condor Project Computer Sciences Department University of Wisconsin-Madison Stork An Introduction Condor Week 2006 Milan.
Condor-G: A Computation Management Agent for Multi-Institutional Grids James Frey, Todd Tannenbaum, Miron Livny, Ian Foster, Steven Tuecke Reporter: Fu-Jiun.
Ewa Deelman, Optimizing for Time and Space in Distributed Scientific Workflows Ewa Deelman University.
Pegasus: Mapping complex applications onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.
An Astronomical Image Mosaic Service for the National Virtual Observatory / ESTO.
Ewa Deelman Using Grid Technologies to Support Large-Scale Astronomy Applications Ewa Deelman Center for Grid Technologies USC Information.
Managing Workflows with the Pegasus Workflow Management System
Computer Science Research Ian Foster University of Chicago & Argonne National Laboratory GriPhyN NSF Project Review January 2003.
Ewa Deelman, Pegasus and DAGMan: From Concept to Execution Mapping Scientific Workflows onto the National.
CONDOR DAGMan and Pegasus Selim Kalayci Florida International University 07/28/2009 Note: Slides are compiled from various TeraGrid Documentations.
Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1.
Miron Livny Computer Sciences Department University of Wisconsin-Madison Harnessing the Capacity of Computational.
Pegasus A Framework for Workflow Planning on the Grid Ewa Deelman USC Information Sciences Institute Pegasus Acknowledgments: Carl Kesselman, Gaurang Mehta,
The Grid is a complex, distributed and heterogeneous execution environment. Running applications requires the knowledge of many grid services: users need.
INFSO-RI Enabling Grids for E-sciencE The US Federation Miron Livny Computer Sciences Department University of Wisconsin – Madison.
Large-Scale Science Through Workflow Management Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.
Grid Resource Allocation and Management (GRAM) Execution management Execution management –Deployment, scheduling and monitoring Community Scheduler Framework.
Managing large-scale workflows with Pegasus Karan Vahi ( Collaborative Computing Group USC Information Sciences Institute Funded.
Pegasus-a framework for planning for execution in grids Ewa Deelman USC Information Sciences Institute.
Pegasus: Planning for Execution in Grids Ewa Deelman Information Sciences Institute University of Southern California.
GriPhyN Status and Project Plan Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory.
Miron Livny Computer Sciences Department University of Wisconsin-Madison Welcome and Condor Project Overview.
Ruth Pordes, Fermilab CD, and A PPDG Coordinator Some Aspects of The Particle Physics Data Grid Collaboratory Pilot (PPDG) and The Grid Physics Network.
Dr. Ahmed Abdeen Hamed, Ph.D. University of Vermont, EPSCoR Research on Adaptation to Climate Change (RACC) Burlington Vermont USA MODELING THE IMPACTS.
Pegasus: Mapping Scientific Workflows onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.
Condor Week 2005Optimizing Workflows on the Grid1 Optimizing workflow execution on the Grid Gaurang Mehta - Based on “Optimizing.
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
Pegasus: Running Large-Scale Scientific Workflows on the TeraGrid Ewa Deelman USC Information Sciences Institute
Pegasus: Mapping complex applications onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
The GriPhyN Planning Process All-Hands Meeting ISI 15 October 2001.
GRIDS Center Middleware Overview Sandra Redman Information Technology and Systems Center and Information Technology Research Center National Space Science.
GriPhyN Virtual Data System Grid Execution of Virtual Data Workflows Mike Wilde Argonne National Laboratory Mathematics and Computer Science Division.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
CEDPS Data Services Ann Chervenak USC Information Sciences Institute.
Pegasus WMS: Leveraging Condor for Workflow Management Ewa Deelman, Gaurang Mehta, Karan Vahi, Gideon Juve, Mats Rynge, Prasanth.
Pegasus-a framework for planning for execution in grids Karan Vahi USC Information Sciences Institute May 5 th, 2004.
Planning Ewa Deelman USC Information Sciences Institute GriPhyN NSF Project Review January 2003 Chicago.
Pegasus: Planning for Execution in Grids Ewa Deelman, Carl Kesselman, Gaurang Mehta, Gurmeet Singh, Karan Vahi Information Sciences Institute University.
A Fully Automated Fault- tolerant System for Distributed Video Processing and Off­site Replication George Kola, Tevfik Kosar and Miron Livny University.
Virtual Data Management for CMS Simulation Production A GriPhyN Prototype.
Funded by the NSF OCI program grants OCI and OCI Mats Rynge, Gideon Juve, Karan Vahi, Gaurang Mehta, Ewa Deelman Information Sciences Institute,
Bulk Data Transfer Activities We regard data transfers as “first class citizens,” just like computational jobs. We have transferred ~3 TB of DPOSS data.
Grid Compute Resources and Job Management. 2 Grid middleware - “glues” all pieces together Offers services that couple users with remote resources through.
1 USC Information Sciences InstituteYolanda Gil AAAI-08 Tutorial July 13, 2008 Part IV Workflow Mapping and Execution in Pegasus (Thanks.
Managing LIGO Workflows on OSG with Pegasus Karan Vahi USC Information Sciences Institute
Ewa Deelman, Managing Scientific Workflows on OSG with Pegasus Ewa Deelman USC Information Sciences.
Resource Allocation and Scheduling for Workflows Gurmeet Singh, Carl Kesselman, Ewa Deelman.
Parallel Computing Globus Toolkit – Grid Ayaka Ohira.
Workload Management Workpackage
Pegasus WMS Extends DAGMan to the grid world
Cloudy Skies: Astronomy and Utility Computing
Dynamic Deployment of VO Specific Condor Scheduler using GT4
U.S. ATLAS Grid Production Experience
Seismic Hazard Analysis Using Distributed Workflows
Peter Kacsuk – Sipos Gergely MTA SZTAKI
Globus Job Management. Globus Job Management Globus Job Management A: GRAM B: Globus Job Commands C: Laboratory: globusrun.
Ewa Deelman University of Southern California
Wide Area Workload Management Work Package DATAGRID project
Mats Rynge USC Information Sciences Institute
GRID Workload Management System for CMS fall production
High Throughput Computing for Astronomers
A General Approach to Real-time Workflow Monitoring
Status of Grids for HEP and HENP
Condor-G Making Condor Grid Enabled
Frieda meets Pegasus-WMS
Condor-G: An Update.
Presentation transcript:

Pegasus and Condor Gaurang Mehta, Ewa Deelman, Carl Kesselman, Karan Vahi Center For Grid Technologies USC/ISI

PEGASUS Pegasus – Planning for Execution in Grid Pegasus is a configurable system that can plan, schedule and execute complex workflows on the Grid. Algorithmic and AI based techniques are used. Pegasus takes an abstract workflow as input. The abstract workflow describes the transformations and data in terms of their logical names. It then queries the Replica Location Service (RLS) for existence of any materialized data. If any derived data exists then it is reused and a workflow reduction is done 11/20/2018 Condor-Week

Workflow Reduction E1 E3 E2 f.b f.c f.a1 f.a2 f.d E3 f.b f.c f.d Execution nodes Transfer nodes Registration nodes f.b and f.c exist in RLS Reduced workflow Original Abstract workflow 11/20/2018 Condor-Week

Pegasus (Cont) It then locates physical locations for both components (transformations and data) Uses Globus Replica Location Service (RLS) and the Transformation Catalog (TC) Finds appropriate resources to execute Via Globus Monitoring and Discovery Service (MDS) Adds the stage-in jobs to transfer raw and materialized input files to the computation sites. Adds the stage out jobs to transfer derived data to the user selected storage location. Both input and output staging is done Globus GridFtp Publishes newly derived data products for reuse RLS, Chimera virtual data catalog (VDC) 11/20/2018 Condor-Week

Workflow Modification E3 f.b f.c f.d T1 E3 T2 f.b f.c f.d T3 R1 Final Dag Execution nodes Transfer nodes Registration nodes Reduced workflow 11/20/2018 Condor-Week

Pegasus (Cont) Pegasus generates the concrete workflow in Condor Dagman format and submits them to Dagman/Condor-G for execution on the Grid. These concrete Dags have the concrete location of the data and the site where the computation is to be performed. Condor-G submits these jobs via Globus-Gram to remote schedulers running Condor, PBS, LSF and Sun Grid Engine. Part of a software package distributed by GriPhyN called Virtual Data System (VDS). VDS-1.2.3(Pegasus+Chimera) currently included in the Virtual Data Toolkit 1.1.13 (VDT). 11/20/2018 Condor-Week

Workflow Construction PEGASUS VDC DAGMAN CONDOR-G CHIMERA TC MDS RLS VDLX DAX USER SUPPLIED DAX Dag/Submit Files VDL GRID 11/20/2018 Condor-Week

Current System 11/20/2018 Condor-Week

Deferred Planning in Pegasus Current Pegasus implementation plans the entire workflow before submitting it for execution. (Full ahead) Grids are very dynamic and resources come and go pretty often. Currently adding support for deferred planning where in only a part of the workflow will be planned and executed at a time. Chop the abstract workflow into partitions. Plan on one partition and submit it to Dagman/Condor-G The last job in the partition calls Pegasus again and plans the next partition and so on.. Initial partitions will be level based on breadth-first search. 11/20/2018 Condor-Week

Incremental Refinement Partition Abstract workflow into partial workflows 11/20/2018 Condor-Week

Meta-DAGMan 11/20/2018 Condor-Week

Current Condor Technologies Used Dagman to manage the dependencies in the acyclic workflow. Provides support to resume a failed workflow using rescue dag generated by Dagman. Condor-G to submit jobs to the grid (globus-jobmanager). Jobs are submitted using Globus GRAM and the stdout/stdin/stderr is streamed back using GLOBUS GASS. Condor as a scheduler to harness idle cpu cycles on existing desktops. ISI has a small 36 node condor pool consisting of primarily Linux and Solaris machines. 11/20/2018 Condor-Week

Future Condor Technologies to be integrated. Nest We are looking at integrating support for nest which allows disk space reservation on remote sites Stork (Data Placement Scheduler) Allows support of multiple transfer protocols. (ftp, http, nest/chirp, gsiftp, srb, file) Reliably transfers your file across the grid. 11/20/2018 Condor-Week

Applications Using Pegasus and Condor Dagman GriPhyN Experiments Laser Interferometer Gravitational Wave Observatory (Caltech/UWM) ATLAS (U of Chicago) SDSS (Fermilab) Also IVDGL/GRID3 National Virtual Observatory and NASA Montage Biology BLAST (ANL, PDQ-funded) Neuroscience Tomography for Telescience(SDSC, NIH-funded) 11/20/2018 Condor-Week

A small Montage workflow 1202 nodes 11/20/2018 Condor-Week

Pegasus Acknowledgements Ewa Deelman, Carl Kesselman, Gaurang Mehta, Karan Vahi, Mei-Hui Su, Saurabh Khurana, Sonal Patil, Gurmeet Singh (Center for Grid Computing, ISI) James Blythe, Yolanda Gil (Intelligent Systems Division, ISI) Collaboration with Miron Livny and the Condor Team (UW Madison) Collaboration with Mike Wilde, Jens Voeckler (UofC) - Chimera Research funded as part of the NSF GriPhyN, NVO and SCEC projects and EU-funded GridLab For more information http://pegasus.isi.edu http://www.griphyn.edu/workspace/vds Contacts: deelman , gmehta , vahi @isi.edu 11/20/2018 Condor-Week