Presentation is loading. Please wait.

Presentation is loading. Please wait.

Pegasus on the Virtual Grid: A Case Study of Workflow Planning over Captive Resources Yang-Suk Kee, Eun-Kyu Byun, Ewa Deelman, Kran Vahi, Jin-Soo Kim Oracle.

Similar presentations


Presentation on theme: "Pegasus on the Virtual Grid: A Case Study of Workflow Planning over Captive Resources Yang-Suk Kee, Eun-Kyu Byun, Ewa Deelman, Kran Vahi, Jin-Soo Kim Oracle."— Presentation transcript:

1 Pegasus on the Virtual Grid: A Case Study of Workflow Planning over Captive Resources Yang-Suk Kee, Eun-Kyu Byun, Ewa Deelman, Kran Vahi, Jin-Soo Kim Oracle US Inc Korea Advanced Institute of Science and Technology Information Sciences Institute/University of Southern California Sungkyunkwan University

2 Overview  Motivation  Background – Pegasus – Virtual Grid  Pegasus-VG Proxy  Conclusion  Discussion

3 Motivation  Challenges in scientific application development – Data/control flow, task scheduling, data replication, fault-tolerance, etc  Challenges in resource management – Availability, performance, cost, reliability, fault- tolerance, etc  How to leverage existing cyber infrastructures for easy and efficient scientific computing?

4 Separations of Concerns  Application domain – Workflow management: application management can be conducted independently of target execution environments. – E.g.) Pegasus, Askalon, Triana  Resource domain – Resource provisioning: resource management can be encapsulated underneath abstractions or virtualizations – E.g.) Virtual Grid, virtual cluster, cloud

5 Workflow planning and execution over provisioned resources

6 Pegasus  A framework for workflow planning and execution  Workflow lifecycle – Design: describe the data/control flows of application via an abstract workflow – Planning: map the workflow tasks onto physical resources – Execution: schedule and run the workflow tasks on the mapped resources

7 Pegasus Workflow Management Pegasus mapper Condor DAGman Condor Computing environment Monitoring Information provenance Pegasus Executable workflow tasks Monitoring Information provenance Abstract workflow Condor pool

8 Virtual Grid  A programmable virtualized resource provisioning framework  Components – vgDL (Virtual Grid Description Language)  Specifies resource requirements – vgES (Virtual Grid Execution System)  Compiles and coordinates resources – PC (Personal Cluster)  Provides uniform job management

9 Timeshare A BC D Application Virtual Grid Resource Abstraction Virtual Grid Resource Abstraction VG Timeshare Lease Batch VG PBS P4 VGDL vgdl=clusterof (node) [2] { node = [Processor==“P4”] } program run AB C D ClassificationSelectionBindingEnvironment ok

10 Pegasus on Virtual Grid  Scope – A basic integration for workflow planning and execution over provisioned resources  Issues – Resource capacity estimation  Resource specification (vgDL) synthesis for Virtual Grid – Resource information publication  Site catalog generation for Pegasus

11 Resource Capacity Estimation  What Virtual Grid expects from Pegasus – vgDL description  Available information – Task execution time, data transfer time, performance metrics, minimum memory capacity, cost, deadline, etc  Unknown information – # of virtual processors  Resource capacity estimate – Minimize the # of processors that can execute a workflow within a deadline

12 BTS (Balanced Time Scheduling) Ref: E-science’08 E.-K. Byun, Y.-S. Kee et. al ID ET Time p1 p2 How many processors do we need to run this workflow within 7 units?

13 Example  Execution time of each task - Xeon processor  Data transfer time - network with 1Gbs bandwidth.  Deadline is 1 hour. Diamond = ClusterOf [2] (nd) [, 0:30:00] { nd = [Processor == “Xeon”] } preprocess findrange analyze f.input f.output

14 Resource Information Publication  What Pegasus expects from Virtual Grid – Site catalog  Virtual Grid – VG instance  Resource information publication – Devirtualize a VG instance and generate a site catalog for Pegasus

15 Timeshare A BC D Application Virtual Grid Resource Abstraction Virtual Grid Resource Abstraction VG Timeshare Lease Batch VG PBS P4 VGDL vgdl=clusterof (node) [2] { node = [Processor==“P4”] } program run AB C D ClassificationSelectionBindingEnvironment ok

16 Personal Cluster  A partition of resources dedicated to a user under the control of a user-level resource manager during a limited time period GT4/PBS Ref: HCW’08 Y.-S. Kee and C. Kesselman

17 Site Catalog Publication … /home/globus/pegasus gt4 PBS $HOME/workdir …

18 Workflow Planning over Provisioned Resources Creation Planning Scheduling/ Execution A BC D CC A BC D CC Executable workflow Abstract workflow BTS VG Virtual Grid VGDL Devirtualization Site catalog vgdl = ClusterOf (nd) [2] { nd = [Proc==“Xeon”] } GT4+PBS PegasusVG-Pegasus Proxy

19 Conclusion  Pegasus on Virtual Grid – Implements workflow planning and execution over on-demand captive resources – Enables easy and efficient application development and execution  Issues – Resource capacity estimation – Site catalog publication

20 Discussion  Effective performance – What is the cost that a user has to pay to have a successful execution?  Ongoing studies – Find-grain planning for resource provisioning  Performance, cost, reliability – Workflow execution for virtualization  Recovery of failed tasks

21 Need More Information?  Pegaus –  VGrADS – Tuesday, 11:30am, RENCI booth (2633) – Wednesday, noon, GCAS booth (285) – Wednesday, 2:00Pm, SDSC booth (568) – Wednesday, 4:00pm, RENCI booth (2633)

22 A Q & Q U E S T I O N S A N S W E R S


Download ppt "Pegasus on the Virtual Grid: A Case Study of Workflow Planning over Captive Resources Yang-Suk Kee, Eun-Kyu Byun, Ewa Deelman, Kran Vahi, Jin-Soo Kim Oracle."

Similar presentations


Ads by Google