28 October 2010 Challenge the future Delft University of Technology Cost-driven Scheduling of Grid Workflows Using Partial Critical Paths Dick Epema Delft.

Slides:



Advertisements
Similar presentations
TWO STEP EQUATIONS 1. SOLVE FOR X 2. DO THE ADDITION STEP FIRST
Advertisements

Zhongxing Telecom Pakistan (Pvt.) Ltd
Constraint Satisfaction Problems
Advanced Piloting Cruise Plot.
Technische Universität München + Hewlett Packard Laboratories Dynamic Workload Management for Very Large Data Warehouses Juggling Feathers and Bowling.
Multiprocessor Scheduling
Libra: An Economy driven Job Scheduling System for Clusters Jahanzeb Sherwani 1, Nosheen Ali 1, Nausheen Lotia 1, Zahra Hayat 1, Rajkumar Buyya 2 1. Lahore.
Pricing for Utility-driven Resource Management and Allocation in Clusters Chee Shin Yeo and Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS)
© 2008 Pearson Addison Wesley. All rights reserved Chapter Seven Costs.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Chapter 1 The Study of Body Function Image PowerPoint
Author: Julia Richards and R. Scott Hawley
1 Copyright © 2013 Elsevier Inc. All rights reserved. Appendix 01.
UNITED NATIONS Shipment Details Report – January 2006.
Exploiting Deadline Flexibility in Grid Workflow Rescheduling Wei Chen Alan Fekete Young Choon Lee.
Challenge the future Delft University of Technology Overprovisioning for Performance Consistency in Grids Nezih Yigitbasi and Dick Epema Parallel.
Tuesday, May 7 Integer Programming Formulations Handouts: Lecture Notes.
and 6.855J Spanning Tree Algorithms. 2 The Greedy Algorithm in Action
© 2010 Pearson Addison-Wesley. All rights reserved. Addison Wesley is an imprint of Chapter 11: Structure and Union Types Problem Solving & Program Design.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Title Subtitle.
Exit a Customer Chapter 8. Exit a Customer 8-2 Objectives Perform exit summary process consisting of the following steps: Review service records Close.
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
Year 6 mental test 10 second questions
2010 fotografiert von Jürgen Roßberg © Fr 1 Sa 2 So 3 Mo 4 Di 5 Mi 6 Do 7 Fr 8 Sa 9 So 10 Mo 11 Di 12 Mi 13 Do 14 Fr 15 Sa 16 So 17 Mo 18 Di 19.
Evaluating Window Joins over Unbounded Streams Author: Jaewoo Kang, Jeffrey F. Naughton, Stratis D. Viglas University of Wisconsin-Madison CS Dept. Presenter:
Solve Multi-step Equations
Real-Time Competitive Environments: Truthful Mechanisms for Allocating a Single Processor to Sporadic Tasks Anwar Mohammadi, Nathan Fisher, and Daniel.
Chapter 4: Informed Heuristic Search
Artificiel Bee Colony (ABC) Algorithme Isfahan University of Technology Fall Elham Seifossadat Faegheh Javadi.
Table 12.1: Cash Flows to a Cash and Carry Trading Strategy.
Randomized Algorithms Randomized Algorithms CS648 1.
Outline Introduction Assumptions and notations
ABC Technology Project
Precedence Diagramming
1 Undirected Breadth First Search F A BCG DE H 2 F A BCG DE H Queue: A get Undiscovered Fringe Finished Active 0 distance from A visit(A)
Chapter 5 Plan-Space Planning.
Green Eggs and Ham.
VOORBLAD.
1 Breadth First Search s s Undiscovered Discovered Finished Queue: s Top of queue 2 1 Shortest path from s.
Factor P 16 8(8-5ab) 4(d² + 4) 3rs(2r – s) 15cd(1 + 2cd) 8(4a² + 3b²)
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
© 2012 National Heart Foundation of Australia. Slide 2.
Lets play bingo!!. Calculate: MEAN Calculate: MEDIAN
Introduction to Feedback Systems / Önder YÜKSEL Bode plots 1 Frequency response:
Understanding Generalist Practice, 5e, Kirst-Ashman/Hull
25 seconds left…...
Slippery Slope
Januar MDMDFSSMDMDFSSS
10 -1 Chapter 10 Amortized Analysis A sequence of operations: OP 1, OP 2, … OP m OP i : several pops (from the stack) and one push (into the stack)
Analyzing Genes and Genomes
We will resume in: 25 Minutes.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Local Search Jim Little UBC CS 322 – CSP October 3, 2014 Textbook §4.8
Intracellular Compartments and Transport
PSSA Preparation.
Essential Cell Biology
Foundations of Data Structures Practical Session #7 AVL Trees 2.
Mani Srivastava UCLA - EE Department Room: 6731-H Boelter Hall Tel: WWW: Copyright 2003.
By Rasmussen College. 1. What majors or programs do you offer? 2. What is the average length of your programs? 3. What percentage of your students graduate?
CpSc 3220 Designing a Database
Compiler Construction
Tai, Yu-Chang 4/29/2013 Future Generation Computer Systems(FGCS.J) journal homepage: Saeid Abrishami a, ∗, Mahmoud Naghibzadeha,
Authors: Weiwei Chen, Ewa Deelman 9th International Conference on Parallel Processing and Applied Mathmatics 1.
A Budget Constrained Scheduling of Workflow Applications on Utility Grids using Genetic Algorithms Jia Yu and Rajkumar Buyya Grid Computing and Distributed.
Presentation transcript:

28 October 2010 Challenge the future Delft University of Technology Cost-driven Scheduling of Grid Workflows Using Partial Critical Paths Dick Epema Delft University of Technology Delft, the Netherlands Saeid Abrishami and Mahmoud Naghibzadeh Ferdowsi University of Mashhad Mashhad, Iran Grid 2010, Brussels, Belgium

2 Partial Critical Paths Introduction Utility Grids versus Community Grids main difference: QoS and SLAs Workflows: a common application type in distributed systems The Workflow Scheduling Problem community grids: many heuristics try to minimize the makespan of the workflow utility grids: other QoS attributes than execution time, e.g., economical cost, play a role, so it is a multi-objective problem We propose a new QoS-based workflow scheduling algorithm called Partial Critical Paths (PCP)

3 Partial Critical Paths The PCP Algorithm: main idea The PCP Algorithm tries to create a schedule that 1.minimizes the total execution cost of a workflow 2.while satisfying a user-defined deadline The PCP Algorithm 1.first schedules the (overall) critical path of the workflow such that a.its execution cost is minimized b.it completes before the users deadline 2.finds the partial critical path to each scheduled task on the critical path and executes the same procedure in a recursive manner overall critical path partial critical path

4 Partial Critical Paths Scheduling System Model (1) Workflow Model an application is modeled by a directed acyclic graph G(T,E) T is a set of n tasks {t 1, t 2, …, t n } E is a set of arcs between two tasks each arc e i,j = (t i, t j ) represents a precedence constraint dummy tasks: t entry and t exit t entry t2t2 t exit t1t1 t3t3 t4t4 t5t5 t6t6 e 1,4

5 Partial Critical Paths Scheduling System Model (2) Utility Grid Model Grid Service Providers (GSPs) Each task can be processed by a number of services on different GSPs ET(t i,s) and EC(t i,s) estimated execution time and execution cost for processing task t i on service s TT(e i,j,r s) and TC(e i,j,r,s) estimated transfer time and transfer cost of sending the required data along e i,j from service s (processing task t i ) to service r (processing task t j ) Grid Market Directory (GMD)

6 Partial Critical Paths Basic Definitions Minumum Exection Time: Minimum Transfer Time: Earliest Start Time: SS(t i ): the selected service for processing the scheduled task t i AST(t i ): the actual start time of t i on its selected service used for finding the partial critical paths

7 Partial Critical Paths The PCP Scheduling Algorithm PROCEDURE ScheduleWorkflow(G(T,V), deadline) 1.Request available services for each task in T from GMD 2.Query available time slots for each service from related GSPs 3.Add t entry and t exit and their corresponding edges to G 4.Compute MET(t i ) for each task in G 5.Compute MTT(e i,j ) for each edge in G 6.Compute EST(t i ) for each task in G 4.Mark t entry and t exit as scheduled 5.Set AST(t entry )=0 and AST(t exit ) = deadline 6.Call ScheduleParents(t exit ) 7.If this procedure was successful make advance reservations for all tasks in G according to the schedule, otherwise return failure

8 Partial Critical Paths ScheduleParents (1) The Critical Parent of a node t is the unscheduled parent p of t for which EST(p)+MET(p)+MTT(e p,t ) is maximal The Partial Critical Path of node t is: empty if t does not have unscheduled parents consists of the Critical Parent p of t and the Partial Critical Path of p if t has unscheduled parents Critical parent and partial critical path change over time p t

9 Partial Critical Paths ScheduleParents (2) PROCEDURE ScheduleParents(t) 1.If t has no unscheduled parents then return success 2.Let CriticalPath be the partial critical path of t 3.Call SchedulePath(CriticalPath) 4.If this procedure is unsuccessful, return failure and a suggested start time for the failed node (try to repair) 5.For all t i on CriticalPath /* from start to end */ Call ScheduleParents(t i ) 6.Iterate over all non-scheduled parents of t

10 Partial Critical Paths SchedulingPath (1) SchedulePath tries to find the cheapest schedule for a Path without violating the actual start times of the scheduled children of the tasks on Path SchedulePath is based on a backtracking strategy A selected service for a task is admissible if the actual start times of the scheduled children of that task can be met

11 Partial Critical Paths SchedulingPath (2) Moves from the first task in Path to the last task For each task, it selects an untried available service If the selected service creates an admissible (partial) schedule, then it moves forward to the next task, otherwise it selects another untried service for that task If there is no available untried service for that task left, then it backtracks to the previous task on the path and selects another service for it This may lead to failure

12 Partial Critical Paths An Example (1) S E Start: Call ScheduleParents(E)

13 Partial Critical Paths An Example (2) S E find the Partial Critical Path for node E (this is the overall critical path of the workflow)

14 Partial Critical Paths An Example (3) S E Call SchedulePath for path Call ScheduleParents for nodes 2, 6, and 9, respectively

15 Partial Critical Paths An Example (4) S E Node 2 has no unscheduled parents, so its partial critical path is empty

16 Partial Critical Paths An Example (5) S E Node 6: find its partial critical path and then call SchedulePath ScheduleParents is called for node 3 but it has no unscheduled parents If SchedulePath cannot schedule this path, then it returns failure, which causes the path to be rescheduled.

17 Partial Critical Paths An Example (6) S E No de 9: find its partial critical path and then call SchedulePath ScheduleParents is called for the nodes 5 and 8 but they have no unscheduled parents If SchedulePath cannot schedule this path, then it returns failure, which causes the path to be rescheduled.

18 Partial Critical Paths An Example (7) S E Now scheduling of the path has been finished, and ScheduleParents is called again for node E to find its next partial critical path and to schedule that path

19 Partial Critical Paths Performance Evaluation Simulation Software: GridSim Grid Environment: DAS-3, a multicluster grid in the Netherlands 5 clusters (32-85 nodes) Average inter-cluster bandwidth: between 10 to 512 MB/s Processor speed: have been changed to make a 10 times difference between the fastest and the slowest cluster Processor price: fictitious prices have been assigned to each cluster (faster cluster has a higher price) Experimental Setup (1): the system

20 Partial Critical Paths Five synthetic workflow applications that are based on real scientific workflows (see next page) Montage CyberShake Epigenomics LIGO SIPHT Three sizes for each workflow: small (about 30 tasks) medium (about 100 tasks) large (about 1000 tasks) Each task can be executed on every cluster Performance Evaluation Experimental Setup (2): the workflows

21 Partial Critical Paths Montage LIGO SIPHT CyberShake Epigenomics Performance Evaluation Experimental Setup (3): the workflows

22 Partial Critical Paths Three scheduling algorithms to schedule each workflow: HEFT: a well-known makespan minimization algorithm Fastest: submits all tasks to the fastest cluster Cheapest: submits all tasks to the cheapest (and slowest) cluster The Normalized Cost and the Normalized Makespan of a workflow: C C : the cost of executing that workflow with Cheapest M H : the makespan of executing that workflow with HEFT Performance Evaluation Experimental Setup (4): metrics

23 Partial Critical Paths Performance Evaluation Experimental Results (1) Normalized Makespan (left) and Normalized Cost (right) of scheduling workflows with HEFT, Fastest and Cheapest

24 Partial Critical Paths Performance Evaluation Experimental Results (2) Normalized Makespan (left) and Normalized Cost (right) of scheduling small workflows with the Partial Critical Paths algorithm deadline=deadline-factor x M H

25 Partial Critical Paths Performance Evaluation Experimental Results (3) Normalized Makespan (left) and Normalized Cost (right) of scheduling medium workflows with the Partial Critical Paths algorithm

26 Partial Critical Paths Performance Evaluation Experimental Results (4) Normalized Makespan (left) and Normalized Cost (right) of scheduling large workflows with the Partial Critical Paths algorithm

27 Partial Critical Paths One of the most cited algorithms in this area has been proposed by Yu et al.: divide the workflow into partitions assign each partition a sub-deadline according to the minimum execution time of each task and the overall deadline of the workflow try to minimize the cost of execution of each partition under the sub- deadline constraints Performance Evaluation Comparison to Other Algorithms (1)

28 Partial Critical Paths Performance Evaluation Comparison to Other Algorithms (2): Cost CyberShakeEpigenomics LIGO them us

29 Partial Critical Paths Performance Evaluation Comparison to Other Algorithms (3): Cost MontageSIPHT

30 Partial Critical Paths Related Work Sakellariou et al. proposed two scheduling algorithms for minimizing the execution time under budget constraints: 1.Initially schedule a workflow with minimum execution time, and then refine the schedule until its budget constraint is satisfied 2.Initially assign each task to the cheapest resource, and then refine the schedule to shorten the execution time under budget constraints

31 Partial Critical Paths Conclusions PCP: a new algorithm for workflow scheduling in utility grids that minimizes the total execution cost while meeting a user- defined deadline Simulation results: PCP has a promising performance in small and medium workflows PCPs performance in large workflows is variable and depends on the structure of the workflow Future work: To extend our algorithm to support other economic grid models Try to enhance it for the cloud computing model

32 Partial Critical Paths Information PDS group home page and publications database: KOALA web site: Grid Workloads Archive (GWA): gwa.ewi.tudelft.nl Failure Trace Archive (FTA): fta.inria.fr