Presentation is loading. Please wait.

Presentation is loading. Please wait.

Robust Task Scheduling in Non-deterministic Heterogeneous Computing Systems Zhiao Shi Asim YarKhan, Jack Dongarra Followed by GridSolve, FT-MPI, Open MPI.

Similar presentations


Presentation on theme: "Robust Task Scheduling in Non-deterministic Heterogeneous Computing Systems Zhiao Shi Asim YarKhan, Jack Dongarra Followed by GridSolve, FT-MPI, Open MPI."— Presentation transcript:

1 Robust Task Scheduling in Non-deterministic Heterogeneous Computing Systems Zhiao Shi Asim YarKhan, Jack Dongarra Followed by GridSolve, FT-MPI, Open MPI updates VGrADS Workshop Sept 2006

2 2 General Task Scheduling Problem Task scheduling: Allocation of tasks of a parallel program to processors to optimize certain goals, e.g. overall execution time (makespan) –Application model: task graph (DAG) –Node – computational task, has an execution cost –Edge – dependency between tasks, data transfer –Computing system model –Network of processing element (processor memory, unit, communication via message-passing) Optimal task scheduling problem is NP-complete –Heuristics: polynomial –Some assume every task has same computation cost, (homogenous tasks), some assume arbitrary cost –Some ignore communication cost

3 3 Robust Static Task Scheduling Heterogeneous and non-deterministic resources Application DAG with task execution distributions Traditional goal: Minimize the overall makespan based on expected system performance Our Goal: Find static schedules that are more robust to varying task execution time –Scheduled performance should be relatively stable with respect to the expected makespan. Approach: –Use “slack” to absorb task execution time increases caused by uncertainties –Employ genetic algorithm for optimization

4 4 Robustness – definition (I) Relative schedule tardiness – : Makespan of the schedule obtained with expected task execution time – : Real makespan with realization of task execution time –Each realization of expected values gives different schedule tardiness Robustness 1 –Reflects the amount by which the realized makespan exceeds the expected makespan

5 5 Robustness – definition (II) Schedule miss rate Robustness 2 –A simpler definition, simply counts the number of times the realized makespan exceeds the expected makespan

6 6 Calculating Makespan Given workflow, processors, and a schedule (assign tasks to processors) Adjust communication costs Makespan = longest path from source to sink 1 1 2 2 3 3 5 5 6 6 7 7 4 4 8 8 (a)(a) P1P1 P1P1 P2P2 P2P2 P3P3 P3P3 P4P4 P4P4 (b)(b) P1P1 P2P2 12 0246 8 35 (c)(c) P3P3 6 4 P4P4 7 8 10 1 8 4 1 2 11 1 1 1 2 2 3 3 5 5 6 6 7 7 4 4 8 8 (d)(d) 0 0 2 1 0 1 0 1 1 1

7 7 Defining Slack Slack of a task node i is defined as follow: – - bottom level (longest path from exit node to node i ) – - top level (longest path from entry node to node i) Average slack of a schedule Usefulness of slack in improving robustness –Slack at each task of a schedule reflects the “wiggle-room” that task has. –Large slack means a task node can tolerate large increase of execution time without increasing the makespan 1 1 2 2 3 3 5 5 6 6 7 7 4 4 8 8 0 0 2 1 0 1 0 1 1 1

8 8 Bi-objective optimization Want to optimize both makespan and robustness (as represented by slack) –Turn out to be conflicting goals  -constraint –Optimize one objective, subject to constraints imposed on the other objectives –maximize average slack of the schedule ( ) –subject to: –Where HEFT is a well known, efficient algorithm for serializing a DAG and assigning a schedule Use genetic algorithm to do the optimization –Fitness function is average slack –For solution violating constraint, fitness is penalized by the degree of violation of the constraint

9 9 Experiment settings Task graphs: Task number ( N ), shape parameter ( alpha ), average computation cost ( cc ), communication to computation ratio ( CCR ) Best case execution time matrix beta is generated taking into account task heterogeneity and machine heterogeneity – b_ij - the best case execution time of task v_i on proc p_j Uncertainty level: degree of uncertainty of actual execution time. – UL_ij- the uncertainty level of the execution time of task i on processor j –The real execution time is a uniformly distributed random variable – The graph has an average uncertainty level UL

10 10 Bi-objective optimization improves both makespan and robustness Performance improvement over HEFT ( = 1.0)

11 11 Relaxing  improves robustness R 1 improvement over = 1.0R 2 improvement over = 1.0

12 12 Review Want to schedule a DAG on a set of resources –Resources may be shared, or variable performance –Thus,for each resource, a task has a distribution for its execution time on that resource Want a bounded makespan for the DAG –Scheduled performance should be relatively stable with respect to the expected makespan. –Schedule is statically generated (not modified dynamically during runtime) –The schedule should, as much as possible, be able to withstand variations in the execution times of the individual tasks –Optimize the slack subject to constraints on makespan

13 13 Conclusions We developed an algorithm for scheduling DAG- structured applications with goals of both minimizing the makespan and maximizing the robustness. Due to conflicting of the two goals, epsilon- constraint method is used to solve the bi-objective optimization problem. Proposed two definitions of robustness Slack is an effective metric to be used to adjust the robustness. The algorithm is flexible in finding the epsilon value in certain user provided range so that the best overall performance is achieved.

14 14 GridSolve Update GridSolve 0.15 released 05/2006 –Support for NATs and firewalls –Based on GridRPC –Support for batch queues –Improved problem descriptions gsIDL –Matlab, C, Fortran client interfaces –History based execution models –Scheduling using perturbation model –Win32 native client Lots of work left –Client bindings (Mathematica, IDL,...) –Service libraries (ScaLAPACK, ARPACK,...) –Backend resource managers (Condor, VGrADS,...)

15 15 FT-MPI Update FT-MPI: Version 1.01 –Fast, scalable fault tolerant MPI implementation –System reports failures, recovers processes and messages –User must recover program (via checkpoint/restart,...) –Status update –In maintenance mode –Improvements in stability, performance

16 16 Open MPI Update Stable MPI-2 library – version 1.1 –A mixture of ideas coming from previous MPI libraries (FT-MPI, LA-MPI, LAM, PACX-MPI) –Team: Indiana U, U Tennessee, LANL, HLRS, U Houston, Cisco, Voltaire, Mellanox, Sun, Sandia, Myricom, IBM –Multiple simultaneous transports (ethernet, myrinet,...) –Resource managers (ssh, BProc, PBS, XGrid, Slurm,...) –Production quality, thread safe, high performance –Tuned collective operations Fault tolerance –Involuntary coordinated checkpointing –Application unaware that it was checkpointed (by SC06) –FT-MPI technologies –New FT framework in progress

17 The End


Download ppt "Robust Task Scheduling in Non-deterministic Heterogeneous Computing Systems Zhiao Shi Asim YarKhan, Jack Dongarra Followed by GridSolve, FT-MPI, Open MPI."

Similar presentations


Ads by Google