Presentation is loading. Please wait.

Presentation is loading. Please wait.

Namyoon Woo and Heon Y. Yeom

Similar presentations


Presentation on theme: "Namyoon Woo and Heon Y. Yeom"— Presentation transcript:

1 K-Depth Look-ahead Task Scheduling in Network of Heterogeneous Processors
Namyoon Woo and Heon Y. Yeom School of Computer Science and Engineering Seoul National University, Korea {nywoo,

2 List Scheduling Heurstic
Introduction (1) Problem Definition Input Task precedence graph (Directed weighted acyclic graph) Processor-network graph Objective Minimizing the overall task execution time. Satisfying the precedence order of tasks. Before the run time. NP-Complete problem List Scheduling Heurstic It is know as Cost-effective heuristic

3 Introduction (2) : List Scheduling
(3) (1) Time T4 T1 T2 T0 T1 T3 T4 T2 T0 T3 T0 T2 T1 T3 (2) T4 T0 P0 P1 P3 P2 T3 P0 P1 P3 P2 P0 P1 P3 P2

4 “Earilist Start Time” (EST) Earliest Finish Time” (HEFT)
Related Works (1) “Earilist Start Time” (EST) Homogeneous Processing “Heterogeneous Earliest Finish Time” (HEFT) [topcuoglu99HCS] Heterogeneous Processing Tx Ti Ty Tz Tx Tx Ty Ty Ti Ti Ty Ty Ty Ty P0 P1 P2 P0 P1 P2

5 “Bubble Scheduling and Allocation” (BSA)
Related Works (2) “Bubble Scheduling and Allocation” (BSA) [kwok2000CC] Tx Tx Tx Tx Tx Ti Ti Ti Ti Ty Ty Ti Ti Ty Ty Ty Tz Tz Tz Ty Tz Tz Tz Tz P0 P1 P2 P3 P0 P1 P2 P3 P0 P1 P2 P3 pivot pivot pivot

6 Heterogeneous Network Links “Successor’s Expected Start Time” (SEST)
Motivation (1) Heterogeneous Network Links “Successor’s Expected Start Time” (SEST) T0 e1 e2 e3 e4 e5 Tx Ty Ti Ty Ty P0 T0 e1 e3 e2 e4 e5 P1 ? Tz P0 P1 P2

7 Clustering k successive tasks.
Motivation (2) Clustering k successive tasks. T0 P0 P1 P2 T0 T3 T2 T4 T5 T6 T2 T3 K-depth T4 T’ T5 T6

8 k-Depth Look-ahead Heuristic
ID of Task X ID of Processor K Predefined Depth w’(i,x) Heterogeneous exe. Time of task i on Processor x h’x Average network cost of Processor x c(i) Average weight of out-edges from task i SUCC(Ti) A set of Task I’s successor tasks NB(Px) A set of neighbor processors of processor x

9 k-DLA Scheduling Heuristic
List the tasks in the pre-defined order while the list is not empty do Select the first task Ti and remove it from the list. For all Px, calculate est(i,x) + ebl(i,x,k). Select Px which gives the minimum value of the sum Schedule Ti on Px end while

10 Experimental Environment
Directed acyclic graphs Random Graph # of tasks (t ) : 50~900 # of edges = from 2t to 5t Real Application Stencil / LU-Decomposition / Laplace Transform # of tasks : over 2000. Processor network architecture 16 nodes –Ring / Mesh / Fully Connected Network Variables Heterogeneous Factor (HF) : 5, 10, 20, 40 Communication to Computation Ratio (CCR) : 0.1, 1, 10.0

11 Metrics for the Performance Comparison
Metircs Normalized Schedule Length (NSL) Schedule Length /  the weight of tasks on critical path NSL shows how close to the optimum the scheduling result is. Running Time The cost of the scheduling heuristic itself Used Processor The tendency or locality of task-processor mapping Heuristics BSA, HEFT, k-DLA (k=1, 5, infinite)

12 Results (1) : Number of Tasks (CCR=1.0, HF=20)
Ring Mesh Clique BSA HEFT 1-DLA 5-DLA -DLA

13 Results (2) : CCR (n=500, HF=20)
Ring Mesh Clique BSA HEFT 1-DLA 5-DLA -DLA

14 Results (3) : Scheduling Time (CCR=1.0, HF=20)
Ring Mesh Clique BSA HEFT 1-DLA 5-DLA -DLA

15 Results (4) : # of scheduled processors
Ring Mesh Clique BSA HEFT 1-DLA 5-DLA -DLA

16 Results (5) : Conventional graph (CCR=1.0, HF=20)
LU 64 Stencil Laplace

17 Analysis and Conclusions
Low High CCR 1-DLA -DLA (except in clique) HF Network Connectivity -DLA 1-DLA or HEFT The DLA heuristic with large k is suitable for the heterogeneous computing system where the network resource is expensive. We can adjust the value k according to the characteristic of a given computing system.


Download ppt "Namyoon Woo and Heon Y. Yeom"

Similar presentations


Ads by Google