Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Scheduling CEG 4131 Computer Architecture III Miodrag Bolic Slides developed by Dr. Hesham El-Rewini Copyright Hesham El-Rewini.

Similar presentations


Presentation on theme: "1 Scheduling CEG 4131 Computer Architecture III Miodrag Bolic Slides developed by Dr. Hesham El-Rewini Copyright Hesham El-Rewini."— Presentation transcript:

1 1 Scheduling CEG 4131 Computer Architecture III Miodrag Bolic Slides developed by Dr. Hesham El-Rewini Copyright Hesham El-Rewini

2 2 Outline Scheduling models Scheduling without considering communication Including communication in scheduling Heuristic algorithms

3 3 Partitioner Grains of Sequential Code Parallel/Distributed System Parallel Program Tasks Scheduler Schedule Processors Time Program Tasks Sequential Program Explicit Approach Implicit Approach Dependence Analyzer Ideal Parallelism Scheduling Parallel Tasks

4 4 Program Tasks Task Notation: (T, <, D, A) T  set of tasks <  partial order on T D  Communication Data A  amount of computation

5 F 20 A 5 Task Graph 10 D 15 E 10 B 15 C 10 G 15 H I 30 5 5 87 5 55 10 5 4 54 20 Task Amount of Computation Communication Data Dependency

6 6 Machine m heterogeneous processors Connected via an arbitrary interconnection network (network graph) Associated with each processor P i is its speed S i Associated with each edge (i,j) is the transfer rate R ij

7 7 Task Schedule Gantt Chart Mapping (f) of tasks to a processing element and a starting time Formally: f(v) = (i,t)  task v is scheduled to be processed by processor i starting at time t

8 8 Gantt Chart

9 9 Gantt Chart with Communication

10 10 Execution and Communication Times If task t i is executed on p j Execution time = A i /S j The communication delay between t i and t j, when executed on adjacent processing elements p k and p l is D ij /R kl

11 11 Complexity Computationally intractable in general Small number of polynomial optimal algorithms in restricted cases A large number of heuristics in more general cases schedule schedulerQuality of the schedule vs. Quality of the scheduler

12 12 Scheduling Task Graphs without considering communication Polynomial-Time Optimal Algorithms in the following cases: 1.Task graph is in-forest: each node has at most one immediate successor, or out-forest: each node has at most one immediate predecessor 2.Task graph is an interval order

13 In-Forest vs. Out-Forest Structure In-ForestOut-Forest 13

14 14 Assumptions A task graph consisting of n tasks A distributed system made up of m processors The execution time of each task is one unit of time Communication between any pair of tasks is zero The goal is to find an optimal schedule, which minimizes the completion time

15 15 List Scheduling All considered algorithms belong to the list scheduling class. Each task is assigned a priority, and a list of tasks is constructed in a decreasing priority order. A task becomes ready for execution when its immediate predecessors in the task graph have already been executed or if it does not have any predecessors.

16 16 Scheduling Inforest/Outforest task graphs 1.The level of each node in the task graph is calculated as given above and used as each node’s priority 2.Whenever a processor becomes available, assign it the unexecuted ready task with the highest priority

17 17 Example 1: Simple List Scheduling Scheduling

18 Example 2: Simple List Scheduling TaskPriority A5 B5 C5 D4 E4 F4 G4 H3 I3 J3 K2 L2 M1 18 ABC D EF H IJ KL M G tProcessors 0P1P2P3P4 1ABCE 2DFGH 3IJL 4K 5M Priority Assignment Scheduling

19 CDE F GH IJ KL M Priority Assignment Scheduling AB Example 3: Simple List Scheduling 19

20 20 Interval Orders A task graph is an interval order when its nodes can be mapped into intervals on the real line, and two elements are related iff the corresponding intervals do not overlap. For any interval ordered pair of nodes u and v, either the successors of u are also successors of v or the successors of v are also successors of u.

21 21 Scheduling interval ordered tasks 1.The number of successors of each node is used as each node’s priority 2.Whenever a processor becomes available, assign it the unexecuted ready task with the highest priority

22 22 Example 1: Scheduling Interval Ordered tasks

23 Example 2: Scheduling Interval Ordered tasks 23 TaskPriority A8 B6 C5 D5 E4 F1 G3 H0 I0 J0 23 AB C DE FG IJH tProcessors 0P1P2P3 1AB 2CDE 3GF 4HIJ Priority Assignment Scheduling

24 Example 3: Scheduling Interval Ordered tasks 24 AB C DE G KLH Priority Assignment Scheduling F IJH

25 25 Communication Models Completion Time –Execution time –Communication time Completion Time as 2 Components Completion Time from the Gantt Chart

26 26 Completion Time as 2 Components Completion Time = Execution Time + Total Communication Delay Total Communication Delay = Number of communication messages * delay per message Execution time  maximum finishing time of any task Number of communication messages  –Model A –Model B

27 27 Completion Time from the Gantt Chart (Model C) Completion Time = Schedule Length This model assumes the existence of an I/O processor with every processor in the system Communication delay between two tasks allocated to the same processor is negligible. Communication delay is counted only between two tasks assigned to different processors

28 28 Example A 1 D 1 E 1 B 1 C 1 Assume a system with 2 processors

29 29 Models A and B Assume tasks A, B, and D are assigned to P1 and tasks C and E are assigned to p2 A B D P1 C E P2 Model A Number of messages = 2 Completion time = 3 + 2 Model B Number of messages = 1 Completion time = 3 + 1 A 1 D 1 E 1 B 1 C 1

30 30 Model C A B CD E Communication Delay P1P2 0 1 2 3 4 A 1 D 1 E 1 B 1 C 1

31 31 A 4 D 5 E 3 B 9 C 7 L 1 M 1 F 1 G 1 I 1 H 1 K 1 J 1 Processors P1P2P3 A BCD EHJ FLK GM HI Model A B Task Assignment Processors P1P2P3 A B BCD EHJ FK GL HM I Model C Task Assignment Model A Number of Messages = 2 + 2 Completion time = 3 + (2*4 + 2*3) = 17 Model B Number of Messages = 2 + 1 = 3 Completion time = 3 + (2*4 + 1*3) = 14 Model C Completion time = 8 Communication delay is displayed in the graph for A & B. Assume execution time of a task is 1. (assume all communication delay is 1 for simplicity) Models A,B,C Example

32 32 Heuristics A heuristic produces an answer in less than exponential time, but does not guarantee an optimal solution. Communication delay versus parallelism Clustering Duplication

33 33 Communication Delay versus Parallelism

34 34 Clustering

35 Clustering Example 1 Part 1 35 A B C ED F G 4 3 2 1.5 2 5 1 1 1 1 2 1 1 TimeP1P2 1A 2 B 3C 4D 5 6E 7 8F 9 10G Task Assignment 1 Communication Delay NOP

36 Clustering Example 1 Part 2 36 A B C ED F G 4 3 2 1.5 2 5 1 1 1 1 2 1 1 TimeP1P2 1A 2 B 3C 4D 5 6 E 7 8F 9G Task Assignment 1 Communication Delay NOP

37 37 Clustering Example 2 37 A B D FE G H 4 3 2 22 5 3 2 1 1 2 3 1 TimeP1P2 1A 2 B 3 4 5D 6D 7 C E 8E 9F 10F 11G 12 13H Task Assignment C 2 1 5 2 1 Communication Delay NOP

38 38 Duplications

39 Duplication Example (Using Clustering Example 1 Part 2) 39 A B C ED F G 4 3 2 1.5 2 5 1 1 1 1 2 1 1 TimeP1P2 1AA 2 B C 3D 4 5 E 6 7F 8G Task Assignment 1 Communication Delay NOP

40 40 Scheduling and grain packing Four major steps are involved in the grain determination and the process of scheduling optimization: –Step 1. Construct a fine-grain program graph. –Step 2. Schedule the fine-grain computation. –Step 3. Grain packing to produce the coarse grains. –Step 4. Generate a parallel schedule based on the packed graph.

41 41 Program decomposition for static multiprocessor scheduling two 2 x 2 matrices A and B are multiplied to compute the sum of the four elements in the resulting product matrix C = A x B. There are eight multiplications and seven additions to be performed in this program, as written below:

42 42 Example 2.5 Ctd’ –C 11 = A 11  B 11 + A 12  B 21 –C 12 = A 11  B 12 + A 12  B 22 –C 21 = A 21  B 11 + A 22  B 21 –C 22 = A 21  B 11 + A 22  B 22 –Sum = C 11 + C 12 + C 21 + C 22

43 43

44 44

45 45


Download ppt "1 Scheduling CEG 4131 Computer Architecture III Miodrag Bolic Slides developed by Dr. Hesham El-Rewini Copyright Hesham El-Rewini."

Similar presentations


Ads by Google