COE 561 Digital System Design & Synthesis Scheduling Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals.

Slides:



Advertisements
Similar presentations
Algorithm Design Methods (I) Fall 2003 CSE, POSTECH.
Advertisements

Algorithm Design Methods Spring 2007 CSE, POSTECH.
CALTECH CS137 Fall DeHon 1 CS137: Electronic Design Automation Day 19: November 21, 2005 Scheduling Introduction.
ECE 667 Synthesis and Verification of Digital Circuits
ECE Longest Path dual 1 ECE 665 Spring 2005 ECE 665 Spring 2005 Computer Algorithms with Applications to VLSI CAD Linear Programming Duality – Longest.
Lesson 08 Linear Programming
Architecture-dependent optimizations Functional units, delay slots and dependency analysis.
Approximation Algorithms Chapter 14: Rounding Applied to Set Cover.
Introduction to Algorithms
CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 14: March 3, 2004 Scheduling Heuristics and Approximation.
1 EE5900 Advanced Embedded System For Smart Infrastructure Static Scheduling.
Terminology Project: Combination of activities that have to be carried out in a certain order Activity: Anything that uses up time and resources CPM: „Critical.
Winter 2005ICS 252-Intro to Computer Design ICS 252 Introduction to Computer Design Lecture 5-Scheudling Algorithms Winter 2005 Eli Bozorgzadeh Computer.
Reconfigurable Computing S. Reda, Brown University Reconfigurable Computing (EN2911X, Fall07) Lecture 10: RC Principles: Software (3/4) Prof. Sherief Reda.
Clock Skewing EECS 290A Sequential Logic Synthesis and Verification.
Sequential Timing Optimization. Long path timing constraints Data must not reach destination FF too late s i + d(i,j) + T setup  s j + P s i s j d(i,j)
Circuit Retiming with Interconnect Delay CUHK CSE CAD Group Meeting One Evangeline Young Aug 19, 2003.
ECE Synthesis & Verification - Lecture 2 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling.
COE 561 Digital System Design & Synthesis Architectural Synthesis Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum.
ECE Synthesis & Verification - Lecture 3 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling.
Approximation Algorithms
Continuous Retiming EECS 290A Sequential Logic Synthesis and Verification.
EDA (CS286.5b) Day 11 Scheduling (List, Force, Approximation) N.B. no class Thursday (FPGA) …
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 5: February 2, 2009 Architecture Synthesis (Provisioning, Allocation)
Tirgul 13. Unweighted Graphs Wishful Thinking – you decide to go to work on your sun-tan in ‘ Hatzuk ’ beach in Tel-Aviv. Therefore, you take your swimming.
COE 561 Digital System Design & Synthesis Resource Sharing and Binding Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum.
SCHEDULING SOURCES- Mark Manwaring Kia Bazargan Giovanni De Micheli Gupta Youn-Long Lin M. Balakrishnan Camposano, J. Hofstede, Knapp, MacMillen Lin.
ECE Synthesis & Verification - Lecture 4 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Allocation:
ICS 252 Introduction to Computer Design
ECE Synthesis & Verification - LP Scheduling 1 ECE 667 ECE 667 Synthesis and Verification of Digital Circuits Scheduling Algorithms Analytical approach.
Fall 2006EE VLSI Design Automation I VII-1 EE 5301 – VLSI Design Automation I Kia Bazargan University of Minnesota Part VII: High Level Synthesis.
Scheduling for Synthesis of Embedded Hardware
1 IOE/MFG 543 Chapter 7: Job shops Sections 7.1 and 7.2 (skip section 7.3)
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 5: February 2, 2009 Architecture Synthesis (Provisioning, Allocation)
COE 561 Digital System Design & Synthesis Architectural Synthesis Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum.
Design Techniques for Approximation Algorithms and Approximation Classes.
Approximating Minimum Bounded Degree Spanning Tree (MBDST) Mohit Singh and Lap Chi Lau “Approximating Minimum Bounded DegreeApproximating Minimum Bounded.
CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 12: February 13, 2002 Scheduling Heuristics and Approximation.
Integer Programming Key characteristic of an Integer Program (IP) or Mixed Integer Linear Program (MILP): One or more of the decision variable must be.
Approximation Algorithms Department of Mathematics and Computer Science Drexel University.
Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu.
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
1 SYNTHESIS of PIPELINED SYSTEMS for the CONTEMPORANEOUS EXECUTION of PERIODIC and APERIODIC TASKS with HARD REAL-TIME CONSTRAINTS Paolo Palazzari Luca.
High-Level Synthesis-II Virendra Singh Indian Institute of Science Bangalore IEP on Digital System IIT Kanpur.
1 Iterative Integer Programming Formulation for Robust Resource Allocation in Dynamic Real-Time Systems Sethavidh Gertphol and Viktor K. Prasanna University.
COE 561 Digital System Design & Synthesis Architectural Synthesis Dr. Muhammad Elrabaa Computer Engineering Department King Fahd University of Petroleum.
ELEC692 VLSI Signal Processing Architecture Lecture 3
Approximation Algorithms Department of Mathematics and Computer Science Drexel University.
L12 : Lower Power High Level Synthesis(3) 성균관대학교 조 준 동 교수
Implicit Hitting Set Problems Richard M. Karp Erick Moreno Centeno DIMACS 20 th Anniversary.
1  Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
Algorithm Design Methods 황승원 Fall 2011 CSE, POSTECH.
1 EE5900 Advanced Embedded System For Smart Infrastructure Static Scheduling.
1 Optimization Techniques Constrained Optimization by Linear Programming updated NTU SY-521-N SMU EMIS 5300/7300 Systems Analysis Methods Dr.
Retiming EECS 290A Sequential Logic Synthesis and Verification.
Approximation Algorithms based on linear programming.
Scheduling Determines the precise start time of each task.
The minimum cost flow problem
Basic Project Scheduling
Basic Project Scheduling
Analysis of Algorithms
King Fahd University of Petroleum and Minerals
Architecture Synthesis
Resource Sharing and Binding
Integrated Systems Centre © Giovanni De Micheli – All rights reserved
EE5900 Advanced Embedded System For Smart Infrastructure
ICS 252 Introduction to Computer Design
Timing Analysis and Optimization of Sequential Circuits
Chapter 1. Formulations.
Reconfigurable Computing (EN2911X, Fall07)
Presentation transcript:

COE 561 Digital System Design & Synthesis Scheduling Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals

2 OutlineOutline n The scheduling problem. n Scheduling without constraints. n Scheduling under timing constraints. Relative scheduling. n Scheduling under resource constraints. The ILP model. Heuristic methods List scheduling Force-directed scheduling n The scheduling problem. n Scheduling without constraints. n Scheduling under timing constraints. Relative scheduling. n Scheduling under resource constraints. The ILP model. Heuristic methods List scheduling Force-directed scheduling

3 SchedulingScheduling n Circuit model Sequencing graph. Cycle-time is given. Operation delays expressed in cycles. n Scheduling Determine the start times for the operations. Satisfying all the sequencing (timing and resource) constraint. n Goal Determine area/latency trade-off. n Scheduling affects Area: maximum number of concurrent operations of same type is a lower bound on required hardware resources. Performance: concurrency of resulting implementation. n Circuit model Sequencing graph. Cycle-time is given. Operation delays expressed in cycles. n Scheduling Determine the start times for the operations. Satisfying all the sequencing (timing and resource) constraint. n Goal Determine area/latency trade-off. n Scheduling affects Area: maximum number of concurrent operations of same type is a lower bound on required hardware resources. Performance: concurrency of resulting implementation.

4 Scheduling Example

5 Scheduling Models n Unconstrained scheduling. n Scheduling with timing constraints Latency. Detailed timing constraints. n Scheduling with resource constraints. n Simplest scheduling model All operations have bounded delays. All delays are in cycles. Cycle-time is given. No constraints - no bounds on area. Goal Minimize latency. n Unconstrained scheduling. n Scheduling with timing constraints Latency. Detailed timing constraints. n Scheduling with resource constraints. n Simplest scheduling model All operations have bounded delays. All delays are in cycles. Cycle-time is given. No constraints - no bounds on area. Goal Minimize latency.

6 Minimum-Latency Unconstrained Scheduling Problem n Given a set of operations V with integer delays D and a partial order on the operations E n Find an integer labeling of the operations  : V  Z +, such that t i =  (v i ), t i  t j + d j  i, j s.t. (v j, v i )  E and t n is minimum. n Unconstrained scheduling used when Dedicated resources are used. Operations differ in type. Operations cost is marginal when compared to that of steering logic, registers, wiring, and control logic. Binding is done before scheduling: resource conflicts solved by serializing operations sharing same resource. Deriving bounds on latency for constrained problems. n Given a set of operations V with integer delays D and a partial order on the operations E n Find an integer labeling of the operations  : V  Z +, such that t i =  (v i ), t i  t j + d j  i, j s.t. (v j, v i )  E and t n is minimum. n Unconstrained scheduling used when Dedicated resources are used. Operations differ in type. Operations cost is marginal when compared to that of steering logic, registers, wiring, and control logic. Binding is done before scheduling: resource conflicts solved by serializing operations sharing same resource. Deriving bounds on latency for constrained problems.

7 ASAP Scheduling Algorithm n Denote by t s the start times computed by the as soon as possible (ASAP) algorithm. n Yields minimum values of start times. n Denote by t s the start times computed by the as soon as possible (ASAP) algorithm. n Yields minimum values of start times.

8 ALAP Scheduling Algorithm n Denote by t L the start times computed by the as late as possible (ALAP) algorithm. n Yields maximum values of start times. n Latency upper bound n Denote by t L the start times computed by the as late as possible (ALAP) algorithm. n Yields maximum values of start times. n Latency upper bound

9 Latency-Constrained Scheduling n ALAP solves a latency-constrained problem. n Latency bound can be set to latency computed by ASAP algorithm. n Mobility Defined for each operation. Difference between ALAP and ASAP schedule. Zero mobility implies that an operation can be started only at one given time step. Mobility greater than 0 measures span of time interval in which an operation may start. n Slack on the start time. n ALAP solves a latency-constrained problem. n Latency bound can be set to latency computed by ASAP algorithm. n Mobility Defined for each operation. Difference between ALAP and ASAP schedule. Zero mobility implies that an operation can be started only at one given time step. Mobility greater than 0 measures span of time interval in which an operation may start. n Slack on the start time.

10 ExampleExample n Operations with zero mobility {v1, v2, v3, v4, v5}. Critical path. n Operations with mobility one {v6, v7}. n Operations with mobility two {v8, v9, v10, v11} n Operations with zero mobility {v1, v2, v3, v4, v5}. Critical path. n Operations with mobility one {v6, v7}. n Operations with mobility two {v8, v9, v10, v11}

11 Scheduling under Detailed Timing Constraints … n Motivation Interface design. Control over operation start time. n Constraints Upper/lower bounds on start-time difference of any operation pair. n Minimum timing constraints between two operations An operation follows another by at least a number of prescribed time steps l ij  0 requires t j  t i + l ij n Maximum timing constraints between two operations An operation follows another by at most a number of prescribed time steps u ij  0 requires t j  t i + u ij n Motivation Interface design. Control over operation start time. n Constraints Upper/lower bounds on start-time difference of any operation pair. n Minimum timing constraints between two operations An operation follows another by at least a number of prescribed time steps l ij  0 requires t j  t i + l ij n Maximum timing constraints between two operations An operation follows another by at most a number of prescribed time steps u ij  0 requires t j  t i + u ij

12 … Scheduling under Detailed Timing Constraints n Example Circuit reads data from a bus, performs computation, writes result back on the bus. Bus interface constraint: data written three cycles after read. Minimum and maximum constraint of 3 cycles between read and write operations. n Example Two circuits required to communicate simultaneously to external circuits. Cycle in which data available is irrelevant. Minimum and maximum timing constraint of zero cycles between two write operations. n Example Circuit reads data from a bus, performs computation, writes result back on the bus. Bus interface constraint: data written three cycles after read. Minimum and maximum constraint of 3 cycles between read and write operations. n Example Two circuits required to communicate simultaneously to external circuits. Cycle in which data available is irrelevant. Minimum and maximum timing constraint of zero cycles between two write operations.

13 Constraint Graph Model n Start from sequencing graph. n Model delays as weights on edges. n Add forward edges for minimum constraints. Edge (v i, v j ) with weight l ij  t j  t i + l ij n Add backward edges for maximum constraints. Edge (v j, v i ) with weight -u ij  t j  t i + u ij because t j  t i + u ij  t i  t j - u ij n Start from sequencing graph. n Model delays as weights on edges. n Add forward edges for minimum constraints. Edge (v i, v j ) with weight l ij  t j  t i + l ij n Add backward edges for maximum constraints. Edge (v j, v i ) with weight -u ij  t j  t i + u ij because t j  t i + u ij  t i  t j - u ij

14 … Constraint Graph Model Mul delay = 2 ADD delay =1

15 Methods for Scheduling under Detailed Timing Constraints … n Presence of maximum timing constraints may prevent existence of a consistent schedule. n Required upper bound on time distance between operations may be inconsistent with first operation execution time. n Minimum timing constraints may conflict with maximum timing constraints. n A criterion to determine existence of a schedule: For each maximum timing constraint u ij Longest weighted path between v i and v j must be  u ij n Presence of maximum timing constraints may prevent existence of a consistent schedule. n Required upper bound on time distance between operations may be inconsistent with first operation execution time. n Minimum timing constraints may conflict with maximum timing constraints. n A criterion to determine existence of a schedule: For each maximum timing constraint u ij Longest weighted path between v i and v j must be  u ij

16 … Methods for Scheduling under Detailed Timing Constraints n Weight of longest path from source to a vertex is the minimum start time of a vertex. n Bellman-Ford or Lia-Wong algorithm provides the schedule. n A necessary condition for existence of a schedule is constraint graph has no positive cycles. n Weight of longest path from source to a vertex is the minimum start time of a vertex. n Bellman-Ford or Lia-Wong algorithm provides the schedule. n A necessary condition for existence of a schedule is constraint graph has no positive cycles.

17 Method for Scheduling with Unbounded-Delay Operations n Unbounded delays Synchronization. Unbounded-delay operations (e.g. loops). n Anchors. Unbounded-delay operations. n Relative scheduling Schedule ops w.r. to the anchors. Combine schedules. n Unbounded delays Synchronization. Unbounded-delay operations (e.g. loops). n Anchors. Unbounded-delay operations. n Relative scheduling Schedule ops w.r. to the anchors. Combine schedules.

18 Relative Scheduling Method n For each vertex Determine relevant anchor set R(.). Anchors affecting start time. Determine time offset from anchors. n Start-time Expressed by: Computed only at run-time because delays of anchors are unknown. n For each vertex Determine relevant anchor set R(.). Anchors affecting start time. Determine time offset from anchors. n Start-time Expressed by: Computed only at run-time because delays of anchors are unknown.

19 Relative Scheduling under Timing Constraints n Problem definition Detailed timing constraints. Unbounded delay operations. n Solution May or may not exist. Problem may be ill-specified. n Feasible problem A solution exists when unknown delays are zero. n Well-posed problem A solution exists for any value of the unknown delays. n Theorem A constraint graph can be made well-posed if there are no cycles with unbounded weights. n Problem definition Detailed timing constraints. Unbounded delay operations. n Solution May or may not exist. Problem may be ill-specified. n Feasible problem A solution exists when unknown delays are zero. n Well-posed problem A solution exists for any value of the unknown delays. n Theorem A constraint graph can be made well-posed if there are no cycles with unbounded weights.

20 ExampleExample (a) & (b) Ill-posed constraint (c) well-posed constraint

21 Relative Scheduling Approach n Analyze graph Detect anchors. Well-posedness test. Determine dependencies from anchors. n Schedule ops with respect to relevant anchors Bellman-Ford, Liao-Wong, Ku algorithms. n Combine schedules to determine start times: n Analyze graph Detect anchors. Well-posedness test. Determine dependencies from anchors. n Schedule ops with respect to relevant anchors Bellman-Ford, Liao-Wong, Ku algorithms. n Combine schedules to determine start times:

22 ExampleExample

23 Scheduling under Resource Constraints n Classical scheduling problem. Fix area bound - minimize latency. n The amount of available resources affects the achievable latency. n Dual problem Fix latency bound - minimize resources. n Assumption All delays bounded and known. n Classical scheduling problem. Fix area bound - minimize latency. n The amount of available resources affects the achievable latency. n Dual problem Fix latency bound - minimize resources. n Assumption All delays bounded and known.

24 Minimum Latency Resource-Constrained Scheduling Problem n Given a set of ops V with integer delays D, a partial order on the operations E, and upper bounds {a k ; k = 1, 2, …, n res } n Find an integer labeling of the operations  : V  Z +, such that t i =  (v i ), t i  t j + d j  i, j s.t. (v j, v i )  E and t n is minimum. n Number of operations of any given type in any schedule step does not exceed bound. n Given a set of ops V with integer delays D, a partial order on the operations E, and upper bounds {a k ; k = 1, 2, …, n res } n Find an integer labeling of the operations  : V  Z +, such that t i =  (v i ), t i  t j + d j  i, j s.t. (v j, v i )  E and t n is minimum. n Number of operations of any given type in any schedule step does not exceed bound. :V  {1,2, …n res }

25 Scheduling under Resource Constraints n Intractable problem. n Algorithms Exact Integer linear program. Hu (restrictive assumptions). Approximate List scheduling. Force-directed scheduling. n Intractable problem. n Algorithms Exact Integer linear program. Hu (restrictive assumptions). Approximate List scheduling. Force-directed scheduling.

26 ILP Formulation … n Binary decision variables X = { x il ; i = 1, 2, …, n; l = 1, 2, …, +1}. x il, is TRUE only when operation v i starts in step l of the schedule (i.e. l = t i ). is an upper bound on latency. n Start time of operation v i n Operations start only once n Binary decision variables X = { x il ; i = 1, 2, …, n; l = 1, 2, …, +1}. x il, is TRUE only when operation v i starts in step l of the schedule (i.e. l = t i ). is an upper bound on latency. n Start time of operation v i n Operations start only once t i =

27 … ILP Formulation … n Sequencing relations must be satisfied n Resource bounds must be satisfied n Sequencing relations must be satisfied n Resource bounds must be satisfied

28 … ILP Formulation n Minimize c T t such that n c T =[0,0,…,0,1] T corresponds to minimizing the latency of the schedule. n c T =[1,1,…,1,1] T corresponds to finding the earliest start times of all operations under the given constraints. n Minimize c T t such that n c T =[0,0,…,0,1] T corresponds to minimizing the latency of the schedule. n c T =[1,1,…,1,1] T corresponds to finding the earliest start times of all operations under the given constraints.

29 Example … n Resource constraints 2 ALUs; 2 Multipliers. a 1 = 2; a 2 = 2. n Single-cycle operation. d i = 1  i. n Operations start only once x 0,1 =1; x 1,1 =1; x 2,1 =1; x 3,2 =1 x 4,3 =1; x 5,4 =1 x 6,1 + x 6,2 =1 x 7,2 + x 7,3 =1 x 8,1 + x 8,2 +x 8,3 =1 x 9,2 + x 9,3 +x 9,4 =1 x 10,1 + x 10,2 +x 10,3 =1 x 11,2 + x 11,3 +x 11,4 =1 x n,5 =1 n Resource constraints 2 ALUs; 2 Multipliers. a 1 = 2; a 2 = 2. n Single-cycle operation. d i = 1  i. n Operations start only once x 0,1 =1; x 1,1 =1; x 2,1 =1; x 3,2 =1 x 4,3 =1; x 5,4 =1 x 6,1 + x 6,2 =1 x 7,2 + x 7,3 =1 x 8,1 + x 8,2 +x 8,3 =1 x 9,2 + x 9,3 +x 9,4 =1 x 10,1 + x 10,2 +x 10,3 =1 x 11,2 + x 11,3 +x 11,4 =1 x n,5 =1

30 … Example … n Sequencing relations must be satisfied 2x 3,2 -x 1,1  1 2x 3,2 -x 2,1  1 2x 7,2 +3x 7,3 -x 6,1 -2x 6,2  1 2x 9,2 +3x 9,3 +4x 9,4 -x 8,1 -2x 8,2 -3x 8,3  1 2x 11,2 +3x 11,3 +4x 11,4 -x 10,1 -2x 10,2 -3x 10,3  1 4x 5,4 -2x 7,2 -3x 7,3  1 4x 5,4 -3x 4,3  1 5x n,5 -2x 9,2 -3x 9,3 -4x 9,4  1 5x n,5 -2x 11,2 -3x 11,3 -4x 11,4  1 5x n,5 -4x 5,4  1 n Sequencing relations must be satisfied 2x 3,2 -x 1,1  1 2x 3,2 -x 2,1  1 2x 7,2 +3x 7,3 -x 6,1 -2x 6,2  1 2x 9,2 +3x 9,3 +4x 9,4 -x 8,1 -2x 8,2 -3x 8,3  1 2x 11,2 +3x 11,3 +4x 11,4 -x 10,1 -2x 10,2 -3x 10,3  1 4x 5,4 -2x 7,2 -3x 7,3  1 4x 5,4 -3x 4,3  1 5x n,5 -2x 9,2 -3x 9,3 -4x 9,4  1 5x n,5 -2x 11,2 -3x 11,3 -4x 11,4  1 5x n,5 -4x 5,4  1

31 … Example n Resource bounds must be satisfied: n Any set of start times satisfying constraints provides a feasible solution. n Any feasible solution is optimum since sink (x n,5 =1) mobility is 0. n Resource bounds must be satisfied: n Any set of start times satisfying constraints provides a feasible solution. n Any feasible solution is optimum since sink (x n,5 =1) mobility is 0.

32 Dual ILP Formulation n Minimize resource usage under latency constraint. n Same constraints as previous formulation. n Additional constraint Latency bound must be satisfied. n Resource usage is unknown in the constraints. n Resource usage is the objective to minimize. Minimize c T a a vector represents resource usage c T vector represents resource costs n Minimize resource usage under latency constraint. n Same constraints as previous formulation. n Additional constraint Latency bound must be satisfied. n Resource usage is unknown in the constraints. n Resource usage is the objective to minimize. Minimize c T a a vector represents resource usage c T vector represents resource costs

33 ExampleExample n Multiplier area = 5; ALU area = 1. n Objective function: 5a 1 +a 2. = 4 n Start time constraints same. n Sequencing dependency constraints same. n Resource constraints x 1,1 +x 2,1 +x 6,1 +x 8,1 – a 1  0 x 3,2 +x 6,2 +x 7,2 +x 8,2 – a 1  0 x 7,3 +x 8,3 – a 1  0 x 10,1 – a 2  0 x 9,2 +x 10,2 +x 11,2 – a 2  0 x 4,3 +x 9,3 +x 10,3 +x 11,3 – a 2  0 x 5,4 +x 9,4 +x 11,4 – a 2  0 n Multiplier area = 5; ALU area = 1. n Objective function: 5a 1 +a 2. = 4 n Start time constraints same. n Sequencing dependency constraints same. n Resource constraints x 1,1 +x 2,1 +x 6,1 +x 8,1 – a 1  0 x 3,2 +x 6,2 +x 7,2 +x 8,2 – a 1  0 x 7,3 +x 8,3 – a 1  0 x 10,1 – a 2  0 x 9,2 +x 10,2 +x 11,2 – a 2  0 x 4,3 +x 9,3 +x 10,3 +x 11,3 – a 2  0 x 5,4 +x 9,4 +x 11,4 – a 2  0

34 ILP Solution n Use standard ILP packages. n Transform into LP problem [Gebotys]. n Advantages Exact method. Other constraints can be incorporated easily Maximum and minimum timing constraints n Disadvantages Works well up to few thousand variables. n Use standard ILP packages. n Transform into LP problem [Gebotys]. n Advantages Exact method. Other constraints can be incorporated easily Maximum and minimum timing constraints n Disadvantages Works well up to few thousand variables.

35 List Scheduling Algorithms n Heuristic method for Minimum latency subject to resource bound. Minimum resource subject to latency bound. n Greedy strategy. n Priority list heuristics. Assign a weight to each vertex indicating its scheduling priority Longest path to sink. Longest path to timing constraint. n Heuristic method for Minimum latency subject to resource bound. Minimum resource subject to latency bound. n Greedy strategy. n Priority list heuristics. Assign a weight to each vertex indicating its scheduling priority Longest path to sink. Longest path to timing constraint.

36 List Scheduling Algorithm for Minimum Latency …

37 … List Scheduling Algorithm for Minimum Latency n Candidate Operations U l,k Operations of type k whose predecessors are scheduled and completed at time step before l n Unfinished operations T l,k are operations of type k that started at earlier cycles and whose execution is not finished at time l Note that when execution delays are 1, T l,k is empty. n Candidate Operations U l,k Operations of type k whose predecessors are scheduled and completed at time step before l n Unfinished operations T l,k are operations of type k that started at earlier cycles and whose execution is not finished at time l Note that when execution delays are 1, T l,k is empty.

38 ExampleExample n Assumptions a 1 = 2 multipliers with delay 1. a 2 = 2 ALUs with delay 1. n First Step U 1,1 = {v 1, v 2, v 6, v 8 } Select {v 1, v 2 } U 1,2 = {v 10 }; selected n Second step U 2,1 = {v 3, v 6, v 8 } select {v 3, v 6 } U 2,2 = {v 11 }; selected n Third step U 3,1 = {v 7, v 8 } Select {v 7, v 8 } U 3,2 = {v 4 }; selected n Fourth step U 4,2 = {v 5, v 9 }; selected n Assumptions a 1 = 2 multipliers with delay 1. a 2 = 2 ALUs with delay 1. n First Step U 1,1 = {v 1, v 2, v 6, v 8 } Select {v 1, v 2 } U 1,2 = {v 10 }; selected n Second step U 2,1 = {v 3, v 6, v 8 } select {v 3, v 6 } U 2,2 = {v 11 }; selected n Third step U 3,1 = {v 7, v 8 } Select {v 7, v 8 } U 3,2 = {v 4 }; selected n Fourth step U 4,2 = {v 5, v 9 }; selected

39 ExampleExample n Assumptions a 1 = 3 multipliers with delay 2. a 2 = 1 ALU with delay 1. n Assumptions a 1 = 3 multipliers with delay 2. a 2 = 1 ALU with delay 1.

40 List Scheduling Algorithm for Minimum Resource Usage

41 ExampleExample n Assume =4 n Let a = [1, 1] T n First Step U 1,1 = {v 1, v 2, v 6, v 8 } Operations with zero slack {v 1, v 2 } a = [2, 1] T U 1,2 = {v 10 } n Second step U 2,1 = {v 3, v 6, v 8 } Operations with zero slack {v 3, v 6 } U 2,2 = {v 11 } n Third step U 3,1 = {v 7, v 8 } Operations with zero slack {v 7, v 8 } U 3,2 = {v 4 } n Fourth step U 4,2 = {v 5, v 9 } Both have zero slack; a = [2, 2] T n Assume =4 n Let a = [1, 1] T n First Step U 1,1 = {v 1, v 2, v 6, v 8 } Operations with zero slack {v 1, v 2 } a = [2, 1] T U 1,2 = {v 10 } n Second step U 2,1 = {v 3, v 6, v 8 } Operations with zero slack {v 3, v 6 } U 2,2 = {v 11 } n Third step U 3,1 = {v 7, v 8 } Operations with zero slack {v 7, v 8 } U 3,2 = {v 4 } n Fourth step U 4,2 = {v 5, v 9 } Both have zero slack; a = [2, 2] T

42 Force-Directed Scheduling … n Heuristic scheduling methods [Paulin] Min latency subject to resource bound. Variation of list scheduling: FDLS. Min resource subject to latency bound. Schedule one operation at a time. n Rationale Reward uniform distribution of operations across schedule steps. n Operation interval: mobility plus one (  i +1). Computed by ASAP and ALAP scheduling n Operation probability p i (l) Probability of executing in a given step. 1/(  i +1) inside interval; 0 elsewhere. n Heuristic scheduling methods [Paulin] Min latency subject to resource bound. Variation of list scheduling: FDLS. Min resource subject to latency bound. Schedule one operation at a time. n Rationale Reward uniform distribution of operations across schedule steps. n Operation interval: mobility plus one (  i +1). Computed by ASAP and ALAP scheduling n Operation probability p i (l) Probability of executing in a given step. 1/(  i +1) inside interval; 0 elsewhere.

43 … Force-Directed Scheduling n Operation-type distribution q k (l) Sum of the op. prob. for each type. Shows likelihood that a resource is used at each schedule step. n Distribution graph for multiplier n Distribution graph for adder n Operation-type distribution q k (l) Sum of the op. prob. for each type. Shows likelihood that a resource is used at each schedule step. n Distribution graph for multiplier n Distribution graph for adder p 1 (1)=1, p 1 (2)=p 1 (3)=p 1 (4)=0 p 2 (1)=1, p 2 (2)=p 2 (3)=p 2 (4)=0  6 =1; time frame [1,2] p 6 (1)=0.5, p 6 (2)=0.5, p 6 (3)=p 6 (4)=0  8 =2; time frame [1,3] p 8 (1)=p 8 (2)=p 8 (3)=0.3, p 8 (4)=0 q mul (1)= =2.8

44 ForceForce n Used as priority function. n Selection of operation to be scheduled in a time step is based on force. n Forces attract (repel) operations into (from) specific schedule steps. n Force is related to concurrency. The larger the force the larger the concurrency n Mechanical analogy Force exerted by elastic spring is proportional to displacement between its end points. Force = constant  displacement. constant = operation-type distribution. displacement = change in probability. n Used as priority function. n Selection of operation to be scheduled in a time step is based on force. n Forces attract (repel) operations into (from) specific schedule steps. n Force is related to concurrency. The larger the force the larger the concurrency n Mechanical analogy Force exerted by elastic spring is proportional to displacement between its end points. Force = constant  displacement. constant = operation-type distribution. displacement = change in probability.

45 Forces Related to the Assignment of an Operation to a Control Step n Self-force Sum of forces relating operation to all schedule steps in its time frame. Self-force for scheduling operation v i in step l  lm denotes a Kronecker delta function; equal 1 when m=l. n Successor-force Related to the successors. Delaying an operation implies delaying its successors. n Self-force Sum of forces relating operation to all schedule steps in its time frame. Self-force for scheduling operation v i in step l  lm denotes a Kronecker delta function; equal 1 when m=l. n Successor-force Related to the successors. Delaying an operation implies delaying its successors.

46 Example: Operation v 6 … n It can be scheduled in the first two steps. p(1) = 0.5; p(2) = 0.5; p(3) = 0; p(4) =0. n Distribution: q(1) = 2.8; q(2) = 2.3; q(3)=0.8. n Assign v 6 to step 1 variation in probability 1 – 0.5 = 0.5 for step1 variation in probability 0 – 0.5 = -0.5 for step2 Self-force: 2.8 * * -0.5 = n Assign v 6 to step 2 variation in probability 0 – 0.5 = -0.5 for step1 variation in probability 1 – 0.5 = 0.5 for step2 Self-force: 2.8 * * 0.5 = n It can be scheduled in the first two steps. p(1) = 0.5; p(2) = 0.5; p(3) = 0; p(4) =0. n Distribution: q(1) = 2.8; q(2) = 2.3; q(3)=0.8. n Assign v 6 to step 1 variation in probability 1 – 0.5 = 0.5 for step1 variation in probability 0 – 0.5 = -0.5 for step2 Self-force: 2.8 * * -0.5 = n Assign v 6 to step 2 variation in probability 0 – 0.5 = -0.5 for step1 variation in probability 1 – 0.5 = 0.5 for step2 Self-force: 2.8 * * 0.5 = -0.25

47 … Example: Operation v6 … n Successor-force Assigning v 6 to step 2 implies operation v 7 assigned to step (0-0.5) (1 -0.5) = -.75 Total-force on v 6 = (-0.25)+(-0.75)=-1. n Conclusion Least force is for step 2. Assigning v 6 to step 2 reduces concurrency (i.e. resources). n Total force on an operation related to a schedule step = self force + predecessor/successor forces with affected time frame n Successor-force Assigning v 6 to step 2 implies operation v 7 assigned to step (0-0.5) (1 -0.5) = -.75 Total-force on v 6 = (-0.25)+(-0.75)=-1. n Conclusion Least force is for step 2. Assigning v 6 to step 2 reduces concurrency (i.e. resources). n Total force on an operation related to a schedule step = self force + predecessor/successor forces with affected time frame

48 … Example: Operation v6 n Assignment of v 6 to step 2 makes v 7 assigned at step 3 Time frame change from [2, 3] to [3, 3] Variation on force of v 7 = 1*q(3) – ½ * (q(2)+q(3)) = ( )= n Assignment of v 8 to step 2 makes v 9 assigned to step 3 or 4 Time frame change from [2, 3, 4] to [3, 4] Variation on force of v 9 = 1/2*(q(3)+q(4)) – 1/3 * (q(2)+q(3)+q(4)) = 0.5*(2+1.6)-0.3*( )=0.3 n Assignment of v 6 to step 2 makes v 7 assigned at step 3 Time frame change from [2, 3] to [3, 3] Variation on force of v 7 = 1*q(3) – ½ * (q(2)+q(3)) = ( )= n Assignment of v 8 to step 2 makes v 9 assigned to step 3 or 4 Time frame change from [2, 3, 4] to [3, 4] Variation on force of v 9 = 1/2*(q(3)+q(4)) – 1/3 * (q(2)+q(3)+q(4)) = 0.5*(2+1.6)-0.3*( )=0.3

49 Force-Directed List Scheduling: Minimum Latency under Resource Constraints n Outer structure of algorithm same as LIST-L. n Selected candidates determined by Reducing iteratively candidate set U l,k. Operations with least force are deferred. Maximize local concurrency by selecting operations with large force. At each outer iteration of loop, time frames updated. n Outer structure of algorithm same as LIST-L. n Selected candidates determined by Reducing iteratively candidate set U l,k. Operations with least force are deferred. Maximize local concurrency by selecting operations with large force. At each outer iteration of loop, time frames updated.

50 Force-Directed Scheduling Algorithm for Minimum Resources n Operations considered one a time for scheduling n For each iteration Time frames, probabilities and forces computed Operation with least force scheduled n Operations considered one a time for scheduling n For each iteration Time frames, probabilities and forces computed Operation with least force scheduled

51 Scheduling Algorithms for Extended Sequencing Models n For hierarchical sequencing graphs, scheduling performed bottom up. n Computed start times are relative to source vertices in corresponding graph entities. n Timing and resource-constrained scheduling is not straightforward. n Simplifying assumptions No resource can be shared across different graph entities in hierarchy. Timing and resource constraints apply within each graph entity. Schedule each graph entity independently. n For hierarchical sequencing graphs, scheduling performed bottom up. n Computed start times are relative to source vertices in corresponding graph entities. n Timing and resource-constrained scheduling is not straightforward. n Simplifying assumptions No resource can be shared across different graph entities in hierarchy. Timing and resource constraints apply within each graph entity. Schedule each graph entity independently.

52 Scheduling Graphs with Alternative Paths n Assume sequencing graph has alternative paths related to branching constructs. Obtained by expanding branch entities n ILP formulation Resource constraints need to express that operations in alternative paths can be scheduled in same time step without affecting resource usage. n Example Assume that path (v 0, v 8, v 9, v n ) is mutually exclusive with other operations. n Assume sequencing graph has alternative paths related to branching constructs. Obtained by expanding branch entities n ILP formulation Resource constraints need to express that operations in alternative paths can be scheduled in same time step without affecting resource usage. n Example Assume that path (v 0, v 8, v 9, v n ) is mutually exclusive with other operations.

53 Scheduling Graphs with Alternative Paths n Resource constraints x 1,1 +x 2,1 +x 6,1 – a 1  0 x 3,2 +x 6,2 +x 7,2 – a 1  0 x 10,2 +x 11,2 – a 2  0 x 4,3 +x 10,3 +x 11,3 – a 2  0 x 5,4 +x 11,4 – a 2  0 n List scheduling and force-directed scheduling algorithms can support mutually exclusive operations By modifying way resource usage computed. n Resource constraints x 1,1 +x 2,1 +x 6,1 – a 1  0 x 3,2 +x 6,2 +x 7,2 – a 1  0 x 10,2 +x 11,2 – a 2  0 x 4,3 +x 10,3 +x 11,3 – a 2  0 x 5,4 +x 11,4 – a 2  0 n List scheduling and force-directed scheduling algorithms can support mutually exclusive operations By modifying way resource usage computed.