Presentation is loading. Please wait.

Presentation is loading. Please wait.

ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling.

Similar presentations


Presentation on theme: "ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling."— Presentation transcript:

1 ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling Constructive Algorithms

2 ECE 667 - Synthesis & Verification - Lecture 5 2 Scheduling – a Combinatorial Optimization Problem NP-complete ProblemNP-complete Problem Optimal solutions for special cases and ILPOptimal solutions for special cases and ILP Heuristics - iterative ImprovementsHeuristics - iterative Improvements Heuristics – constructiveHeuristics – constructive Various versions of the problemVarious versions of the problem Unconstrained minimum latencyUnconstrained minimum latency Resource-constrained minimum latencyResource-constrained minimum latency Timing constrained minimum latencyTiming constrained minimum latency Latency-constrained minimumLatency-constrained minimum If all resources are identical, problem is reduced to multiprocessor scheduling (Hu’s algorithm)If all resources are identical, problem is reduced to multiprocessor scheduling (Hu’s algorithm) Minimum latency multiprocessor problem is intractableMinimum latency multiprocessor problem is intractable

3 ECE 667 - Synthesis & Verification - Lecture 5 3 Scheduling - Iterative Improvement Kernighan - Lin (deterministic)Kernighan - Lin (deterministic) Simulated AnnealingSimulated Annealing Lottery Iterative ImprovementLottery Iterative Improvement Neural NetworksNeural Networks Genetic AlgorithmsGenetic Algorithms Taboo SearchTaboo Search

4 ECE 667 - Synthesis & Verification - Lecture 5 4 Scheduling - Constructive Techniques Most ConstrainedMost Constrained Least ConstrainingLeast Constraining

5 ECE 667 - Synthesis & Verification - Lecture 5 5 Force Directed Scheduling Goal is to reduce hardware by balancing concurrencyGoal is to reduce hardware by balancing concurrency Iterative algorithm, one operation scheduled per iterationIterative algorithm, one operation scheduled per iteration Information (i.e. speed & area) fed back into schedulerInformation (i.e. speed & area) fed back into scheduler

6 ECE 667 - Synthesis & Verification - Lecture 5 6 The Force Directed Scheduling Algorithm

7 ECE 667 - Synthesis & Verification - Lecture 5 7 Step 1 Determine ASAP and ALAP schedulesDetermine ASAP and ALAP schedules * - + * * * + < * * - * - + *** + < ** - ASAP ALAP

8 ECE 667 - Synthesis & Verification - Lecture 5 8 Step 2 Determine Time Frame of each opDetermine Time Frame of each op –Length of box ~ Possible execution cycles –Width of box ~ Probability of assignment –Uniform distribution, Area assigned = 1 C-step 1 C-step 2 C-step 3 C-step 4 Time Frames * - * * - * * * + < + 1/2 1/3

9 ECE 667 - Synthesis & Verification - Lecture 5 9 Step 3 Create Distribution GraphsCreate Distribution Graphs –Sum of probabilities of each Op type –Indicates concurrency of similar Ops DG(i) =  Prob(Op, i) DG for Multiply DG for Add, Sub, Comp

10 ECE 667 - Synthesis & Verification - Lecture 5 10 Diff Eq Example: Precedence Graph Recalled

11 ECE 667 - Synthesis & Verification - Lecture 5 11 Diff Eq Example: Time Frame & Probability Calculation

12 ECE 667 - Synthesis & Verification - Lecture 5 12 Diff Eq Example: DG Calculation

13 ECE 667 - Synthesis & Verification - Lecture 5 13 Conditional Statements Operations in different branches are mutually exclusiveOperations in different branches are mutually exclusive Operations of same type can be overlapped onto DGOperations of same type can be overlapped onto DG Probability of most likely operation is added to DGProbability of most likely operation is added to DG DG for Add - + - + + Fork Join + - + - +

14 ECE 667 - Synthesis & Verification - Lecture 5 14 Self Forces Scheduling an operation will effect overall concurrency n Every operation has 'self force' for every C-step of its time frame Analogous to the effect of a spring: f = K x Desirable scheduling will have negative self force l Will achieve better concurrency (lower potential energy ) Force(i) = DG(i) * x(i) DG(i) ~ Current Distribution Graph value x(i) ~ Change in operation’s probability Self Force(j) = [Force(i)]

15 ECE 667 - Synthesis & Verification - Lecture 5 15 Example n Attempt to schedule multiply in C-step 1 Self Force(1) = Force(1) + Force(2) = ( DG(1) * X(1) ) + ( DG(2) * X(2) ) = [2.833*(0.5) + 2.333 * (-0.5)] = +0.25 n This is positive, scheduling the multiply in the first C-step would be bad DG for Multiply * - * * - * * * + < + C-step 1 C-step 2 C-step 3 C-step 4 1/2 1/3

16 ECE 667 - Synthesis & Verification - Lecture 5 16 Diff Eq Example: Self Force for Node 4

17 ECE 667 - Synthesis & Verification - Lecture 5 17 Predecessor & Successor Forces Scheduling an operation may affect the time frames of other linked operationsScheduling an operation may affect the time frames of other linked operations This may negate the benefits of the desired assignment This may negate the benefits of the desired assignment Predecessor/Successor Forces = Sum of Self Forces of any implicitly scheduled operations Predecessor/Successor Forces = Sum of Self Forces of any implicitly scheduled operations * - + * * * + < * * -

18 ECE 667 - Synthesis & Verification - Lecture 5 18 Diff Eq Example: Successor Force on Node 4 If node 4 scheduled in step 1If node 4 scheduled in step 1 –no effect on time frame for successor node 8 Total force = Froce4(1) = +0.25Total force = Froce4(1) = +0.25 If node 4 scheduled in step 2If node 4 scheduled in step 2 –causes node 8 to be scheduled into step 3 –must calculate successor force

19 ECE 667 - Synthesis & Verification - Lecture 5 19 Diff Eq Example: Final Time Frame and Schedule

20 ECE 667 - Synthesis & Verification - Lecture 5 20 Diff Eq Example: Final DG

21 ECE 667 - Synthesis & Verification - Lecture 5 21 Lookahead Temporarily modify the constant DG(i) to include the effect of the iteration being consideredTemporarily modify the constant DG(i) to include the effect of the iteration being considered Force (i) = temp_DG(i) * x(i) temp_DG(i) = DG(i) + x(i)/3 Consider previous example:Consider previous example: Self Force(1) = (DG(1) + x(1)/3)x(1) + (DG(2) + x(2)/3)x(2) =.5(2.833 +.5/3) -.5(2.333 -.5/3) =.5(2.833 +.5/3) -.5(2.333 -.5/3) = +.41667 = +.41667 This is even worse than beforeThis is even worse than before

22 ECE 667 - Synthesis & Verification - Lecture 5 22 Minimization of Bus Costs Basic algorithm suitable for narrow class of problemsBasic algorithm suitable for narrow class of problems Algorithm can be refined to consider “cost” factorsAlgorithm can be refined to consider “cost” factors Number of buses ~ number of concurrent data transfersNumber of buses ~ number of concurrent data transfers Number of buses = maximum # transfers in any C-stepNumber of buses = maximum # transfers in any C-step Create modified DG to include transfers: Transfer DGCreate modified DG to include transfers: Transfer DG Trans DG(i) =  [Prob (op,i) * Opn_No_InOuts] Opn_No_InOuts ~ combined distinct in/outputs for Op Calculate Force with this DG and add to Self ForceCalculate Force with this DG and add to Self Force

23 ECE 667 - Synthesis & Verification - Lecture 5 23 Minimization of Register Costs Minimum no. registers required is given by the largest number of data arcs crossing a C-step boundaryMinimum no. registers required is given by the largest number of data arcs crossing a C-step boundary Create Storage Operations, at output of any operation that transfers a value to a destination in a later C-stepCreate Storage Operations, at output of any operation that transfers a value to a destination in a later C-step Generate Storage DG for these “operations”Generate Storage DG for these “operations” Length of storage operation depends on final scheduleLength of storage operation depends on final schedule

24 ECE 667 - Synthesis & Verification - Lecture 5 24 Minimization of Register Costs ( contd.) avg life] =avg life] = storage DG(i) = (no overlap between ASAP & ALAP)storage DG(i) = (no overlap between ASAP & ALAP) storage DG(i) = (if overlap)storage DG(i) = (if overlap) Calculate and add “Storage” Force to Self ForceCalculate and add “Storage” Force to Self Force 7 registers minimum ASAPForce Directed 5 registers minimum

25 ECE 667 - Synthesis & Verification - Lecture 5 25 Pipelining * * * *** + + < - - * * * *** + + < - - DG for Multiply 1 2 3, 1’ 4, 2’ 3’ 4’ Instance Instance’ Functional Pipelining 1 2 3 4 * * Structural Pipelining Functional PipeliningFunctional Pipelining –Pipelining across multiple operations –Must balance distribution across groups of concurrent C-steps –Cut DG horizontally and superimpose –Finally perform regular Force Directed Scheduling Structural PipeliningStructural Pipelining –Pipelining within an operation –For non data-dependant operations, only the first C-step need be considered

26 ECE 667 - Synthesis & Verification - Lecture 5 26 Other Optimizations Local timing constraintsLocal timing constraints –Insert dummy timing operations -> Restricted time frames Multiclass FU’sMulticlass FU’s –Create multiclass DG by summing probabilities of relevant ops Multistep/Chained operations.Multistep/Chained operations. –Carry propagation delay information with operation –Extend time frames into other C-steps as required Hardware constraintsHardware constraints –Use Force as priority function in list scheduling algorithms

27 ECE 667 - Synthesis & Verification - Lecture 5 27 Scheduling using Simulated Annealing Reference: Devadas, S.; Newton, A.R. Algorithms for hardware allocation in data path synthesis. Algorithms for hardware allocation in data path synthesis. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, July 1989, Vol.8, (no.7):768-81. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, July 1989, Vol.8, (no.7):768-81.

28 ECE 667 - Synthesis & Verification - Lecture 5 28 Simulated Annealing Local Search Solution space Cost function ?

29 ECE 667 - Synthesis & Verification - Lecture 5 29 Statistical Mechanics Combinatorial Optimization State {r:} (configuration -- a set of atomic position ) weight e -E({r:])/K B T -- Boltzmann distribution E({r:]): energy of configuration K B : Boltzmann constant T: temperature Low temperature limit ??

30 ECE 667 - Synthesis & Verification - Lecture 5 30 Analogy Physical System State (configuration) Energy Ground State Rapid Quenching Careful Annealing Optimization Problem Solution Cost Function Optimal Solution Iteration Improvement Simulated Annealing

31 ECE 667 - Synthesis & Verification - Lecture 5 31 Generic Simulated Annealing Algorithm 1. Get an initial solution S 2. Get an initial temperature T > 0 3. While not yet 'frozen' do the following: 3.1 For 1  i  L, do the following: 3.1.1 Pick a random neighbor S'of S 3.1.2 Let  =cost(S') - cost(S) 3.1.3 If   0 (downhill move) set S = S' 3.1.4 If  >0 (uphill move) set S=S' with probability e -  /T 3.2 Set T = rT (reduce temperature) 4. Return S

32 ECE 667 - Synthesis & Verification - Lecture 5 32 Basic Ingredients for S.A. Solution SpaceSolution Space Neighborhood StructureNeighborhood Structure Cost FunctionCost Function Annealing ScheduleAnnealing Schedule

33 ECE 667 - Synthesis & Verification - Lecture 5 33 Observation All scheduling algorithms we have discussed so far are critical path schedulersAll scheduling algorithms we have discussed so far are critical path schedulers They can only generate schedules for iteration period larger than or equal to the critical pathThey can only generate schedules for iteration period larger than or equal to the critical path They only exploit concurrency within a single iteration, and only utilize the intra-iteration precedence constraintsThey only exploit concurrency within a single iteration, and only utilize the intra-iteration precedence constraints

34 ECE 667 - Synthesis & Verification - Lecture 5 34 Example Can one do better than iteration period of 4?Can one do better than iteration period of 4? –Pipelining + retiming can reduce critical path to 3, and also the # of functional units ApproachesApproaches –Transformations followed by scheduling –Transformations integrated with scheduling


Download ppt "ECE 667 - Synthesis & Verification - Lecture 5 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling."

Similar presentations


Ads by Google