ESE535: Electronic Design Automation

Slides:



Advertisements
Similar presentations
CALTECH CS137 Fall DeHon 1 CS137: Electronic Design Automation Day 19: November 21, 2005 Scheduling Introduction.
Advertisements

ECE 667 Synthesis and Verification of Digital Circuits
EDA (CS286.5b) Day 10 Scheduling (Intro Branch-and-Bound)
Counting the bits Analysis of Algorithms Will it run on a larger problem? When will it fail?
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 8: February 11, 2015 Scheduling Introduction.
CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 14: March 3, 2004 Scheduling Heuristics and Approximation.
Winter 2005ICS 252-Intro to Computer Design ICS 252 Introduction to Computer Design Lecture 5-Scheudling Algorithms Winter 2005 Eli Bozorgzadeh Computer.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 17: April 2, 2008 Scheduling Introduction.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 21: April 15, 2009 Routing 1.
Penn ESE Fall DeHon 1 ESE (ESE534): Computer Organization Day 19: March 26, 2007 Retime 1: Transformations.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 7: February 11, 2008 Static Timing Analysis and Multi-Level Speedup.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 20: April 7, 2008 Scheduling Variants and Approaches.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 5: February 2, 2009 Architecture Synthesis (Provisioning, Allocation)
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 12: March 2, 2009 Sequential Optimization (FSM Encoding)
Tirgul 13. Unweighted Graphs Wishful Thinking – you decide to go to work on your sun-tan in ‘ Hatzuk ’ beach in Tel-Aviv. Therefore, you take your swimming.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 15: March 18, 2009 Static Timing Analysis and Multi-Level Speedup.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 7: February 9, 2009 Scheduling Variants and Approaches.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 5: February 2, 2009 Architecture Synthesis (Provisioning, Allocation)
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 8: February 13, 2008 Retiming.
Design Techniques for Approximation Algorithms and Approximation Classes.
CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 12: February 13, 2002 Scheduling Heuristics and Approximation.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 9: February 16, 2015 Scheduling Variants and Approaches.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 24: April 18, 2011 Covering and Retiming.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 10: February 18, 2015 Architecture Synthesis (Provisioning, Allocation)
Penn ESE525 Spring DeHon 1 ESE535: Electronic Design Automation Day 6: February 4, 2014 Partitioning 2 (spectral, network flow)
CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 7: February 3, 2002 Retiming.
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 23: April 20, 2015 Static Timing Analysis and Multi-Level Speedup.
TU/e Algorithms (2IL15) – Lecture 12 1 Linear Programming.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 20: April 4, 2011 Static Timing Analysis and Multi-Level Speedup.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 15: March 13, 2013 High Level Synthesis II Dataflow Graph Sharing.
Approximation Algorithms based on linear programming.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 11: February 25, 2015 Placement (Intro, Constructive)
TU/e Algorithms (2IL15) – Lecture 12 1 Linear Programming.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 25: April 17, 2013 Covering and Retiming.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 5: January 31, 2011 Scheduling Variants and Approaches.
CALTECH CS137 Fall DeHon 1 CS137: Electronic Design Automation Day 20: November 23, 2005 Scheduling Variants and Approaches.
CS137: Electronic Design Automation
ESE532: System-on-a-Chip Architecture
Solver & Optimization Problems
CS137: Electronic Design Automation
CS137: Electronic Design Automation
ESE535: Electronic Design Automation
CS137: Electronic Design Automation
ESE535: Electronic Design Automation
ESE535: Electronic Design Automation
ESE535: Electronic Design Automation
ESE534: Computer Organization
ESE534: Computer Organization
Objective of This Course
Integer Programming (정수계획법)
ESE535: Electronic Design Automation
ESE535: Electronic Design Automation
Richard Anderson Lecture 28 Coping with NP-Completeness
ESE535: Electronic Design Automation
ESE535: Electronic Design Automation
ESE535: Electronic Design Automation
CS184a: Computer Architecture (Structures and Organization)
ESE535: Electronic Design Automation
Integer Programming (정수계획법)
ESE535: Electronic Design Automation
ESE535: Electronic Design Automation
ESE535: Electronic Design Automation
ESE535: Electronic Design Automation
ESE535: Electronic Design Automation
ICS 252 Introduction to Computer Design
ESE535: Electronic Design Automation
ESE532: System-on-a-Chip Architecture
CS137: Electronic Design Automation
Presentation transcript:

ESE535: Electronic Design Automation Day 16: March 18, 2013 Architecture Synthesis (Provisioning, Allocation) Penn ESE535 Spring 2013 -- DeHon

Today Problem Brute-Force/Exhaustive Greedy Estimators Behavioral (C, MATLAB, …) RTL Gate Netlist Layout Masks Arch. Select Schedule FSM assign Two-level, Multilevel opt. Covering Retiming Placement Routing Today Problem Brute-Force/Exhaustive Greedy Estimators Analytical Provisioning ILP Schedule and Provision Penn ESE535 Spring 2013 -- DeHon

Previously General formulation for scheduled operator sharing VLIW Fast algorithms for scheduling onto fixed resource set List Scheduling Penn ESE535 Spring 2013 -- DeHon

VLIW + X Instruction Memory Address Penn ESE535 Spring 2013 -- DeHon

Today + X How do we determine the set of resources? Penn ESE535 Spring 2013 -- DeHon

Today: Provisioning + X Given Determine: An area budget A graph to schedule A library of operators Determine: Delay minimizing set of operators Or delay-achieving set of operators i.e. select the operator set Penn ESE535 Spring 2013 -- DeHon

Exhaustive Identify all area-feasible operator sets Schedule for each E.g. preclass exercise Schedule for each Select best  optimal Drawbacks? Penn ESE535 Spring 2013 -- DeHon

Exhaustive How large is space of feasible operator sets? As function of operator types – O Types: add, multiply, divide, …. Maximum number of operators of type m Penn ESE535 Spring 2013 -- DeHon

Implication Feasible operator space can be too large to explore exhaustively Penn ESE535 Spring 2013 -- DeHon

Greedy Incremental Start with one of each operator While (there is area to hold an operator) Which single operator Can be added without exceeding area limit? And provides largest benefit/operator-area? Add one operator of that type How long does this run? Tschedule(E)* O(operator-types * A) Weakness? Penn ESE535 Spring 2013 -- DeHon

Example Find best 5 operator solution. Penn ESE535 Spring 2013 -- DeHon

Example Find best 5 operator solution. A B C D E F G H I J K Penn ESE535 Spring 2013 -- DeHon

Example Find best 5 operator solution. One of each. I A E B F C G J K Sq Dia Circ A B E C F D G I H J K Penn ESE535 Spring 2013 -- DeHon

Example Find best 5 operator solution. Two Squares I A E B F C G J K D H Sq Dia Circ A,B C,D E F G I H J K Penn ESE535 Spring 2013 -- DeHon

Example Find best 5 operator solution. Two Diamonds I A E B F C G J K H Sq Dia Circ A B E C F D G I H J K Penn ESE535 Spring 2013 -- DeHon

Example Find best 5 operator solution. Two Circles I A E B F C G J K D H Sq Dia Circ A B E C F D G I H J K Penn ESE535 Spring 2013 -- DeHon

Example Which should greedy add? Find best 5 operator solution. Incremental addition does not accelerate. Penn ESE535 Spring 2013 -- DeHon

Example Two sqs + Two diamonds Find best 5 operator solution. I A E B C G J K D H Sq Dia Circ A,B C,D E,F G,H I J K Max effect: Incremental may not suggest next single addition. (maybe better with cost function that accounts for total slack?) Penn ESE535 Spring 2013 -- DeHon

Analytic Formulation Penn ESE535 Spring 2013 -- DeHon

Challenge Scheduling expensive Results not analytic O(|E|) or O(|E|*log(|V|)) using list-schedule Results not analytic Cannot write an equation around them Bounds are sometimes useful No precedence  is resource bound Often one bound dominates Latency bound unaffected by operator count Penn ESE535 Spring 2013 -- DeHon

Estimations Step 1: estimate with resource bound O(|E|) vs. O(|V|) evaluation Step 2: use estimate in equations T=max(N1/M1,N2/M2,….) Most useful when RB>>CP Penn ESE535 Spring 2013 -- DeHon

Constraints Let Ai be area of operator type i Let Mi by number of operators of type i (start summary of variables on board) Penn ESE535 Spring 2013 -- DeHon

Achieve Time Target Want to achieve a schedule in T cycles Each resource bound must be less than T cycles: Ni/Mi ≤ T Penn ESE535 Spring 2013 -- DeHon

Algebraic Solve Set of equations Assume equality for time bound Ni/Mi ≤ T S Ai Mi ≤ Area Assume equality for time bound Ni/Mi=T  Mi=Ni/T Penn ESE535 Spring 2013 -- DeHon

Rearranging Penn ESE535 Spring 2013 -- DeHon

Bounding T Gives Lower Bound on T Intuition: N of each is right balance given unbounded area; Scale to area available. Penn ESE535 Spring 2013 -- DeHon

Preclass What is Tlower for preclass? Penn ESE535 Spring 2013 -- DeHon

Back Substitute from T to x Mi=Ni/T Mi won’t necessarily be integer Round down definitely feasible solution May have room to move a few up by 1 Reduces range may need to search (just over the residual area once rounded down) Penn ESE535 Spring 2013 -- DeHon

Preclass Mi=Ni/T T>=3 Madd, Mmpy ? Madd = 8/3  2 or 3 Mmpy = 4/3  1 or 2 Penn ESE535 Spring 2013 -- DeHon

Counter Example 1 Unit each Area = 4 Units What would analytic predict? What is best? How does CP compare to RB? Analytic Resource Estimate Most useful when RB>>CP Penn ESE535 Spring 2013 -- DeHon

Analytic Counter Example How would greedy incremental work on this one? Penn ESE535 Spring 2013 -- DeHon

Maybe we can do exhaustive, if we formulate properly. ILP Maybe we can do exhaustive, if we formulate properly. Penn ESE535 Spring 2013 -- DeHon

ILP Integer Linear Programming Formulate set of linear equation constraints (inequalities) Ax0+Bx1+Cx2 ≤ D x0+x1=1 A,B,C,D – constants xi – variables to satisfy No products on variables, just linear weighted sums Can constrain variables to integers No polynomial time guarantee But often practical Solvers exist (significant piece next lecture) Penn ESE535 Spring 2013 -- DeHon

ILP Provision and Schedule Now to make it look like an ILP nail… Formulate operator selection and scheduling as ILP problem Penn ESE535 Spring 2013 -- DeHon

Formulation Integer variables Mi 0-1 (binary) variables xi,j number of operators of type i 0-1 (binary) variables xi,j 1 if node i is scheduled into timestep j 0 otherwise Variable assignment completely specifies operator selection and schedule This formulation for achieving a target time T j ranges 0 to T-1 Penn ESE535 Spring 2013 -- DeHon

Target T  Min T Formulation targets T What if we don’t know T? Want to minimize T? Do binary search for minimum T How does that impact solution time? Penn ESE535 Spring 2013 -- DeHon

Constraints What properties must hold true for a solution to be valid? Total area constraints Not assign too many things to a timestep Assign every node to some timestep Maintain precedence Penn ESE535 Spring 2013 -- DeHon

(1) Total Area Same as before Penn ESE535 Spring 2013 -- DeHon

(2) Not overload timestep For each timestep j For each operator type k Penn ESE535 Spring 2013 -- DeHon

(3) Node is scheduled For each node in graph Can narrow to sum over slack window. Penn ESE535 Spring 2013 -- DeHon

(4) Precedence Holds For each edge from node src to node snk Can narrow to sum over slack windows. Penn ESE535 Spring 2013 -- DeHon

Constraints Total area constraints Not assign too many things to a timestep Assign every node to some timestep Maintain precedence Penn ESE535 Spring 2013 -- DeHon

ILP Solver ILP Solver can take these constraints and find a solution (satisfying assignment) On Wednesday, will see how to start to make this practical Penn ESE535 Spring 2013 -- DeHon

SAT/ILP Scheduling Variant (Demonstration) <if time permits> Penn ESE535 Spring 2013 -- DeHon

Two Constraint Challenge Processing elements have limited memory Instruction memory (data memory) Tasks have different requirements for compute and instruction memory i.e. Run length not correlated to code length No provisioning, scheduling Penn ESE535 Spring 2013 -- DeHon

Plishker Task Example Penn ESE535 Spring 2013 -- DeHon

Task Task: schedule tasks onto PEs obeying both memory and compute capacity limits Example and ILP solution From Plishker et al. NSCD2004 Penn ESE535 Spring 2013 -- DeHon

Task Task: schedule tasks onto PEs obeying both memory and compute capacities  two capacity assignment problem  two capacity bin packing problem Task: i <Ci,Ii> Penn ESE535 Spring 2013 -- DeHon

SAT Packing Ai,j – task i assigned to resource j Variables: Constraints Coverage constraints Uniqueness constraints Cardinality constraints PE compute PE memory Penn ESE535 Spring 2013 -- DeHon

Allow Code Sharing Two tasks of same type can share code Instead of memory capacity Vector of memory usage Compute PE Imem vector As OR of task vectors assigned to it Compute mem space as sum of non-zero vector entry weights (dot product) Penn ESE535 Spring 2013 -- DeHon

Allow Code Sharing Two tasks of same type can share code Task has vector of memory uage Task i needs set of instructions k: Ti,k Compute PE Imem vector OR (all i): PE.Imemj,k+=Ai,j * Ti,k PE Mem space PE.Total_Imemj= (PE.Imemj,k*Instrs(k)) Penn ESE535 Spring 2013 -- DeHon

Symmetries Many symmetries Speedup with symmetry breaking Tasks in same class are equivalent PEs indistinguishable Total ordering on tasks and PEs Add constraints to force tasks to be assigned to PEs by ordering Plishker claims “significant runtime speedup” Using GALENA [DAC 2003] psuedo-Boolean SAT solver Penn ESE535 Spring 2013 -- DeHon

Plishker Task Example Penn ESE535 Spring 2013 -- DeHon

Results Solutions in < 1 second SAT/ILP Solve Greedy (first-fit) binpack SAT/ILP Solve Solutions in < 1 second Penn ESE535 Spring 2013 -- DeHon

Why can they do this? Ignore precedence? Ignore Interconnect? Penn ESE535 Spring 2013 -- DeHon

Why can they do this? Ignore precedence? Ignore Interconnect? feed forward, buffered Ignore Interconnect? Through shared memory, not dominant? Penn ESE535 Spring 2013 -- DeHon

Interconnect Buffers Allow “Software Pipelining” Each data item Spatial we would pipeline, running all three at once Think of each schedule instance as one timestep in spatial pipeline. Penn ESE535 Spring 2013 -- DeHon

Interconnect Buffer A B C A B C A B C 50 100 A B C A B C A B C PE0 PE1 Penn ESE535 Spring 2013 -- DeHon

Round up Algorithms and Runtimes Exhaustive Schedule Greedy Schedule Analytic Estimates ILP formulation Penn ESE535 Spring 2013 -- DeHon

Big Ideas: Estimators Dominating Effects Reformulate as a problem we already have a solution for ILP Technique: Greedy Technique: ILP Penn ESE535 Spring 2013 -- DeHon

Admin Reading for Wednesday on web My grading priority now will be 5a Penn ESE535 Spring 2013 -- DeHon