# ECE260B – CSE241A Placement.1http://vlsicad.ucsd.edu ECE260B – CSE241A Winter 2005 Placement Website: Slides.

## Presentation on theme: "ECE260B – CSE241A Placement.1http://vlsicad.ucsd.edu ECE260B – CSE241A Winter 2005 Placement Website: Slides."— Presentation transcript:

ECE260B – CSE241A Placement.1http://vlsicad.ucsd.edu ECE260B – CSE241A Winter 2005 Placement Website: http://vlsicad.ucsd.edu/courses/ece260b-w05 Slides courtesy of Prof. Andrew B. Kahng

ECE260B – CSE241A Placement.2http://vlsicad.ucsd.edu VLSI Design Flow and Physical Design Stage Definitions: Cell: a circuit component to be placed on the chip area. In placement, the functionality of the component is ignored. Net: specifying a subset of terminals, to connect several cells. Netlist: a set of nets which contains the connectivity information of the circuit. Global Placement Detail Placement Clock Tree Synthesis and Routing Global Routing Detail Routing Power/Ground Stripes, Rings Routing Extraction and Delay Calc. Timing Verification IO Pad Placement

ECE260B – CSE241A Placement.3http://vlsicad.ucsd.edu Placement Problem Input: A set of cells and their complete information (a cell library). Connectivity information between cells (netlist information). Output: A set of locations on the chip: one location for each cell. Goal: The cells are placed to produce a routable chip that meets timing and other constraints (e.g., low-power, noise, etc.) Challenge: The number of cells in a design is very large (> 1 million). The timing constraints are very tight.

ECE260B – CSE241A Placement.4http://vlsicad.ucsd.edu A BC Optimal Relative Order:

ECE260B – CSE241A Placement.7http://vlsicad.ucsd.edu A BC Place to the left

ECE260B – CSE241A Placement.8http://vlsicad.ucsd.edu A BC … or to the right

ECE260B – CSE241A Placement.9http://vlsicad.ucsd.edu A BC Optimal Relative Order: Without “free” space, the placement problem is dominated by order

ECE260B – CSE241A Placement.10http://vlsicad.ucsd.edu Placement Problem A bad placement A good placement

ECE260B – CSE241A Placement.11http://vlsicad.ucsd.edu Global and Detailed Placement Global Placement Detailed Placement In global placement, we decide the approximate locations for cells by placing cells in global bins. In detailed placement, we make some local adjustment to obtain the final non- overlapping placement.

ECE260B – CSE241A Placement.12http://vlsicad.ucsd.edu Placement Footprints: Standard Cell: Data Path: IP - Floorplanning

ECE260B – CSE241A Placement.13http://vlsicad.ucsd.edu Core Control IO Reserved areas Mixed Data Path & sea of gates: Placement Footprints:

ECE260B – CSE241A Placement.14http://vlsicad.ucsd.edu Perimeter IO Area IO Placement Footprints:

ECE260B – CSE241A Placement.15http://vlsicad.ucsd.edu Placement objectives are subject to user constraints / design style: Hierarchical Design Constraints pin location power rail reserved layers Flat Design with Floorplan Constraints Fixed Circuits I/O Connections Hierarchical Design Constraints pin location power rail reserved layers Flat Design with Floorplan Constraints Fixed Circuits I/O Connections

ECE260B – CSE241A Placement.16http://vlsicad.ucsd.edu Standard Cells

ECE260B – CSE241A Placement.17http://vlsicad.ucsd.edu Standard Cells Power connected by abutment, placed in sea-of-rows Rarely rotated DRC clean in any combination Circuit clean (I.e. no naked T-gates, no huge input capacitances) 8,9,10+ tracks in height Metal 1 only used (hopefully) Multi-height stdcells possible Buffers: sizes, intrinsic delay steps, optimal repeater selection Special clock buffers + gates (balanced P:N) Special metastability hardened flops Cap cells (metal1 used?) Gap fillers (metal1 used?) Tie-high, tie-low Power connected by abutment, placed in sea-of-rows Rarely rotated DRC clean in any combination Circuit clean (I.e. no naked T-gates, no huge input capacitances) 8,9,10+ tracks in height Metal 1 only used (hopefully) Multi-height stdcells possible Buffers: sizes, intrinsic delay steps, optimal repeater selection Special clock buffers + gates (balanced P:N) Special metastability hardened flops Cap cells (metal1 used?) Gap fillers (metal1 used?) Tie-high, tie-low

ECE260B – CSE241A Placement.18http://vlsicad.ucsd.edu Unconstrained Placement

ECE260B – CSE241A Placement.19http://vlsicad.ucsd.edu Floor planned Placement

ECE260B – CSE241A Placement.20http://vlsicad.ucsd.edu Placement Cube (4D) Cost Function(s) to be used Cut, wirelength, congestion, crossing,... Algorithm(s) to be used FM, Quadratic, annealing, …. Granularity of the netlist Coarseness of the layout domain 2x2, 4x4, …. An effective methodology picks the right mix from the above and knows when to switch from one to next. Most methods today are ad-hoc Cost Function(s) to be used Cut, wirelength, congestion, crossing,... Algorithm(s) to be used FM, Quadratic, annealing, …. Granularity of the netlist Coarseness of the layout domain 2x2, 4x4, …. An effective methodology picks the right mix from the above and knows when to switch from one to next. Most methods today are ad-hoc Algorithm Cost Function Netlist Granularity Layout Coarseness

ECE260B – CSE241A Placement.21http://vlsicad.ucsd.edu Advantages of Hierarchy Design is carved into smaller pieces that can be worked on in parallel (improved throughput) A known floor plan provides the logic design team with a large degree of placement control. A known floor plan provided early knowledge of long wires Timing closure problems can be addressed by tools, logic design, and hierarchy manipulation Late design changes can be done with minimal turmoil to the entire design Design is carved into smaller pieces that can be worked on in parallel (improved throughput) A known floor plan provides the logic design team with a large degree of placement control. A known floor plan provided early knowledge of long wires Timing closure problems can be addressed by tools, logic design, and hierarchy manipulation Late design changes can be done with minimal turmoil to the entire design

ECE260B – CSE241A Placement.22http://vlsicad.ucsd.edu Disadvantages of Hierarchy Results depend on the quality of the hierarchy. The logic hierarchy must be designed with Physical Design taken into account. Additional methodology requirements must be met to enable hierarchy. Ex. Pin assignment, Macro abstract management, area budgeting, floor planning, timing budgets, etc Late design changes may affect multiple components. Hierarchy allows divergent methodologies Hierarchy hinders Design Automation algorithms. They can no longer perform global optimizations. Results depend on the quality of the hierarchy. The logic hierarchy must be designed with Physical Design taken into account. Additional methodology requirements must be met to enable hierarchy. Ex. Pin assignment, Macro abstract management, area budgeting, floor planning, timing budgets, etc Late design changes may affect multiple components. Hierarchy allows divergent methodologies Hierarchy hinders Design Automation algorithms. They can no longer perform global optimizations.

ECE260B – CSE241A Placement.23http://vlsicad.ucsd.edu Traditional Placement Algorithms Quadratic Placement Simulated Annealing Bi-Partitioning / Quadrisection Force Directed Placement Hybrid Quadratic Placement Simulated Annealing Bi-Partitioning / Quadrisection Force Directed Placement Hybrid Algorithm Cost Function Netlist Granularity Layout Coarseness

ECE260B – CSE241A Placement.24http://vlsicad.ucsd.edu Quadratic Placement  Analytical Technique x4 x3 x1 x2 Min [(x1-x3) 2 + (x1-x2) 2 + (x2-x4) 2 ] : F  F/  x1 = 0;  F/  x2 = 0; Ax = B 2 -1 -1 2 x = x1x2 A = B = x3x4

ECE260B – CSE241A Placement.25http://vlsicad.ucsd.edu Analytical Placement Get a solution with lots of overlap What do we do with the overlap? Get a solution with lots of overlap What do we do with the overlap?

ECE260B – CSE241A Placement.26http://vlsicad.ucsd.edu Pros and Cons of QP  Pros:  Very Fast Analytical Solution  Can Handle Large Design Sizes  Can be Used as an Initial Seed Placement Engine  Cons:  Can Generate Overlapped Solutions: Postprocessing Needed  Not Suitable for Timing Driven Placement  Not Suitable for Simultaneous Optimization of Other Aspects of Physical Design (clocks, crosstalk…)  Gives Trivial Solutions without Pads (and close to trivial with pads)  Pros:  Very Fast Analytical Solution  Can Handle Large Design Sizes  Can be Used as an Initial Seed Placement Engine  Cons:  Can Generate Overlapped Solutions: Postprocessing Needed  Not Suitable for Timing Driven Placement  Not Suitable for Simultaneous Optimization of Other Aspects of Physical Design (clocks, crosstalk…)  Gives Trivial Solutions without Pads (and close to trivial with pads)

ECE260B – CSE241A Placement.27http://vlsicad.ucsd.edu Simulated Annealing Placement  Initial Placement Improved through Swaps and Moves Swaps and Moves  Accept a Swap/Move if it improves cost  Accept a Swap/Move that degrades cost under some probability conditions Time Cost

ECE260B – CSE241A Placement.28http://vlsicad.ucsd.edu Pros and Cons of SA  Pros:  Can Reach Globally Optimal Solution (given “enough” time)  Open Cost Function.  Can Optimize Simultaneously all Aspects of Physical Design  Can be Used for End Case Placement  Cons:  Extremely Slow Process of Reaching a Good Solution  Pros:  Can Reach Globally Optimal Solution (given “enough” time)  Open Cost Function.  Can Optimize Simultaneously all Aspects of Physical Design  Can be Used for End Case Placement  Cons:  Extremely Slow Process of Reaching a Good Solution

ECE260B – CSE241A Placement.30http://vlsicad.ucsd.edu Pros and Cons of Partitioning Based Placement  Pros:  More Suitable to Timing Driven Placement since it is Move Based  New Innovation (hMetis) in Partitioning Algorithms have made this Extremely Fast  Open Cost Function  Move Based means Simultaneous Optimization of all Design Aspects Possible  Cons:  Not Well Understood  Lots of “indifferent” moves  May not work well with some cost functions.  Pros:  More Suitable to Timing Driven Placement since it is Move Based  New Innovation (hMetis) in Partitioning Algorithms have made this Extremely Fast  Open Cost Function  Move Based means Simultaneous Optimization of all Design Aspects Possible  Cons:  Not Well Understood  Lots of “indifferent” moves  May not work well with some cost functions.

ECE260B – CSE241A Placement.31http://vlsicad.ucsd.edu Hypergraphs in VLSI CAD Circuit netlist represented by hypergraph

ECE260B – CSE241A Placement.32http://vlsicad.ucsd.edu Hypergraph Partitioning in VLSI Variants directed/undirected hypergraphs weighted/unweighted vertices, edges constraints, objectives, … Human-designed instances Benchmarks up to 4,000,000 vertices sparse (vertex degree  4, hyperedge size  4) small number of very large hyperedges Efficiency, flexibility: KL-FM style preferred Variants directed/undirected hypergraphs weighted/unweighted vertices, edges constraints, objectives, … Human-designed instances Benchmarks up to 4,000,000 vertices sparse (vertex degree  4, hyperedge size  4) small number of very large hyperedges Efficiency, flexibility: KL-FM style preferred

ECE260B – CSE241A Placement.33http://vlsicad.ucsd.edu Context: Top-Down VLSI Placement etc

ECE260B – CSE241A Placement.34http://vlsicad.ucsd.edu Context: Top-Down Placement Speed 6,000 cells/minute to final detailed placement partitioning used only in top-down global placement implied partitioning runtime: 1 second for 25,000 cells, < 30 seconds for 750,000 cells Structure tight balance constraint on total cell areas in partitions widely varying cell areas fixed terminals (pads, terminal propagation, etc.) Speed 6,000 cells/minute to final detailed placement partitioning used only in top-down global placement implied partitioning runtime: 1 second for 25,000 cells, < 30 seconds for 750,000 cells Structure tight balance constraint on total cell areas in partitions widely varying cell areas fixed terminals (pads, terminal propagation, etc.)

ECE260B – CSE241A Placement.36http://vlsicad.ucsd.edu Cut During One Pass (Bipartitioning) Moves Cut

ECE260B – CSE241A Placement.37http://vlsicad.ucsd.edu Multilevel Partitioning RefinementClustering

ECE260B – CSE241A Placement.38http://vlsicad.ucsd.edu Force Directed Placement  Cells are dragged by forces.  Forces are generated by nets connecting cells. Longer nets generate bigger forces.  Placement is obtained by either a constructive or an iterative method. i j F ij i

ECE260B – CSE241A Placement.39http://vlsicad.ucsd.edu Pros and Cons of Force Directed Placement  Pros:  Very Fast Analytical Solution  Can Handle Large Design Sizes  Can be Used as an Initial Seed Placement Engine  The Force  Cons:  Not sensitive to the non-overlapping constraints  Gives Trivial Solutions without Pads  Not Suitable for Timing Driven Placement  Pros:  Very Fast Analytical Solution  Can Handle Large Design Sizes  Can be Used as an Initial Seed Placement Engine  The Force  Cons:  Not sensitive to the non-overlapping constraints  Gives Trivial Solutions without Pads  Not Suitable for Timing Driven Placement

ECE260B – CSE241A Placement.40http://vlsicad.ucsd.edu Hybrid Placement  Mix-matching different placement algorithms  Effective algorithms are always hybrid  Mix-matching different placement algorithms  Effective algorithms are always hybrid

ECE260B – CSE241A Placement.41http://vlsicad.ucsd.edu GORDIAN (quadratic + partitioning) Partition and Replace Initial Placement

ECE260B – CSE241A Placement.42http://vlsicad.ucsd.edu Congestion Minimization Traditional placement problem is to minimize interconnection length (wirelength) A valid placement has to be routable Congestion is important because it represents routability (lower congestion implies better routability) There is not yet enough research work on the congestion minimization problem Traditional placement problem is to minimize interconnection length (wirelength) A valid placement has to be routable Congestion is important because it represents routability (lower congestion implies better routability) There is not yet enough research work on the congestion minimization problem

ECE260B – CSE241A Placement.43http://vlsicad.ucsd.edu Definition of Congestion Routing demand = 3 Assume routing supply is 1, overflow = 3 - 1 = 2 on this edge. Overflow =  overflow  all edges Overflow on each edge = Routing Demand - Routing Supply (if Routing Demand > Routing Supply) 0 (otherwise)

ECE260B – CSE241A Placement.44http://vlsicad.ucsd.edu Correlation between Wirelength and Congestion Total Wirelength = Total Routing Demand

ECE260B – CSE241A Placement.45http://vlsicad.ucsd.edu Wirelength  Congestion A congestion minimized placement A wirelength minimized placement

ECE260B – CSE241A Placement.46http://vlsicad.ucsd.edu Congestion Map of a Wirelength Minimized Placement Congested Spots

ECE260B – CSE241A Placement.47http://vlsicad.ucsd.edu Congestion MAP

ECE260B – CSE241A Placement.48http://vlsicad.ucsd.edu Congestion Reduction Postprocessing Reduce congestion globallyby minimizing thetraditional wirelength Post process the wirelengthoptimized placement usingthe congestion objective

ECE260B – CSE241A Placement.49http://vlsicad.ucsd.edu Among a variety of cost functions and methods for congestion minimization, wirelength alone followed by a post processing congestion minimization works the best and is one of the fastest. Cost functions such as a hybrid length plus congestion do not work very well. Among a variety of cost functions and methods for congestion minimization, wirelength alone followed by a post processing congestion minimization works the best and is one of the fastest. Cost functions such as a hybrid length plus congestion do not work very well. Congestion Reduction Postprocessing

ECE260B – CSE241A Placement.50http://vlsicad.ucsd.edu Cost Functions for Placement  The final goal of placement is to achieve routability and meet timing constraints  Constraints are very hard to use in optimization, thus we use cost functions (e.g., Wirelength) to predict our goals.  We will show what happens when you try constraints directly  The main challenge is a technical understanding of various cost functions and their interaction.  The final goal of placement is to achieve routability and meet timing constraints  Constraints are very hard to use in optimization, thus we use cost functions (e.g., Wirelength) to predict our goals.  We will show what happens when you try constraints directly  The main challenge is a technical understanding of various cost functions and their interaction.

ECE260B – CSE241A Placement.51http://vlsicad.ucsd.edu Prediction What is prediction ? every system has some critical cost functions: Area, wirelength, congestion, timing etc. Prediction aims at estimating values of these cost functions without having to go through the time- consuming process of full construction. Allows quick space exploration, localizes the search For example: statistical wire-load models Wirelength in placement What is prediction ? every system has some critical cost functions: Area, wirelength, congestion, timing etc. Prediction aims at estimating values of these cost functions without having to go through the time- consuming process of full construction. Allows quick space exploration, localizes the search For example: statistical wire-load models Wirelength in placement

ECE260B – CSE241A Placement.52http://vlsicad.ucsd.edu Paradigms of Prediction Two fundamental paradigms statistical prediction #of two-terminal nets in all designs #of two-terminal nets with length greater than 10 in all designs constructive prediction #of two-terminal nets with length greater than 10 in this design … and everything in between, e.g., #of critical two-terminal nets in a design based on statistical data and a quick inspection of the design in hand. “Absolute truth” or “I need it to make progress” SLIP (System Level Interconnect Prediction) community. Two fundamental paradigms statistical prediction #of two-terminal nets in all designs #of two-terminal nets with length greater than 10 in all designs constructive prediction #of two-terminal nets with length greater than 10 in this design … and everything in between, e.g., #of critical two-terminal nets in a design based on statistical data and a quick inspection of the design in hand. “Absolute truth” or “I need it to make progress” SLIP (System Level Interconnect Prediction) community.

ECE260B – CSE241A Placement.53http://vlsicad.ucsd.edu Cost Functions for Placement  Net-cut  Linear wirelength  Quadratic wirelength  Congestion  Timing  Coupling  Other performance related cost functions  Undiscovered: crossing  Net-cut  Linear wirelength  Quadratic wirelength  Congestion  Timing  Coupling  Other performance related cost functions  Undiscovered: crossing Algorithm Cost Function Netlist Granularity Layout Coarseness

ECE260B – CSE241A Placement.54http://vlsicad.ucsd.edu Net-cut Cost for Global Placement  The net-cut cost is defined as the number of external nets between different global bins  Minimizing net-cut in global placement tends to put highly connected cells close to each other.

ECE260B – CSE241A Placement.55http://vlsicad.ucsd.edu Linear Wirelength Cost The linear length of a net between cell 1 and cell 2 is l 12 = |x1-x2| +|y1-y2| The linear wirelength cost is the summation of the linear length of all nets. (x1,y1) (x2,y2) 1 2

ECE260B – CSE241A Placement.56http://vlsicad.ucsd.edu Quadratic Wirelength Cost The quadratic length of a net between cell 1 and cell 2 is l 12 = (x1-x2) 2 +(y1-y2) 2 The quadratic wirelength cost is the summation of the quadratic length of all nets. (x1,y1) (x2,y2) 1 2

ECE260B – CSE241A Placement.57http://vlsicad.ucsd.edu Congestion Cost Routing demand = 3 Assume routing supply is 1, overflow = 3 - 1 = 2 on this edge. Congestion Overflow =  overflow  all edges Overflow on each edge = Routing Demand - Routing Supply (if Routing Demand > Routing Supply) 0 (otherwise)

ECE260B – CSE241A Placement.58http://vlsicad.ucsd.edu Cost Functions for Placement  Various cost functions (and a mix of them) have been used in practice to model/estimate routability and timing  We have a good “feel” for what each cost function is capable of doing  We need to understand the interaction among cost functions  Various cost functions (and a mix of them) have been used in practice to model/estimate routability and timing  We have a good “feel” for what each cost function is capable of doing  We need to understand the interaction among cost functions

ECE260B – CSE241A Placement.59http://vlsicad.ucsd.edu Congestion Minimization and Congestion vs Wirelength  Congestion is important because it closely represents routability (especially at lower-levels of granularity)  Congestion is not well understood  Ad-hoc techniques have been kind-of working since congestion has never been severe  It has been observed that length minimization tends to reduce congestion.  Goal: Reduce congestion in placement (willing to sacrifice wirelength a little bit).  Congestion is important because it closely represents routability (especially at lower-levels of granularity)  Congestion is not well understood  Ad-hoc techniques have been kind-of working since congestion has never been severe  It has been observed that length minimization tends to reduce congestion.  Goal: Reduce congestion in placement (willing to sacrifice wirelength a little bit).

ECE260B – CSE241A Placement.60http://vlsicad.ucsd.edu Correlation between Wirelength and Congestion Total Wirelength = Total Routing Demand

ECE260B – CSE241A Placement.61http://vlsicad.ucsd.edu Wirelength  Congestion A congestion minimized placement A wirelength minimized placement

ECE260B – CSE241A Placement.62http://vlsicad.ucsd.edu Congestion Map of a Wirelength Minimized Placement Congested Spots

ECE260B – CSE241A Placement.63http://vlsicad.ucsd.edu Different Routing Models for modeling congestion Bounding box router: fast but inaccurate. Real router: accurate but slow. A bounding box router can be used in placement if it produces correlated routing results with the real router. Note: For different cost functions, answer might be different (e.g., for coupling, only a detailed router can answer). Bounding box router: fast but inaccurate. Real router: accurate but slow. A bounding box router can be used in placement if it produces correlated routing results with the real router. Note: For different cost functions, answer might be different (e.g., for coupling, only a detailed router can answer).

ECE260B – CSE241A Placement.64http://vlsicad.ucsd.edu Different Routing Models A bounding box routing model A MST+shortest_path routing model

ECE260B – CSE241A Placement.65http://vlsicad.ucsd.edu Objective Functions Used in Congestion Minimization WL: Standard total wirelength objective. Ovrflw: Total overflow in a placement (a direct congestion cost). Hybrid: (1-  )WL +  Ovrflw QL: A quadratic plus linear objective. LQ: A linear plus quadratic objective. LkAhd: A modified overflow cost. (1-  T )WL +  T Ovrflw: A time changing hybrid objective which let the cost function gradually change from wirelength to overflow as optimization proceeds. WL: Standard total wirelength objective. Ovrflw: Total overflow in a placement (a direct congestion cost). Hybrid: (1-  )WL +  Ovrflw QL: A quadratic plus linear objective. LQ: A linear plus quadratic objective. LkAhd: A modified overflow cost. (1-  T )WL +  T Ovrflw: A time changing hybrid objective which let the cost function gradually change from wirelength to overflow as optimization proceeds.

ECE260B – CSE241A Placement.66http://vlsicad.ucsd.edu Post Processing to Reduce Congestion Reduce congestion globallyby minimizing thetraditional wirelength Post process the wirelengthoptimized placement usingthe congestion objective

ECE260B – CSE241A Placement.67http://vlsicad.ucsd.edu Post Processing Heuristics Greedy cell-centric algorithm: Greedily move cells around and greedily accept moves. Flow-based cell-centric algorithm: Use a flow-based approach to move cells. Net-centric algorithm: Move nets with bigger contributions to the congestion first. Greedy cell-centric algorithm: Greedily move cells around and greedily accept moves. Flow-based cell-centric algorithm: Use a flow-based approach to move cells. Net-centric algorithm: Move nets with bigger contributions to the congestion first.

ECE260B – CSE241A Placement.68http://vlsicad.ucsd.edu Greedy Cell-centric Heuristic

ECE260B – CSE241A Placement.69http://vlsicad.ucsd.edu Flow-based Cell-centric Heuristic Cell Nodes Bin Nodes

ECE260B – CSE241A Placement.70http://vlsicad.ucsd.edu Net-centric Heuristic 2 1 2 2 2 1 1

ECE260B – CSE241A Placement.71http://vlsicad.ucsd.edu From Global Placement to Detailed Placement Global Placement: Assumingall the cells are placed at thecenters of global bins. Detailed Placement: Cells areplaced without overlapping.

ECE260B – CSE241A Placement.72http://vlsicad.ucsd.edu Correlation Between Global and Detailed Placement WL g : Wirelength optimized global placement. CON g : Wirelength optimized detailed placement. WL d : Congestion optimized global placement. CON d : Congestion optimized detailed placement. Conclusion: Congestion at detailed placement level is correlated with congestion at global placement level. Thus reducing congestion inglobal placement helps reduce congestion in final detailed placement.

ECE260B – CSE241A Placement.73http://vlsicad.ucsd.edu CongestionCongestion Wirelength minimization can minimize congestion globally. A post processing congestion minimization following wirelength minimization works the best to reduce congestion in placement. A number of congestion-related cost functions were tested, including a hybrid length plus congestion (commonly believed to be very effective). Experiments prove that they do not work very well. Net-centric post processing techniques are very effective to minimize congestion. Congestion at the global placement level, correlates well with congestion of detailed placement. Wirelength minimization can minimize congestion globally. A post processing congestion minimization following wirelength minimization works the best to reduce congestion in placement. A number of congestion-related cost functions were tested, including a hybrid length plus congestion (commonly believed to be very effective). Experiments prove that they do not work very well. Net-centric post processing techniques are very effective to minimize congestion. Congestion at the global placement level, correlates well with congestion of detailed placement.

ECE260B – CSE241A Placement.74http://vlsicad.ucsd.edu Shapes of Cost Functions Solution Space net-cut cost wirelength congestion

ECE260B – CSE241A Placement.75http://vlsicad.ucsd.edu Relationships Between the Three Cost Functions:  The net-cut objective function is more smooth than the wirelength objective function  The wirelength objective function is more smooth than the congestion objective function  Local minimas of these three objectives are in the same neighborhood.  The net-cut objective function is more smooth than the wirelength objective function  The wirelength objective function is more smooth than the congestion objective function  Local minimas of these three objectives are in the same neighborhood.

ECE260B – CSE241A Placement.76http://vlsicad.ucsd.edu Crossing: A routability estimator? Replace each crossing with a “gate” A planar netlist Easy to place Replace each crossing with a “gate” A planar netlist Easy to place

ECE260B – CSE241A Placement.77http://vlsicad.ucsd.edu Timing Cost  Delay of the circuit is defined as the longest delay among all possible paths from primary inputs to primary outputs.  Interconnection delay becomes more and more important in deep sub-micron regime.  Delay of the circuit is defined as the longest delay among all possible paths from primary inputs to primary outputs.  Interconnection delay becomes more and more important in deep sub-micron regime. Critical Path

ECE260B – CSE241A Placement.78http://vlsicad.ucsd.edu Timing Analysis 555 444 2 LATCHLATCH LATCHLATCH 3211 2132 1 22 19 How do we get the delay numbers on the gate/interconnect?

ECE260B – CSE241A Placement.79http://vlsicad.ucsd.edu ApproachesApproaches Budgeting In accurate information Fast Path Analysis Most accurate information Very slow Path analysis with infrequent path substitution Somewhere in between Budgeting In accurate information Fast Path Analysis Most accurate information Very slow Path analysis with infrequent path substitution Somewhere in between

ECE260B – CSE241A Placement.80http://vlsicad.ucsd.edu Timing Metrics  How do we assess the change in a delay due to a potential move during physical design?  Whether it is channel routing or area routing, the problem is the same  translate geometrical change into delay change  How do we assess the change in a delay due to a potential move during physical design?  Whether it is channel routing or area routing, the problem is the same  translate geometrical change into delay change

ECE260B – CSE241A Placement.81http://vlsicad.ucsd.edu Others costs: Coupling Cost  Hard to model during placement  Can run a global router in the middle of placement  Even at the global routing level it is hard to model it  Hard to model during placement  Can run a global router in the middle of placement  Even at the global routing level it is hard to model it Avoid it

ECE260B – CSE241A Placement.82http://vlsicad.ucsd.edu Spacing Extra space Segregation Noisy region Quiet region Shielding Grounded Shields Coupling Solutions Once we have some metrics for coupling, we can calculate sensitivities, and optimize the physical design...

ECE260B – CSE241A Placement.83http://vlsicad.ucsd.edu Other Performance Costs  Power usage of the chip.  Weighted nets  Dual voltages (severe constraint on placement)  Very little known about these cost functions and their interaction with other cost functions  Fundamental research is needed to shed some light on the structure of them  Power usage of the chip.  Weighted nets  Dual voltages (severe constraint on placement)  Very little known about these cost functions and their interaction with other cost functions  Fundamental research is needed to shed some light on the structure of them

ECE260B – CSE241A Placement.84http://vlsicad.ucsd.edu Netlist Granularity: Problem Size and Solution Space Size  The most challenging part of the placement problem is to solve a huge system within given amount of time  We need to effectively reduce the size of the solution space and/or reduce the problem size  Netlist clustering: Edge extraction in the netlist  The most challenging part of the placement problem is to solve a huge system within given amount of time  We need to effectively reduce the size of the solution space and/or reduce the problem size  Netlist clustering: Edge extraction in the netlist Algorithm Cost Function Netlist Granularity Layout Coarseness

ECE260B – CSE241A Placement.85http://vlsicad.ucsd.edu Layout Coarsening Reduce Solution Space Edge extraction in the solution space Only simple things have been tried GP, DP (Twolf) 2x1, 2x2, …. Coarsen only “easy” parts Reduce Solution Space Edge extraction in the solution space Only simple things have been tried GP, DP (Twolf) 2x1, 2x2, …. Coarsen only “easy” parts Algorithm Cost Function Netlist Granularity Layout Coarseness

ECE260B – CSE241A Placement.86http://vlsicad.ucsd.edu Incremental Placement  Given an optimal placement for a given netlist, how to construct optimal placements for netlists modified from the given netlist.  Very little research in this area.  Different type of incremental changes (in one region, or all over)  Methods to use  How global should the method be  An extremely important problem.  Given an optimal placement for a given netlist, how to construct optimal placements for netlists modified from the given netlist.  Very little research in this area.  Different type of incremental changes (in one region, or all over)  Methods to use  How global should the method be  An extremely important problem.

ECE260B – CSE241A Placement.87http://vlsicad.ucsd.edu  A placement move changes the interconnect capacitance and resistance of the associated net  A net topology approximation is required to estimate these changes  A placement move changes the interconnect capacitance and resistance of the associated net  A net topology approximation is required to estimate these changes Incremental Placement

ECE260B – CSE241A Placement.88http://vlsicad.ucsd.edu “Placynthesis” Algorithms resizing buffering cloning restructuring

ECE260B – CSE241A Placement.89http://vlsicad.ucsd.edu Many other Design Metrics: Power Supply and Total Power Source: The Incredible Shrinking Transistor, Yuan Taur, T. J. Watson Research Center, IBM, IEEE Spectrum, July 1999

ECE260B – CSE241A Placement.90http://vlsicad.ucsd.edu HL H L  feedthrough VHVH VLVL GND H -- High Voltage Block L -- Low Voltage Block Layout Structure VHVH VLVL Cell Library with Dual Power Rails GND IN OUT Dual Voltages: A harder problem Layout synthesis with dual voltages: major geometric constraints

ECE260B – CSE241A Placement.91http://vlsicad.ucsd.edu Placement References C. J. Alpert, T. Chan, D. J.-H. Huang, I. Markov, and K. Yan, “Quadratic Placement Revisited”,Proc. 34th IEEE/ACM Design Automation Conference, 1997, pp. 752-757 C. J. Alpert, J.-H Huang, and A. B. Kahng, “Multilevel Circuit Partitioning”, Proc. 34th IEEE/ACM Design Automation Conference, 1997, pp. 530-533 U. Brenner, and A. Rohe, “An Effective Congestion Driven Placement Framework”, International Symposium on Physical Design 2002, pp. 6-11 A. E. Caldwell, A. B. Kahng, and I.L. Markov, “Can Recursive Bisection Alone Produce Routable Placements”,Proc. 37th IEEE/ACM Design Automation Conference, 2000, pp 477-482 M.A. Breuer, “Min-Cut Placement”, J. Design Automation and Fault Tolerant Computing, I(4), 1997, pp 343-362 J. Vygen, “Algorithms for Large-Scale Flat Placement”, Proc. 34th IEEE/ACM Design Automation Conference, 1988,pp 746-751 H. Eisenmann and F. M. Johannes, “Generic Global Placement and Floorplanning”, Proc. 35th IEEE/ACM Design Automation Conference, 1998, pp. 269-274 S.-L. Ou and M. Pedram, “Timing Driven Placement Based on Partitioning with Dynamic Cut-Net Control”, Proc. 37th IEEE/ACM Design Automation Conference, 2000, pp. 472-476 C.M. Fiduccia and R.M. Mattheyses, A linear time heuristic for improving network partitions, Proc. ACM/IEEE Design Automation Conference. (1982) pp. 175 - 181. C. J. Alpert, T. Chan, D. J.-H. Huang, I. Markov, and K. Yan, “Quadratic Placement Revisited”,Proc. 34th IEEE/ACM Design Automation Conference, 1997, pp. 752-757 C. J. Alpert, J.-H Huang, and A. B. Kahng, “Multilevel Circuit Partitioning”, Proc. 34th IEEE/ACM Design Automation Conference, 1997, pp. 530-533 U. Brenner, and A. Rohe, “An Effective Congestion Driven Placement Framework”, International Symposium on Physical Design 2002, pp. 6-11 A. E. Caldwell, A. B. Kahng, and I.L. Markov, “Can Recursive Bisection Alone Produce Routable Placements”,Proc. 37th IEEE/ACM Design Automation Conference, 2000, pp 477-482 M.A. Breuer, “Min-Cut Placement”, J. Design Automation and Fault Tolerant Computing, I(4), 1997, pp 343-362 J. Vygen, “Algorithms for Large-Scale Flat Placement”, Proc. 34th IEEE/ACM Design Automation Conference, 1988,pp 746-751 H. Eisenmann and F. M. Johannes, “Generic Global Placement and Floorplanning”, Proc. 35th IEEE/ACM Design Automation Conference, 1998, pp. 269-274 S.-L. Ou and M. Pedram, “Timing Driven Placement Based on Partitioning with Dynamic Cut-Net Control”, Proc. 37th IEEE/ACM Design Automation Conference, 2000, pp. 472-476 C.M. Fiduccia and R.M. Mattheyses, A linear time heuristic for improving network partitions, Proc. ACM/IEEE Design Automation Conference. (1982) pp. 175 - 181.