Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSE241 L8 Placement.1Kahng & Cichy, UCSD ©2003 CSE241 VLSI Digital Circuits Winter 2003 Lecture 08: Placement.

Similar presentations


Presentation on theme: "CSE241 L8 Placement.1Kahng & Cichy, UCSD ©2003 CSE241 VLSI Digital Circuits Winter 2003 Lecture 08: Placement."— Presentation transcript:

1 CSE241 L8 Placement.1Kahng & Cichy, UCSD ©2003 CSE241 VLSI Digital Circuits Winter 2003 Lecture 08: Placement

2 CSE241 L8 Placement.2Kahng & Cichy, UCSD ©2003 Introduction  Dr. Gabriel Robins  E-mail : robins@cs.virginia.edu  Web: www.cs.virginia.edu/robins

3 CSE241 L8 Placement.3Kahng & Cichy, UCSD ©2003 VLSI Design Flow and Physical Design Stage Definitions: Cell: a circuit component to be placed on the chip area. In placement, the functionality of the component is ignored. Net: specifying a subset of terminals, to connect several cells. Netlist: a set of nets which contains the connectivity information of the circuit. Global Placement Detail Placement Clock Tree Synthesis and Routing Global Routing Detail Routing Power/Ground Stripes, Rings Routing Extraction and Delay Calc. Timing Verification IO Pad Placement

4 CSE241 L8 Placement.4Kahng & Cichy, UCSD ©2003 Placement Problem Input: A set of cells and their complete information (a cell library). Connectivity information between cells (netlist information). Output: A set of locations on the chip: one location for each cell. Goal: The cells are placed to produce a routable chip that meets timing and other constraints (e.g., low-power, noise, etc.) Challenge: The number of cells in a design is very large (> 1 million). The timing constraints are very tight.

5 CSE241 L8 Placement.5Kahng & Cichy, UCSD ©2003 A BC Optimal Relative Order:

6 CSE241 L8 Placement.6Kahng & Cichy, UCSD ©2003 A B C To spread...

7 CSE241 L8 Placement.7Kahng & Cichy, UCSD ©2003 A BC.. or not to spread

8 CSE241 L8 Placement.8Kahng & Cichy, UCSD ©2003 A BC Place to the left

9 CSE241 L8 Placement.9Kahng & Cichy, UCSD ©2003 A BC … or to the right

10 CSE241 L8 Placement.10Kahng & Cichy, UCSD ©2003 A BC Optimal Relative Order: Without “free” space, the placement problem is dominated by order

11 CSE241 L8 Placement.11Kahng & Cichy, UCSD ©2003 Placement Problem A bad placement A good placement

12 CSE241 L8 Placement.12Kahng & Cichy, UCSD ©2003 Global and Detailed Placement Global Placement Detailed Placement In global placement, we decide the approximate locations for cells by placing cells in global bins. In detailed placement, we make some local adjustment to obtain the final non- overlapping placement.

13 CSE241 L8 Placement.13Kahng & Cichy, UCSD ©2003 Placement Footprints: Standard Cell: Data Path: IP - Floorplanning

14 CSE241 L8 Placement.14Kahng & Cichy, UCSD ©2003 Core Control IO Reserved areas Mixed Data Path & sea of gates: Placement Footprints:

15 CSE241 L8 Placement.15Kahng & Cichy, UCSD ©2003 Perimeter IO Area IO Placement Footprints:

16 CSE241 L8 Placement.16Kahng & Cichy, UCSD ©2003 Placement objectives are subject to user constraints / design style:  Hierarchical Design Constraints l pin location l power rail l reserved layers  Flat Design with Floorplan Constraints  Fixed Circuits  I/O Connections

17 CSE241 L8 Placement.17Kahng & Cichy, UCSD ©2003 Standard Cells

18 CSE241 L8 Placement.18Kahng & Cichy, UCSD ©2003 Standard Cells l Power connected by abutment, placed in sea-of-rows l Rarely rotated l DRC clean in any combination l Circuit clean (I.e. no naked T-gates, no huge input capacitances) l 8,9,10+ tracks in height l Metal 1 only used (hopefully) l Multi-height stdcells possible l Buffers: sizes, intrinsic delay steps, optimal repeater selection l Special clock buffers + gates (balanced P:N) l Special metastability hardened flops l Cap cells (metal1 used?) l Gap fillers (metal1 used?) l Tie-high, tie-low

19 CSE241 L8 Placement.19Kahng & Cichy, UCSD ©2003 Unconstrained Placement

20 CSE241 L8 Placement.20Kahng & Cichy, UCSD ©2003 Floor planned Placement

21 CSE241 L8 Placement.21Kahng & Cichy, UCSD ©2003 Placement Cube (4D)  Cost Function(s) to be used l Cut, wirelength, congestion, crossing,...  Algorithm(s) to be used l FM, Quadratic, annealing, ….  Granularity of the netlist  Coarseness of the layout domain l 2x2, 4x4, ….  An effective methodology picks the right mix from the above and knows when to switch from one to next.  Most methods today are ad-hoc Algorithm Cost Function Netlist Granularity Layout Coarseness

22 CSE241 L8 Placement.22Kahng & Cichy, UCSD ©2003 Advantages of Hierarchy  Design is carved into smaller pieces that can be worked on in parallel (improved throughput)  A known floor plan provides the logic design team with a large degree of placement control.  A known floor plan provided early knowledge of long wires  Timing closure problems can be addressed by tools, logic design, and hierarchy manipulation  Late design changes can be done with minimal turmoil to the entire design

23 CSE241 L8 Placement.23Kahng & Cichy, UCSD ©2003 Disadvantages of Hierarchy  Results depend on the quality of the hierarchy. The logic hierarchy must be designed with PD taken into account.  Additional methodology requirements must be met to enable hierarchy. Ex. Pin assignment, Macro Abstract management, area budgeting, floor planning, timing budgets, etc  Late design changes may affect multiple components.  Hierarchy allows divergent methodologies  Hierarchy hinders DA algorithms. They can no longer perform global optimizations.

24 CSE241 L8 Placement.24Kahng & Cichy, UCSD ©2003 Traditional Placement Algorithms  Quadratic Placement  Simulated Annealing  Bi-Partitioning / Quadrisection  Force Directed Placement  Hybrid Algorithm Cost Function Netlist Granularity Layout Coarseness

25 CSE241 L8 Placement.25Kahng & Cichy, UCSD ©2003 Quadratic Placement  Analytical Technique x4 x3 x1 x2 Min [(x1-x3) 2 + (x1-x2) 2 + (x2-x4) 2 ] : F  F/  x1 = 0;  F/  x2 = 0; Ax = B 2 -1 -1 2 x = x1x2 A = B = x3x4

26 CSE241 L8 Placement.26Kahng & Cichy, UCSD ©2003 Analytical Placement  Get a solution with lots of overlap  What do we do with the overlap?

27 CSE241 L8 Placement.27Kahng & Cichy, UCSD ©2003 Pros and Cons of QP  Pros:  Very Fast Analytical Solution  Can Handle Large Design Sizes  Can be Used as an Initial Seed Placement Engine  Cons:  Can Generate Overlapped Solutions: Postprocessing Needed  Not Suitable for Timing Driven Placement  Not Suitable for Simultaneous Optimization of Other Aspects of Physical Design (clocks, crosstalk…)  Gives Trivial Solutions without Pads (and close to trivial with pads)

28 CSE241 L8 Placement.28Kahng & Cichy, UCSD ©2003 Simulated Annealing Placement  Initial Placement Improved through Swaps and Moves Swaps and Moves  Accept a Swap/Move if it improves cost  Accept a Swap/Move that degrades cost under some probability conditions Time Cost

29 CSE241 L8 Placement.29Kahng & Cichy, UCSD ©2003 Pros and Cons of SA  Pros:  Can Reach Globally Optimal Solution (given “enough” time)  Open Cost Function.  Can Optimize Simultaneously all Aspects of Physical Design  Can be Used for End Case Placement  Cons:  Extremely Slow Process of Reaching a Good Solution

30 CSE241 L8 Placement.30Kahng & Cichy, UCSD ©2003 Bi-Partitioning/Quadrisection

31 CSE241 L8 Placement.31Kahng & Cichy, UCSD ©2003 Pros and Cons of Partitioning Based Placement  Pros:  More Suitable to Timing Driven Placement since it is Move Based  New Innovation (hMetis) in Partitioning Algorithms have made this Extremely Fast  Open Cost Function  Move Based means Simultaneous Optimization of all Design Aspects Possible  Cons:  Not Well Understood  Lots of “indifferent” moves  May not work well with some cost functions.

32 CSE241 L8 Placement.32Kahng & Cichy, UCSD ©2003 Force Directed Placement  Cells are dragged by forces.  Forces are generated by nets connecting cells. Longer nets generate bigger forces.  Placement is obtained by either a constructive or an iterative method. i j F ij i

33 CSE241 L8 Placement.33Kahng & Cichy, UCSD ©2003 Pros and Cons of Force Directed Placement  Pros:  Very Fast Analytical Solution  Can Handle Large Design Sizes  Can be Used as an Initial Seed Placement Engine  The Force  Cons:  Not sensitive to the non-overlapping constraints  Gives Trivial Solutions without Pads  Not Suitable for Timing Driven Placement

34 CSE241 L8 Placement.34Kahng & Cichy, UCSD ©2003 Hybrid Placement  Mix-matching different placement algorithms  Effective algorithms are always hybrid

35 CSE241 L8 Placement.35Kahng & Cichy, UCSD ©2003 GORDIAN (quadratic + partitioning) Partition and Replace Initial Placement

36 CSE241 L8 Placement.36Kahng & Cichy, UCSD ©2003 Congestion Minimization  Traditional placement problem is to minimize interconnection length (wirelength)  A valid placement has to be routable  Congestion is important because it represents routability (lower congestion implies better routability)  There is not yet enough research work on the congestion minimization problem

37 CSE241 L8 Placement.37Kahng & Cichy, UCSD ©2003 Definition of Congestion Routing demand = 3 Assume routing supply is 1, overflow = 3 - 1 = 2 on this edge. Overflow =  overflow  all edges Overflow on each edge = Routing Demand - Routing Supply (if Routing Demand > Routing Supply) 0 (otherwise)

38 CSE241 L8 Placement.38Kahng & Cichy, UCSD ©2003 Correlation between Wirelength and Congestion Total Wirelength = Total Routing Demand

39 CSE241 L8 Placement.39Kahng & Cichy, UCSD ©2003 Wirelength  Congestion A congestion minimized placement A wirelength minimized placement

40 CSE241 L8 Placement.40Kahng & Cichy, UCSD ©2003 Congestion Map of a Wirelength Minimized Placement Congested Spots

41 CSE241 L8 Placement.41Kahng & Cichy, UCSD ©2003 Congestion MAP

42 CSE241 L8 Placement.42Kahng & Cichy, UCSD ©2003 Congestion Reduction Postprocessing Reduce congestion globallyby minimizing thetraditional wirelength Post process the wirelengthoptimized placement usingthe congestion objective

43 CSE241 L8 Placement.43Kahng & Cichy, UCSD ©2003  Among a variety of cost functions and methods for congestion minimization, wirelength alone followed by a post processing congestion minimization works the best and is one of the fastest.  Cost functions such as a hybrid length plus congestion do not work very well. Congestion Reduction Postprocessing

44 CSE241 L8 Placement.44Kahng & Cichy, UCSD ©2003 Cost Functions for Placement  The final goal of placement is to achieve routability and meet timing constraints  Constraints are very hard to use in optimization, thus we use cost functions (e.g., Wirelength) to predict our goals.  We will show what happens when you try constraints directly  The main challenge is a technical understanding of various cost functions and their interaction.

45 CSE241 L8 Placement.45Kahng & Cichy, UCSD ©2003 Prediction  What is prediction ? l every system has some critical cost functions: Area, wirelength, congestion, timing etc. l Prediction aims at estimating values of these cost functions without having to go through the time- consuming process of full construction.  Allows quick space exploration, localizes the search  For example: l statistical wire-load models l Wirelength in placement

46 CSE241 L8 Placement.46Kahng & Cichy, UCSD ©2003 Paradigms of Prediction  Two fundamental paradigms l statistical prediction -#of two-terminal nets in all designs -#of two-terminal nets with length greater than 10 in all designs l constructive prediction -#of two-terminal nets with length greater than 10 in this design l … and everything in between, e.g., -#of critical two-terminal nets in a design based on statistical data and a quick inspection of the design in hand.  “Absolute truth” or “I need it to make progress”  SLIP (System Level Interconnect Prediction) community.

47 CSE241 L8 Placement.47Kahng & Cichy, UCSD ©2003 Cost Functions for Placement  Net-cut  Linear wirelength  Quadratic wirelength  Congestion  Timing  Coupling  Other performance related cost functions  Undiscovered: crossing Algorithm Cost Function Netlist Granularity Layout Coarseness

48 CSE241 L8 Placement.48Kahng & Cichy, UCSD ©2003 Net-cut Cost for Global Placement  The net-cut cost is defined as the number of external nets between different global bins  Minimizing net-cut in global placement tends to put highly connected cells close to each other.

49 CSE241 L8 Placement.49Kahng & Cichy, UCSD ©2003 Linear Wirelength Cost The linear length of a net between cell 1 and cell 2 is l 12 = |x1-x2| +|y1-y2| The linear wirelength cost is the summation of the linear length of all nets. (x1,y1) (x2,y2) 1 2

50 CSE241 L8 Placement.50Kahng & Cichy, UCSD ©2003 Quadratic Wirelength Cost The quadratic length of a net between cell 1 and cell 2 is l 12 = (x1-x2) 2 +(y1-y2) 2 The quadratic wirelength cost is the summation of the quadratic length of all nets. (x1,y1) (x2,y2) 1 2

51 CSE241 L8 Placement.51Kahng & Cichy, UCSD ©2003 Congestion Cost Routing demand = 3 Assume routing supply is 1, overflow = 3 - 1 = 2 on this edge. Congestion Overflow =  overflow  all edges Overflow on each edge = Routing Demand - Routing Supply (if Routing Demand > Routing Supply) 0 (otherwise)

52 CSE241 L8 Placement.52Kahng & Cichy, UCSD ©2003 Cost Functions for Placement  Various cost functions (and a mix of them) have been used in practice to model/estimate routability and timing  We have a good “feel” for what each cost function is capable of doing  We need to understand the interaction among cost functions

53 CSE241 L8 Placement.53Kahng & Cichy, UCSD ©2003 Congestion Minimization and Congestion vs Wirelength  Congestion is important because it closely represents routability (especially at lower-levels of granularity)  Congestion is not well understood  Ad-hoc techniques have been kind-of working since congestion has never been severe  It has been observed that length minimization tends to reduce congestion.  Goal: Reduce congestion in placement (willing to sacrifice wirelength a little bit).

54 CSE241 L8 Placement.54Kahng & Cichy, UCSD ©2003 Correlation between Wirelength and Congestion Total Wirelength = Total Routing Demand

55 CSE241 L8 Placement.55Kahng & Cichy, UCSD ©2003 Wirelength  Congestion A congestion minimized placement A wirelength minimized placement

56 CSE241 L8 Placement.56Kahng & Cichy, UCSD ©2003 Congestion Map of a Wirelength Minimized Placement Congested Spots

57 CSE241 L8 Placement.57Kahng & Cichy, UCSD ©2003 Different Routing Models for modeling congestion  Bounding box router: fast but inaccurate.  Real router: accurate but slow.  A bounding box router can be used in placement if it produces correlated routing results with the real router.  Note: For different cost functions, answer might be different (e.g., for coupling, only a detailed router can answer).

58 CSE241 L8 Placement.58Kahng & Cichy, UCSD ©2003 Different Routing Models A bounding box routing model A MST+shortest_path routing model

59 CSE241 L8 Placement.59Kahng & Cichy, UCSD ©2003 Objective Functions Used in Congestion Minimization  WL: Standard total wirelength objective.  Ovrflw: Total overflow in a placement (a direct congestion cost).  Hybrid: (1-  )WL +  Ovrflw  QL: A quadratic plus linear objective.  LQ: A linear plus quadratic objective.  LkAhd: A modified overflow cost.  (1-  T )WL +  T Ovrflw: A time changing hybrid objective which let the cost function gradually change from wirelength to overflow as optimization proceeds.

60 CSE241 L8 Placement.60Kahng & Cichy, UCSD ©2003 Post Processing to Reduce Congestion Reduce congestion globallyby minimizing thetraditional wirelength Post process the wirelengthoptimized placement usingthe congestion objective

61 CSE241 L8 Placement.61Kahng & Cichy, UCSD ©2003 Post Processing Heuristics  Greedy cell-centric algorithm: Greedily move cells around and greedily accept moves.  Flow-based cell-centric algorithm: Use a flow-based approach to move cells.  Net-centric algorithm: Move nets with bigger contributions to the congestion first.

62 CSE241 L8 Placement.62Kahng & Cichy, UCSD ©2003 Greedy Cell-centric Heuristic

63 CSE241 L8 Placement.63Kahng & Cichy, UCSD ©2003 Flow-based Cell-centric Heuristic Cell Nodes Bin Nodes

64 CSE241 L8 Placement.64Kahng & Cichy, UCSD ©2003 Net-centric Heuristic 2 1 2 2 2 1 1

65 CSE241 L8 Placement.65Kahng & Cichy, UCSD ©2003 From Global Placement to Detailed Placement Global Placement: Assumingall the cells are placed at thecenters of global bins. Detailed Placement: Cells areplaced without overlapping.

66 CSE241 L8 Placement.66Kahng & Cichy, UCSD ©2003 Correlation Between Global and Detailed Placement WL g : Wirelength optimized global placement. CON g : Wirelength optimized detailed placement. WL d : Congestion optimized global placement. CON d : Congestion optimized detailed placement. Conclusion: Congestion at detailed placement level is correlated with congestion at global placement level. Thus reducing congestion inglobal placement helps reduce congestion in final detailed placement.

67 CSE241 L8 Placement.67Kahng & Cichy, UCSD ©2003 Congestion  Wirelength minimization can minimize congestion globally. A post processing congestion minimization following wirelength minimization works the best to reduce congestion in placement.  A number of congestion-related cost functions were tested, including a hybrid length plus congestion (commonly believed to be very effective). Experiments prove that they do not work very well.  Net-centric post processing techniques are very effective to minimize congestion.  Congestion at the global placement level, correlates well with congestion of detailed placement.

68 CSE241 L8 Placement.68Kahng & Cichy, UCSD ©2003 Shapes of Cost Functions Solution Space net-cut cost wirelength congestion

69 CSE241 L8 Placement.69Kahng & Cichy, UCSD ©2003 Relationships Between the Three Cost Functions:  The net-cut objective function is more smooth than the wirelength objective function  The wirelength objective function is more smooth than the congestion objective function  Local minimas of these three objectives are in the same neighborhood.

70 CSE241 L8 Placement.70Kahng & Cichy, UCSD ©2003 Crossing: A routability estimator?  Replace each crossing with a “gate”  A planar netlist  Easy to place

71 CSE241 L8 Placement.71Kahng & Cichy, UCSD ©2003 Timing Cost  Delay of the circuit is defined as the longest delay among all possible paths from primary inputs to primary outputs.  Interconnection delay becomes more and more important in deep sub-micron regime. Critical Path

72 CSE241 L8 Placement.72Kahng & Cichy, UCSD ©2003 Timing Analysis 555 444 2 LATCHLATCH LATCHLATCH 3211 2132 1 22 19 How do we get the delay numbers on the gate/interconnect?

73 CSE241 L8 Placement.73Kahng & Cichy, UCSD ©2003 Approaches  Budgeting l In accurate information l Fast  Path Analysis l Most accurate information l Very slow  Path analysis with infrequent path substitution l Somewhere in between

74 CSE241 L8 Placement.74Kahng & Cichy, UCSD ©2003 Timing Metrics • How do we assess the change in a delay due to a potential move during physical design? • Whether it is channel routing or area routing, the problem is the same • translate geometrical change into delay change

75 CSE241 L8 Placement.75Kahng & Cichy, UCSD ©2003 Others costs: Coupling Cost  Hard to model during placement  Can run a global router in the middle of placement  Even at the global routing level it is hard to model it Avoid it

76 CSE241 L8 Placement.76Kahng & Cichy, UCSD ©2003 Spacing Extra space Segregation Noisy region Quiet region Shielding Grounded Shields Coupling Solutions  Once we have some metrics for coupling, we can calculate sensitivities, and optimize the physical design...

77 CSE241 L8 Placement.77Kahng & Cichy, UCSD ©2003 Other Performance Costs  Power usage of the chip.  Weighted nets  Dual voltages (severe constraint on placement)  Very little known about these cost functions and their interaction with other cost functions  Fundamental research is needed to shed some light on the structure of them

78 CSE241 L8 Placement.78Kahng & Cichy, UCSD ©2003 Netlist Granularity: Problem Size and Solution Space Size  The most challenging part of the placement problem is to solve a huge system within given amount of time  We need to effectively reduce the size of the solution space and/or reduce the problem size  Netlist clustering: Edge extraction in the netlist Algorithm Cost Function Netlist Granularity Layout Coarseness

79 CSE241 L8 Placement.79Kahng & Cichy, UCSD ©2003 Layout Coarsening  Reduce Solution Space  Edge extraction in the solution space  Only simple things have been tried l GP, DP (Twolf) l 2x1, 2x2, ….  Coarsen only “easy” parts Algorithm Cost Function Netlist Granularity Layout Coarseness

80 CSE241 L8 Placement.80Kahng & Cichy, UCSD ©2003 Incremental Placement  Given an optimal placement for a given netlist, how to construct optimal placements for netlists modified from the given netlist.  Very little research in this area.  Different type of incremental changes (in one region, or all over)  Methods to use  How global should the method be  An extremely important problem.

81 CSE241 L8 Placement.81Kahng & Cichy, UCSD ©2003 • A placement move changes the interconnect capacitance and resistance of the associated net • A net topology approximation is required to estimate these changes Incremental Placement

82 CSE241 L8 Placement.82Kahng & Cichy, UCSD ©2003 “Placynthesis” Algorithms resizing buffering cloning restructuring

83 CSE241 L8 Placement.83Kahng & Cichy, UCSD ©2003 Many other Design Metrics: Power Supply and Total Power Source: The Incredible Shrinking Transistor, Yuan Taur, T. J. Watson Research Center, IBM, IEEE Spectrum, July 1999

84 CSE241 L8 Placement.84Kahng & Cichy, UCSD ©2003 HL H L  feedthrough VHVH VLVL GND H -- High Voltage Block L -- Low Voltage Block Layout Structure VHVH VLVL Cell Library with Dual Power Rails GND IN OUT Dual Voltages: A harder problem  Layout synthesis with dual voltages: major geometric constraints

85 CSE241 L8 Placement.85Kahng & Cichy, UCSD ©2003 Placement References  C. J. Alpert, T. Chan, D. J.-H,\. Huang, I. Markov, and K. Yan, “Quandratic Placement Revisited”,Proc. 34th IEEE/ACM Design Automation Conference, 1997, pp. 752-757  C. J. Alpert, J.-H Huang, and A. B. Kahng, “Multilevel Circuit Partitioning”, Proc. 34th IEEE/ACM Design Automation Conference, 1997, pp. 530-533  U. Brenner, and A. Rohe, “An Effective Congestion Driven Placement Framework”, International Symposium on Physical Design 2002, pp. 6-11  A. E. Caldwell, A. B. Kahng, and I.L. Markov, “Can Recursive Bisection Alone Produce Routable Placements”,Proc. 37th IEEE/ACM Design Automation Conference, 2000, pp 477-482  M.A. Breuer, “Min-Cut Placement”, J. Design Automation and Fault Tolerant Computing, I(4), 1997, pp 343-362  J. Vygen, “Algorithms for Large-Scale Flat Placement”, Proc. 34th IEEE/ACM Design Automation Conference, 1988,pp 746-751  H. Eisenmann and F. M. Johannes, “Generic Global Placement and Floorplanning”, Proc. 35th IEEE/ACM Design Automation Conference, 1998, pp. 269-274  S.-L. Ou and M. Pedram, “Timing Driven Placement Based on Partitioning with Dynamic Cut-Net Control”, Proc. 37th IEEE/ACM Design Automation Conference, 2000, pp. 472-476  C.M. Fiduccia and R.M. Mattheyses, A linear time heuristic for improving network partitions, Proc. ACM/IEEE Design Automation Conference. (1982) pp. 175 - 181.


Download ppt "CSE241 L8 Placement.1Kahng & Cichy, UCSD ©2003 CSE241 VLSI Digital Circuits Winter 2003 Lecture 08: Placement."

Similar presentations


Ads by Google