Presentation is loading. Please wait.

Presentation is loading. Please wait.

Analytical Minimization of Signal Delay in VLSI Placement Andrew B. Kahng and Igor L. Markov UCSD, Univ. of Michigan

Similar presentations


Presentation on theme: "Analytical Minimization of Signal Delay in VLSI Placement Andrew B. Kahng and Igor L. Markov UCSD, Univ. of Michigan"— Presentation transcript:

1 Analytical Minimization of Signal Delay in VLSI Placement Andrew B. Kahng and Igor L. Markov UCSD, Univ. of Michigan http://www.eecs.umich.edu/~imarkov IBM technical contact: Paul Villarrubia

2 Outline Background: Global Placement for VLSI –wirelength minimization –delay minimization Contribution –minimization objective –“generic” minimization algorithm: outer loop and inner loop –empirical results Futures

3 VLSI Global Placement Find locations for standard cells Standard cells placed in rows, without overlap Minimize wirelength, “routing congestion” Minimize clock cycle Key abstractions: –standard cells  rectangular outlines –netlist  weighted hypergraph (signal nets  hyperedges) –signal delay  function of cell locations (interconnect dominates)

4 A VLSI Global Placement Example bad placementgood placement

5 Netlist Hypergraph and Timing Graph Two signal nets: 3 pins (l.blue), and 4 pins (l.green) Ovals: hyperedges Red edges: timing graph edges

6 Top-Down Global Placement Placement blocks represent cells and layout area –single block at the start, driven by recursive (min-cut) bipartitioning –each pass: number of blocks doubles, size of blocks halves –end case: several cells in a tiny region etc. Intuition: many cells can operate in parallel. Partitioning finds “independent” groups of cells

7 Analytical Global Placement Find a continuous placement (locations == reals) Efficient optimizations when nonconvex constraints are relaxed (e.g., cells are allowed to overlap) Represent multi-pin hyperedges by sets of edges –minimize total weighted “wirelength” of all edges Popular objectives: Linear (Manhattan) WL = w 12 ( |x 1 -x 2 | + |y 1 -y 2 | ) Quadratic “squared” WL = w 12 ( (x 1 -x 2 ) 2 + (y 1 -y 2 ) 2 ) Constraints: fixed vertices and/or “region constraints” P1P1 P2P2

8 Analytical Placement Alone is Not Enough Many cells overlap Must “spread” the placement IBM CPlace and XQ –Remove overlap (comp. geometry) –Cplace combines min-cut with analytical techniques

9 Timing-Driven Placement Cycle time  maximum path delay, not total path delay (!) –max(x,y,...) is not differentiable –framework: pin-based timing graph Analytical approaches allow cell overlaps –Cell overlaps are resolved later Main difficulty: cannot enumerate signal paths Signal paths implicitly defined by device types –signal path sources, sinks == I/O pins and storage elements Timing constraints also implicitly defined –“actual arrival times” (AATs) at sources –“required arrival times” (RATs) at sinks –source-sink path constraint: path delay  RAT@sink - AAT@source

10 Implicit Analysis of Path Constraints Static Timing Analysis (STA) methodology –forward topological traversal in timing graph  AAT@every_pin –similar backward traversal  RAT@every_pin –slack@pin is given by RAT@pin - AAT@pin –negative slacks  violated timing constraints STA-based and STA-inspired placement methods –slacks  net weights for HPWL minimization top-down placement to maximize negative slack (Marek-Sadowska/Lin 86) –note: STA requires edge delays (e.g., from placement) –delay budgets zero-slack (Hauge, Nair and Yoffa 86) iterative min-max (Shragowitz et al. 90/92) limit-bumping (Frankle 92)

11 Motivations For Novelty Many promising techniques available –net reweighting –delay budgeting –others Existing frameworks have weaknesses –speed/scalability –loss or ignorance of input information delay budgeting algorithms tend to ignore fixed locations, obstacles –optimization of “wrong” global objectives (e.g., average wirelength)

12 The Dimensionless Path-Timing Objective For path  consider edge e  Dimensionless Path-Timing Objective (DPO)  =max  {t  /c  }= max  {(  e  d e )/c  } Where – c  is path constraint – t  is path delay – d e = d ij (x i,y i,x j,y j ) is edge delay

13 DPO: Properties  =max  {t  /c  }= max  {(  e  d e )/c  }   1  all timing constraints are satisfied Convex when edge delay models are convex Min DPO  max slack when all c  are equal Max slack can be reduced to min DPO –add two new vertices: the source and the sink –connect the source to former sources –connect the sink to former sinks –use constant edge delay models

14 Criticalities: “Multiplicative Slacks” By analogy with slack, define criticalities  i = max   v {t  /c  } for vertex v=v i  ij = max   e {t  /c  } for edge e=e ij Criticalities are multiplicative versions of slack DPO and criticalities quickly computable –STA + postprocessing Vertex criticalities  cells on critical paths –can be used by the proposed top-down timing-driven placement flow

15 Generic Minimization of DPO Reduce DPO to a simpler objective: max ij w ij d ij –maximal weighted edge delay –use “reweighting iterations” One reweighting iteration –assume a placement –compute edge criticalities –compute new edge weights w ij –minimize max ij w ij d ij (New weights: w ij ’=  ij  / d ij where  = max ij w ij d ij )

16 Properties of Reweighting Theorem 1. If  = max ij w ij d ij does not increase at a particular iteration, all timing constraints must be satisfied. Theorem 2. A re-weighting iteration either decreases DPO, or leaves it unchanged. Reweighting upper-bounds d ij because w ij d ij    can interpret reweighting as delay rebudgeting Youssef and Shragowitz used w ij =  ij in 1990/92 –[interpretation of their iterative MiniMax] –no iterations with placement: ignore fixed pad locations

17 Optimization of Maximal Edge Delay Must consider particular edge delay models –popular choices: linear and quadratic Theorem 3. 2-dim max edge delay can be reduced to 1-dim case with double #vertices [“Inlined” implementation: no new graph] max a km |t k -t m | max b km (t k -t m ) 2 Theorem 4. Let b km =a km 2  minimizers coincide  Linear and quadratic WL are numerically equivalent!

18 Top-Down Placement Framework Top-down placement done in passes In one pass –split every previously existing block Cell-to-block assignments –viewed as region constraints –gradually refine, converge to cell locs Assume we analytically minimized signal delay  have cell locations  can compute edge delays  can perform Static Timing Analysis  know which cells lie on critical paths Use delay-minimizing cell locs when splitting blocks

19 Empirical Validation We combined min-max placement with recursive min-cut bisection (Capo  CapoT) Implemented minimization of edge delay objectives: –Length as delay –Squared length as delay –Quadratic RC delay –MST-based Elmore delay (using Evaluated –Internal evaluators (after placement): sanity check –Industry timing analyzer Compared to an industry placer on 4 test-cases –Won on three test-cases (by slack computed with industry STA)

20 Results of Quadratic, Linear and Min-Max Placement

21

22 Conclusions and Ongoing Work New timing-driven placement framework –can potentially be combined with budgeting or reweighting –expected to be successful enough on its own –leverages mincut placement –relies on a novel analytical delay minimization Dimensionless Path-timing Objective (DPO) –novel global timing objective; generalizes slack optimization New minimization algorithms –reweighting iteration: reduction to simpler MAX-based objective –MAX-based objective can be minimized very quickly Ongoing work in the context of timing-driven flows

23 Future Work Observation (how the proposed method works) –a classic placement approach is split into stages –a new timing optimization is performed between those stages –most critical wires/gates are found first (traditionally: placement is found first)  Try other types of optimizations during placement –routing of timing-critical nets better delay estimation early cross-talk detection? –sizing of timing-critical drivers –buffer insertion for timing-critical nets –early detection of dangerous cross-talk  Faster and cheaper ICs


Download ppt "Analytical Minimization of Signal Delay in VLSI Placement Andrew B. Kahng and Igor L. Markov UCSD, Univ. of Michigan"

Similar presentations


Ads by Google