Presentation is loading. Please wait.

Presentation is loading. Please wait.

VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig 1 EECS 527 Paper Presentation Techniques for Fast.

Similar presentations


Presentation on theme: "VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig 1 EECS 527 Paper Presentation Techniques for Fast."— Presentation transcript:

1 VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig 1 EECS 527 Paper Presentation Techniques for Fast Physical Synthesis By Charles J. Alpert, Shrirang K. Karandikar, Zhuo Li, Gi-Joon Nam, Stephen T. Quay, Haoxing Ren, C. N. Sze, Paul G. Villarrubia, and Mehmet C. Yildiz Presented by Lingfeng Xu Department Electrical Engineering and Computer Science University of Michigan, Ann Arbor 11/2011

2 VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig EECS 527 Paper Presentation Outlines  Introduction  Buffering Trends  Major Phases of Physical Synthesis  Closer Look at Optimization  Selected Techniques  Fast Timing-Driven Buffering  Layout Aware Buffer Trees  Diffusion Based Legalization  Q&A 2

3 VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig EECS 527 Paper Presentation Introduction  Purpose of physical synthesis  Timing closure  Physical synthesis  Iterations  Iterate between manual design work and automatic physical synthesis  Philosophy  As fast as possible even if a little optimality is sacrificed  IBM’s physical synthesis tool  PDS (Placement-Driven Synthesis) system 3

4 VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig EECS 527 Paper Presentation Buffering trends  “Buffering Explosion”  Thiner wires == resistance increase  Wire delays increasingly dominate gate delays  Saxena et al. [3] predict that half of all logic will consist of buffers  20% - 25% buffers or inverters in today’s 90nm design 4

5 VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig  Percentage of block-level nets requiring repeaters [3]  Intra-block communication repeaters as a percentage of the total cell count for the block [3] 5

6 VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig EECS 527 Paper Presentation Buffering trends  Challenges  Buffer insertion need to be performed fast  Area and Power  Layout awareness  Buffering constricts or seeds global routing 6

7 VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig EECS 527 Paper Presentation Major Phase of Physical Synthesis  PDS stages  Initial placement and optimization  Timing-driven placement and optimization  Timing-driven detailed placement  Optimization techniques  Clock insertion and optimization  Routing and post routing optimization  Early-mode timing optimization 7

8 VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig EECS 527 Paper Presentation Closer look at Optimization  Optimization phases  Electrical correction  Critical path optimization  Histogram compression  Legalization  An example of physical synthesis breakdown 8 Initial Placement Electrical Correction Legalization Critical Slack Optimization Phase 1 Timing-driven Placement Electrical Correction Critical Slack Optimization Legalization Compression Legalization Phase 2 Timing-driven Detailed Placement Phase 3 Electrical Correction Legalization Critical Slack Optimization Legalization Critical Slack Optimization Legalization Compression Legalization Phase 4

9 VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig EECS 527 Paper Presentation How to Achieve Fast Physical Synthesis?  Selected Techniques  Fast Timing-Driven Buffering  Layout Aware Buffer Trees  Diffusion Based Legalization 9

10 VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig EECS 527 Paper Presentation Fast Timing-Driven Buffering  Motivation  Over a million buffers  Rebuffering rips all buffers and reinserts buffers from scratch  Considerations  Buffering resources vs. delay  Runtime  Slew, noise and capacitance constraints 10

11 VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig EECS 527 Paper Presentation Fast Timing-Driven Buffering  Classical Buffering Algorithm  Goal: Maximize source RAT  Dynamic programming  Candidate solutions generated and propagated from the sinks to the source  Solution internal node characteristics (q, c, w)  q: required arrival time  c: downstream load capacitance  w: cost summation for the buffer insertion decision  Example: sink (q = RAT, c = load capacitance, w = 0) 11

12 VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig EECS 527 Paper Presentation Fast Timing-Driven Buffering  Classical Buffering Algorithm  Two solutions α 1, α 2  α 2 dominates α 1, if q 2 ≥ q 1, c 2 ≤ c 1 and w 2 ≤ w 1  α 1 is redundant and can be pruned  At the end of algorithm  A set of solutions with different cost-RAT tradeoff is obtained  Choose one in middle  “10 ps rule”: If margin RAT gain is more than 10ps, choose solution with bigger RAT 12

13 VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig EECS 527 Paper Presentation Fast Timing-Driven Buffering  Prebuffer Slack Pruning (PSP)  Based on current node being processed  if q 2 < q 1, c 2 < c 1 and (q 2 - q 1 )/(c 2 - c 1 ) ≥ R min, then α 2 is pruned early  Appropriate R min guarantees optimality, however larger value does not hurt solution quality 13

14 VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig EECS 527 Paper Presentation Fast Timing-Driven Buffering  Squeeze Pruning  Three partial solutions α 1, α 2, α 3 with same cost  if (q 2 - q 1 )/(c 2 - c 1 )≤(q 3 - q 2 )/(c 3 - c 2 ), then α 2 is pruned  For a two-pin net, the middle point is always dominated by either the first or the third solution; for multi-sink net, optimality not guaranteed but causes no degradation in solution most of the time 14

15 VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig EECS 527 Paper Presentation Fast Timing-Driven Buffering  Library Lookup  Every buffer in the library is examined for iteration If there are m kinds of buffer and inverter, n nodes, mn candidate solutions in total  However many candidate solutions are not worth considering  Pre-compute Buffer table and Inverter table  2n candidate solutions, n with inverters and n with buffers 15

16 VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig EECS 527 Paper Presentation Fast Timing-Driven Buffering  Results and Summary  Derived from 5000 high capacitance nets from an ASIC chip  3% quality degradation and 20x speedup  Philosophy: as fast as possible even if a little optimality is sacrificed  Rip up and rebuffering with more accurate techniques can be perform latter if desired 16

17 VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig EECS 527 Paper Presentation Layout Aware Fast and Flexible Buffer Trees  Layout problems in buffering  (a) Alley  (b) Pile-ups  Holes in large blocks  Layout constrains  Holes in large blocks  Navigating blocks and dense region  Critical and non-critical routes  Avoiding routing congestions 17

18 VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig EECS 527 Paper Presentation Layout Aware Fast and Flexible Buffer Trees  Layout aware buffer tree flow  Step 1: Construct a fast timing-driven Steiner tree  Step 2: Reroute the Steiner tree to preserve its topology while navigating environmental constrains  Step 3: Insert buffers (e.g. with Fast Timing-Driven Buffering)  This work focuses on Step 2 18

19 VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig EECS 527 Paper Presentation Layout Aware Fast and Flexible Buffer Trees  Algorithm  Break existing Steiner tree into disjoint 2-paths, i.e., paths start and end with either source, sink or a Steiner point  Each 2-path is routed in turn to minimize cost, starting from sinks and ending at source  Maze routing for each 2-path with cost function  If Steiner point is in a congested region, move it in a specified “plate region” 19

20 VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig EECS 527 Paper Presentation  20

21 VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig EECS 527 Paper Presentation  21

22 VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig EECS 527 Paper Presentation Layout Aware Fast and Flexible Buffer Trees  General Maze routing cost function  Tradeoff parameter 0 ≤ K ≤ 1  Tile cost: cost(t) = 1 + K e(t)  Merging branches: cost(t) = max(cost(L), cost(R) + K min(cost(L), cost(R))  Sink initialization cost(s) = (K - 1)RAT(s)/DpT  Use K=1 for electrical correction; use K=0.1 for critical path 22

23 VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig EECS 527 Paper Presentation Layout Aware Fast and Flexible Buffer Trees  Example and Summary  A 7-pin net of an industrial design  (a) K=1.0, 4134ps slack improvement  (b) K=0.1, 4646ps slack improvement 23

24 VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig EECS 527 Paper Presentation Diffusion-Based Placement Techniques for Legalization  Classical legalization  After optimization, local regions can be overfull  Run periodically to snap from overlaps to legal positions  If one waits too long between two legalizations, cells may end up quite far away from optimal position, which may severely hurt timing  Diffusion-Based Legalization  Avoid cells been moved too far away  Fast. Run in minutes on designs with millions of gates 24

25 VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig EECS 527 Paper Presentation Diffusion-Based Placement Techniques for Legalization  Diffusion as a Physical Process  Moves elements from a state with non-zero potential energy to a state of equilibrium  Can be modeled by breaking down into finite time steps  Relationship of material concentration with time and space 25

26 VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig EECS 527 Paper Presentation Diffusion-Based Placement Techniques for Legalization  Diffusion as a Physical Process  Cell velocity  Cell new location 26

27 VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig EECS 527 Paper Presentation Diffusion-Based Placement Techniques for Legalization  Diffusion Based Placement  Coordinates are scaled so that the width and height of each bin is one  Location (x, y) lies in bin  Forward Time Centered Space (FTCS) scheme New bin density  Bin velocity 27

28 VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig EECS 527 Paper Presentation Diffusion-Based Placement Techniques for Legalization  Diffusion Based Placement  Enforce v H = 0 at horizontal boundary and v H = 0 at vertical boundary  Two cells right next to each other can be assigned very different velocities which could change their relative ordering. Apply velocity interpolation based on the four closest bins to remedy this behavior  New locations (x, y) for the next time stamp 28

29 VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig EECS 527 Paper Presentation Diffusion-Based Placement Techniques for Legalization  Diffusion Based Placement: Getting it work  Diffusion process reaches equilibrium when each bin has the same density, i.e. the average density, can cause unnecessary spreading, even if every bin’s density is well below d max  Idea: Run diffusion for regions which requires it  Local Diffusion: Run diffusion on cells in a window around bins that violate target density constraint  If FTCS error exceeds a certain threshold, update the real density based on real cell placement and restart the diffusion algorithm 29

30 VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig EECS 527 Paper Presentation Diffusion-Based Placement Techniques for Legalization  Example  Before legalization, after traditional legalization and diffusion legalization  4% total wire length save  48% worst slack improvement  36% less negative paths  Summary  Diffusion based legalization is less likely to disrupt the state of design 30

31 VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig EECS 527 Paper Presentation Summary  Buffering trends  “Buffer Explosion”  Physical synthesis phases  4 phases  Fast Timing-Driven Buffering  Layout Aware Buffer Trees  Diffusion-Based Legalization 31

32 VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig EECS 527 Paper Presentation 32 Thanks ! Q&A


Download ppt "VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig 1 EECS 527 Paper Presentation Techniques for Fast."

Similar presentations


Ads by Google