Presentation is loading. Please wait.

Presentation is loading. Please wait.

F.F. Dragan (Kent State) A.B. Kahng (UCSD) I. Mandoiu (UCLA) S. Muddu (Sanera Systems) A. Zelikovsky (Georgia State) Provably Good Global Buffering by.

Similar presentations


Presentation on theme: "F.F. Dragan (Kent State) A.B. Kahng (UCSD) I. Mandoiu (UCLA) S. Muddu (Sanera Systems) A. Zelikovsky (Georgia State) Provably Good Global Buffering by."— Presentation transcript:

1 F.F. Dragan (Kent State) A.B. Kahng (UCSD) I. Mandoiu (UCLA) S. Muddu (Sanera Systems) A. Zelikovsky (Georgia State) Provably Good Global Buffering by Multiterminal Multicommodity Flow Approximation

2 2 Outline Buffer-block methodology for global buffering Global routing via buffer-blocks problem Integer node-capacitated multiterminal multicommodity flow (MTMCF) formulation Provably good approximation of fractional MTMCF Provably good rounding of fractional MTMCF Key implementation choices Experimental results Extensions & conclusions

3 3 Motivation VDSM  buffer / inverter insertion for all global nets –50nm technology  >1,000,000 buffers Solution: insert buffers only in Buffer-Blocks (BBs)  Simplified design: isolates buffer insertion from circuit block implementations  Efficient utilization of routing/area resources (RAR) RAR(cap. 2k buffer-block) =   RAR(cap. k buffer-block) For high-end designs,   1.6

4 4 1.Buffer-block planning [Cong+99] [TangW00] –given placement of circuit blocks + netlist –find shape and location of BBs within available free space so that to maximize the number of routable nets 2.Global buffering via given BBs  This paper –given nets + BB locations and capacities –find buffered routing for each net, subject to timing-driven and buffer- parity constraints Buffer-Block Methodology

5 5

6 6 Problem Formulation Global Buffering via Buffer-Blocks (GRBB) Problem Given: BB locations and capacities list of multi-pin nets, each net has upper-bound + parity requirement on #buffers for each source- sink path [non-negative weight (criticality coefficient)] L/U bounds on wirelength b/w consecutive buffers/pins Find: buffered routing of a maximum [weighted] number of nets subject to the given constraints [Dragan+00]: 2-pin nets This paper: multi-pin nets

7 7 Our Contributions Integer node-capacitated MTMCF formulation Approximation algorithm for fractional MTMCF –Extends [GargK98,Fleischer99,Albrecht00,Dragan+00] to node- capacitated + multiterminal case Provably good fractional MTMCF rounding algorithms,  Provably good algorithm for GRBB Problem Practical rounding heuristics based on random-walks Computational study comparing alternative implementations

8 8 Integer Program Formulation

9 9 “Relax+Round” Approach 1.Solve the fractional relaxation –Relaxation = node-capacitated multiterminal multicommodity flow –Exact linear programming algorithms are impractical for large instances –KEY IDEA: use approximation algorithm can approximate optimum within a factor of (1-  ) for any  >0 allows continuous tradeoff between runtime and solution quality 2.Round to integer solution –Provably good rounding using [RaghavanT87] –Practical rounding using random-walks

10 10 The  -MTMCF Algorithm w(v) = , f = 0 For i = 1 to N do For k = 1, …, #nets do Find min weight valid routing tree T for net k While w(T) < min{ 1,  (1+2  )^i } do f(T)= f(T) + 1 For every v  T do w(v)  ( 1 +   (T,v) /cap(v) ) * w(v) End For Find min weight valid routing tree T for net k End While End For Output f/N

11 11 Runtime of  -MTMCF Algorithm Main step of  -MTMCF algorithm: computing min node-weight valid routing tree for a net  min node-weight directed rooted Steiner tree (DRST) in a directed acyclic graph

12 12 Implementation choices 2-Pin3,4-pinMulti-pin DecompositionStar, Minimum Spanning tree Matching, 3-restricted Steiner tree Not needed Min-weight DRSTShortest path (exact) Try all Steiner pts + shortest paths (exact) Very hard!  heuristics RoundingRandom-walkBackward random-walks [Dragan+00]This paper

13 13 1.Store fractional flows f(T) for every valid routing tree T 2.Scale down each f(T) by 1-  for small  3.Each net k routed with prob. f(k)=  { f(T) | T routing for k }  Number of routed nets  (1-  )OPT 4.To route net k, choose tree T with probability = f(T) / f(k)  With high probability, no BB capacity is exceeded Problem: Impractical to store all non-zero flow trees Provably Good Rounding

14 14 1.Store fractional flows f(T) for every valid routing tree T 2.Scale down each f(T) by 1-  for small  3.Each net k routed with prob. f(k)=  { f(T) | T routing for k }  Number of routed nets  (1-  )OPT 4.To route net k, choose tree T with probability = f(T) / f(k)  With high probability, no BB capacity is exceeded Random-Walk 2-TMCF Rounding use random walk from source to sink Practical: random walk requires storing only flows on edges

15 15 Random-Walk MTMCF Rounding S T1 T2 T3 Source  Sinks

16 16 Random-Walk MTMCF Rounding S T1 T2 T3 Source  Sinks

17 17 Our MTMCF Rounding Heuristic 1.Round each net k with probability f(k), using backward random walks –No scaling-down, approximate MTMCF < OPT 2.Resolve capacity violations by greedily deleting routed paths –Few violations 3.Greedily route remaining nets using unused BB capacity –Further routing still possible

18 18 Implemented Heuristics Greedy buffered routing 1.For each net, route sinks sequentially along shortest path to source or node already connected to source 2.After routing a net, remove fully used BBs MTMCF approximation + randomized rounding –2TMCF [Dragan+00] –3TMCF (3-pin decomposition +  -MTMCF + rounding) –4TMCF (4-pin decomposition +  -MTMCF + rounding) –MTMCF (  -MTMCF w/ approximate DRST + rounding)

19 19 Experimental Setup Test instances extracted from next-generation SGI microprocessor Up to 5,000 nets, ~6,000 sinks U=4,000  m, L=500-2,000  m 50 buffer blocks 200-400 buffers / BB

20 20 % Sinks Connected #sinks/ #nets Greed 2TMCF3TMCF4TMCFMTMCF  =.64  =.04  =.64  =.04  =.64  =.04  =.64  =.04 2958/ 2396 92.293.895.596.297.896.698.396.797.4 3077/ 2438 92.393.996.596.498.596.998.897.699.3 3099/ 2784 92.193.695.596.498.096.698.197.398.7 6038/ 4764 93.594.896.895.797.696.598.496.397.7 6296/ 4925 93.696.297.697.098.697.799.197.798.4 6321/ 4938 93.396.297.596.898.497.798.997.798.2

21 21 Runtime (sec.) #sinks/ #nets Greed 2TMCF3TMCF4TMCFMTMCF  =.64  =.04  =.64  =.04  =.64  =.04  =.64  =.04 2958/ 2396.301.633579.162,09098.9129,1902.33947 3077/ 2438.332.3535011.102,356128.3837,9702.87846 3099/ 2784.331.8039212.562,364132.8138,3412.86877 6038/ 4764.532.8460016.573,166182.5560,4504.981,866 6296/ 4925.554.3569019.53,721265.7877,6715.381,828 6321/ 4938.543.3773018.993,813255.3779,1235.431,833

22 22 % Routed Nets vs. Runtime

23 23 Resource Usage Greed 2TMCF3TMCF4TMCFMTMCF  =.64  =.04  =.64  =.04  =.64  =.04  =.64  =.04 # Conn. Sinks 5,6455,7255,8425,7795,8965,8275,9425,8135,897 % Conn. Sinks 93.594.896.895.797.696.598.496.397.7 Wirelength (meters) 42.2245.1847.8044.4847.6644.1847.4945.3347.51 WL/sink (microns) 7,4797,8918,1827,6978,0837,5827,9927,7988,057 #Buffers90379,86010,6769,59110,6109,49710,5079,86010,647 #Buff/sink1.601.721.831.661.801.631.771.701.81 #nets = 4,764 #sinks = 6,038 400 buffers/BB

24 24 WL and #Buffers for 100% Completion #nets = 4,764 #sinks = 6,038 Flow-rounding wastes routing resources! BB Cap. Greed 2TMCF3TMCF4TMCFMTMCF  =.64  =.04  =.64  =.04  =.64  =.04  =.64  =.04 500——— 51.28 ——— 50.07 ——— 49.45 ——— 49.54 11,73811,31211,07911,161 600——— 50.4651.1348.9349.9548.0249.5848.3449.27 11,33011,68810,80211,26710,51211,11510,63111,075 1000 47.8950.5950.7649.0549.9348.0149.9848.2848.27 10,33011,33411,55810,80211,28410,51211,37310,61910,783 8000 47.8950.6250.2848.9751.2848.0751.4048.3348.44 10,33011,3341134010,79411,78810,50311,80310,61910,625

25 25 Conclusions and Ongoing Work Provably good algorithms and practical heuristics based on node-capacitated MTMCF approximation –Higher completion rates than previous algorithms Extensions: –Combine global buffering with BB planning combine with compaction

26 26 Combining with compaction

27 27 Combining with compaction

28 28 Combining with compaction Sum-capacity constraints: cap(BB1) + cap(BB2)  const.

29 29 Conclusions and Ongoing Work Provably good algorithms and practical heuristics based on node-capacitated MTMCF approximation –Higher completion rates than previous algorithms Extensions: –Combine global buffering with BB planning combine with compaction –Enforce channel capacity constraints –Improved resource usage smart release of resources


Download ppt "F.F. Dragan (Kent State) A.B. Kahng (UCSD) I. Mandoiu (UCLA) S. Muddu (Sanera Systems) A. Zelikovsky (Georgia State) Provably Good Global Buffering by."

Similar presentations


Ads by Google