F.F. Dragan (Kent State) A.B. Kahng (UCSD) I. Mandoiu (UCLA/UCSD) S. Muddu (Sanera Systems) A. Zelikovsky (Georgia State) Practical Approximation Algorithms.

Slides:



Advertisements
Similar presentations
Iterative Rounding and Iterative Relaxation
Advertisements

The Primal-Dual Method: Steiner Forest TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA A A AA A A A AA A A.
Primal Dual Combinatorial Algorithms Qihui Zhu May 11, 2009.
ECE 667 Synthesis and Verification of Digital Circuits
OCV-Aware Top-Level Clock Tree Optimization
Buffer and FF Insertion Slides from Charles J. Alpert IBM Corp.
1 Interconnect Layout Optimization by Simultaneous Steiner Tree Construction and Buffer Insertion Presented By Cesare Ferri Takumi Okamoto, Jason Kong.
F.F. Dragan (Kent State) A.B. Kahng (UCSD) I. Mandoiu (UCLA) S. Muddu (Sanera Systems) A. Zelikovsky (Georgia State) Provably Good Global Buffering by.
Coupling-Aware Length-Ratio- Matching Routing for Capacitor Arrays in Analog Integrated Circuits Kuan-Hsien Ho, Hung-Chih Ou, Yao-Wen Chang and Hui-Fang.
A.B. Kahng, Ion I. Mandoiu University of California at San Diego, USA A.Z. Zelikovsky Georgia State University, USA Supported in part by MARCO GSRC and.
1 EL736 Communications Networks II: Design and Algorithms Class8: Networks with Shortest-Path Routing Yong Liu 10/31/2007.
© Yamacraw, 2001 Minimum-Buffered Routing of Non-Critical Nets for Slew Rate and Reliability A. Zelikovsky GSU Joint work with C. Alpert.
Minimum-Buffered Routing of Non- Critical Nets for Slew Rate and Reliability Control Supported by Cadence Design Systems, Inc. and the MARCO Gigascale.
Background: Scan-Based Delay Fault Testing Sequentially apply initialization, launch test vector pairs that differ by 1-bit shift A vector pair induces.
CSC 2300 Data Structures & Algorithms April 17, 2007 Chapter 9. Graph Algorithms.
Linear Programming and Approximation
Totally Unimodular Matrices Lecture 11: Feb 23 Simplex Algorithm Elliposid Algorithm.
Introduction to Linear and Integer Programming Lecture 7: Feb 1.
Primal Dual Method Lecture 20: March 28 primaldual restricted primal restricted dual y z found x, succeed! Construct a better dual.
Approximation Algorithm: Iterative Rounding Lecture 15: March 9.
Approximation Algorithms
F.F. Dragan (Kent State) A.B. Kahng (UCSD) I. Mandoiu (Georgia Tech/UCLA) S. Muddu (Silicon Graphics) A. Zelikovsky (Georgia State) Provably Good Global.
Provably Good Global Buffering Using an Available Buffer Block Plan F. F. Dragan (Kent) A. B. Kahng (UCLA) I. Mandoiu (Gatech) S. Muddu (Silicon graphics)
Symmetric Connectivity With Minimum Power Consumption in Radio Networks G. Calinescu (IL-IT) I.I. Mandoiu (UCSD) A. Zelikovsky (GSU)
Augmenting Paths, Witnesses and Improved Approximations for Bounded Degree MSTs K. Chaudhuri, S. Rao, S. Riesenfeld, K. Talwar UC Berkeley.
Non-tree Routing for Reliability & Yield Improvement A.B. Kahng – UCSD B. Liu – Incentia I.I. Mandoiu – UCSD Work supported by Cadence, MARCO GSRC, and.
EDA (CS286.5b) Day 19 Covering and Retiming. “Final” Like Assignment #1 –longer –more breadth –focus since assignment #2 –…but ideas are cummulative –open.
Floorplan Evaluation with Timing-Driven Global Wireplanning, Pin Assignment and Buffer / Wire Sizing Christoph Albrecht Synopsys, Inc., Mountain View formerly.
(work appeared in SODA 10’) Yuk Hei Chan (Tom)
Ion I. Mandoiu Ph.D. Defense of Research August 11, 2000 Approximation Algorithms for VLSI Routing.
Approximation Algorithms: Bristol Summer School 2008 Seffi Naor Computer Science Dept. Technion Haifa, Israel TexPoint fonts used in EMF. Read the TexPoint.
Introduction to Routing. The Routing Problem Apply after placement Input: –Netlist –Timing budget for, typically, critical nets –Locations of blocks and.
Area-I/O Flip-Chip Routing for Chip-Package Co-Design Progress Report 方家偉、張耀文、何冠賢 The Electronic Design Automation Laboratory Graduate Institute of Electronics.
Escape Routing For Dense Pin Clusters In Integrated Circuits Mustafa Ozdal, Design Automation Conference, 2007 Mustafa Ozdal, IEEE Trans. on CAD, 2009.
Approximating Minimum Bounded Degree Spanning Tree (MBDST) Mohit Singh and Lap Chi Lau “Approximating Minimum Bounded DegreeApproximating Minimum Bounded.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
A Polynomial Time Approximation Scheme For Timing Constrained Minimum Cost Layer Assignment Shiyan Hu*, Zhuo Li**, Charles J. Alpert** *Dept of Electrical.
Solving Hard Instances of FPGA Routing with a Congestion-Optimal Restrained-Norm Path Search Space Keith So School of Computer Science and Engineering.
Topics in Algorithms 2005 Constructing Well-Connected Networks via Linear Programming and Primal Dual Algorithms Ramesh Hariharan.
New Modeling Techniques for the Global Routing Problem Anthony Vannelli Department of Electrical and Computer Engineering University of Waterloo Waterloo,
Approximation Algorithms Department of Mathematics and Computer Science Drexel University.
1 CS612 Algorithms for Electronic Design Automation CS 612 – Lecture 8 Lecture 8 Network Flow Based Modeling Mustafa Ozdal Computer Engineering Department,
Thermal-aware Steiner Routing for 3D Stacked ICs M. Pathak and S.K. Lim Georgia Institute of Technology ICCAD 07.
ARCHER:A HISTORY-DRIVEN GLOBAL ROUTING ALGORITHM Muhammet Mustafa Ozdal, Martin D. F. Wong ICCAD ’ 07.
Combinatorial Optimization Problems in Computational Biology Ion Mandoiu CSE Department.
A Faster Approximation Scheme for Timing Driven Minimum Cost Layer Assignment Shiyan Hu*, Zhuo Li**, and Charles J. Alpert** *Dept of ECE, Michigan Technological.
Fast and accurate energy minimization for static or time-varying Markov Random Fields (MRFs) Nikos Komodakis (Ecole Centrale Paris) Nikos Paragios (Ecole.
Fujitsu Labs, January 20, 2003 Non-tree Routing for Reliability and Yield Improvement Ion Mandoiu CSE Department, UC San Diego Joint work with A.B. Kahng.
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
Branch-and-Cut Valid inequality: an inequality satisfied by all feasible solutions Cut: a valid inequality that is not part of the current formulation.
CS223 Advanced Data Structures and Algorithms 1 Maximum Flow Neil Tang 3/30/2010.
1 CS612 Algorithms for Electronic Design Automation CS 612 – Lecture 8 Lecture 8 Network Flow Based Modeling Mustafa Ozdal Computer Engineering Department,
Lagrangean Relaxation
© Yamacraw, 2002 Symmetric Minimum Power Connectivity in Radio Networks A. Zelikovsky (GSU) Joint work with Joint work with.
1 Slides by Yong Liu 1, Deep Medhi 2, and Michał Pióro 3 1 Polytechnic University, New York, USA 2 University of Missouri-Kansas City, USA 3 Warsaw University.
1 Approximation algorithms Algorithms and Networks 2015/2016 Hans L. Bodlaender Johan M. M. van Rooij TexPoint fonts used in EMF. Read the TexPoint manual.
TU/e Algorithms (2IL15) – Lecture 12 1 Linear Programming.
Ion I. Mandoiu, Vijay V. Vazirani Georgia Tech Joseph L. Ganley Simplex Solutions A New Heuristic for Rectilinear Steiner Trees.
Spanning Trees Dijkstra (Unit 10) SOL: DM.2 Classwork worksheet Homework (day 70) Worksheet Quiz next block.
Discrete Optimization MA2827 Fondements de l’optimisation discrète Material from P. Van Hentenryck’s course.
Confidential & Proprietary – All Rights Reserved Internal Distribution, October Quality of Service in Multimedia Distribution G. Calinescu (Illinois.
VLSI Physical Design Automation
Data Driven Resource Allocation for Distributed Learning
Lap Chi Lau we will only use slides 4 to 19
Topics in Algorithms Lap Chi Lau.
Multi-Commodity Flow Based Routing
1.3 Modeling with exponentially many constr.
Buffered tree construction for timing optimization, slew rate, and reliability control Abstract: With the rapid scaling of IC technology, buffer insertion.
Linear Programming and Approximation
1.3 Modeling with exponentially many constr.
Presentation transcript:

F.F. Dragan (Kent State) A.B. Kahng (UCSD) I. Mandoiu (UCLA/UCSD) S. Muddu (Sanera Systems) A. Zelikovsky (Georgia State) Practical Approximation Algorithms for Separable Packing LPs

2 Outline VLSI design motivation –Global routing via buffer-blocks –Separable packing ILP formulations PTAS for separable packing LPs Analysis Experimental results

3 Outline VLSI design motivation –Global routing via buffer-blocks –Separable packing ILP formulations PTAS for separable packing LPs Analysis Experimental results

4 Outline VLSI design motivation –Global routing via buffer-blocks –Separable packing ILP formulations PTAS for separable packing LPs Analysis Experimental results

5 Outline VLSI design motivation –Global routing via buffer-blocks –Separable packing ILP formulations PTAS for separable packing LPs Analysis Experimental results

6 VLSI Global Routing

7 Buffered Buffer Blocks

8 Problem Formulation Global Routing via Buffer-Blocks (GRBB) Problem Given: BB locations and capacities List of multi-pin nets – upper-bound on #buffers for each source-sink path L/U bounds on the wirelength b/w consecutive buffers/pins Find: Buffered routing of a maximum number of nets subject to the given constraints

9 Integer Program Formulation

10 Enforcing Parity Constraints Inverting buffers change the polarity of the signal Each sink has a given polarity requirement  Parity constraints for the #buffers on each routed source-sink path  A path may use two buffers in the same buffer block Integer program changes Split each BB vertex r of G into two copies, r’ and r’’ Impose capacity constraint on the sets of vertices {r’,r’’}

11 Combining with compaction

12 Combining with compaction

13 Combining with compaction Set capacity constraints: cap(BB1) + cap(BB2)  const.

14 GRBB with Buffer Library Discrete buffer library: different buffer sizes/driving strengths  Need to allocate BB capacity between different buffer types Integer program changes Replace each BB vertex r of G by a set X(r) of vertices (one for each buffer type) Modify edge set of G to take into account non-uniform driving strengths Impose capacity constraint on the sets of vertices X(r):

15 “Relax+Round” Approach to GRBB 1.Solve the fractional relaxation –Exact linear programming algorithms are impractical for large instances –KEY IDEA: use an approximation algorithm allows fine-tuning the tradeoff between runtime and solution quality 2.Round to integer solution –Provably good rounding [RT87] –Practical runtime (random-walk based)

16 Outline VLSI design motivation –Global routing via buffer-blocks –Separable packing LP formulations PTAS for separable packing LPs Analysis Experimental results

17 Separable Packing LP

18 Previous Work MCF and packing/covering LP approximation: [FGK73,SM90, PST91,G92,GK94,KPST94,LMPSTT95,R95,Y95,GK98,F00,…] Exponential length function to model flow congestion [SM90] Shortest-path augmentation + final scaling [Y95] Modified routing increment [GK98] Fewer shortest-path augmentations [F00] We extend speed-up idea of [F00] to separable packing LPs

19 Separable Packing LP Algorithm w(X)  , f  0,  =  For i = 1 to N do For k = 1, …, #nets do Find min weight feasible Steiner tree T for net k While weight(T) < min{ 1, (1+  )  } do f(T)= f(T) + 1 For every X do w(X)  ( 1 +   (T,X)/cap(X) ) * w(X) End For Find min weight feasible Steiner tree T for net k End While End For  = (1+  )  End For Output f/N

20 Outline VLSI design motivation –Global routing via buffer-blocks –Separable packing ILP formulations PTAS for separable packing LPs Analysis Experimental results

21 Runtime Dual LP: Choose #iterations N such that all feasible trees have weight  1 after N iterations (i.e.,   1) Tree weight lower bound is  initially, and is multiplied by ( 1+  ) in each iteration

22 Approximation Guarantee Theorem: For every  <.15, the algorithm finds factor 1/(1+4  ) approximation by choosing where L is the maximum number of vertices in a feasible Steiner tree. For this value of , the running time is

23 Outline VLSI design motivation –Global routing via buffer-blocks –Separable packing ILP formulations PTAS for separable packing LPs Analysis Experimental results

24 Implementation choices 2-Pin3,4-pinMulti-pin DecompositionStar, Minimum Spanning tree Matching, 3-restricted Steiner tree Not needed Min-weight DRSTShortest path (exact) Try all Steiner pts + shortest paths (exact) Very hard!  heuristics RoundingRandom-walkBackward random-walks

25 1.Store fractional flows f(T) for every feasible Steiner tree T 2.Scale down each f(T) by 1-  for small  3.Each net k routed with prob. f(k)=  { f(T) | T feasible for k }  Number of routed nets  (1-  )OPT 4.To route net k, choose tree T with probability = f(T) / f(k)  With high probability, no BB capacity is exceeded Problem: Impractical to store all non-zero flow trees Provably Good Rounding

26 1.Store fractional flows f(T) for every valid routing tree T 2.Scale down each f(T) by 1-  for small  3.Each net k routed with prob. f(k)=  { f(T) | T routing for k }  Number of routed nets  (1-  )OPT 4.To route net k, choose tree T with probability = f(T) / f(k)  With high probability, no BB capacity is exceeded Random-Walk 2-TMCF Rounding use random walk from source to sink Practical: random walk requires storing only flows on edges

27 Random-Walk MTMCF Rounding S T1 T2 T3 Source  Sinks

28 Random-Walk MTMCF Rounding S T1 T2 T3 Source  Sinks

29 The MTMCF Rounding Heuristic 1.Round each net k with probability f(k), using backward random walks –No scaling-down, approximate MTMCF < OPT 2.Resolve capacity violations by greedily deleting routed paths –Few violations 3.Greedily route remaining nets using unused BB capacity –Further routing still possible

30 Implemented Heuristics Greedy buffered routing: 1.For each net, route sinks sequentially along shortest paths to source or node already connected to source 2.After routing a net, remove fully used BBs Generalized MCF approximation + randomized rounding –G2TMCF –G3TMCF (3-pin decomposition) –G4TMCF (4-pin decomposition) –GMTMCF (no decomposition, approximate DRST)

31 Experimental Setup Test instances extracted from next-generation SGI microprocessor Up to 5,000 nets, ~6,000 sinks U=4,000  m, L=500-2,000  m 50 buffer blocks buffers / BB

32 % Routed Nets vs. Runtime

33 Conclusions and Ongoing Work Provably good algorithms and practical heuristics based on separable packing LP approximation –Higher completion rates than previous algorithms Extensions: –Combine global buffering with BB planning –Buffer “site” methodology  tile graph –Routing congestion (channel capacity constraints) –Simultaneous pin assignment

34

35 % Sinks Connected #sinks/ #nets Greed G2TMCFG3TMCFG4TMCFGMTMCF  =.64  =.04  =.64  =.04  =.64  =.04  =.64  = / / / / / /

36 Runtime (sec.) #sinks/ #nets Greed G2TMCFG3TMCFG4TMCFGMTMCF  =.64  =.04  =.64  =.04  =.64  =.04  =.64  = / , , / , , / , , / , , , / , , , / , , ,833

37 Resource Usage Greed G2TMCFG3TMCFG4TMCFGMTMCF  =.64  =.04  =.64  =.04  =.64  =.04  =.64  =.04 # Conn. Sinks 5,6455,7255,8425,7795,8965,8275,9425,8135,897 % Conn. Sinks WL (meters) WL/sink (microns) 7,4797,8918,1827,6978,0837,5827,9927,7988,057 #Buff90379,86010,6769,59110,6109,49710,5079,86010,647 #Buff/sink #nets = 4,764 #sinks = 6, buffers/BB

38 Resource Usage for 100% Completion Greed  =.04 4TMCF,  =.04 #buffers/BB1,000 or INF ,000INF WL (meters) WL/sink (microns) 7,9318,1918,2128,2788,513 #Buff10,33011,07911,11511, #Buff/sink #nets = 4,764 #sinks = 6,038MTMCF wastes routing resources!