6/19/2016 1 VLSI Physical Design Automation Prof. David Pan Office: ACES 5.434 Placement (3)

Slides:



Advertisements
Similar presentations
Porosity Aware Buffered Steiner Tree Construction C. Alpert G. Gandham S. Quay IBM Corp M. Hrkic Univ Illinois Chicago J. Hu Texas A&M Univ.
Advertisements

OCV-Aware Top-Level Clock Tree Optimization
Optimization of Placement Solutions for Routability Wen-Hao Liu, Cheng-Kok Koh, and Yih-Lang Li DAC’13.
Xing Wei, Wai-Chung Tang, Yu-Liang Wu Department of Computer Science and Engineering The Chinese University of HongKong
BSPlace: A BLE Swapping technique for placement Minsik Hong George Hwang Hemayamini Kurra Minjun Seo 1.
Improving Placement under the Constant Delay Model Kolja Sulimma 1, Ingmar Neumann 1, Lukas Van Ginneken 2, Wolfgang Kunz 1 1 EE and IT Department University.
Natarajan Viswanathan Min Pan Chris Chu Iowa State University International Symposium on Physical Design April 6, 2005 FastPlace: An Analytical Placer.
X-Architecture Placement Based on Effective Wire Models Tung-Chieh Chen, Yi-Lin Chuang, and Yao-Wen Chang Graduate Institute of Electronics Engineering.
Meng-Kai Hsu, Sheng Chou, Tzu-Hen Lin, and Yao-Wen Chang Electronics Engineering, National Taiwan University Routability Driven Analytical Placement for.
Ripple: An Effective Routability-Driven Placer by Iterative Cell Movement Xu He, Tao Huang, Linfu Xiao, Haitong Tian, Guxin Cui and Evangeline F.Y. Young.
1 Physical Hierarchy Generation with Routing Congestion Control Chin-Chih Chang *, Jason Cong *, Zhigang (David) Pan +, and Xin Yuan * * UCLA Computer.
Coupling-Aware Length-Ratio- Matching Routing for Capacitor Arrays in Analog Integrated Circuits Kuan-Hsien Ho, Hung-Chih Ou, Yao-Wen Chang and Hui-Fang.
Congestion Driven Placement for VLSI Standard Cell Design Shawki Areibi and Zhen Yang School of Engineering, University of Guelph, Ontario, Canada December.
International Conference on Computer-Aided Design San Jose, CA Nov. 2001ER UCLA UCLA 1 Congestion Reduction During Placement Based on Integer Programming.
38 th Design Automation Conference, Las Vegas, June 19, 2001 Creating and Exploiting Flexibility in Steiner Trees Elaheh Bozorgzadeh, Ryan Kastner, Majid.
Placement 1 Outline Goal What is Placement? Why Placement?
Routing 1 Outline –What is Routing? –Why Routing? –Routing Algorithms Overview –Global Routing –Detail Routing –Shortest Path Algorithms Goal –Understand.
Reconfigurable Computing (EN2911X, Fall07)
Circuit Performance Variability Decomposition Michael Orshansky, Costas Spanos, and Chenming Hu Department of Electrical Engineering and Computer Sciences,
Accurate Pseudo-Constructive Wirelength and Congestion Estimation Andrew B. Kahng, UCSD CSE and ECE Depts., La Jolla Xu Xu, UCSD CSE Dept., La Jolla Supported.
A Proposal for Routing-Based Timing-Driven Scan Chain Ordering Puneet Gupta 1 Andrew B. Kahng 1 Stefanus Mantik 2
ISPD 2000, San DiegoApr 10, Requirements for Models of Achievable Routing Andrew B. Kahng, UCLA Stefanus Mantik, UCLA Dirk Stroobandt, Ghent.
A Resource-level Parallel Approach for Global-routing-based Routing Congestion Estimation and a Method to Quantify Estimation Accuracy Wen-Hao Liu, Zhen-Yu.
POLAR 2.0: An Effective Routability-Driven Placer Chris Chu Tao Lin.
Cost-Based Tradeoff Analysis of Standard Cell Designs Peng Li Pranab K. Nag Wojciech Maly Electrical and Computer Engineering Carnegie Mellon University.
7/15/ VLSI Placement Prof. Shiyan Hu Office: EERC 731.
ECE 506 Reconfigurable Computing Lecture 8 FPGA Placement.
VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig 1 FLUTE: Fast Lookup Table Based RSMT Algorithm.
Introduction to Routing. The Routing Problem Apply after placement Input: –Netlist –Timing budget for, typically, critical nets –Locations of blocks and.
MGR: Multi-Level Global Router Yue Xu and Chris Chu Department of Electrical and Computer Engineering Iowa State University ICCAD
9/4/ VLSI Physical Design Automation Prof. David Pan Office: ACES Detailed Routing (I)
Area-I/O Flip-Chip Routing for Chip-Package Co-Design Progress Report 方家偉、張耀文、何冠賢 The Electronic Design Automation Laboratory Graduate Institute of Electronics.
Modern VLSI Design 4e: Chapter 4 Copyright  2008 Wayne Wolf Topics n Interconnect design. n Crosstalk. n Power optimization.
CRISP: Congestion Reduction by Iterated Spreading during Placement Jarrod A. Roy†‡, Natarajan Viswanathan‡, Gi-Joon Nam‡, Charles J. Alpert‡ and Igor L.
Power Reduction for FPGA using Multiple Vdd/Vth
Global Routing.
1 Coupling Aware Timing Optimization and Antenna Avoidance in Layer Assignment Di Wu, Jiang Hu and Rabi Mahapatra Texas A&M University.
EE 5900 Advanced Algorithms for Robust VLSI CAD, Spring 2009 Static Timing Analysis and Gate Sizing.
TSV-Aware Analytical Placement for 3D IC Designs Meng-Kai Hsu, Yao-Wen Chang, and Valerity Balabanov GIEE and EE department of NTU DAC 2011.
Archer: A History-Driven Global Routing Algorithm Mustafa Ozdal Intel Corporation Martin D. F. Wong Univ. of Illinois at Urbana-Champaign Mustafa Ozdal.
An Efficient Clustering Algorithm For Low Power Clock Tree Synthesis Rupesh S. Shelar Enterprise Microprocessor Group Intel Corporation, Hillsboro, OR.
New Modeling Techniques for the Global Routing Problem Anthony Vannelli Department of Electrical and Computer Engineering University of Waterloo Waterloo,
1 Wire Length Prediction-based Technology Mapping and Fanout Optimization Qinghua Liu Malgorzata Marek-Sadowska VLSI Design Automation Lab UC-Santa Barbara.
CSE 494: Electronic Design Automation Lecture 2 VLSI Design, Physical Design Automation, Design Styles.
Massachusetts Institute of Technology 1 L14 – Physical Design Spring 2007 Ajay Joshi.
Placement. Physical Design Cycle Partitioning Placement/ Floorplanning Placement/ Floorplanning Routing Break the circuit up into smaller segments Place.
Jason Cong‡†, Guojie Luo*†, Kalliopi Tsota‡, and Bingjun Xiao‡ ‡Computer Science Department, University of California, Los Angeles, USA *School of Electrical.
VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 6: Detailed Routing © KLMH Lienig 1 What Makes a Design Difficult to Route Charles.
Modern VLSI Design 3e: Chapter 4 Copyright  1998, 2002 Prentice Hall PTR Topics n Interconnect design. n Crosstalk. n Power optimization.
Congestion Estimation and Localization in FPGAs: A Visual Tool for Interconnect Prediction David Yeager Darius Chiu Guy Lemieux The University of British.
Fishbone: A Block-Level Placement and Routing Scheme Fan Mo and Robert K. Brayton EECS, UC Berkeley.
I N V E N T I V EI N V E N T I V E A Morphing Approach To Address Placement Stability Philip Chong Christian Szegedy.
An Effective Congestion Driven Placement Framework André Rohe University of Bonn, Germany joint work with Ulrich Brenner.
ECE 260B – CSE 241A /UCB EECS Kahng/Keutzer/Newton Physical Design Flow Read Netlist Initial Placement Placement Improvement Cost Estimation Routing.
1 CS612 Algorithms for Electronic Design Automation CS 612 – Lecture 1 Course Overview Mustafa Ozdal Computer Engineering Department, Bilkent University.
Physical Synthesis Buffer Insertion, Gate Sizing, Wire Sizing,
1 WireMap FPGA Technology Mapping for Improved Routability Stephen Jang, Xilinx Inc. Billy Chan, Xilinx Inc. Kevin Chung, Xilinx Inc. Alan Mishchenko,
A Novel Timing-Driven Global Routing Algorithm Considering Coupling Effects for High Performance Circuit Design Jingyu Xu, Xianlong Hong, Tong Jing, Yici.
Proximity Optimization for Adaptive Circuit Design Ang Lu, Hao He, and Jiang Hu.
EE4271 VLSI Design VLSI Channel Routing.
RTL Design Flow RTL Synthesis HDL netlist logic optimization netlist Library/ module generators physical design layout manual design a b s q 0 1 d clk.
Placement and Routing Algorithms. 2 FPGA Placement & Routing.
VLSI Physical Design Automation
VLSI Physical Design Automation
VLSI Quadratic Placement
EE5780 Advanced VLSI Computer-Aided Design
Placement and Routing With Congestion Control
EE4271 VLSI Design, Fall 2016 VLSI Channel Routing.
VLSI Physical Design Automation
Reconfigurable Computing (EN2911X, Fall07)
Presentation transcript:

6/19/ VLSI Physical Design Automation Prof. David Pan Office: ACES Placement (3)

2 6/19/2016 Outline Wire length driven placement Main methods –Simulated Annealing –Partition-based methods –Analytical methods Timing and congestion consideration during placement Newer trends

3 6/19/2016 Timing Cost  Delay of the circuit is defined as the longest delay among all possible paths from primary inputs to primary outputs.  Interconnection delay becomes more and more important in deep sub-micron regime. Critical Path

4 6/19/2016 PO1 PO2 PO3 PI1 PI2 PI netlist with delay for each gate Timing Analysis PO1 PO2 PO3 PI1 PI2 PI arrival times

5 6/19/2016 PO1 PO2 PO3 PI1 PI2 PI arrival time/required time 0/4 0/0 0/8 1/5 3/3 1/9 7/9 9/9 7/15 7/13 13/15 15/15 14/18 18/22 22/22 18/22 PO1 PO2 PO3 PI1 PI2 PI slack = required time - arrival time Timing Analysis

6 6/19/2016 Another example with interconnect delay – Same Timing Analysis LATCHLATCH LATCHLATCH

7 6/19/2016 Timing Driven Placement Approaches Path-based –Most accurate information –Very slow Budgeting –Inaccurate information –Hard to budget –Fast Net-based approach –Net-weighting

8 6/19/2016 Net-Weighting Basic approach –For more timing critical nets (i.e., smaller slack), assign higher net weights –Minimize where

6/19/ H. Ren, D. Z. Pan and D.S. Kung ISPD-04 Sensitivity Guided Netweighting for Placement Driven Synthesis

10 6/19/2016 Figure of Merit (FOM) FOM is the total slack difference compared to a certain slack threshold for all timing end points. Interpreted as the amount of work left for the physical synthesis engine or to the designers for manual fix. FOM and WNS (worst negative slack) are the two most important metrics for timing closure in modern physical synthesis However, FOM was not used to guide placement explicitly

11 6/19/2016 Sensitivity Definitions Net length sensitivity to net weight Net delay sensitivity to net length Net slack sensitivity to net weight: FOM sensitivity to net delay FOM sensitivity to net weight:

12 6/19/2016 Closed-Form Sensitivity For net length to weight sensitivity, we have For delay to wire length sensitivity, we have Use switch-level RC and Elmore delay to illustrate the concept Use switch-level RC and Elmore delay to illustrate the concept Good enough during placement Good enough during placement Can be extended to more accurate models Can be extended to more accurate models

13 6/19/2016 FOM to Net Delay Sensitivity Question: suppose the delay of net i is reduced by a small amount  T(i), what is the impact to FOM? Define: K(i) to be the number of timing end points whose slack will change due to  T(i) Then, we have the following Theorem

14 6/19/2016 K(i) Computation C D (-3, 1) (-3, 2) P o1 P o2 A B (-0.8, 0) (-1.2, 1) (-3, 1) (-0.8, 0) (-1.2, 1) (slack, K(i)) pair Topologically sorted order from PO to PI Only propagate K(i) to the most timing critical input pin

15 6/19/2016 Net Weight Generation Put these sensitivities together and generate new net weight

16 6/19/2016 Experiments We compare the placement and physical synthesis results of three different algorithms on 7 industry chips (up to 444k movable objects) from IBM –WL: wire length driven placement with uniform weight –TS: timing driven placement using slack sensitivity –TSF: timing driven placement using both slack and FOM sensitivity

17 6/19/2016 Timing after Placement

18 6/19/2016 Timing after Physical Synthesis

19 6/19/2016 Outline Wire length driven placement Main methods –Simulated Annealing –Partition-based methods –Analytical methods Timing and congestion consideration Newer trends

20 6/19/2016 Congestion Minimization Traditional placement problem is to minimize interconnection length (wirelength) A valid placement has to be routable Congestion is important because it represents routability (lower congestion implies better routability) There is not yet enough research work on the congestion minimization problem

21 6/19/2016 Definition of Congestion Routing demand = 3 Assume routing supply is 1, overflow = = 2 on this edge. Overflow =  overflow  all edges Overflow on each edge = Routing Demand - Routing Supply (if Routing Demand > Routing Supply) 0 (otherwise)

22 6/19/2016 Correlation between Wirelength and Congestion Total Wirelength = Total Routing Demand

23 6/19/2016 Wirelength  Congestion A congestion minimized placement A wirelength minimized placement

24 6/19/2016 Congestion Map of a Wirelength Minimized Placement Congested Spots

25 6/19/2016 Congestion Reduction Postprocessing Reduce congestion globallyby minimizing thetraditional wirelength Post process the wirelengthoptimized placement usingthe congestion objective

6/19/ An Effective Congestion Driven Placement Framework André Rohe University of Bonn, Germany joint work with Ulrich Brenner ISPD 2002 (Best Paper)

27 6/19/2016 A dense Placement good wirelength impossible to route

28 6/19/2016 Possible Solution easy to route bad wirelength/timing

29 6/19/2016 Congestion Driven Placement easy to route + good wirelength almost no extra computation efford !

30 6/19/2016 Overall Algorithm: Bonn Place Partitioning based approach Solves QP in each level, followed by partitioning Partitioning is done by quadrisection: circuits are partitioned with minimum movement (Vygen)

31 6/19/2016 Methods used for congestion driven placement Very fast congestion calculation Inflate circuits in congested regions Spreading inflated cells

32 6/19/2016 Congestion calculation Calculate Steiner Tree for each net Probablitiy estimation for each 2-point connection (similar to Hung & Flynn, Lou et al.)

33 6/19/2016 Quality of congestion calculation congestion estimation

34 6/19/2016 Quality of congestion calculation Bonn Global HDP Global

35 6/19/2016 Inflation of circuits (used previously by Hou et al.) Initial inflation (based on pin density) Given a circuit c in Region R, c is inflated by up to 100% The inflation is based on the congestion in R and the surrounding regions & the pin density in R Deflation is possible if the circuit is no longer critical.

36 6/19/2016 Placement Step 0

37 6/19/2016 Placement Step 1

38 6/19/2016 Placement Step 2

39 6/19/2016 Placement Step 3

40 6/19/2016 Placement Step 4

41 6/19/2016 Placement Step 5

42 6/19/2016 Placement Step 6

43 6/19/2016 Placement Step 7

44 6/19/2016 Spreading inflated cells Repartitioning considers 2x2 windows in placement grid to optimize netlength Use extra repartitioning step to move cells away from overloaded regions

45 6/19/2016 Summary: Algorithm overview 1.Init: Set window_set := {chip area}, set circuit_list(chip area):={all circuits} 2.Main Loop: While (window size big enough) Solve a QP to minimize quadratic netlength For (each window w in window_set) Quadrisection(w) Repartitioning 3.Legalization

46 6/19/2016 Algorithm overview 1.Init: Set window_set := {chip area}, set circuit_list(chip area):={all circuits} For (each c in {all circuits}) Increase b(c) proportionally to |pins(c)|/size(c) # initial inflation b(c) 2.Main Loop: While (window size big enough) Solve a QP to minimize quadratic netlength For (each window w in window_set) Quadrisection(w) Repartitioning 3.Legalization

47 6/19/2016 Algorithm overview 1.Init: Set window_set := {chip area}, set circuit_list(chip area):={all circuits} For (each c in {all circuits}) Increase b(c) proportionally to |pins(c)|/size(c) # initial inflation b(c) 2.Main Loop: While (window size big enough) Solve a QP to minimize quadratic netlength For (each window w in window_set) Quadrisection(w) Compute congestion and update b(c) # update inflation b(c) Quadrisection(w) Repartitioning 3.Legalization

48 6/19/2016 Algorithm overview 1.Init: Set window_set := {chip area}, set circuit_list(chip area):={all circuits} For (each c in {all circuits}) Increase b(c) proportionally to |pins(c)|/size(c) # initial inflation b(c) 2.Main Loop: While (window size big enough) Solve a QP to minimize quadratic netlength For (each window w in window_set) Quadrisection(w) Compute congestion and update b(c) # update inflation b(c) Quadrisection(w) Reduce overloaded windows # extra repartitioning steps Repartitioning 3.Legalization

49 6/19/2016 Computational Results StandardCongestion Driven ChipCPUlenCPUlenBlow IBM 10:23 h7.2 m0:26 h7.4 m10.2 % IBM 20:26 h7.9 m0:27 h9.0 m6.6 % IBM 33:50 h134 m4:39 h142 m20.1 % IBM 47:08 h241 m7:24 h270 m20.2 % IBM 516:10 h375 m16:37 h406 m57.8 % Mean+8.7 %+8.5%

50 6/19/2016 Computational Results II StandardCongestion Driven ChipHDPovCPUlenHDPovCPUlen IBM :15 h9 m75.500:05 h7.5 m IBM :19 h11.5 m75.400:05 h10.1 m IBM :36 h162 m77.304:51 h164 m IBM :18 h324 m75.202:48 h326 m IBM :57 h512 m :48 h527 m Mean-9 %-73 %-5.2 %

51 6/19/2016 Summary In this module, we cover two important concepts during placement to consider besides wire length –Timing driven placement, using net-weighting A new sensitivity based net weighting in ISPD’04 paper –Congestion minimization (using ISPD’02 as an example) congestion estimation Inflate cells in congested region Spread inflated cells