A Clustering Utility Based Approach for S. Areibi, M. Thompson, A. Vannelli uoguelph.ca September 2001 School of Engineering ASIC Design 14th.

Slides:



Advertisements
Similar presentations
Heuristic Search techniques
Advertisements

TOPIC : SYNTHESIS DESIGN FLOW Module 4.3 Verilog Synthesis.
Multilevel Hypergraph Partitioning Daniel Salce Matthew Zobel.
L30: Partitioning 성균관대학교 조 준 동 교수
1 An Adaptive GA for Multi Objective Flexible Manufacturing Systems A. Younes, H. Ghenniwa, S. Areibi uoguelph.ca.
Fast Algorithms For Hierarchical Range Histogram Constructions
BSPlace: A BLE Swapping technique for placement Minsik Hong George Hwang Hemayamini Kurra Minjun Seo 1.
Improving Placement under the Constant Delay Model Kolja Sulimma 1, Ingmar Neumann 1, Lukas Van Ginneken 2, Wolfgang Kunz 1 1 EE and IT Department University.
Meng-Kai Hsu, Sheng Chou, Tzu-Hen Lin, and Yao-Wen Chang Electronics Engineering, National Taiwan University Routability Driven Analytical Placement for.
A Size Scaling Approach for Mixed-size Placement Kalliopi Tsota, Cheng-Kok Koh, Venkataramanan Balakrishnan School of Electrical and Computer Engineering.
Shuai Li and Cheng-Kok Koh School of Electrical and Computer Engineering, Purdue University West Lafayette, IN, Mixed Integer Programming Models.
Ripple: An Effective Routability-Driven Placer by Iterative Cell Movement Xu He, Tao Huang, Linfu Xiao, Haitong Tian, Guxin Cui and Evangeline F.Y. Young.
FastPlace: Efficient Analytical Placement using Cell Shifting, Iterative Local Refinement and a Hybrid Net Model FastPlace: Efficient Analytical Placement.
Congestion Driven Placement for VLSI Standard Cell Design Shawki Areibi and Zhen Yang School of Engineering, University of Guelph, Ontario, Canada December.
ISQED’2015: D. Seemuth, A. Davoodi, K. Morrow 1 Automatic Die Placement and Flexible I/O Assignment in 2.5D IC Design Daniel P. Seemuth Prof. Azadeh Davoodi.
Multiobjective VLSI Cell Placement Using Distributed Simulated Evolution Algorithm Sadiq M. Sait, Mustafa I. Ali, Ali Zaidi.
APLACE: A General and Extensible Large-Scale Placer Andrew B. KahngSherief Reda Qinke Wang VLSICAD lab University of CA, San Diego.
Fuzzy Simulated Evolution for Power and Performance of VLSI Placement Sadiq M. Sait Habib Youssef Junaid A. KhanAimane El-Maleh Department of Computer.
1 HW/SW Partitioning Embedded Systems Design. 2 Hardware/Software Codesign “Exploration of the system design space formed by combinations of hardware.
Ryan Kinworthy 2/26/20031 Chapter 7- Local Search part 1 Ryan Kinworthy CSCE Advanced Constraint Processing.
Reconfigurable Computing (EN2911X, Fall07)
Fuzzy Simulated Evolution for Power and Performance of VLSI Placement Sadiq M. SaitHabib Youssef Junaid A. KhanAimane El-Maleh Department of Computer Engineering.
On Legalization of Row-Based Placements Andrew B. KahngSherief Reda CSE & ECE Departments University of CA, San Diego La Jolla, CA 92093
Fast Force-Directed/Simulated Evolution Hybrid for Multiobjective VLSI Cell Placement Junaid Asim Khan Dept. of Elect. & Comp. Engineering, The University.
Can Recursive Bisection Alone Produce Routable Placements? Andrew E. Caldwell Andrew B. Kahng Igor L. Markov Supported by Cadence.
1 Topology Design of Structured Campus Networks by Habib Youssef Sadiq M. SaitSalman A. Khan Department of Computer Engineering King Fahd University of.
Fuzzy Evolutionary Algorithm for VLSI Placement Sadiq M. SaitHabib YoussefJunaid A. Khan Department of Computer Engineering King Fahd University of Petroleum.
Ryan Kastner ASIC/SOC, September Coupling Aware Routing Ryan Kastner, Elaheh Bozorgzadeh and Majid Sarrafzadeh Department of Electrical and Computer.
POLAR 2.0: An Effective Routability-Driven Placer Chris Chu Tao Lin.
Carmine Cerrone, Raffaele Cerulli, Bruce Golden GO IX Sirmione, Italy July
CSE 144 Project Part 2. Overview Multiple rows Routing channel between rows Components of identical height but various width Goal: Implement a placement.
1 Topology Design of Structured Campus Networks by Habib Youssef Sadiq M. SaitSalman A. Khan Department of Computer Engineering King Fahd University of.
Chip Planning 1. Introduction Chip Planning:  Deals with large modules with −known areas −fixed/changeable shapes −(possibly fixed locations for some.
1 ENTITY test is port a: in bit; end ENTITY test; DRC LVS ERC Circuit Design Functional Design and Logic Design Physical Design Physical Verification and.
Introduction SWE 619. Why Is Building Good Software Hard? Large software systems enormously complex  Millions of “moving parts” People expect software.
MGR: Multi-Level Global Router Yue Xu and Chris Chu Department of Electrical and Computer Engineering Iowa State University ICCAD
CSE 242A Integrated Circuit Layout Automation Lecture: Partitioning Winter 2009 Chung-Kuan Cheng.
Introduction to VLSI Design – Lec01. Chapter 1 Introduction to VLSI Design Lecture # 2 A Circuit Design Example.
Power Reduction for FPGA using Multiple Vdd/Vth
Global Routing.
1 Coupling Aware Timing Optimization and Antenna Avoidance in Layer Assignment Di Wu, Jiang Hu and Rabi Mahapatra Texas A&M University.
Seongbo Shim, Yoojong Lee, and Youngsoo Shin Lithographic Defect Aware Placement Using Compact Standard Cells Without Inter-Cell Margin.
TSV-Aware Analytical Placement for 3D IC Designs Meng-Kai Hsu, Yao-Wen Chang, and Valerity Balabanov GIEE and EE department of NTU DAC 2011.
Solving Hard Instances of FPGA Routing with a Congestion-Optimal Restrained-Norm Path Search Space Keith So School of Computer Science and Engineering.
1 Global Routing Method for 2-Layer Ball Grid Array Packages Yukiko Kubo*, Atsushi Takahashi** * The University of Kitakyushu ** Tokyo Institute of Technology.
March 20, 2007 ISPD An Effective Clustering Algorithm for Mixed-size Placement Jianhua Li, Laleh Behjat, and Jie Huang Jianhua Li, Laleh Behjat,
SOFTWARE DESIGN.
An Efficient Clustering Algorithm For Low Power Clock Tree Synthesis Rupesh S. Shelar Enterprise Microprocessor Group Intel Corporation, Hillsboro, OR.
New Modeling Techniques for the Global Routing Problem Anthony Vannelli Department of Electrical and Computer Engineering University of Waterloo Waterloo,
CSE 494: Electronic Design Automation Lecture 2 VLSI Design, Physical Design Automation, Design Styles.
MODELING AND ANALYSIS OF MANUFACTURING SYSTEMS Session 12 MACHINE SETUP AND OPERATION SEQUENCING E. Gutierrez-Miravete Spring 2001.
Massachusetts Institute of Technology 1 L14 – Physical Design Spring 2007 Ajay Joshi.
The Fast Optimal Voltage Partitioning Algorithm For Peak Power Density Minimization Jia Wang, Shiyan Hu Department of Electrical and Computer Engineering.
Jason Cong‡†, Guojie Luo*†, Kalliopi Tsota‡, and Bingjun Xiao‡ ‡Computer Science Department, University of California, Los Angeles, USA *School of Electrical.
Register Placement for High- Performance Circuits M. Chiang, T. Okamoto and T. Yoshimura Waseda University, Japan DATE 2009.
O PTIMAL SERVICE TASK PARTITION AND DISTRIBUTION IN GRID SYSTEM WITH STAR TOPOLOGY G REGORY L EVITIN, Y UAN -S HUN D AI Adviser: Frank, Yeong-Sung Lin.
Stochastic greedy local search Chapter 7 ICS-275 Spring 2009.
An Exact Algorithm for Difficult Detailed Routing Problems Kolja Sulimma Wolfgang Kunz J. W.-Goethe Universität Frankfurt.
Effective Linear Programming-Based Placement Techniques Sherief Reda UC San Diego Amit Chowdhary Intel Corporation.
SEMI-SYNTHETIC CIRCUIT GENERATION FOR TESTING INCREMENTAL PLACE AND ROUTE TOOLS David GrantGuy Lemieux University of British Columbia Vancouver, BC.
1 Chapter 5 Branch-and-bound Framework and Its Applications.
A Snap-On Placement Tool Israel Waldman. Introduction.
1 BROOKHAVEN SCIENCE ASSOCIATES 13th International Workshop on Accelerator Alignment October 13-17, 2014, IHEP, Beijing, China Smoothing Based on Best-fit.
RTL Design Flow RTL Synthesis HDL netlist logic optimization netlist Library/ module generators physical design layout manual design a b s q 0 1 d clk.
Partial Reconfigurable Designs
HeAP: Heterogeneous Analytical Placement for FPGAs
Unit# 9: Computer Program Development
Algorithm An algorithm is a finite set of steps required to solve a problem. An algorithm must have following properties: Input: An algorithm must have.
Sheqin Dong, Song Chen, Xianlong Hong EDA Lab., Tsinghua Univ. Beijing
EE5780 Advanced VLSI Computer-Aided Design
Presentation transcript:

A Clustering Utility Based Approach for S. Areibi, M. Thompson, A. Vannelli uoguelph.ca September 2001 School of Engineering ASIC Design 14th ASIC/SOC 2001

Outline This paper introduces several approaches for circuit clustering that are used for hierarchical Standard Cell VLSI Placement problem. Introduction (VLSI Design Cycle). Motivation. Circuit Layout (Placement Problem) Clustering Based Standard Cell Placement.  Weighted Hyper-edge Clustering.  De-Clustering.  Top Level Improvement. Numerical Testing and Comparison. Conclusions & Future Work. 14th ASIC/SOC 2001

Introduction Design Automation: is the task of automatically designing a circuit using software tools. The ultimate goal : is to fully automate the tasks of designing, verifying, and testing a circuit. The VLSI design process is a complicated task, and the only feasible approach to solving the VLSI design problem is a divide-and- conquer strategy. One of these tasks is physical design, which is still incredibly complex. Not surprisingly, this complexity is handled by dividing the physical design task into more tractable sub-tasks. One sub-task within physical design is placement, in which technology-mapped logic components are arranged on a chip. 14th ASIC/SOC 2001

Motivation As we move to deep sub-micron designs below 0.18 microns, the delay of a circuit, as well as power dissipation and area, is dominated by interconnections between logical elements (i.e. transistors). To deal with the complexity of millions of components and to achieve a turn around time in a couple of months, VLSI design tools must not only be computationally fast but also generate optimal layouts. Since the delay of a circuit cannot be ignored, work must still be done to reduce the area of placement and routing of very high performance designs. There is a great need for DA tools that operate in a reasonable amount of time, while still arriving at “reasonably good’’ solutions. 14th ASIC/SOC 2001

Phases of VLSI System Design 14th ASIC/SOC 2001

Physical Design Process Physical design is a complex process, therefore, it is broken down into various sub-steps. 14th ASIC/SOC 2001

Layout Styles The VLSI design process includes logical and physical designs of a circuit. The logical design of a circuit is independent of an implementation, while the physical design is inherently linked to the layout style of the target technology that will implement the desired behavior. The layout style dictates many design constraints for physical design, since it must be possible to fabricate the physical design in the desired technology. Different styles are used to alter the design style to achieve some quality gain. 14th ASIC/SOC 2001

Cont.. Layout Styles Different technologies that can implement a VLSI design. 14th ASIC/SOC 2001

Circuit Placement Description: Given a set of modules and nets, assign the modules to legal positions within a placement area such that the interconnection cost is minimized. The set of modules in the network is denoted by M and the set of nets by N. The modules in a net n  N is denoted by M n and the set of nets connected to a module m  M is denoted by N m. In the standard-cell layout, cell modules are placed within R parallel rows in the chip core area such that no cells in a row are overlapping, and a maximum row length is not exceeded. The objective in the VLSI circuit placement is to minimize the total wire-length  (x,y) =  W ij [(x i - x j ) 2 + (y i - y j ) 2 ] 14th ASIC/SOC 2001

Interconnection Cost/HPWL The cost (HPWL) is given by C l (x) =  i=1..N (H i + V i ) where N is the number of nets, H i and V i are the span of the net i in the horizontal and vertical direction separately. 14th ASIC/SOC 2001

Overlap Penalty The cost C O (x) is the overlap penalty function, and given by: C O (x) = p O  I<j O(i, j) Where p O is a penalty parameter. The function O(i, j) returns the total amount of overlap area between cells i and j. Certainly, by checking every other cell on the same row as cell $i$ it can be determined which of these cells overlap with cell i. However, the complexity is O(M i ), where M i is the number of cells in the row. The time spent doing overlap computation can be substantial. Another way to compute the cost C O (x) is to search toward W max (maximum cell width) left away from cell i and toward W max right away from cell $i$, all other cells which overlap with cell i can be found. 14th ASIC/SOC 2001

Row Length Penalty The cost C r (x) is the row length penalty function. It is given by: C r (x) = p r  I=1..R || L ai - L di || Here p r is a row penalty parameter, R is the number Of rows, L ai and L di are the actual and desired row length for row i. p r = 5 is approximately the smallest value which would yield uniform row lengths without placing excessively emphasis on C r (x) in the objective function. 14th ASIC/SOC 2001

Approaches for Solving Placement The most basic division among approaches is between exact solution methods and approximation methods (Heuristics). 14th ASIC/SOC 2001

Advantages/Disadvantages of Approaches The first approach to a placement problem is to solve it in a top-down fashion, by considering globally the best positions for cells in a placement. The more conventional approach is to use a bottom-up, iterative improvement approach, which attempts to find a good overall solution by looking at one or a few cells movement at at time. A more recent approach to combining these techniques is called hierarchal improvement, and is a two step procedure, first proceeding bottom-up, and then top-down. The bottom-up technique is clustering (reduce complexity) while the goal of the top-down method is to determine the location for all clusters. 14th ASIC/SOC 2001

Multilevel Clustering Hierarchy 14th ASIC/SOC 2001 Early methods of clustering performed the desired circuit size reduction in a single level. Clustering in steps produces superior results permitting more gradual de-clustering.

Weighted Hyper-edge Clustering Our work is based on Karypis (METIS) technique used for circuit partitioning. The major addition to simple hyper-edge clustering is the development of cluster size control. Our method is divided into passes: 1. In the first pass, cells hyper edges are greedily clustered together, but only if they are within width limits. 2. In the second pass, remaining cells on hyper edges are also greedily clustered together. 3. A third pass is performed to assign any remaining cells to a new cluster. 14th ASIC/SOC 2001

Weighted Hyper-edge Clustering Sort nets by increasing size For each sorted net If no cell is clustered and sum of cell width is within limits Cluster all cells on nets End If End For For each sorted net If sum of un-clustered cell widths is within limits Cluster all un-clustered cells on net End If End For For each un-clustered cell in circuit Create a new cluster from cell End For 14th ASIC/SOC 2001

Flatten De-Clustering Heuristic Previous methods for clustering-based placement flattened the circuit by placing the cells of a cluster randomly within the physical confines of the cluster at the previous hierarchal level. Since relative positions between cells in a cluster were not considered at any clustered level of the hierarchy, they are not implied at the flattened level, some method must be used to determine legal relative positions for the flattened cells to occupy. To minimize the quality deterioration during circuit flattening, further improvement is performed on the circuit at each flattening stage, using localized search heuristic. In our approach, results were obtained by using the ARP global placer at the top hierarchical level, without any iterative improvement, and then using only FLATTEN during de-clustering. 14th ASIC/SOC 2001

Circuit De-Clustering The average position of all connected pins is calculated for each cell in a cluster. The cells within the cluster are then sorted by their average pinx- coordinate, and given a relative order as they are flattened. 14th ASIC/SOC 2001

Numerical Testing and Comparison Test circuits used: MCNC '91 benchmarks. Ten circuits ranging in size from 125 cells to over 25,000 cells. CircuitCellsPadsNetsPinsRows Fract Prim Struct Ind Prim Bio Ind Ind Avq.s Avq.l th ASIC/SOC 2001

Clustering Depth Effects of different clustering depths on solution quality using different clustering methods. Results clearly show that, for WHEC, three levels of clustering gives good results for all sizes of circuits. 14th ASIC/SOC 2001

Circuit Size Reduction As we increase the number of clustering levels there is a gradual reduction in the number of cells and nets. 14th ASIC/SOC 2001

Contd.. Circuit Size Reduction As we increase the number of clustering levels there is a gradual reduction in the number of cells and nets. 14th ASIC/SOC 2001

Flatten De-Clustering Results Small Benchmarks % 1.71 % % 3.08 % 2.50 % 0.09 % th ASIC/SOC 2001

Flatten De-Clustering Results Medium Benchmarks % 0.25 % 0.51 % 0.95 % 0.43 % 0.76 % th ASIC/SOC 2001

Flatten De-Clustering Results Large Benchmarks % 0.02 % 0.34 % 0.54 % 0.14 % 0.40 % th ASIC/SOC 2001 % 7.67 % 0.97

Wire-Length Comparison Small Benchmarks 14th ASIC/SOC 2001

Wire-Length Comparison Medium Benchmarks 14th ASIC/SOC 2001

Wire-Length Comparison Large Benchmarks 14th ASIC/SOC 2001

Run Time Comparison Small Benchmarks 14th ASIC/SOC 2001

Run Time Comparison Medium Benchmarks 14th ASIC/SOC 2001

Run Time Comparison Large Benchmarks 14th ASIC/SOC 2001

Conclusions ARP is very useful for initial solutions but requires further tuning. The WHEC method of clustering was shown to be very effective when applied to the standard-cell placement problem. Edge clustering and modified hyper edge clustering were shown to perform well but the size limiting feature of WHEC provided results superior to other methods. Average improvement using the FLATTEN heuristic was between 1-2% for most benchmarks. Little improvement was obtained but the heuristic has a linear time complexity. Comparison with flat placement indicates that the hierarchical placement based on WHEC method reduced wire-length on average by 9% for all circuits and the execution time was lower by 70%. 14th ASIC/SOC 2001

Future Work Our future work attempts to incorporate utility into a stochastic heuristic or a hill climbing technique such as Tabu Search. Also involves further investigation of the modified WHEC clustering technique in addition to enabling the advanced features of the ARP (initial placer) by adaptively tuning the parameters. For further information, contact me: Presentation, is available: 14th ASIC/SOC 2001