Design Hierarchy Guided Multilevel Circuit Partitioning

Slides:



Advertisements
Similar presentations
Multilevel Hypergraph Partitioning Daniel Salce Matthew Zobel.
Advertisements

Minimum Clique Partition Problem with Constrained Weight for Interval Graphs Jianping Li Department of Mathematics Yunnan University Jointed by M.X. Chen.
ECE 667 Synthesis and Verification of Digital Circuits
A Graph-Partitioning-Based Approach for Multi-Layer Constrained Via Minimization Yih-Chih Chou and Youn-Long Lin Department of Computer Science, Tsing.
Chapter 4: Trees Part II - AVL Tree
Meng-Kai Hsu, Sheng Chou, Tzu-Hen Lin, and Yao-Wen Chang Electronics Engineering, National Taiwan University Routability Driven Analytical Placement for.
Branch & Bound Algorithms
Tanuj Jindal ∗, Charles J. Alpert‡, Jiang Hu ∗, Zhuo Li‡, Gi-Joon Nam‡, Charles B. Winn‡‡ ∗ Department of ECE, Texas A&M University, College Station, Texas.
VLSI Layout Algorithms CSE 6404 A 46 B 65 C 11 D 56 E 23 F 8 H 37 G 19 I 12J 14 K 27 X=(AB*CD)+ (A+D)+(A(B+C)) Y = (A(B+C)+AC+ D+A(BC+D)) Dr. Md. Saidur.
Chapter 2 – Netlist and System Partitioning
Sublinear Algorithms for Approximating Graph Parameters Dana Ron Tel-Aviv University.
EDA (CS286.5b) Day 5 Partitioning: Intro + KLFM. Today Partitioning –why important –practical attack –variations and issues.
Local Unidirectional Bias for Smooth Cutsize-delay Tradeoff in Performance-driven Partitioning Andrew B. Kahng and Xu Xu UCSD CSE and ECE Depts. Work supported.
Reconfigurable Computing (EN2911X, Fall07)
A scalable multilevel algorithm for community structure detection
SubSea: An Efficient Heuristic Algorithm for Subgraph Isomorphism Vladimir Lipets Ben-Gurion University of the Negev Joint work with Prof. Ehud Gudes.
Lecture 9: Multi-FPGA System Software October 3, 2013 ECE 636 Reconfigurable Computing Lecture 9 Multi-FPGA System Software.
Analysis of Algorithms CS 477/677
Interconnect Implications of Growth-Based Structural Models for VLSI Circuits* Chung-Kuan Cheng, Andrew B. Kahng and Bao Liu UC San Diego CSE Dept.
1 Circuit Partitioning Presented by Jill. 2 Outline Introduction Cut-size driven circuit partitioning Multi-objective circuit partitioning Our approach.
1 Enhancing Performance of Iterative Heuristics for VLSI Netlist Partitioning Dr. Sadiq M. Sait Dr. Aiman El-Maleh Mr. Raslan Al Abaji. Computer Engineering.
Multilevel Hypergraph Partitioning G. Karypis, R. Aggarwal, V. Kumar, and S. Shekhar Computer Science Department, U of MN Applications in VLSI Domain.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University COT 5410 – Spring 2004.
MGR: Multi-Level Global Router Yue Xu and Chris Chu Department of Electrical and Computer Engineering Iowa State University ICCAD
CHAMELEON : A Hierarchical Clustering Algorithm Using Dynamic Modeling
FastDTW: Toward Accurate Dynamic Time Warping in Linear Time and Space
CSE 242A Integrated Circuit Layout Automation Lecture: Partitioning Winter 2009 Chung-Kuan Cheng.
Jure Leskovec Computer Science Department Cornell University / Stanford University Joint work with: Eric Horvitz, Michael Mahoney,
March 20, 2007 ISPD An Effective Clustering Algorithm for Mixed-size Placement Jianhua Li, Laleh Behjat, and Jie Huang Jianhua Li, Laleh Behjat,
Merge Sort. What Is Sorting? To arrange a collection of items in some specified order. Numerical order Lexicographical order Input: sequence of numbers.
A Clustering Algorithm based on Graph Connectivity Balakrishna Thiagarajan Computer Science and Engineering State University of New York at Buffalo.
Improved Cut Sequences for Partitioning Based Placement Mehmet Can YILDIZ and Patrick H. Madden State University of New York at BinghamtonComputer Science.
Order Statistics. Order statistics Given an input of n values and an integer i, we wish to find the i’th largest value. There are i-1 elements smaller.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
Ho-Lin Chang, Hsiang-Cheng Lai, Tsu-Yun Hsueh, Wei-Kai Cheng, Mely Chen Chi Department of Information and Computer Engineering, CYCU A 3D IC Designs Partitioning.
Heapsort. Heapsort is a comparison-based sorting algorithm, and is part of the selection sort family. Although somewhat slower in practice on most machines.
1 A Min-Cost Flow Based Detailed Router for FPGAs Seokjin Lee *, Yongseok Cheon *, D. F. Wong + * The University of Texas at Austin + University of Illinois.
1 A System for Outlier Detection and Cluster Repair Ying Liu Dr. Sprague Oct 21, 2005.
Timing-Driven Routing for FPGAs Based on Lagrangian Relaxation
Optimality, Scalability and Stability study of Partitioning and Placement Algorithms Jason Cong, Michail Romesis, Min Xie UCLA Computer Science Department.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University.
Data Structures and Algorithms in Parallel Computing Lecture 7.
About Me Swaroop Butala  MSCS – graduating in Dec 09  Specialization: Systems and Databases  Interests:  Learning new technologies  Application of.
CprE566 / Fall 06 / Prepared by Chris ChuPartitioning1 CprE566 Partitioning.
1 Overview (Part 1) Background notions A reference framework for multiresolution meshes Classification of multiresolution meshes An introduction to LOD.
Global Clustering-Based Performance-Driven Circuit Partitioning Jason Cong University of California Los Angeles Chang Wu Aplus Design.
Example Apply hierarchical clustering with d min to below data where c=3. Nearest neighbor clustering d min d max will form elongated clusters!
Hypergraph Partitioning With Fixed Vertices Andrew E. Caldwell, Andrew B. Kahng and Igor L. Markov UCLA Computer Science Department
Multilevel Partitioning
3/21/ VLSI Physical Design Automation Prof. David Pan Office: ACES Lecture 4. Circuit Partitioning (II)
A Snap-On Placement Tool Israel Waldman. Introduction.
Prediction of Interconnect Net-Degree Distribution Based on Rent’s Rule Tao Wan and Malgorzata Chrzanowska- Jeske Department of Electrical and Computer.
High Performance Computing Seminar
MapReduce MapReduce is one of the most popular distributed programming models Model has two phases: Map Phase: Distributed processing based on key, value.
Prof. Yu-Chee Tseng Department of Computer Science
SINGLE-LEVEL PARTITIONING SUPPORT IN BOOM-II
Andrew B. Kahng and Xu Xu UCSD CSE and ECE Depts.
Chapter 2 – Netlist and System Partitioning
ESE534: Computer Organization
Parallel Sort, Search, Graph Algorithms
Sungho Kang Yonsei University
Hierarchical Search on DisCSPs
A Semi-Persistent Clustering Technique for VLSI Circuit Placement
Lecture 14 Shortest Path (cont’d) Minimum Spanning Tree
Hierarchical Search on DisCSPs
Decision trees MARIO REGIN.
Lecture 13 Shortest Path (cont’d) Minimum Spanning Tree
Dynamic Load Balancing of Unstructured Meshes
Reconfigurable Computing (EN2911X, Fall07)
Presentation transcript:

Design Hierarchy Guided Multilevel Circuit Partitioning Yongseok Cheon and D.F. Wong Department of Computer Sciences The University of Texas at Austin

Outline Motivation & Contribution Problem Design hierarchy Rent’s rule & Rent exponent Our approach Design hierarchy guided clustering Design hierarchy guided ML partitioning Experimental results

Motivation Natural question: How to use design hierarchy for partitioning? Effectiveness of multilevel partitioning Similarity between design hierarchy (DH) and ML clustering tree

Contribution Rent exponent as a quality indicator Intelligent and systematic use of hierarchical logical grouping information for better partitioning Partitioning results with higher quality, more stability obtained

Partitioning problem Netlist hypergraph Partitioned hypergraph

Multilevel partitioning Multilevel clustering (coarsening) Initial partitioning Multilevel FM refinement with unclustering (uncoarsening) hMetis (3) (1) (2)

DH guided ML partitioning

Design hierarchy Hierarchical grouping which already has some implications on connectivity information To identify which hierarchical element is good or bad in terms of physical connectivity, Rent’s rule is used

Rent’s rule Rent’s rule & Rent exponent E = external pin count B = # of cells inside P = avg # of pins per cell r = Rent exponent

Rent exponent For a hierarchical element H, Rent exponent for H E = external pin count I = internal pin count P = avg # of pins per cell = (I+E)/|H|

Rent exponent Small r  more strongly connected cells inside Large r  more weakly connected cells inside r = ln(4/34)/ln10 + 1 = 0.0147 r = ln(15/25)/ln10 + 1 = 0.778

Selective preservation of DH Global Rent exponent, r = weighted average of Rent exponents of all hierarchical elements in DH = A hierarchical element H is determined to be preserved or broken according to r(H) If r(H)  r : H will be used as a search scope for clustering of the cells inside H – positive scope If r(H)  r : H is removed from DH and the cells inside of H can be freely clustered with outside cells – negative scope

Modification of DH Remove all negative scopes from design hierarchy D – scope tree D’ H(v) (parent of v in D’) : served as clustering scope for v Design hierarchy tree D : negative scope : positive scope Scope tree D' H1 H2 H3 H4 u v H(u) = H1 H(v) = H2

DH guided ML clustering Input: bottommost hypergraph G1 & design hierarchy D Output: k-level clustering tree C Modify D to D’ do Perform cluster_one_level(Gk) with D’  upper level hypergraph Gk+1 Update D’ k = k+1 until Gk is saturated

Global saturation Saturation condition(stopping criteria): # of vertices   or Problem size reduction rate   ( =100, =0.9 in our experiments )

Clustering scope Hierarchical node as clustering scope For each anchor v, best neighbor w to be matched with v is searched within H(v) u is selected as an anchor before v if H(u)  H(v) Scope tree D' H1 H2 H3 H4 u v

Scope restricted clustering cluster_one_level() For randomly selected unmatched vertex v, find w within the scope H(v) that maximizes the clustering cost, Vertices with smaller scopes are selected as anchors earlier Create a new upper level cluster v’ with v and w H(v’) := H(v) since H(v)  H(w)

Scope restricted clustering(cnt’d) cluster_one_level() – continued If no best target w, create v’ only with v If w already matched in v’, append v to v’ “unmatched” condition is relaxed - already matched neighbor w is also considered  More problem size reduction H(v’) := H(v) since H(v)  H(v’)

One level clustering No reduction rate control to take full advantage of design hierarchy  aggressively reduced # of levels in resulting clustering tree Cluster sizes are controlled such that they cannot exceed  = bal_ratiototal_size Local saturation condition for scope X: # of vertices in X  (X) or Size reduction rate in X  (X) ( in our experiments )

Scope tree restructuring Scope tree is restructured after one level clustering by removing saturated scopes Enlarged clustering scopes are used at higher level clustering with bigger & fewer clusters Restructured scope Scope tree D' H(u) = H1 tree after one level H(u') = H3 H3 H(v) = H2 clustering H3 H(v') = H4 H1 H4 H4 H2 u v u' v' H1 and H2 are saturated!

DH guided ML partitioning dhml Perform Rent exponent computation on D Apply DH guided ML clustering to obtain k level clustering tree C At the coarsest level, execute 20 runs of FM and pick the best one From the partition at level k down to level 0, apply unclustering and FM_partition to improve the partition from upper levels

DH guided ML partitioning Multi way partitioning: dhml RBP Recursive bi-partitioning Partial design hierarchy trees used at each sub-partitioning Performance compared with hMetis RBP version

Experimental results Circuit characteristics Circuit # cells # nets levels/# hier nodes Ind1 Ind2 Ind3 Ind4 Ind5 Ind6 15186 136340 224908 414633 1213105 1841147 19152 183340 187595 414013 1317889 2788461 6/302 9/10427 5/57590 13/94796 13/33277 11/35449

Experimental results Cut set size comparison (Minimum cut size from 5 runs of dhml & 10 runs of hMetis RBP) Up to 16% better quality in half # of runs Circuit 2-way 16-way 256-way dhml hMetis Ind1 64 69 437 483 - Ind2 133 134 1203 1294 14633 16137 Ind3 292 305 1454 1551 7450 7508 Ind4 202 208 3394 3498 12013 13999 Ind5 1376 1352 7410 7950 22474 24454 ind6 55 56 8275 8265 33472 35075

Experimental results Quality stability

Experimental results Observation 20-50% better quality in the initial partition at the coarsest level Number of levels reduced to 55-75% of hMetis while still producing up to 16% better cut quality More stable cut quality implying smaller # of runs needed to obtain the near-best solution Similar or little more runtime than hMetis

Summary Systematic ML partitioning method exploiting design hierarchy presented ML clustering guided by design hierarchy Rent exponent Clustering scope restriction Dynamic scope restructuring Experimental results show… Better clustering tree More stable and higher quality solution