Processing Rate Optimization by Sequential System Floorplanning Jia Wang 1, Ping-Chih Wu 2, and Hai Zhou 1 1 Electrical Engineering & Computer Science.

Slides:



Advertisements
Similar presentations
April 2004NUCAD Northwestern University1 Minimal Period Retiming Under Process Variations Jia Wang and Hai Zhou Electrical & Computer Engineering Northwestern.
Advertisements

THERMAL-AWARE BUS-DRIVEN FLOORPLANNING PO-HSUN WU & TSUNG-YI HO Department of Computer Science and Information Engineering, National Cheng Kung University.
Impact of Interference on Multi-hop Wireless Network Performance Kamal Jain, Jitu Padhye, Venkat Padmanabhan and Lili Qiu Microsoft Research Redmond.
ECE 667 Synthesis and Verification of Digital Circuits
OCV-Aware Top-Level Clock Tree Optimization
Native-Conflict-Aware Wire Perturbation for Double Patterning Technology Szu-Yu Chen, Yao-Wen Chang ICCAD 2010.
Linear Constraint Graph for Floorplan Optimization with Soft Blocks Jia Wang Electrical and Computer Engineering Illinois Institute of Technology Chicago,
1 Advancing Supercomputer Performance Through Interconnection Topology Synthesis Yi Zhu, Michael Taylor, Scott B. Baden and Chung-Kuan Cheng Department.
1 Interconnect Layout Optimization by Simultaneous Steiner Tree Construction and Buffer Insertion Presented By Cesare Ferri Takumi Okamoto, Jason Kong.
Clock Skewing EECS 290A Sequential Logic Synthesis and Verification.
Sequential Timing Optimization. Long path timing constraints Data must not reach destination FF too late s i + d(i,j) + T setup  s j + P s i s j d(i,j)
Constructing Minimal Spanning Steiner Trees with Bounded Path Length Presenter : Cheng-Yin Wu, NTUGIEE Some of the Slides in this Presentation are Referenced.
Coupling-Aware Length-Ratio- Matching Routing for Capacitor Arrays in Analog Integrated Circuits Kuan-Hsien Ho, Hung-Chih Ou, Yao-Wen Chang and Hui-Fang.
Circuit Retiming with Interconnect Delay CUHK CSE CAD Group Meeting One Evangeline Young Aug 19, 2003.
Interconnect Estimation without Packing via ACG Floorplans Jia Wang and Hai Zhou Electrical & Computer Engineering Northwestern University U.S.A.
Polynomial-Time Approximation Schemes for Geometric Intersection Graphs Authors: T. Erlebach, L. Jansen, and E. Seidel Presented by: Ping Luo 10/17/2005.
NuCAD ACG - Adjacent Constraint Graph for General Floorplans Hai Zhou and Jia Wang ICCD 2004, San Jose October 11-13, 2004.
TH EDA NTHU-CS VLSI/CAD LAB 1 Re-synthesis for Reliability Design Shih-Chieh Chang Department of Computer Science National Tsing Hua University.
Pipelining and Retiming 1 Pipelining  Adding registers along a path  split combinational logic into multiple cycles  increase clock rate  increase.
Storage Assignment during High-level Synthesis for Configurable Architectures Wenrui Gong Gang Wang Ryan Kastner Department of Electrical and Computer.
COE 561 Digital System Design & Synthesis Resource Sharing and Binding Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum.
03/08/2005 © J.-H. Jiang1 Retiming and Resynthesis EECS 290A – Spring 2005 UC Berkeley.
1 ENTITY test is port a: in bit; end ENTITY test; DRC LVS ERC Circuit Design Functional Design and Logic Design Physical Design Physical Verification and.
1 Shortest Path Calculations in Graphs Prof. S. M. Lee Department of Computer Science.
Chih-Hung Lin, Kai-Cheng Wei VLSI CAD 2008
Introduction to Routing. The Routing Problem Apply after placement Input: –Netlist –Timing budget for, typically, critical nets –Locations of blocks and.
© The McGraw-Hill Companies, Inc., Chapter 3 The Greedy Method.
CAFE router: A Fast Connectivity Aware Multiple Nets Routing Algorithm for Routing Grid with Obstacles Y. Kohira and A. Takahashi School of Computer Science.
CSE 242A Integrated Circuit Layout Automation Lecture: Partitioning Winter 2009 Chung-Kuan Cheng.
Simulated Annealing.
Mehdi Kargar Aijun An York University, Toronto, Canada Discovering Top-k Teams of Experts with/without a Leader in Social Networks.
Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department.
An Efficient Clustering Algorithm For Low Power Clock Tree Synthesis Rupesh S. Shelar Enterprise Microprocessor Group Intel Corporation, Hillsboro, OR.
Bus-Driven Floorplanning Hua Xiang*, Xiaoping Tang +, Martin D. F. Wong* * Univ. Of Illinois at Urbana-Champaign + Cadence Design Systems Inc.
Bus-Pin-Aware Bus-Driven Floorplanning B. Wu and T. Ho Department of Computer Science and Information Engineering NCKU GLSVLSI 2010.
Massachusetts Institute of Technology 1 L14 – Physical Design Spring 2007 Ajay Joshi.
1 Exploring Custom Instruction Synthesis for Application-Specific Instruction Set Processors with Multiple Design Objectives Lin, Hai Fei, Yunsi ACM/IEEE.
Placement. Physical Design Cycle Partitioning Placement/ Floorplanning Placement/ Floorplanning Routing Break the circuit up into smaller segments Place.
A Faster Approximation Scheme for Timing Driven Minimum Cost Layer Assignment Shiyan Hu*, Zhuo Li**, and Charles J. Alpert** *Dept of ECE, Michigan Technological.
Tao Lin Chris Chu TPL-Aware Displacement- driven Detailed Placement Refinement with Coloring Constraints ISPD ‘15.
1 Short Term Scheduling. 2  Planning horizon is short  Multiple unique jobs (tasks) with varying processing times and due dates  Multiple unique jobs.
Register Placement for High- Performance Circuits M. Chiang, T. Okamoto and T. Yoshimura Waseda University, Japan DATE 2009.
1. Placement of Digital Microfluidic Biochips Using the T-tree Formulation Ping-Hung Yuh 1, Chia-Lin Yang 1, and Yao-Wen Chang 2 1 Dept. of Computer Science.
Design of a High-Throughput Low-Power IS95 Viterbi Decoder Xun Liu Marios C. Papaefthymiou Advanced Computer Architecture Laboratory Electrical Engineering.
Semantic Wordfication of Document Collections Presenter: Yingyu Wu.
1 ER UCLA ISPD: Sonoma County, CA, April, 2001 An Exact Algorithm for Coupling-Free Routing Ryan Kastner, Elaheh Bozorgzadeh,Majid Sarrafzadeh.
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
1 SYNTHESIS of PIPELINED SYSTEMS for the CONTEMPORANEOUS EXECUTION of PERIODIC and APERIODIC TASKS with HARD REAL-TIME CONSTRAINTS Paolo Palazzari Luca.
A Stable Fixed-outline Floorplanning Method Song Chen and Takeshi Yoshimura Graduate School of IPS, Waseda University March, 2007.
ELEC692 VLSI Signal Processing Architecture Lecture 3
Maze Routing Algorithms with Exact Matching Constraints for Analog and Mixed Signal Designs M. M. Ozdal and R. F. Hentschke Intel Corporation ICCAD 2012.
Pipelining and Retiming
L12 : Lower Power High Level Synthesis(3) 성균관대학교 조 준 동 교수
Routability-driven Floorplanning With Buffer Planning Chiu Wing Sham Evangeline F. Y. Young Department of Computer Science & Engineering The Chinese University.
1 NTUplace: A Partitioning Based Placement Algorithm for Large-Scale Designs Tung-Chieh Chen 1, Tien-Chang Hsu 1, Zhe-Wei Jiang 1, and Yao-Wen Chang 1,2.
1 Twin Binary Sequences: A Non-Redundant Representation for General Non-Slicing Floorplan Evan Young Department of Computer Science and Engineering The.
© The McGraw-Hill Companies, Inc., Chapter 12 On-Line Algorithms.
11 -1 Chapter 12 On-Line Algorithms On-Line Algorithms On-line algorithms are used to solve on-line problems. The disk scheduling problem The requests.
1 Floorplanning of Pipelined Array (FoPA) Modules using Sequence Pairs Matt Moe Herman Schmit.
NOTE: To change the image on this slide, select the picture and delete it. Then click the Pictures icon in the placeholder to insert your own image. Fast.
Efficient Placement and Dispatch of Sensors in a Wireless Sensor Network You-Chiun Wang, Chun-Chi Hu, and Yu-Chee Tseng IEEE Transactions on Mobile Computing.
Intro. ANN & Fuzzy Systems Lecture 37 Genetic and Random Search Algorithms (2)
Retiming EECS 290A Sequential Logic Synthesis and Verification.
The Early Days of Automatic Floorplan Design
Partial Reconfigurable Designs
ME 521 Computer Aided Design 15-Optimization
Sequence Pair Representation
Sheqin Dong, Song Chen, Xianlong Hong EDA Lab., Tsinghua Univ. Beijing
EDA Lab., Tsinghua University
Fast Min-Register Retiming Through Binary Max-Flow
Presentation transcript:

Processing Rate Optimization by Sequential System Floorplanning Jia Wang 1, Ping-Chih Wu 2, and Hai Zhou 1 1 Electrical Engineering & Computer Science Northwestern University, U.S.A 2 Cadence Design Systems Inc, U.S.A

ISQED 2006 Motivation Optimize the performance of a sequential system. –Optimize the frequency (clock period). Minimal period retiming. ([4] Lin et al. ICCAD’03) Clock skew scheduling. –When the frequency is given but cannot be met by the above methods. Global interconnects need to be pipelined while the functionality of the system should not change. Latency insensitive design (LIS). ([6] [7] Carloni et al. ICCAD’99, DAC’00) Wire-pipelining correcting method. ([5] Nookala et al. DAC’04) Throughput is traded-off for frequency. Optimizations are applied after delays are estimated. –More optimization possibilities in floorplanning and placement when interconnect delays dominate. Placement driven by sequential timing. ([9] Hurst et al. ICCAD’04) Floorplanning for throughput. ([8] Casu et al. ISPD’04)

ISQED 2006 Processing Rate How to measure the performance of a sequential system? –Frequency? Throughput varies. –Throughput? Frequency varies. Use Processing Rate to measure the performance. –Defined as the length of processed input sequence per unit time. –Equal to frequency times throughput in a synchronous system. An upper bound of the processing rate is derived as: –G is the graph describing the sequential system. For a wire e, w(e) is the number of flip-flops on it and d(e) is the delay of it. –Independent of afterward optimization methodologies. –Independent of the operating frequency.

ISQED 2006 Floorplanning for Processing Rate (FPR) Find a floorplan to maximize the upper bound. –Intuitively, designs with larger bounds are superior to the ones with smaller bounds. –Good fidelity between the bound and the processing rate makes our approach effective. Optimizing the bound means at the stage of floorplanning, –Not necessary to determine what methodology to apply later. –Not necessary to know the operating frequency. Save design time since it is not necessary to repeatedly perform floorplanning according to the different afterward optimization methodologies and operation frequencies.

ISQED 2006 Overview of the Floorplanning Algorithm Simulated annealing (SA) based floorplanner. Adjacent Constraint Graph (ACG) as the floorplan representation. –A representation for general floorplans. Common ACG perturbations that change the geometric relationships locally. –Local changes enable incremental evaluation of the bound. Cost function to be optimized includes the area of the floorplan and the processing rate upper bound. When the floorplan is required to be fit into a fixed outline, a outline cost is included in the cost function as well: –W and H are the width and height of the current floorplan respectively. –W * and H * are the desired width and height respectively.

ISQED 2006 Adjacent Constraint Graph (ACG) A constraint graph containing both horizontal and vertical constraint edges satisfying that, –Exactly one constraint relation between every pair of modules. –No transitive edges. –No cross, which is an edge configuration if allowed may introduce quadratic number of edges to the graph. Reduced ACG simplifies ACG by removing a group of edges that can be inferred from other edges. An example: (1)The floorplan. (2)Its ACG. (3)Its Reduced ACG.

ISQED 2006 Direct Bound Evaluation The minimum cycle ratio problem. –Need to be solved many times in simulated annealing (SA). More than 700K times for our largest benchmark. Previous work [8] only estimates but not computes the ratio in SA. –Many polynomial-time algorithms available. –However, choose Howard’s algorithm. Not proved to be polynomial-time but among the fastest ones in practice. Howard’s algorithm iteratively finds the ratio. –Maintain a policy graph. A sub-graph of G where there is exactly one edge starting from any vertex. Its minimum cycle ratio  is obtained by enumerating its cycles.  is an upper bound of the minimum cycle ratio of G. –Check if there is a negative cycle in G with edge weights w(e)-  d(e). No. Then  is the minimum cycle ratio. Yes. Build a new policy graph containing one such cycle. –Keep a vertex labeling to interleave the above two steps.

ISQED 2006 Incremental Bound Evaluation The initial policy graph in Howard’s algorithm: –Constructed heuristically. –Intuitively, an initial one with a smaller  tends to converge quicker. The floorplans in simulated annealing: –ACG perturbations change the geometric relationships locally. –Most likely, the cycle ratio will not change much across perturbations. Reuse final policy graph as the initial one for the floorplan after perturbation. –Reduce running time by 29% on average. –Columns time are in seconds. –Columns #iter. are the total number of iterations in Howard’s algorithm.

ISQED 2006 Experiments Six GSRC floorplanning benchmarks: –n10, n30, n50, n100, n200, n300. –Including 10, 30, 50, 100, 200, 300 modules respectively. Each n-pin net is decomposed into n-1 2-pin nets. –The last pin of the net is treated as the sources of the nets after decomposition. –Other n-1 pins as sinks. –Exactly one flip-flop on each net. Wire delays are computed as Manhattan distances between pins. –Pins are assumed to be at the centers of the modules. Evaluate the processing rate under different operating frequencies for comparison with [8]. –Frequencies are modeled by critical length. The distance that a signal travels in one clock cycle. 30%, 50%, 70%, and 100% of the square root of the total module area.

ISQED 2006 Experiments (Cont.) Results reported in the format 1−throughput/white space (%). –The smaller the number, the better. –Dominating solutions are highlighted. One floorplan for all the frequencies in our approach. –One floorplan for each frequency in [8].