Gate Sizing for Cell Library Based Designs Shiyan Hu*, Mahesh Ketkar**, Jiang Hu* *Dept of ECE, Texas A&M University **Intel Corporation.

Slides:



Advertisements
Similar presentations
Constraint Satisfaction Problems
Advertisements

© Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems Introduction.
Advanced Piloting Cruise Plot.
© 2008 Pearson Addison Wesley. All rights reserved Chapter Seven Costs.
Multicriteria Decision-Making Models
Chapter 1 The Study of Body Function Image PowerPoint
Subspace Embeddings for the L1 norm with Applications Christian Sohler David Woodruff TU Dortmund IBM Almaden.
Cognitive Radio Communications and Networks: Principles and Practice By A. M. Wyglinski, M. Nekovee, Y. T. Hou (Elsevier, December 2009) 1 Chapter 12 Cross-Layer.
Introduction to Algorithms
Thursday, March 7 Duality 2 – The dual problem, in general – illustrating duality with 2-person 0-sum game theory Handouts: Lecture Notes.
Summary of Convergence Tests for Series and Solved Problems
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Title Subtitle.
My Alphabet Book abcdefghijklm nopqrstuvwxyz.
DIVIDING INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
Year 6 mental test 10 second questions
Fast optimal instruction scheduling for single-issue processors with arbitrary latencies Peter van Beek, University of Waterloo Kent Wilken, University.
Marcus T. Schmitz and Bashir M. Al-Hashimi
Evaluating Window Joins over Unbounded Streams Author: Jaewoo Kang, Jeffrey F. Naughton, Stratis D. Viglas University of Wisconsin-Madison CS Dept. Presenter:
1 Outline relationship among topics secrets LP with upper bounds by Simplex method basic feasible solution (BFS) by Simplex method for bounded variables.
Robust Window-based Multi-node Technology- Independent Logic Minimization Jeff L.Cobb Kanupriya Gulati Sunil P. Khatri Texas Instruments, Inc. Dept. of.
REVIEW: Arthropod ID. 1. Name the subphylum. 2. Name the subphylum. 3. Name the order.
1) Read 2) Plan 3) Solve 4) Check Camels live in many places around the world. There are two kinds of camels. The Bactrian camel has two humps, while.
Announcements Homework 6 is due on Thursday (Oct 18)
1 The tiling algorithm Learning in feedforward layered networks: the tiling algorithm writed by Marc M é zard and Jean-Pierre Nadal.
Particle Swarm Optimization (PSO)
Randomized Algorithms Randomized Algorithms CS648 1.
David Luebke 1 6/7/2014 ITCS 6114 Skip Lists Hashing.
Data Structures ADT List
ABC Technology Project
Hash Tables.
Outline Minimum Spanning Tree Maximal Flow Algorithm LP formulation 1.
1 Undirected Breadth First Search F A BCG DE H 2 F A BCG DE H Queue: A get Undiscovered Fringe Finished Active 0 distance from A visit(A)
VOORBLAD.
演 算 法 實 驗 室演 算 法 實 驗 室 On the Minimum Node and Edge Searching Spanning Tree Problems Sheng-Lung Peng Department of Computer Science and Information Engineering.
Making Time-stepped Applications Tick in the Cloud Tao Zou, Guozhang Wang, Marcos Vaz Salles*, David Bindel, Alan Demers, Johannes Gehrke, Walker White.
Name Convolutional codes Tomashevich Victor. Name- 2 - Introduction Convolutional codes map information to code bits sequentially by convolving a sequence.
Chapter 6 The Mathematics of Diversification
1 Breadth First Search s s Undiscovered Discovered Finished Queue: s Top of queue 2 1 Shortest path from s.
Factor P 16 8(8-5ab) 4(d² + 4) 3rs(2r – s) 15cd(1 + 2cd) 8(4a² + 3b²)
Optimization 1/33 Radford, A D and Gero J S (1988). Design by Optimization in Architecture, Building, and Construction, Van Nostrand Reinhold, New York.
© 2012 National Heart Foundation of Australia. Slide 2.
Understanding Generalist Practice, 5e, Kirst-Ashman/Hull
Addition 1’s to 20.
25 seconds left…...
H to shape fully developed personality to shape fully developed personality for successful application in life for successful.
Januar MDMDFSSMDMDFSSS
Week 1.
We will resume in: 25 Minutes.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Local Search Jim Little UBC CS 322 – CSP October 3, 2014 Textbook §4.8
CPSC 322, Lecture 14Slide 1 Local Search Computer Science cpsc322, Lecture 14 (Textbook Chpt 4.8) Oct, 5, 2012.
Intracellular Compartments and Transport
PSSA Preparation.
Essential Cell Biology
Mani Srivastava UCLA - EE Department Room: 6731-H Boelter Hall Tel: WWW: Copyright 2003.
Chapter 5 The Mathematics of Diversification
13-Optimization Assoc.Prof.Dr. Ahmet Zafer Şenalp Mechanical Engineering Department Gebze Technical.
A Polynomial Time Approximation Scheme For Timing Constrained Minimum Cost Layer Assignment Shiyan Hu*, Zhuo Li**, Charles J. Alpert** *Dept of Electrical.
A Faster Approximation Scheme for Timing Driven Minimum Cost Layer Assignment Shiyan Hu*, Zhuo Li**, and Charles J. Alpert** *Dept of ECE, Michigan Technological.
Pattern Sensitive Placement For Manufacturability Shiyan Hu, Jiang Hu Department of Electrical and Computer Engineering Texas A&M University College Station,
Fast Algorithms for Slew Constrained Minimum Cost Buffering S. Hu*, C. Alpert**, J. Hu*, S. Karandikar**, Z. Li*, W. Shi* and C. Sze** *Dept of ECE, Texas.
A Fully Polynomial Time Approximation Scheme for Timing Driven Minimum Cost Buffer Insertion Shiyan Hu*, Zhuo Li**, Charles Alpert** *Dept of Electrical.
Unified Adaptivity Optimization of Clock and Logic Signals Shiyan Hu and Jiang Hu Dept of Electrical and Computer Engineering Texas A&M University.
Presentation transcript:

Gate Sizing for Cell Library Based Designs Shiyan Hu*, Mahesh Ketkar**, Jiang Hu* *Dept of ECE, Texas A&M University **Intel Corporation

2 Outline Introduction Introduction Motivation Motivation Problem Formulation Problem Formulation Algorithms Algorithms –Continuous solution guided dynamic programming –Node pruning and Stage pruning –Locality Sensitive Hashing based pruning Experimental Results Experimental Results Conclusion Conclusion

3 Gate Sizing Problem Size a gate Size a gate –Gate power –Driving resistance –Input capacitance Gate sizing problem Gate sizing problem –Minimize power subject to timing constraint Gate sizing for timing- power tradeoff

4 Continuous Gate Sizing Previous works Previous works –Fishburn and Dunlop, ICCAD85 –Sapatnekar, Rao, Vaidya, and Kang TCAD 93 –Chen, Chu, and Wong, TCAD99 Continuous problem formulation Minimize Area (Power) Subject to: Delay T Delay T X min X X max X min X X max 1 X1X1 X2X2 X3X3 2 3

5 Motivation Trend: cell library based design Trend: cell library based design –Discrete gate sizes Need to round continuous gate sizes Need to round continuous gate sizes Sparseness of gate library big rounding error Sparseness of gate library big rounding error Timing violation

6 Nearest Rounding Does Not Work Continuous solution by mathematical programming Continuous solution by mathematical programming Rounding continuous sizes to nearest discrete sizes Rounding continuous sizes to nearest discrete sizes

7 Discrete Gate Sizing Very few existing approach Very few existing approach GS approach [Coudert, TVLSI97] GS approach [Coudert, TVLSI97] –Trial-and-error style algorithm Based on slacks, pick a group of gates for sizing Based on slacks, pick a group of gates for sizing Random perturbation Random perturbation Repeat until convergence Repeat until convergence –Significant room for improvement

8 Our Choices Discrete gate sizing is an integer programming problem Discrete gate sizing is an integer programming problem –Hard to solve for large circuits Rounding? Rounding? –Not good solution quality –Very fast Dynamic programming? Dynamic programming? –Best solution quality –Computationally prohibitive

9 Our Idea Dynamic programming based rounding Dynamic programming based rounding –Continuous solution guided dynamic programming Largely reduce search space Largely reduce search space Keep solution quality Keep solution quality –At each cell, try discrete gate sizes around the obtained continuous size –For critical cells, try more gate sizes

10 Overall Flow Circuit partitioning Process stage by stage Pick best solutions at PO For each gate, sizing around continuous solution and perform node pruning Locality sensitive hashing based pruning Stage pruning

11 Circuit Partitioning A cutlineA cutline – prune solutions for acceleration A stage - solution propagation

12 Dynamic Programming Based Rounding Try gate sizes around continuous solution For timing critical nodes, try more sizes

13 Pruning For Acceleration Three types of pruning Three types of pruning –Node pruning Inside a stage Inside a stage –Stage pruning At cutline At cutline –Locality Sensitive Hashing based pruning At cutline At cutline

14 Node Pruning (I) Solution Characterization Solution Characterization –A solution s is characterized by D(s) and W(s). D(s): maximum delay from any primary input to any processed gate D(s): maximum delay from any primary input to any processed gate W(s): cumulative gate area for all processed gates W(s): cumulative gate area for all processed gates Node Pruning Node Pruning –Two solutions s1, s2 s1 is pruned if s1 is pruned if –D(s1) D(s2): larger delay, and –W(s1) W(s2): larger area.

15 Node Pruning (II) Solution 1: (D,W)=(11,4) Solution 2: (D,W)=(10,3) Pruned D 1x 2x 1x D

16 Stage Pruning Solution Characterization Solution Characterization –A solution s is characterized by f(s) and W(s). f(s) measures the proximity to the continuous solution f(s) measures the proximity to the continuous solution –gate i : discrete size, gate i c : continuous size W(s): cumulative gate area for all processed gates W(s): cumulative gate area for all processed gates Stage Pruning Stage Pruning –Two solutions s1, s2 s1 is pruned if s1 is pruned if –f(s1) f(s2): farther to continuous solution, and –W(s1) W(s2): larger area.

17 Locality Sensitive Hashing Based Pruning Maintain diversity in solutions Maintain diversity in solutions –Do not spend time in checking similar solutions How? How? –Cluster solutions –For each cluster, pick a few representative solutions for propagation

18 Solution Clustering A gate a dimension A gate a dimension Coordinate = gate implementation ID Coordinate = gate implementation ID Large circuit many dimensions Large circuit many dimensions Efficient clustering needed Efficient clustering needed –Most existing approaches does not scale well with dimensionality

19 Locality Sensitive Hashing For m solutions in d dimensions, clustering runs in only O(dmlogm) time For m solutions in d dimensions, clustering runs in only O(dmlogm) time –Linear in dimension Idea: Idea: –For a solution, concatenate coordinates in all d dimensions to a single string –Map it to a much shorter one while preserving distance properties –Many solutions many short strings. Cluster them.

20 Solution 1 1x 2x 5x 1, 2, 5 Concatenate discrete gate sizes to form a string 00001,00011,11111 Unary representation

21 Solution 2 1x 2x 3x 00001,00011,00111 Unary representation 1, 2, 3 Concatenate discrete gate sizes to form a string

22 Hashing (I) 00001,00011, ,00011,00111 Randomly pick k=5 locations Solution 1 Solution Solution Solution 2 Shorter strings

23 Hashing (II) Same shorter strings Hash to same bucket Same shorter strings Hash to same bucket Indyk et al, prove that Indyk et al, prove that –With large probability, geometrically close points are hashed together and geometrically far-apart points are hashes into different buckets. –A bucket = a cluster.

24 Experimental Setup ISCAS85 benchmark circuits ISCAS85 benchmark circuits X86 computer with 3.2Ghz CPU and 1G memory X86 computer with 3.2Ghz CPU and 1G memory 130nm technology 130nm technology 10 geometrically spaced gate sizes per gate type 10 geometrically spaced gate sizes per gate type Compare to nearest rounding and GS approach Compare to nearest rounding and GS approach

25 Comparison on Slack and Area Slack for GS: 2ps - 21ps Slack for ours: 1ps - 45ps Area saving ratio over GS Slack from rounding: Slack(ps)Area Reduction

26 CPU Comparison

27 Observations Nearest rounding introduces significant timing violations Nearest rounding introduces significant timing violations Our algorithm saves 9%-31% area over GS while still improving slacks in many cases Our algorithm saves 9%-31% area over GS while still improving slacks in many cases Runtime of our algorithm is on average 1.7x of GS. Runtime of our algorithm is on average 1.7x of GS.

28 Delay-Cost Tradeoff Provide more choices to designers Provide more choices to designers Help users get better timing constraint for circuit Help users get better timing constraint for circuit Two continuous solutions to guide our approach and two curves are obtained Two continuous solutions to guide our approach and two curves are obtained

29 Conclusion Propose a dynamic programming algorithm for discrete gate sizing problem. Propose a dynamic programming algorithm for discrete gate sizing problem. –Reduce search space by continuous solution guider. –Node pruning, Stage pruning, and Locality Sensitive Hashing based pruning for improving runtime. 9%-31% area reduction compared to GS. 9%-31% area reduction compared to GS. Future work seeks to handling variations in our approach. Future work seeks to handling variations in our approach.