Subject : CAD For VLSI (7CS4) 1 Unit 5 Floor-planning, Placement & Routing.

Slides:



Advertisements
Similar presentations
Analysis of Floorplanning Algorithm in EDA Tools
Advertisements

FPGA-Based System Design: Chapter 4 Copyright  2004 Prentice Hall PTR Topics n Logic synthesis. n Placement and routing.
4/22/ Clock Network Synthesis Prof. Shiyan Hu Office: EREC 731.
Simulated Annealing Premchand Akella. Agenda Motivation The algorithm Its applications Examples Conclusion.
Ch.7 Layout Design Standard Cell Design TAIST ICTES Program VLSI Design Methodology Hiroaki Kunieda Tokyo Institute of Technology.
Coupling-Aware Length-Ratio- Matching Routing for Capacitor Arrays in Analog Integrated Circuits Kuan-Hsien Ho, Hung-Chih Ou, Yao-Wen Chang and Hui-Fang.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE VLSI Circuit Design Lecture 11 - Combinational.
ICS 252 Introduction to Computer Design Routing Fall 2007 Eli Bozorgzadeh Computer Science Department-UCI.
VLSI Layout Algorithms CSE 6404 A 46 B 65 C 11 D 56 E 23 F 8 H 37 G 19 I 12J 14 K 27 X=(AB*CD)+ (A+D)+(A(B+C)) Y = (A(B+C)+AC+ D+A(BC+D)) Dr. Md. Saidur.
Placement 1 Outline Goal What is Placement? Why Placement?
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 21: April 15, 2009 Routing 1.
VLSI Routing. Routing Problem  Given a placement, and a fixed number of metal layers, find a valid pattern of horizontal and vertical wires that connect.
Routing 1 Outline –What is Routing? –Why Routing? –Routing Algorithms Overview –Global Routing –Detail Routing –Shortest Path Algorithms Goal –Understand.
Reconfigurable Computing (EN2911X, Fall07)
Fuzzy Simulated Evolution for Power and Performance of VLSI Placement Sadiq M. SaitHabib Youssef Junaid A. KhanAimane El-Maleh Department of Computer Engineering.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 19: April 9, 2008 Routing 1.
Simulated Annealing 10/7/2005.
Metal Layer Planning for Silicon Interposers with Consideration of Routability and Manufacturing Cost W. Liu, T. Chien and T. Wang Department of CS, NTHU,
Multi-Layer Channel Routing Complexity and Algorithm Rajat K. Pal.
CSE 144 Project Part 2. Overview Multiple rows Routing channel between rows Components of identical height but various width Goal: Implement a placement.
CDCTree: Novel Obstacle-Avoiding Routing Tree Construction based on Current Driven Circuit Model Speaker: Lei He.
Chip Planning 1. Introduction Chip Planning:  Deals with large modules with −known areas −fixed/changeable shapes −(possibly fixed locations for some.
1 ENTITY test is port a: in bit; end ENTITY test; DRC LVS ERC Circuit Design Functional Design and Logic Design Physical Design Physical Verification and.
General Routing Overview and Channel Routing
ECE 506 Reconfigurable Computing Lecture 7 FPGA Placement.
Register-Transfer (RT) Synthesis Greg Stitt ECE Department University of Florida.
Introduction to Routing. The Routing Problem Apply after placement Input: –Netlist –Timing budget for, typically, critical nets –Locations of blocks and.
August 25, 2015National Workshop on VLSI Design Physical Design Automation Speaker: Speaker: Debdeep Mukhopadhyay Dept of Comp. Sc and Engg IIT.
Graph partition in PCB and VLSI physical synthesis Lin Zhong ELEC424, Fall 2010.
MGR: Multi-Level Global Router Yue Xu and Chris Chu Department of Electrical and Computer Engineering Iowa State University ICCAD
A Topology-based ECO Routing Methodology for Mask Cost Minimization Po-Hsun Wu, Shang-Ya Bai, and Tsung-Yi Ho Department of Computer Science and Information.
9/4/ VLSI Physical Design Automation Prof. David Pan Office: ACES Detailed Routing (I)
Area-I/O Flip-Chip Routing for Chip-Package Co-Design Progress Report 方家偉、張耀文、何冠賢 The Electronic Design Automation Laboratory Graduate Institute of Electronics.
Authors: Jia-Wei Fang,Chin-Hsiung Hsu,and Yao-Wen Chang DAC 2007 speaker: sheng yi An Integer Linear Programming Based Routing Algorithm for Flip-Chip.
Global Routing. Global routing:  To route all the nets, should consider capacities  Sequential −One net at a time  Concurrent −Order-independent 2.
Simulated Annealing.
Global Routing.
CAD for Physical Design of VLSI Circuits
10/7/ VLSI Physical Design Automation Prof. David Pan Office: ACES Lecture 6. Floorplanning (1)
CSE 494: Electronic Design Automation Lecture 2 VLSI Design, Physical Design Automation, Design Styles.
Maze Routing مرتضي صاحب الزماني.
Massachusetts Institute of Technology 1 L14 – Physical Design Spring 2007 Ajay Joshi.
Modern VLSI Design 2e: Chapter 7 Copyright  1998 Prentice Hall PTR Topics n Block placement. n Global routing. n Switchbox routing.
10/25/ VLSI Physical Design Automation Prof. David Pan Office: ACES Lecture 3. Circuit Partitioning.
Placement. Physical Design Cycle Partitioning Placement/ Floorplanning Placement/ Floorplanning Routing Break the circuit up into smaller segments Place.
Modern VLSI Design 3e: Chapter 10 Copyright  1998, 2002 Prentice Hall PTR Topics n CAD systems. n Simulation. n Placement and routing. n Layout analysis.
Register Placement for High- Performance Circuits M. Chiang, T. Okamoto and T. Yoshimura Waseda University, Japan DATE 2009.
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
ECE 260B – CSE 241A /UCB EECS Kahng/Keutzer/Newton Physical Design Flow Read Netlist Initial Placement Placement Improvement Cost Estimation Routing.
CHAPTER 8 Developing Hard Macros The topics are: Overview Hard macro design issues Hard macro design process Physical design for hard macros Block integration.
CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 13: February 20, 2002 Routing 1.
Detailed Routing مرتضي صاحب الزماني.
FPGA CAD 10-MAR-2003.
Ramakrishna Lecture#2 CAD for VLSI Ramakrishna
VLSI Floorplanning and Planar Graphs prepared and Instructed by Shmuel Wimer Eng. Faculty, Bar-Ilan University July 2015VLSI Floor Planning and Planar.
3/21/ VLSI Physical Design Automation Prof. David Pan Office: ACES Lecture 4. Circuit Partitioning (II)
مرتضي صاحب الزماني 1 Maze Routing. Homework 4 مهلت تحویل : 23 اردیبهشت پروژه 1 : انتخاب طرح : امروز مرتضي صاحب الزماني 2.
EE4271 VLSI Design VLSI Channel Routing.
CALTECH CS137 Fall DeHon 1 CS137: Electronic Design Automation Day 21: November 28, 2005 Routing 1.
RTL Design Flow RTL Synthesis HDL netlist logic optimization netlist Library/ module generators physical design layout manual design a b s q 0 1 d clk.
VLSI Physical Design Automation
Chapter 7 – Specialized Routing
VLSI Physical Design Automation
VLSI Physical Design Automation
Topics Logic synthesis. Placement and routing..
EE4271 VLSI Design, Fall 2016 VLSI Channel Routing.
VLSI Physical Design Automation
UNIT-III Logic Gates and other Complex gates, Switch logic, Alternate gate circuits Physical Design, Floor Planning, Placement – Routing, Power Delay.
ICS 252 Introduction to Computer Design
Reconfigurable Computing (EN2911X, Fall07)
Presentation transcript:

Subject : CAD For VLSI (7CS4) 1 Unit 5 Floor-planning, Placement & Routing

Synthesis Flow 2 High-Level Synthesis Logic Synthesis Physical Design Fabrication and Packaging Figures adopted with permission from Prof. Ciesielski, UMASS

Physical Design 3 Circuit Design Partitioning Floorplanning & Placement Routing Fabrication

What is Backend? 4 Physical Design: 1. FloorPlanning : Architect’s job 2. Placement : Builder’s job 3. Routing : Electrician’s job At sub-micron level

So, what is Partitioning? 5 System Level Partitioning Board Level Partitioning Chip Level Partitioning System PCBs Chips Subcircuits / Blocks

Partitioning of a Circuit 6

Why Partitioning? 7 Since each partition can correspond to a chip, interesting objectives are: Minimum number of partitions Subject to maximum size (area) of each partition Minimum number of interconnections between partitions Since they correspond to off-chip wiring with more delay and less reliability Less pin count on ICs (larger IO pins, much higher packaging cost) Balanced partitioning given bound for area of each partition

Circuit Representation 8 Netlist: Gates: A, B, C, D Nets: {A,B,C}, {B,D}, {C,D} Hypergraph: Vertices: A, B, C, D Hyperedges: {A,B,C}, {B,D}, {C,D} Vertex label: Gate size/area Hyperedge label: Importance of net (weight) A B CD A B C D

9 FloorPlanning

Floorplanning 10 The floorplanning problem is to plan the positions and shapes of the modules at the beginning of the design cycle to optimize the circuit performance: chip area total wirelength delay of critical path routability others, e.g., noise, heat dissipation, etc. Floorplanning also decides the IO structure, aspect ratio of the design. A bad floorplan will lead to waste-age of die area and routing congestion.

Contd… 11 Floorplanning is a mapping between the logical description (the netlist) and the physical description (the floorplan). Floorplanning is the process of identifying structures that should be placed close together, and allocating space for them in such a manner as to meet the sometimes conflicting goals of available space (cost of the chip), required performance, and the desire to have everything close to everything else.

Goals and Objectives 12 Goals of fl oorplanning: arrange the blocks on a chip, decide the location of the I/O pads, decide the location and number of the power pads, decide the type of power distribution, and decide the location and type of clock distribution. Objectives of fl oor planning are: to minimize the chip area, and minimize delay.

Polar Graph Representation 13 A graph representation of floorplan. Each floorplan is modeled by a pair of directed acyclic graphs: Horizontal polar graph Vertical polar graph For horizontal (vertical) polar graph, Vertex: Vertical (horizontal) channel Edge: 2 channels are on 2 sides of a block Edge weight: Width (height) of the block Note: There are many other graph representations.

Polar Graph: Example 14 Horizontal Polar Graph Vertical Polar Graph

Bounds on Aspect Ratios 15 If there is no bound on the aspect ratios, can we pack everything tightly? - Sure! But we don ’ t want to layout blocks as long strips, so we require r i  h i /w i  s i for each i.

Slicing and Non-Slicing Floorplan 16 Slicing Floorplan: One that can be obtained by repetitively subdividing (slicing) rectangles horizontally or vertically. Non-Slicing Floorplan: One that may not be obtained by repetitively subdividing alone.

Representation of Slicing Floorplan Slicing Floorplan V HH 213H V 64 V 75 Slicing Tree Polish Expression ( postorder traversal of slicing tree ) 21H67V45VH3HV

Skewed ST and Normalized PE 18 Skewed Slicing Tree: no node and its right son are the same. Normalized Polish Expression: no consecutive H’s or V’s Slicing Floorplan V HH 213H V 64 V 75 Slicing Tree (Skewed) Polish Expression 21H67V45VH3HV V HH 21H V 6 V 7 3 Slicing Tree H67V45V3HHV

Example of Moves V2H5V1H V4H5V1H V45HV1H V45VH1H M1 M3 M2

Channel Definition 20

Contd. … 21

I/O and Power Planning 22 The next step is to plan and create power and ground structures for both I/O pads and core logic For core logic, there is a core ring enclosing the core with one or more sets of power and ground rings horizontal metal layer: top and bottom sides, while the vertical metal layer is utilized for left, right

Contd internal core power and ground busses consist of one or two sets of wires or strips that repeat at regular intervals across the core logic, or specified region, within the design. Each of these power and ground strips run vertically, horizontally, or in both directions. If these strips run both vertically and horizontally at regular intervals, then the style is known as power mesh. As the ASIC core power consumption increases, the distance of power and ground strip intervals increases.

Clock Planning 24 The idea of the implementation of clock distribution networks is to provide clock to all clocked elements in the design in a symmetrically-structured manner The basic idea of manual implementation of clock distribution networks is to build a low resistance/capacitance grid similar to power and ground mesh that covers the entire logic core area

Contd It is essential to realize that clock grid networks consume a great deal of power due to being active all the time and it may not be possible to make such networks uniform owing to floorplanning constraints (e.g. to spread the power dissipation evenly across the chip). Another aspect of clock planning is that it is well suited to hierarchical physical design. This type of clock distribution is manually crafted at the chip level, providing clock to each sub-block that is place- and-routed individually.

26 Placement

Floorplanning v.s. Placement 27 Both determines block positions to optimize the circuit performance. Floorplanning: Details like shapes of blocks, I/O pin positions, etc. are not yet fixed (blocks with flexible shape are called soft blocks). Placement: Details like module shapes and I/O pin positions are fixed (blocks with no flexibility in shape are called hard blocks).

Importance of Placement 28 Placement is a key step in physical design Poor placement consumes large area, leads to difficult/ impossible routing task Ill placed layout cannot be improved by high quality routing Quality of placement: Layout area Routability Performance (usually timing, measured by delay of critical/ longest net)

Placement Goals and Objectives 29 Goals: (1) Guarantee the router can complete the routing step (2) Minimize all the critical net delays (3) Make the chip as dense as possible Objectives: (1) Minimize power dissipation (2) Minimize crosstalk between signals

Problem formulation 30 Input: Blocks (standard cells and macros) B 1,..., B n Shapes and Pin Positions for each block B i Nets N 1,..., N m Output: Coordinates (x i, y i ) for block B i. No overlaps between blocks The total wire length is minimized The area of the resulting block is minimized or given a fixed die Other consideration: timing, routability, clock, buffering and interaction with physical synthesis

Placement affects chip area 31

…And also Wire Length 32

Placement Algorithms 33 There are two classes of placement algorithms: 1) A constructive placement method uses a set of rules to arrive at a constructed placement. The most commonly used methods are variations on the min-cut algorithm. The other commonly used constructive placement algorithm is the eigenvalue method. 2) An iterative placement improvement. As in system partitioning, placement usually starts with a constructed solution and then improves it using an iterative algorithm. In most tools we can specify the locations and relative placements of certain critical logic cells as seed placements

Min-cut placement 34 The min-cut placement method uses successive application of partitioning 1. Cut the placement area into two pieces. 2. Swap the logic cells to minimize the cut cost. 3. Repeat the process from step 1, cutting smaller pieces until all the logic cells are placed. Min-cut placement. (a) Divide the chip into bins using a grid. (b) Merge all connections to the center of each bin. (c) Make a cut and swap logic cells between bins to minimize the cost of the cut. (d) Take the cut pieces and throw out all the edges that are not inside the piece. (e) Repeat the process with a new cut and continue until we reach the individual bins.

35 1..

Iterative Placement Improvement 36 An iterative placement improvement algorithm takes an existing placement and tries to improve it by moving the logic cells. There are two parts to the algorithm: 1.The selection criteria that decides which logic cells to try moving. 2. The measurement criteria that decides whether to move the selected cells. There are several interchange or iterative exchange methods that differ in their selection and measurement criteria: 1. pairwise interchange, 2. force-directed interchange, 3. force-directed relaxation, and 4. force-directed pairwise relaxation.

Pairwise-interchange algorithm 37 All of these methods usually consider only pairs of logic cells to be exchanged. A source logic cell is picked for trial exchange with a destination logic cell. The pairwise-interchange algorithm is similar to the interchange algorithm used for iterative improvement in the system partitioning step: 1.Select the source logic cell at random. 2.Try all the other logic cells in turn as the destination logic cell. 3.Use any of the measurement methods we have discussed to decide on whether to accept the interchange. 4.The process repeats from step 1, selecting each logic cell in turn as a source logic cell.

38 (a) and (b) show how we can extend pairwise interchange to swap more than two logic cells at a time. If we swap l logic cells at a time and find a locally optimum solution, we say that solution is l -optimum. The neighborhood exchange algorithm is a modification to pairwise interchange that considers only destination logic cells in a neighborhood —cells within a certain distance, e, of the source logic cell. Limiting the search area for the destination logic cell to the e - neighborhood reduces the search time.

Force Directed Approach 39 Transform the placement problem to the classical mechanics problem of a system of objects attached to springs Analogies: Module (Block/Cell/Gate) = Object Net = Spring Net weight = Spring constant Optimal placement = Equilibrium configuration

An Example 40 Resultant Force

Force-directed placement. (a) A network with nine logic cells. (b) We make a grid (one logic cell per bin). (c) Forces are calculated as if springs were attached to the centers of each logic cell for each connection. The two nets connecting logic cells A and I correspond to two springs. (d) The forces are proportional to the spring extensions. 41.

Comments on Force-Directed Placement 42 Use directions of forces to guide the search Usually much faster than simulated annealing x Focus on connections, not shapes of blocks x Only a heuristic; an equilibrium configuration does not necessarily give a good placement ? Successful or not depends on the way to eliminate overlapping

Simulated Annealing 43 Very general search technique. Try to avoid being trapped in local minimum by making probabilistic moves. Popularize as a heuristic for optimization

Basic Idea of Simulated Annealing 44 Inspired by the Annealing Process: The process of carefully cooling molten metals in order to obtain a good crystal structure. First, metal is heated to a very high temperature. Then slowly cooled. By cooling at a proper rate, atoms will have an increased chance to regain proper crystal structure. Attaining a min cost state in simulated annealing is analogous to attaining a good crystal structure in annealing.

Simulated Annealing 45 State Cost Temperature dropping Drop back

The Simulated Annealing Procedure 46 Let t be the initial temperature. Repeat Pick a neighbor of the current state randomly. Let c = cost of current state. Let c’ = cost of the neighbour picked. If c’ < c, then move to the neighbour (downhill move). If c’ > c, then move to the neighbour with probablility e -(c ’ -c)/t (uphill move). Until equilibrium is reached. Reduce t according to cooling schedule. Until Freezing point is reached.

Things to decide when using SA 47 When solving a combinatorial problem, we have to decide: The state space The neighborhood structure The cost function The initial state The initial temperature The cooling schedule (how to change t) The freezing point

48 Routing

Routing in design flow 49 AC B Post Placed Netlist AND OR INV Floorplan/Placement Routing Process of finding geometric layouts of the net

The Routing Problem 50 Apply it after Placement Input: Netlist Timing budget for, typically, critical nets Locations of blocks and locations of pins Output: Geometric layouts of all nets Objective: Minimize the total wire length, the number of vias, or just completing all connections without increasing the chip area. Each net meets its timing budget.

The Routing Constraints 51 Examples: Placement constraint Number of routing layers Delay constraint Meet all geometrical constraints (design rules) Physical/Electrical/Manufacturing constraints: Crosstalk

Steiner Tree 52 For a multi-terminal net, we can construct a spanning tree to connect all the terminals together. But the wire length will be large. Better use Steiner Tree: A tree connecting all terminals and some additional nodes (Steiner nodes). Rectilinear Steiner Tree: Steiner tree in which all the edges run horizontally and vertically. Steiner Node

Routing Problem is Very Hard 53 Minimum Steiner Tree Problem: Given a net, find the Steiner tree with the minimum length. Input :An edge weighted graph G=(V,E) and a subset D (demand points) Output: A subset of vertices V ’ (such that D is covered) and induces a tree of minimum cost over all such trees This problem is NP-Complete!

Heuristic Algorithms 54 Use MST (minimum spanning tree) algorithms to start with Cost MST /Cost RMST ≤3/2 Heuristics can guarantee that the weight of RST is at most 3/2 of the weight of the optimal tree Apply local modifications to reach a RMST (rectilinear minimum steiner tree)

Kinds of Routing 55 Global Routing Detailed Routing Channel Switchbox Others: Maze routing Over the cell routing Clock routing

General Routing Paradigm 56 Two phases:

Extraction and Timing Analysis 57 After global routing and detailed routing, information of the nets can be extracted and delays can be analyzed. If some nets fail to meet their timing budget, detailed routing and/or global routing needs to be repeated.

Routing Regions 58

Global Routing 59 Global routing is divided into 3 phases: 1. Region definition 2. Region assignment 3. Pin assignment to routing regions

Maze Routing Problem 60 Given: A planar rectangular grid graph. Two points S and T on the graph. Obstacles modeled as blocked vertices. Objective: Find the shortest path connecting S and T. This technique can be used in global or detailed routing (switchbox) problems.

Grid Graph 61 X X Area Routing Grid Graph (Maze) S T S T S T X Simplified Representation X Blocked cells

Maze Routing 62 S T

Lee ’ s Algorithm 63 “ An Algorithm for Path Connection and its Application ”, C.Y. Lee, IRE Transactions on Electronic Computers, 1961.

Basic Idea 64 A Breadth-First Search (BFS) of the grid graph. Always find the shortest path possible. Consists of two phases: Wave Propagation Retrace

An Illustration 65 S T

Wave Propagation 66 At step k, all vertices at Manhattan-distance k from S are labeled with k. A Propagation List (FIFO) is used to keep track of the vertices to be considered next. S T 0 S T S T After Step 0After Step 3After Step 6

Retrace 67 Trace back the actual route. Starting from T. At vertex with k, go to any vertex with label k-1. S T Final labeling

How many grids visited using Lee ’ s algorithm? 68 S T

Time and Space Complexity 69 For a grid structure of size w  h: Time per net = O(wh) Space = O(wh log wh) (O(log wh) bits are needed during exploration phase + one additional bit to indicate blocked or not) For a 2000  2000 grid structure: 12 bits per label Total 6 Mbytes of memory! For 4000 x 4000, 48 M bytes!

Acker ’ s coding : Improvement to Lee ’ s Algorithm 70 The vertices in wave-front L are always adjacent to the vertices L-1 and L+1 in the wavefront Soln: the predecessor of any wavefront is labeled different from its successor 0,0,1,1,0, …. Need to indicate blocked or not Hence can do away with 2 bits Time complexity is not improved

Acker ’ s Technique 71 S T

Detailed routing 72 Global routing do not define wires They define routing regions Detailed router places actual wires within regions, indicated by the global router We consider the channel routing problem here…

Channel Routing 73 A channel is the routing region bounded by two parallel rows of terminals Assume top and bottom boundary Each terminal is assigned a number to indicate which net it belongs to 0 indicates : does not require an electrical connection

Channel Routing 74 channel

Channel Routing 75 Upper boundary Lower boundary Tracks Terminals Via TrunksBranches Dogleg

Channel Routing How to connect all the points with the same label with the smallest no. of tracks (to minimize the channel height)?

Horizontal Constraint Graph (HCV) Clique of size 4

Left-Edge Algorithm Sort the horizontal segments of the nets in increasing order of their left end points. 2. Place them one by one greedily on the bottommost available track.

Left-Edge Algorithm Sort by left end points Place nets greedily.

Vertical Constraint Graph and Doglegs imposes a vertical constraint on 2, as top terminal belongs to 1 and bottom terminal belongs to 2 2 imposes a vertical constraint on 1 2 VCG : Cycle Dogleg

Conclusion: 81 We have discussed the problem of partitioning and the role of partitioning in floorplanning. We have understood the concept and physical significance of FloorPlanning, Placement and Routing with various algorithms used in physical design automation.

82 Thanks Queries???