ECE 506 Reconfigurable Computing Lecture 7 FPGA Placement.

Slides:



Advertisements
Similar presentations
FPGA-Based System Design: Chapter 4 Copyright  2004 Prentice Hall PTR Topics n Logic synthesis. n Placement and routing.
Advertisements

3D-STAF: Scalable Temperature and Leakage Aware Floorplanning for Three-Dimensional Integrated Circuits Pingqiang Zhou, Yuchun Ma, Zhouyuan Li, Robert.
Meng-Kai Hsu, Sheng Chou, Tzu-Hen Lin, and Yao-Wen Chang Electronics Engineering, National Taiwan University Routability Driven Analytical Placement for.
Coupling-Aware Length-Ratio- Matching Routing for Capacitor Arrays in Analog Integrated Circuits Kuan-Hsien Ho, Hung-Chih Ou, Yao-Wen Chang and Hui-Fang.
Congestion Driven Placement for VLSI Standard Cell Design Shawki Areibi and Zhen Yang School of Engineering, University of Guelph, Ontario, Canada December.
100 Placement Assign logic blocks to specific chip locations Seek to minimize routing distance, congestion CLB IOB.
ISQED’2015: D. Seemuth, A. Davoodi, K. Morrow 1 Automatic Die Placement and Flexible I/O Assignment in 2.5D IC Design Daniel P. Seemuth Prof. Azadeh Davoodi.
Multiobjective VLSI Cell Placement Using Distributed Simulated Evolution Algorithm Sadiq M. Sait, Mustafa I. Ali, Ali Zaidi.
Fuzzy Simulated Evolution for Power and Performance of VLSI Placement Sadiq M. Sait Habib Youssef Junaid A. KhanAimane El-Maleh Department of Computer.
ENGIN112 L38: Programmable Logic December 5, 2003 ENGIN 112 Intro to Electrical and Computer Engineering Lecture 38 Programmable Logic.
MAE 552 – Heuristic Optimization Lecture 6 February 6, 2002.
Placement 1 Outline Goal What is Placement? Why Placement?
Reconfigurable Computing (EN2911X, Fall07)
Fuzzy Simulated Evolution for Power and Performance of VLSI Placement Sadiq M. SaitHabib Youssef Junaid A. KhanAimane El-Maleh Department of Computer Engineering.
Evolution of implementation technologies
Lecture 4: FPGA Placement September 12, 2013 ECE 636 Reconfigurable Computing Lecture 4 FPGA Placement.
Simulated Annealing 10/7/2005.
EDA (CS286.5b) Day 7 Placement (Simulated Annealing) Assignment #1 due Friday.
Fuzzy Evolutionary Algorithm for VLSI Placement Sadiq M. SaitHabib YoussefJunaid A. Khan Department of Computer Engineering King Fahd University of Petroleum.
The Memory/Logic Interface in FPGA’s with Large Embedded Memory Arrays The Memory/Logic Interface in FPGA’s with Large Embedded Memory Arrays Steven J.
CSE 144 Project Part 2. Overview Multiple rows Routing channel between rows Components of identical height but various width Goal: Implement a placement.
Lecture 5: FPGA Routing September 17, 2013 ECE 636 Reconfigurable Computing Lecture 5 FPGA Routing.
CS 151 Digital Systems Design Lecture 38 Programmable Logic.
ECE 506 Reconfigurable Computing Lecture 8 FPGA Placement.
Elements of the Heuristic Approach
Register-Transfer (RT) Synthesis Greg Stitt ECE Department University of Florida.
Placement by Simulated Annealing. Simulated Annealing  Simulates annealing process for placement  Initial placement −Random positions  Perturb by block.
Global Routing. Global routing:  To route all the nets, should consider capacities  Sequential −One net at a time  Concurrent −Order-independent 2.
Simulated Annealing.
Global Routing.
CAD for Physical Design of VLSI Circuits
LOPASS: A Low Power Architectural Synthesis for FPGAs with Interconnect Estimation and Optimization Harikrishnan K.C. University of Massachusetts Amherst.
CprE / ComS 583 Reconfigurable Computing Prof. Joseph Zambreno Department of Electrical and Computer Engineering Iowa State University Lecture #14 – Placement.
Solving Hard Instances of FPGA Routing with a Congestion-Optimal Restrained-Norm Path Search Space Keith So School of Computer Science and Engineering.
New Modeling Techniques for the Global Routing Problem Anthony Vannelli Department of Electrical and Computer Engineering University of Waterloo Waterloo,
1 Simulated Annealing Contents 1. Basic Concepts 2. Algorithm 3. Practical considerations.
Massachusetts Institute of Technology 1 L14 – Physical Design Spring 2007 Ajay Joshi.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR FPGA Fabric n Elements of an FPGA fabric –Logic element –Placement –Wiring –I/O.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Topics n FPGA fabric architecture concepts.
L11: Lower Power High Level Synthesis(2) 성균관대학교 조 준 동 교수
Placement. Physical Design Cycle Partitioning Placement/ Floorplanning Placement/ Floorplanning Routing Break the circuit up into smaller segments Place.
Field Programmable Gate Arrays (FPGAs) An Enabling Technology.
Congestion Estimation and Localization in FPGAs: A Visual Tool for Interconnect Prediction David Yeager Darius Chiu Guy Lemieux The University of British.
Modern VLSI Design 3e: Chapter 10 Copyright  1998, 2002 Prentice Hall PTR Topics n CAD systems. n Simulation. n Placement and routing. n Layout analysis.
CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 10: February 6, 2002 Placement (Simulated Annealing…)
CprE / ComS 583 Reconfigurable Computing Prof. Joseph Zambreno Department of Electrical and Computer Engineering Iowa State University Lecture #4 – FPGA.
1 A Min-Cost Flow Based Detailed Router for FPGAs Seokjin Lee *, Yongseok Cheon *, D. F. Wong + * The University of Texas at Austin + University of Illinois.
ECE 260B – CSE 241A /UCB EECS Kahng/Keutzer/Newton Physical Design Flow Read Netlist Initial Placement Placement Improvement Cost Estimation Routing.
Timing-Driven Routing for FPGAs Based on Lagrangian Relaxation
Routability-driven Floorplanning With Buffer Planning Chiu Wing Sham Evangeline F. Y. Young Department of Computer Science & Engineering The Chinese University.
Optimization Problems
FPGA-Based System Design: Chapter 1 Copyright  2004 Prentice Hall PTR Moore’s Law n Gordon Moore: co-founder of Intel. n Predicted that number of transistors.
FPGA CAD 10-MAR-2003.
Ramakrishna Lecture#2 CAD for VLSI Ramakrishna
An Introduction to Simulated Annealing Kevin Cannons November 24, 2005.
A Design Flow for Optimal Circuit Design Using Resource and Timing Estimation Farnaz Gharibian and Kenneth B. Kent {f.gharibian, unb.ca Faculty.
Register-Transfer (RT) Synthesis Greg Stitt ECE Department University of Florida.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Topics n FPGA fabric architecture concepts.
A Survey of Fault Tolerant Methodologies for FPGA’s Gökhan Kabukcu
A Snap-On Placement Tool Israel Waldman. Introduction.
RTL Design Flow RTL Synthesis HDL netlist logic optimization netlist Library/ module generators physical design layout manual design a b s q 0 1 d clk.
Placement and Routing Algorithms. 2 FPGA Placement & Routing.
Optimization Problems
VLSI Physical Design Automation
Partial Reconfigurable Designs
Topics SRAM-based FPGA fabrics: Xilinx. Altera..
Optimization Problems
Topics Logic synthesis. Placement and routing..
ECE 697F Reconfigurable Computing Lecture 4 FPGA Placement
Reconfigurable Computing (EN2911X, Fall07)
Presentation transcript:

ECE 506 Reconfigurable Computing Lecture 7 FPGA Placement

Placement °VLSI Design Flow Objective: -Minimize total chip area, -Sustain routable circuit within timing budget °FPGA Flow Area fixed Objective: -Assign LUTs in the netlist to available logic blocks in the array within utilization and performance constraints (Interconnect) -Locate functional blocks such that the interconnect required to route the signals between them is minimized. Target Architecture determines the cost function

Placement algorithm °two basic inputs: netlist with functional blocks and connections between them device map (architecture) °algorithm selects a legal location for each block such that the circuit wiring is optimized.

Significance of Placement °Good placement is extremely important sets constraints for routability even if the circuit does route, a poor placement will still lead to a lower maximum operating speed and increased power consumption. °Finding a good placement is challenging A large commercial FPGA contains over 500,000 functional blocks, -500,000! Possible placements. Exhaustive evaluation is therefore impossible. Placement is a computationally hard problem, -no known algorithm that produces optimal results in practical central processing unit (CPU) time. Development of fast and effective heuristic placement algorithms is a critical research area.

Device Legality Constraints °All resources are prefabricated in an FPGA leads to a variety of placement legality constraints: °A legal placement must place a functional block only in a location on the chip that can accommodate it. RAM block must be placed in a RAM location, and a lookup table (LUT) must be placed in a LUT location. °Some groups of functional blocks must be placed in a specific relative orientation to make use of special, dedicated routing resources. arithmetic logic cells—to use the dedicated carry- chain hardware, the logic cells forming a carry chain must be placed adjacent to each other in the sequence required by the carry structure.

FPGA Placement Constraints °FPGA interconnect is prefabricated, Amount of interconnect in each region of a device is fixed °Routing congestion When the interconnect demand approaches or exceeds the fabricated wiring capacity in some part of the FPGA. A placement that requires more interconnect in a device region than that region contains cannot be routed

FPGA Placement Constraints °Stratix-II is an island-style FPGA that contains routing segments that span 4, 16, and 24 logic blocks. Programmable switches allow routing segments in the same direction (horizontal or vertical) to be connected at their endpoints to create longer routes. Other programmable switches allow some horizontal routing segments to connect to vertical routing segments where they cross and vice versa. XY Length 4 Length 2 Length 1

Placement Objective– Routability Driven °Create a placement that minimizes the total interconnect required, °Increase the probability of successful routing °Consequently, some routability-driven placement algorithms minimize not only the total wiring required by the design but also the amount of routing congestion.

Placement Objective – Timing Driven °In addition to optimizing for routability, timing- driven algorithms use timing analysis to identify critical paths and/or connections to optimize the delay of those connections. °Most delays in an FPGA are due to the programmable interconnect timing-driven placement can achieve a large improvement in circuit speed over routability-driven approaches.

Level of Control on Placement °Commercial FPGA placement tools allow designers to control the placement °Common types of placement directives. °1) Exact location of a block The most restrictive Typical uses -to lock down the design I/Os at the locations required by the circuit board or to lock down the elements of a performance- critical intellectual property (IP) core. °2) Area specific less restrictive forces blocks to go into a specific 2D area, allows a designer to guide the placement tool

Level of Control on Placement °3) Relative location specify the relative location of several blocks, placement tool chooses exactly where to locate the block group. Typical use -for library components where a designer knows a good placement of the component blocks relative to each other. °4) Floating region specifies that some logic should be placed within a tight region placement tool can choose where that region should be on the device.

Placement Algorithms Constructive methods: -Begin from netlist and generate an initial placement. -Partitioning method: Mincut -First address placement of partitions individually –Significant amount of reduction in search space -Then address placement of partitions relative to each other -Not suitable for FPGAs –Especially island style FPGA with limited routing resources –Method postpones the impact of inter-partition connections –Leads to increased demand on routing tracks

Placement Placement has a set of competing goals. Can’t optimize locally and globally simultaneously. Use heuristic approaches to evaluate quality. CDF A B E 12 LUT1LUT2 A B C D E

Getting Stuck with Local Minima pick a random starting point repeatedly swap, if the new state has a lower cost, it is accepted, otherwise the current state is retained. greedily accept good moves Problem: large number of local minima circuit placed as shown at left, is in a local minima. No swap of logic or I/O functions will reduce the total wirelength.

Technology Mapping to Placement Mapping onto 5-LUT

Technology Mapping to Placement

Iterative Placement Algorithms °Iterative improvement Begin with random or constructive placement. Iterate to improve it. Pairwise interchange Hill climbing -To avoid getting trapped in local minima, consider “hill- climbing” approach -Need to accept worse solutions or make “bad” moves to get global minima. -Acceptance is probabalistic. Only accept cost-increasing moves some of the time.

Iterative Placement Algorithms °Methods Force-directed methods (classical mechanics) -Force vector computed on each module corresponding to all nets -Solve set of non-linear differential equations. –FD relaxation –FD pairwise exchange Simulated annealing (statistical mechanics) -Model a physical annealing process which optimizes energy. -Similar to “quenching” metal. -Generates best results -Can be time consuming Macro-based approaches -Genetic algorithms

Physical Annealing Take a metal and heat to high temperature Allow it to cool slowly; metal is annealed to a low temperature Atoms in the metal are at lower energy states after annealing Higher the temperature initially and slower the cooling, the tougher the metal becomes. Atoms transition to high energy states and then move to low energy.

Simulated Annealing Optimization strategy based on physical annealing process Generate random moves. -Initially, accept moves that decrease and increase cost. As temperature decreases, the probability of accepting bad moves decreases. Eventually, default to greedy algorithm Only accept positive moves Determine when to terminate.

Simulated Annealing

Bounding Box and Cost Function °Bounding box underestimates wirelength q(n) is compensation factor -q is 1 for 3- and 2-terminal nets -increases to 2.79 for 50 terminal nets Cav is channel capacity (tracks) in x and y directions over the bounding box of net n -penalizes placements which require more routing in areas of the FPGA that have narrower channels. -However, Cav is constant since channel width is fixed for island style FPGA

Placement Flow

Wire length measures °Estimate wire length by distance between components. °Possible distance measures: Euclidean distance (sqrt(x 2 + y 2 )); Manhattan distance (x + y). °Multi-point nets must be broken up into trees for good estimates. Euclidean Manhattan

Weighted Graph -> Distance Table °Geometric Distance NOT Accurate !!! °Need Weighted Graph Cost of Routing Resources °Finding Shortest Path at Each Step of Annealing costly Need for Lookup Table

Simulated Annealing – Moves per iteration Moves_per_iteration = BN 4/3 N = # of logic blocks and I/O pads B = scaling factor

Simulated Annealing – Swapping Range Swap distance is adjusted based on the acceptance rate as well. Initially set to entire FPGA As T drops, distance drops.

Simulated Annealing New T depends on the fraction of attempted moves that were accepted. Reduces rapidly when acceptance rate is high When the temperature is less than a small fraction of the average cost of a net, it is unlikely that any move that results in a cost increase will be accepted, so we terminate the anneal.

Annealing Criteria Contemporary FPGA packages use the following parameters: 1.Starting temp – 20 * stand_dev(cost of N swaps) 2.Cost function – weighted sum of wire length and delay 3.Inner loop – B * N 4/3 Beta cost function 4.Stopping criteria – T < [.005 * Cost/N nets ]

Strengths of SA making it suitable for FPGA °Can enforce all the legality constraints imposed by the FPGA architecture fairly directly By forbidding the creation of illegal placements in the move generator By adding a penalty cost to illegal placements. °Can directly model the impact of the FPGA routing architecture on circuit delay and routing congestion By creating an appropriate cost function