Routing Wire Optimization through Generic Synthesis on FPGA Carry Hadi P. Afshar Joint work with: Grace Zgheib, Philip Brisk and Paolo Ienne.

Slides:



Advertisements
Similar presentations
EGRE 427 Advanced Digital Design Figures from Application-Specific Integrated Circuits, Michael John Sebastian Smith, Addison Wesley, 1997 Chapter 5 Programmable.
Advertisements

Architectural Improvement for Field Programmable Counter Array: Enabling Efficient Synthesis of Fast Compressor Trees on FPGA Alessandro Cevrero 1,2 Panagiotis.
A Survey of Logic Block Architectures For Digital Signal Processing Applications.
Floating-Point FPGA (FPFPGA) Architecture and Modeling (A paper review) Jason Luu ECE University of Toronto Oct 27, 2009.
Cross-layer Optimized Placement and Routing for FPGA Soft Error Mitigation Keheng Huang 1,2, Yu Hu 1, and Xiaowei Li 1 1 Key Laboratory of Computer System.
Programmable Logic Devices
Reducing the Pressure on Routing Resources of FPGAs with Generic Logic Chains Hadi P. Afshar Joint work with: Grace Zgheib, Philip Brisk and Paolo Ienne.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR SRAM-based FPGA n SRAM-based LE –Registers in logic elements –LUT-based logic element.
Architecture Design Methodology. 2 The effects of architecture design on metrics:  Area (cost)  Performance  Power Target market:  A set of application.
EECE579: Digital Design Flows
ENGIN112 L38: Programmable Logic December 5, 2003 ENGIN 112 Intro to Electrical and Computer Engineering Lecture 38 Programmable Logic.
The Spartan 3e FPGA. CS/EE 3710 The Spartan 3e FPGA  What’s inside the chip? How does it implement random logic? What other features can you use?  What.
Logic Synthesis for Programmable Devices Onur Bay & Debatosh Debnath
Architecture and Synthesis for Power-Efficient FPGAs Jason Cong University of California, Los Angeles Partially supported by NSF Grants.
CS294-6 Reconfigurable Computing Day 14 October 7/8, 1998 Computing with Lookup Tables.
February 4, 2002 John Wawrzynek
Lecture 3 1 ECE 412: Microcomputer Laboratory Lecture 3: Introduction to FPGAs.
HARP: Hard-Wired Routing Pattern FPGAs Cristinel Ababei , Satish Sivaswamy ,Gang Wang , Kia Bazargan , Ryan Kastner , Eli Bozorgzadeh   ECE Dept.
CS 151 Digital Systems Design Lecture 38 Programmable Logic.
Introduction to FPGA’s FPGA (Field Programmable Gate Array) –ASIC chips provide the highest performance, but can only perform the function they were designed.
Yehdhih Ould Mohammed Moctar1 Nithin George2 Hadi Parandeh-Afshar2
Philip Brisk 2 Paolo Ienne 2 Hadi Parandeh-Afshar 1,2 1: University of Tehran, ECE Department 2: EPFL, School of Computer and Communication Sciences Efficient.
LOPASS: A Low Power Architectural Synthesis for FPGAs with Interconnect Estimation and Optimization Harikrishnan K.C. University of Massachusetts Amherst.
Open Discussion of Design Flow Today’s task: Design an ASIC that will drive a TV cell phone Exercise objective: Importance of codesign.
Enhancing FPGA Performance for Arithmetic Circuits Philip Brisk 1 Ajay K. Verma 1 Paolo Ienne 1 Hadi Parandeh-Afshar 1,2 1 2 University of Tehran Department.
ASIC Design Flow – An Overview Ing. Pullini Antonio
1 Rapid Estimation of Power Consumption for Hybrid FPGAs Chun Hok Ho 1, Philip Leong 2, Wayne Luk 1, Steve Wilton 3 1 Department of Computing, Imperial.
PROGRAMMABLE LOGIC DEVICES (PLD)
ICCD Conversion Driven Design of Binary to Mixed Radix Circuits Ashur Rafiev, Julian Murphy, Danil Sokolov, Alex Yakovlev School of EECE, Newcastle.
A Flexible DSP Block to Enhance FGPA Arithmetic Performance
J. Christiansen, CERN - EP/MIC
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR FPGA Fabric n Elements of an FPGA fabric –Logic element –Placement –Wiring –I/O.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Topics n FPGA fabric architecture concepts.
Programmable Logic Devices
THE TESTING APPROACH FOR FPGA LOGIC CELLS E. Bareiša, V. Jusas, K. Motiejūnas, R. Šeinauskas Kaunas University of Technology LITHUANIA EWDTW'04.
Introduction to FPGA Created & Presented By Ali Masoudi For Advanced Digital Communication Lab (ADC-Lab) At Isfahan University Of technology (IUT) Department.
Design Space Exploration for Application Specific FPGAs in System-on-a-Chip Designs Mark Hammerquist, Roman Lysecky Department of Electrical and Computer.
Introduction to FPGAs Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223.
EE 466/586 VLSI Design Partha Pande School of EECS Washington State University
EE3A1 Computer Hardware and Digital Design
1 A Min-Cost Flow Based Detailed Router for FPGAs Seokjin Lee *, Yongseok Cheon *, D. F. Wong + * The University of Texas at Austin + University of Illinois.
Timing-Driven Routing for FPGAs Based on Lagrangian Relaxation
Introduction to FPGA Tools
1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer.
ESS | FPGA for Dummies | | Maurizio Donna FPGA for Dummies Basic FPGA architecture.
FPGA-Based System Design: Chapter 1 Copyright  2004 Prentice Hall PTR Moore’s Law n Gordon Moore: co-founder of Intel. n Predicted that number of transistors.
An Improved “Soft” eFPGA Design and Implementation Strategy
1 Area-Efficient FPGA Logic Elements: Architecture and Synthesis Jason Anderson and Qiang Wang 1 IEEE/ACM ASP-DAC Yokohama, Japan January 26-28,
In-Place Decomposition for Robustness in FPGA Ju-Yueh Lee, Zhe Feng, and Lei He Electrical Engineering Dept., UCLA Presented by Ju-Yueh Lee Address comments.
1 Field-programmable Gate Array Architectures and Algorithms Optimized for Implementing Datapath Circuits Andy Gean Ye University of Toronto.
Click to edit Master title style Literature Review Measuring the Gap Between FPGAs and ASICs Ian Kuon, Jonathan Rose University of Toronto IEEE TCAD/ICAS.
Delivered by.. Love Jain p08ec907. Design Styles  Full-custom  Cell-based  Gate array  Programmable logic Field programmable gate array (FPGA)
Philip Brisk 2 Paolo Ienne 2 Hadi Parandeh-Afshar 1,2 1: University of Tehran, ECE Department 2: EPFL, School of Computer and Communication Sciences Improving.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Topics n FPGA fabric architecture concepts.
Chandrasekhar 1 MAPLD 2005/204 Reduced Triple Modular Redundancy for Tolerating SEUs in SRAM based FPGAs Vikram Chandrasekhar, Sk. Noor Mahammad, V. Muralidharan.
Resource Sharing in LegUp. Resource Sharing in High Level Synthesis Resource Sharing is a well-known technique in HLS to reduce circuit area by sharing.
1 Architecture of Datapath- oriented Coarse-grain Logic and Routing for FPGAs Andy Ye, Jonathan Rose, David Lewis Department of Electrical and Computer.
Floating-Point FPGA (FPFPGA)
A New Logic Synthesis, ExorBDS
Topics SRAM-based FPGA fabrics: Xilinx. Altera..
IPF: In-Place X-Filling to Mitigate Soft Errors in SRAM-based FPGAs
MAPLD 2005 Reduced Triple Modular Redundancy for Tolerating SEUs in SRAM based FPGAs Vikram Chandrasekhar, Sk. Noor Mahammad, V. Muralidharan Dr. V. Kamakoti.
Electronics for Physicists
Exploiting Fast Carry Chains of FPGAs for Designing Compressor Trees
A Novel FPGA Logic Block for Improved Arithmetic Performance
An Active Glitch Elimination Technique for FPGAs
FPGA Logic Synthesis using Quantified Boolean Satisfiability
Electronics for Physicists
Measuring the Gap between FPGAs and ASICs
Reconfigurable Computing (EN2911X, Fall07)
Presentation transcript:

Routing Wire Optimization through Generic Synthesis on FPGA Carry Hadi P. Afshar Joint work with: Grace Zgheib, Philip Brisk and Paolo Ienne

FPGAs and ASICs Gaps* Performance – Ratio: 3-4 Area – Ratio: Power – Ratio: 7-15 *I. Kuon and J. Rose, "Measuring the gap between FPGAs and ASICs“, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol. 26, NO. 2, FEBRUARY 2007, pp. 203 – Routing resources consume ≈60-80% of the chip area and are significant contributors to circuit delay. Concerns: ✘ Lack of generality and flexibility ✘ Underutilization ✘ Change in routing structure How to narrow the gap? Specialized (DSP) blocks Coarser grained logic blocks Hard-wired connections

Carry Chains 3 4-LUT + + CLB 8 Inputs

Motivation Example 4

Problem Definition 5 LUT Mapped Flow Graph Step1: Logic Matching Step2: Chaining

Logic Matching Step1: Enumeration of Programmable Part Step2: Identifying regular and independent segments Step3: Developing alphabet library of the macro cell Step4: Mask division and library matching 6 B LUT + A C in C out

Logic Matching (Example) Step1: Enumeration 7 i3i3 i2i2 i1i1 i0i0 LUT 1 LUT A0A0 B0B0 0001A0A0 B1B1 0010A1A1 B0B0 0011A1A1 B1B1 0100A2A2 B2B2 0101A2A2 B3B3 0110A3A3 B2B2 0111A3A3 B3B3 1000A4A4 B4B4 1001A4A4 B5B5 1010A5A5 B4B4 1011A5A5 B5B5 1100A6A6 B6B6 1101A6A6 B7B7 1110A7A7 B6B6 1111A7A7 B7B7

Logic Matching (Example) Step2: Regular and Independent Segments 8 i3i3 i2i2 i1i1 i0i0 LUT 1 LUT A0A0 B0B0 0001A0A0 B1B1 0010A1A1 B0B0 0011A1A1 B1B1 0100A2A2 B2B2 0101A2A2 B3B3 0110A3A3 B2B2 0111A3A3 B3B3 1000A4A4 B4B4 1001A4A4 B5B5 1010A5A5 B4B4 1011A5A5 B5B5 1100A6A6 B6B6 1101A6A6 B7B7 1110A7A7 B6B6 1111A7A7 B7B7

Logic Matching (Example) Step3: Alphabet library of the cell 9 LUT 1 LUT 2 C in 8-bit alphabets of configuration mask dictionary A0A0 B0B … A0A0 B1B … A1A1 B0B … A1A1 B1B … A0A0 B0B … A0A0 B1B … A1A1 B0B … A1A1 B1B … A 0 = 0 A 1 = 0 B 0 = 0 B 1 = 0 A 0 = 1 A 1 = 0 B 0 = 0 B 1 = 0 A 0 = 0 A 1 = 1 B 0 = 0 B 1 = 0 A 0 = 1 A 1 = 1 B 0 = 0 B 1 = 0 A 0 = 0 A 1 = 0 B 0 = 1 B 1 = 0

Logic Matching (Example) Step4: Mask segmented matching 10 8-bit Library

How much we gain? Assume that mask is 32-bit – N Segments – M Patterns in each segment – Our Library Size = Bits – Num of all configurations = 11 Order of magnitudes less memory Order of magnitudes less comparisons

Chaining Heuristic 12 Input Output Input Output Input Output We need to find chains of functions, which are mappable to the macrocell, to be placed on the carry chains

Synthesis and Chaining Results BenchmarkChainableChained Max Chain Length Average Chain Length alu474%39%43.5 pdc69%35%63.9 misex368%42%43.1 ex101071%41%53.4 ex5p72%40%43.5 des*65%31%33.0 apex273%42%43.6 apex475%39%43.7 spla72%43%64.2 seq69%38%43.4 Average70%39% * The minimum threshold for the chain length is 4, except for “des” which is 3.

Experimental Methodology 14 Goal: Extract chains of eligible functions from the synthesized netlist in order to place them on the logic chains; the non- chained ones are remained unchanged. Our Synthesis Engine Logic Matching Chaining Heuristic Netlist Generation VQM Parser DAG Generation Quartus-II LUT Mapping & Syn Quartus-II Place & Route

Local Routing Wires 15 26% saving in local wires number

Total Wire Lengths 16 9% saving in total wire lengths

Delay 17 3% delay penalty due to large in-out delay of the adder

Conclusion 18 Narrow the FPGA and ASIC Gaps Lighten the stress on routing resources Hardwired connections + Dedicated logic Improved Routability with a Lighter Network

19 Thanks for your attention.