An Improved “Soft” eFPGA Design and Implementation Strategy

Slides:



Advertisements
Similar presentations
Field Programmable Gate Array
Advertisements

Digital Integrated Circuits© Prentice Hall 1995 Combinational Logic COMBINATIONAL LOGIC.
FPGA (Field Programmable Gate Array)
EGRE 427 Advanced Digital Design Figures from Application-Specific Integrated Circuits, Michael John Sebastian Smith, Addison Wesley, 1997 Chapter 5 Programmable.
A reconfigurable system featuring dynamically extensible embedded microprocessor, FPGA, and customizable I/O Borgatti, M. Lertora, F. Foret, B. Cali, L.
Programmable Logic Devices
EECE579: Digital Design Flows
Clustering of Large Designs for Channel-Width Constrained FPGAs Marvin TomGuy Lemieux University of British Columbia Department of Electrical and Computer.
from High-frequency Clocks using DC-DC Converters
FPGA structure and programming - Eli Kaminsky 1 FPGA structure and programming.
Caltech CS184a Fall DeHon1 CS184a: Computer Architecture (Structures and Organization) Day8: October 18, 2000 Computing Elements 1: LUTs.
Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA
Lecture 3: Field Programmable Gate Arrays II September 10, 2013 ECE 636 Reconfigurable Computing Lecture 3 Field Programmable Gate Arrays II.
Institute of Digital and Computer Systems 1 Fabio Garzia / Finding Peak Performance in a Process23/06/2015 Chapter 5 Finding Peak Performance in a Process.
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 11: February 14, 2007 Compute 1: LUTs.
ASIC vs. FPGA – A Comparisson Hardware-Software Codesign Voin Legourski.
S. Reda EN160 SP’07 Design and Implementation of VLSI Systems (EN0160) Lecture 33: Array Subsystems (PLAs/FPGAs) Prof. Sherief Reda Division of Engineering,
February 4, 2002 John Wawrzynek
The Memory/Logic Interface in FPGA’s with Large Embedded Memory Arrays The Memory/Logic Interface in FPGA’s with Large Embedded Memory Arrays Steven J.
 2000 M. CiesielskiPTL Synthesis1 Synthesis for Pass Transistor Logic Maciej Ciesielski Dept. of Electrical & Computer Engineering University of Massachusetts,
1 Introduction A digital circuit design is just an idea, perhaps drawn on paper We eventually need to implement the circuit on a physical device –How do.
Digital Integrated Circuits© Prentice Hall 1995 Combinational Logic COMBINATIONAL LOGIC.
CS 151 Digital Systems Design Lecture 38 Programmable Logic.
Dr. Konstantinos Tatas ACOE201 – Computer Architecture I – Laboratory Exercises Background and Introduction.
Yehdhih Ould Mohammed Moctar1 Nithin George2 Hadi Parandeh-Afshar2
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Topics n Circuit design for FPGAs: –Logic elements. –Interconnect.
Power Reduction for FPGA using Multiple Vdd/Vth
Shashi Kumar 1 Logic Synthesis: Course Introduction Shashi Kumar Embedded System Group Department of Electronics and Computer Engineering Jönköping Univ.
Coarse and Fine Grain Programmable Overlay Architectures for FPGAs
CAD for Physical Design of VLSI Circuits
XStream: Rapid Generation of Custom Processors for ASIC Designs Binu Mathew * ASIC: Application Specific Integrated Circuit.
LOPASS: A Low Power Architectural Synthesis for FPGAs with Interconnect Estimation and Optimization Harikrishnan K.C. University of Massachusetts Amherst.
Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc.
SHA-3 Candidate Evaluation 1. FPGA Benchmarking - Phase Round-2 SHA-3 Candidates implemented by 33 graduate students following the same design.
Modern VLSI Design 4e: Chapter 6 Copyright  2008 Wayne Wolf Topics Memories: –ROM; –SRAM; –DRAM; –Flash. Image sensors. FPGAs. PLAs.
Heterogeneous FPGA architecture and CAD Peter Jamieson Supervisor: Jonathan Rose.
Programmable Logic Devices
Notices You have 18 more days to complete your final project!
CS/EE 3700 : Fundamentals of Digital System Design
Basic Sequential Components CT101 – Computing Systems Organization.
Design Space Exploration for Application Specific FPGAs in System-on-a-Chip Designs Mark Hammerquist, Roman Lysecky Department of Electrical and Computer.
Impact of Interconnect Architecture on VPSAs (Via-Programmed Structured ASICs) Usman Ahmed Guy Lemieux Steve Wilton System-on-Chip Lab University of British.
Configuration Bitstream Reduction for SRAM-based FPGAs by Enumerating LUT Input Permutations The University of British Columbia© 2011 Guy Lemieux Ameer.
Logical Circuits Philip Gebhardt 3/15/2011. Logic Circuits Negative, Positive, and Complimentary circuits Logic Gates Programmable Logic Devices.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Circuit design for FPGAs n Static CMOS gate vs. LUT n LE output drivers n Interconnect.
1 Leakage Power Analysis of a 90nm FPGA Authors: Tim Tuan (Xilinx), Bocheng Lai (UCLA) Presenter: Sang-Kyo Han (ECE, University of Maryland) Published.
1 Carnegie Mellon University Center for Silicon System Implementation An Architectural Exploration of Via Patterned Gate Arrays Chetan Patel, Anthony Cozzie,
Introduction to VLSI Design© Steven P. Levitan 1998 Introduction Properties of Complementary CMOS Gates.
1 Contents Reviewed Rabaey CH 3, 4, and 6. 2 Physical Structure of MOS Transistors: the NMOS [Adapted from Principles of CMOS VLSI Design by Weste & Eshraghian]
A Synthesizable Datapath-Oriented Programmable Logic Core Steven J.E. Wilton, Chun Hok Ho, Philip Leong, Wayne Luk, Brad Quinton University of British.
ESS | FPGA for Dummies | | Maurizio Donna FPGA for Dummies Basic FPGA architecture.
Directional and Single-Driver Wires in FPGA Interconnect Guy Lemieux Edmund LeeMarvin TomAnthony Yu Dept. of ECE, University of British Columbia Vancouver,
ECE 551: Digital System Design & Synthesis Motivation and Introduction Lectures Set 1 (3 Lectures)
Introduction to Field Programmable Gate Arrays Lecture 1/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden, 31 May – 9 June 2007 Javier.
Integrated Microsystems Lab. EE372 VLSI SYSTEM DESIGNE. Yoon 1-1 Panorama of VLSI Design Fabrication (Chem, physics) Technology (EE) Systems (CS) Matel.
1 Field-programmable Gate Array Architectures and Algorithms Optimized for Implementing Datapath Circuits Andy Gean Ye University of Toronto.
FPGA Logic Cluster Design Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223.
© PSU Variation Aware Placement in FPGAs Suresh Srinivasan and Vijaykrishnan Narayanan Pennsylvania State University, University Park.
Architecture and algorithm for synthesizable embedded programmable logic core Noha Kafafi, Kimberly Bozman, Steven J. E. Wilton 2003 Field programmable.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 11: January 31, 2005 Compute 1: LUTs.
Interconnect Driver Design for Long Wires in FPGAs Edmund Lee, Guy Lemieux & Shahriar Mirabbasi University of British Columbia, Canada Electrical & Computer.
A Survey of Fault Tolerant Methodologies for FPGA’s Gökhan Kabukcu
Resource Sharing in LegUp. Resource Sharing in High Level Synthesis Resource Sharing is a well-known technique in HLS to reduce circuit area by sharing.
Lecture 19: SRAM.
CS184a: Computer Architecture (Structure and Organization)
Topics Circuit design for FPGAs: Logic elements. Interconnect.
An Active Glitch Elimination Technique for FPGAs
Implementation Technology
EEE2243 Digital System Design Chapter 9: Advanced Topic: Physical Implementation by Muhazam Mustapha extracted from Frank Vahid’s slides, May 2012.
ESE534: Computer Organization
Presentation transcript:

An Improved “Soft” eFPGA Design and Implementation Strategy Victor Aken’Ova, Guy Lemieux, Resve Saleh SoC Research Lab, University of British Columbia Vancouver, BC Canada

Overview Introduction and Motivation Embedded FPGA (eFPGA) Soft Embedded FPGAs Configurable Architecture Improving Soft eFPGAs Tactical Standard Cells Structured eFPGA layout Results Summary and Conclusions

Introduction Software Flexibility Hardware Flexibility eFPGAs SoC designs are getting more complex and costly Programmability can be built into SoCs to amortize costs by reducing chip re-spins Software Flexibility No Flexibility Hardware Flexibility eFPGAs

Applications for eFPGA Fabrics CPU An eFPGA for CPU acceleration 3 1 2 eFPGA for product differentiation An eFPGA for revisions

Motivation shortcomings of existing eFPGA design approaches Hard eFPGA Highly efficient full-custom layouts but inflexible Soft eFPGA Very flexible but inefficient standard cell layouts alternative approach: flexible + efficient

“Hard” eFPGA Approach with a library of 3 Cores user circuit 1 RTL ? ? ? 3 2 Restrictive! overcapacity increases area and delay overheads

The “Soft” eFPGA Approach eFPGA RTL Generator auto generated eFPGA ASIC flow much less logic and routing overcapacity Generic Standard Cells 7x area and 2x delay versus full-custom

Some Solutions to Problems of Existing Approaches retain eFPGA generator idea for flexibility But… use tactical cells to reduce area + delay use structured approach for efficiency

Our Improved Design Approach “Soft++” eFPGA RTL Generator auto generated eFPGA Structured ASIC FLOW GOAL Tactical +Generic Cells combine best of soft and hard approaches

Island-style eFPGA Architecture used island-style architecture because Mainstream: existing FPGA CAD tools can can be leveraged can exploit its regular structure to improve design efficiency Created parameterized eFPGA in VHDL

Island-style eFPGA Architecture L: Left Edge TILE C C: Corner TILE B B: Bottom Edge TILE (a) Island-style eFPGA (b) eFPGA Tile Layout

Unstructured vs. Structured eFPGA Design Approach Soft eFPGA Fixed Logic tile1 tile2 tile3 tile4 (a) unstructured eFPGA layout (b) structured eFPGA layout

Measured Impact of Structure on eFPGA Quality Significant improvements in logic capacity result of a more efficient CAD methodology wire-only critical path delay less by 21% Cut CAD design time by as much as 6X

Architecture-specific Tactical Cells – The Concept improve quality by creating few tactical standard cells to replace generic cells detailed analysis of design profile should reveal areas that yield significant gains

Standard cell Area Breakdown for Island-Style Architecture switch 16% other 12% LUT 30% input mux 13% muxes 42% flip-flops 46% LUT mux 39% flip-flops and multiplexers dominate eFPGA area

Architecture-specific Tactical Cells – Flip-Flop vs. SRAM ~2:1 area ratio! (a) typical D flip-flop (b) typical SRAM cell An SRAM circuit has fewer transistors = less area

Custom Layout of Standard Cell – Flip-Flop vs. SRAM 2.5X vdd gnd 1X vdd gnd Standard Cell Flip-flop Tactical SRAM Cell

Architecture-specific Tactical Cells – CMOS vs. Pass Gate D O S0 C B A S1 O S0 D C B A VDD S1 after extra output inverter decompose into NAND, INV ~4:1 area ratio! pass tree logic uses fewer transistors and is faster

Layout Technique for Pass-Tree Multiplexers vdd n-well n-well vdd n-well cutout gnd extra NMOS (denser cell) gnd underutilized region n-well cut-outs allow denser pass transistor tree layouts

Architecture-specific Tactical Cells – Cell Area Equivalent Standard cell Area (um2) Custom Tactical cell Area (um2) improvement Factor Cell 61 1-SRAM 24 2.5 899 146 6.1 16:1 MUX 2228 32:1 MUX 293 7.6 4-LUT 1875 530 3.5 5-LUT 4180 1061 3.9

Area Impact of Tactical Standard Cells – eFPGA Area -58% eFPGA -85% (a) soft (b) soft ++ (c) full-custom soft ++ ~2.4X smaller than soft = 58% area savings

Graphs of Area and Delay Savings 2.4X Better Area 1.6 – 2.8X full-custom area Benchmarks 1.4X Better Delay 1.1X of full-custom delay Benchmarks

Fabricated Chip Designs with eFPGAs (180nm process) (a) gradual architecture (b) island-style architecture

Summary eFPGA area improved 58% (on average) 2 to 2.8X larger than full-custom equivalent (worst case) eFPGA delay improved 40% (average) within 10% of delay of full-custom versions exploited the regularity of island-style architecture to increase logic capacity

End of Talk

Question and Answer Slide Soft Soft++ custom hard Area soft++ fills some of performance gap left by hard Logic Capacity