A New Hybrid FPGA with Nanoscale Clusters and CMOS Routing Reza M. P

Slides:

Advertisements

Similar presentations

Defect Tolerance for Yield Enhancement of FPGA Interconnect Using Fine-grain and Coarse-grain Redundancy Anthony J. YuGuy G.F. Lemieux September 15, 2005.

Advertisements

NanoFabric Chang Seok Bae. nanoFabric nanoFabric : an array of connect nanoBlocks nanoBlock : logic block that can be progammed to implement Boolean function.

ECE 506 Reconfigurable Computing Lecture 6 Clustering Ali Akoglu.

Cross-layer Optimized Placement and Routing for FPGA Soft Error Mitigation Keheng Huang 1,2, Yu Hu 1, and Xiaowei Li 1 1 Key Laboratory of Computer System.

Architecture Design Methodology. 2 The effects of architecture design on metrics:  Area (cost)  Performance  Power Target market:  A set of application.

Defect Tolerance for Yield Enhancement of FPGA Interconnect Using Fine-grain and Coarse-grain Redundancy Anthony J. Yu August 15, 2005.

Defect Tolerance for Yield Enhancement of FPGA Interconnect Using Fine-grain and Coarse-grain Redundancy Anthony J. Yu August 15, 2005.

CMOL vs NASICs T. Wang University of Massachusetts, Amherst September 29, 2005.

CMOL: Device, Circuits, and Architectures Konstantin K.Likharev and Dmitri B. Strukov Stony Brook University 697GG Nano Computering Fall 2005 Prepared.

Reconfigurable Computing (EN2911X, Fall07)

An Efficient Chiplevel Time Slack Allocation Algorithm for Dual-Vdd FPGA Power Reduction Yan Lin 1, Yu Hu 1, Lei He 1 and Vijay Raghunathan 2 1 EE Department,

Array-Based Architecture for FET-Based, Nanoscale Electronics André DeHon 2003 Presented By Mahmoud Ben Naser.

NanoMap: An Integrated Design Optimization Flow for a Hybrid Nanotube/CMOS Dynamically Reconfigurable Architecture Wei Zhang†, Li Shang‡ and Niraj K. Jha†

The Memory/Logic Interface in FPGA’s with Large Embedded Memory Arrays The Memory/Logic Interface in FPGA’s with Large Embedded Memory Arrays Steven J.

Leakage Efficient Chip-Level Dual-Vdd Assignment with Time Slack Allocation for FPGA Power Reduction Yan Lin and Lei He EE Department, UCLA Partially supported.

Prospects for Terabit-scale nano electronic memories Venkata R.Malladi Instructor : Dr.Damian.

Trace-Based Framework for Concurrent Development of Process and FPGA Architecture Considering Process Variation and Reliability 1 Lerong Cheng, 1 Yan Lin,

Dynamic Power Consumption In Large FPGAs WILLIAM GARCIA, ANDREW MORTELLARO.

Yehdhih Ould Mohammed Moctar1 Nithin George2 Hadi Parandeh-Afshar2

FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Topics n Circuit design for FPGAs: –Logic elements. –Interconnect.

An automatic tool flow for the combined implementation of multi-mode circuits Brahim Al Farisi, Karel Bruneel, João Cardoso, Dirk Stroobandt.

Power Reduction for FPGA using Multiple Vdd/Vth

Building Cad Prototyping Tool for Emerging Nanoscale Fabrics Catherine Dezan Joined work between Lester( France.

Titan: Large and Complex Benchmarks in Academic CAD

LOPASS: A Low Power Architectural Synthesis for FPGAs with Interconnect Estimation and Optimization Harikrishnan K.C. University of Massachusetts Amherst.

Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc.

Channel Width Reduction Techniques for System-on-Chip Circuits in Field-Programmable Gate Arrays Marvin Tom University of British Columbia Department of.

1. NATURE: Non-Volatile Nanotube RAM based Field-Programmable Gate Arrays Wei Zhang†, Niraj K. Jha† and Li Shang ‡ †Dept. of Electrical Engineering Princeton.

Programmable Logic Devices

Reconfigurable Computing Using Content Addressable Memory (CAM) for Improved Performance and Resource Usage Group Members: Anderson Raid Marie Beltrao.

Design Space Exploration for Application Specific FPGAs in System-on-a-Chip Designs Mark Hammerquist, Roman Lysecky Department of Electrical and Computer.

CprE / ComS 583 Reconfigurable Computing Prof. Joseph Zambreno Department of Electrical and Computer Engineering Iowa State University Lecture #4 – FPGA.

Timing-Driven Routing for FPGAs Based on Lagrangian Relaxation

1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer.

1 Carnegie Mellon University Center for Silicon System Implementation An Architectural Exploration of Via Patterned Gate Arrays Chetan Patel, Anthony Cozzie,

1 Field-programmable Gate Array Architectures and Algorithms Optimized for Implementing Datapath Circuits Andy Gean Ye University of Toronto.

FPGA Logic Cluster Design Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223.

Click to edit Master title style Progress Update Energy-Performance Characterization of CMOS/MTJ Hybrid Circuits Fengbo Ren 05/28/2010.

© PSU Variation Aware Placement in FPGAs Suresh Srinivasan and Vijaykrishnan Narayanan Pennsylvania State University, University Park.

Interconnect Driver Design for Long Wires in FPGAs Edmund Lee University of British Columbia Electrical & Computer Engineering MASc Thesis Presentation.

Lecture 17: Dynamic Reconfiguration I November 10, 2004 ECE 697F Reconfigurable Computing Lecture 17 Dynamic Reconfiguration I Acknowledgement: Andre DeHon.

Interconnect Driver Design for Long Wires in FPGAs Edmund Lee, Guy Lemieux & Shahriar Mirabbasi University of British Columbia, Canada Electrical & Computer.

A Survey of Fault Tolerant Methodologies for FPGA’s Gökhan Kabukcu

Programmable Logic Devices

1 Architecture of Datapath- oriented Coarse-grain Logic and Routing for FPGAs Andy Ye, Jonathan Rose, David Lewis Department of Electrical and Computer.

The Interconnect Delay Bottleneck.

Placement study at ESA Filomena Decuzzi David Merodio Codinachs

Floating-Point FPGA (FPFPGA)

Topics SRAM-based FPGA fabrics: Xilinx. Altera..

MAPLD 2005 Reduced Triple Modular Redundancy for Tolerating SEUs in SRAM based FPGAs Vikram Chandrasekhar, Sk. Noor Mahammad, V. Muralidharan Dr. V. Kamakoti.

Andrew B. Kahng and Xu Xu UCSD CSE and ECE Depts.

Technology Roadmap for Nano-electronics

Andy Ye, Jonathan Rose, David Lewis

Dr. Clincy Professor of CS

Verilog to Routing CAD Tool Optimization

ELE 523E COMPUTATIONAL NANOELECTRONICS

ELEC 6970: Low Power Design Class Project By: Sachin Dhingra

Topics Circuit design for FPGAs: Logic elements. Interconnect.

Presentation Title Greg Snider QSR, HP Laboratories

FPGA Glitch Power Analysis and Reduction

Nanowire Addressing with Randomized-Contact Decoders

Off-path Leakage Power Aware Routing for SRAM-based FPGAs

FIGURE 5-1 MOS Transistor, Symbols, and Switch Models

CprE / ComS 583 Reconfigurable Computing

Chapter 3b Leakage Efficient Chip-Level Dual-Vdd Assignment with Time Slack Allocation for FPGA Power Reduction Prof. Lei He Electrical Engineering Department.

Programmable logic and FPGA

Reconfigurable Computing (EN2911X, Fall07)

Fault Mitigation of Switching Lattices under the Stuck-At Model

Reconfigurable Computing (EN2911X, Fall07)

Presentation transcript:

A New Hybrid FPGA with Nanoscale Clusters and CMOS Routing Reza M. P A New Hybrid FPGA with Nanoscale Clusters and CMOS Routing Reza M.P. Rad and Mohammad Tehranipoor University of Connecticut Design Automation Conference, 2006

Outline Introduction/Background Motivation/Contributions CMOS Logic Cluster Architecture Nanowire-based Cluster CMOS Support Experiment Setups Results (Area/Performance) Conclusion

Introduction Challenges in further scaling CMOS Various nanoscale and molecular-electronics based devices under research Various assembly techniques experimented Self-assembly and nano-imprint techniques provide regular array structures (crossbars) Molecules with switching properties Reconfigurable switches CMOS support can provide inputs/outputs/configuration circuitry for nanoscale devices.

Background CMOS-Nano interface based on modulated doping of nanowires [DeHon, JETC’05] DMUX based on random deposition of gold particles on nanowire array [Kuekes, 2000] Single bit decoder-like interface [DeHon, JETC 2005] PLA-based FPGA architecture [DeHon, JETC’05] Island style architecture for nanoscale devices [Goldstein, 2001] Cell-based architecture (CMOL) [Likharev, 2005]

Background Using nanowires as FPGAs’ interconnects were analyzed in [Gayasen, DAC 2005] Hybrid-FPGAs with CMOS clusters and nanowire routing were considered It was reported that using nanowire interconnects in FPGAs can reduce area up to 70% Effects of nanowire-based implementation of logic clusters on area and delay were not reported

Motivation Design a new hybrid FPGA Emerging nanotechnologies require a CMOS-scale support that provides I/O and configuration circuitry Analyze efficiency of hybrid CMOS-Nano devices A hybrid FPGA with nanoscale clusters Perform experimental evaluations to provide insights to benefits and challenges of such hybrid technologies

Contribution New hybrid FPGA with nanoscale cluster and CMOS interconnects A logic cluster architecture based on crossbars of nanowires is proposed for FPGAs The proposed cluster has the same functionality as traditional CMOS clusters FPGA tools are modified to model area and performance based on the proposed cluster Results show significant area reduction for hybrid FPGA while the performance is slightly degraded.

Logic Cluster Architecture LUTs and MUXes are the most area consuming parts of any cluster MUXes can take up to 70% of the area Reducing the size of LUTs and MUXes, will considerably reduce the overall area of the FPGAs K input LUT out DFF Basic Logic Element (BLE) BLE1 N Out BLE N I In Logic Cluster in FPGAs

LUTs Implemented on Crossbars LUTs and MUXes can be implemented on nanowire crossbars Diodes of each column are configured to make one of the minterms Diodes on output line are configured to provide sum of minterms f = ∑Minterms(1,2,4,2 −1) k

Nanowire-based (Nanoscale) Cluster A crossbar can be configured as several LUTs and MUXes It has the same functionality as CMOS clusters used in FPGAs (I) : cluster I/O (II) : To DFFs (III) : Config. Addr. I/O and Config MUXes proposed in [Kuekes,2000] or [Rad, 2006]

CMOS Support CMOS support circuitry for the proposed cluster CMOS Substrate CMOS support circuitry for the proposed cluster Provides inversion, latching and configuration addresses It can be implemented on the substrate under the nanoscale crossbar to minimize the area

SIS (FlowMap and FlowPack): Maps Netlist to K-input LUTs Experiment Setups MCNC Benchmarks (Netlist) Area and delay for routing components and clusters should be estimated and applied to VPR Realistic values to resistors and capacitors of switches and line segments VPR calculates area based on number of min-size transistors SIS (FlowMap and FlowPack): Maps Netlist to K-input LUTs T-Vpack: Packs K-LUTs Into clusters of size N VPR: Performance Driven placement and routing Architecture Model For: (I) Full-CMOS FPGA (22 nm) (II) Hybrid FPGA Area and Delay Results

Results: Area Average area for implementing 77 MCNC benchmarks K : LUT size N : Cluster size Area of Hybrid FPGA is significantly lower than CMOS FPGA 2 CMOS FPGA (22 nm) N=8 Average Area um 28500 N=2 K (# of LUT Inputs) 2 Hybrid FPGA Average Area um 12500 K (# of LUT Inputs)

Results: Area (Hybrid FPGA) The increase in area of LUT and MUXes will be small when K increases (nanowire crossbars) Inter-cluster routing area decreases when K increases Area of hybrid FPGA will not increase with increase in K Up to 75% area reduction compared to CMOS FPGA Area Reduction % 4 5 6 7 2 18.3 23.6 32.5 46.8 29.5 43.8 53.9 64.6 44.3 50.6 59.3 68.5 8 45.9 55.1 69.1 75.7 K N

Results: Delay Average critical path delays for different values of K and N for 77 MCNC benchmarks K : LUT size N : Cluster size Delay parameters for 22 nm CMOS were estimated based on [Sylvester & Kuetzer 1998] Nanowire RC parameters calculated based on [DeHon, JETC’05] CMOS FPGA (22 nm) Critical Path Delay (S) K (# of LUT Inputs) Hybrid FPGA N=8 Critical Path Delay (S) N=2 K (# of LUT Inputs)

Results: Delay In CMOS FPGAs, delay of the cluster will increase when K increases Increasing K will reduces the number of inter-cluster routing wires on critical path The results show that increasing K and N will slightly reduce the critical path delay for CMOS FPGAs CMOS FPGA (22 nm)

Results: Delay In Hybrid FPGA, delay of the cluster depends on resistance and capacitance values of the nanowires As K and N increase, the length of nanowires used in the cluster will increase Hence delay of the cluster considerably increases Therefore, for hybrid FPGA, increasing K and N will increase critical path delay Hybrid FPGA N=8 N=2

Conclusions A new hybrid FPGA was proposed The proposed cluster was based on nanowire crossbars The FPGA tools have been modified to implement MCNC benchmarks on the proposed hybrid FPGA Hybrid FPGAs showed area reductions of up to 75% compared to 22 nm CMOS FPGAs Performances of CMOS and Hybrid FPGAs are almost equal for average size clusters

Future works Application of experimental data of nanowire based devices to the models to obtain more accurate comparison measures Perform power analysis to evaluate power requirements of nanowire based circuits Investigation of more efficient implementations of logic clusters based on nanowires Reliability and fault tolerance of nanoscale components must be investigated.