Mapping into LUT Structures

Slides:



Advertisements
Similar presentations
ENGIN112 L38: Programmable Logic December 5, 2003 ENGIN 112 Intro to Electrical and Computer Engineering Lecture 38 Programmable Logic.
Advertisements

Exploiting Symmetry in SAT-Based Boolean Matching for Heterogeneous FPGA Technology Mapping Yu Hu 1, Victor Shih 2, Rupak Majumdar 2 and Lei He 1 1 Electrical.
Rewiring – Review, Quantitative Analysis and Applications Matthew Tang Wai Chung CUHK CSE MPhil 10/11/2003.
 Y. Hu, V. Shih, R. Majumdar and L. He, “Exploiting Symmetries to Speedup SAT-based Boolean Matching for Logic Synthesis of FPGAs”, TCAD  Y. Hu,
Reinventing The Wheel: Developing a New Standard-Cell Synthesis Flow Alan Mishchenko University of California, Berkeley.
Logic Synthesis Primer
FPGA Technology Mapping. 2 Technology mapping:  Implements the optimized nodes of the Boolean network to the target device library.  For FPGA, library.
CS 151 Digital Systems Design Lecture 38 Programmable Logic.
Introduction to FPGA’s FPGA (Field Programmable Gate Array) –ASIC chips provide the highest performance, but can only perform the function they were designed.
Wenlong Yang Lingli Wang State Key Lab of ASIC and System Fudan University, Shanghai, China Alan Mishchenko Department of EECS University of California,
Reconfigurable Computing Using Content Addressable Memory (CAM) for Improved Performance and Resource Usage Group Members: Anderson Raid Marie Beltrao.
Enumeration of Irredundant Circuit Structures Alan Mishchenko Department of EECS UC Berkeley UC Berkeley.
1 Stephen Jang Kevin Chung Xilinx Inc. Alan Mishchenko Robert Brayton UC Berkeley Power Optimization Toolbox for Logic Synthesis and Mapping.
Wenlong Yang Lingli Wang State Key Lab of ASIC and System Fudan University, Shanghai, China Alan Mishchenko Department of EECS University of California,
In-Place Decomposition for Robustness in FPGA Ju-Yueh Lee, Zhe Feng, and Lei He Electrical Engineering Dept., UCLA Presented by Ju-Yueh Lee Address comments.
Logic and Computer Design Fundamentals, Fifth Edition Mano | Kime | Martin Copyright ©2016, 2008, 2004 by Pearson Education, Inc. All rights reserved.
1 WireMap FPGA Technology Mapping for Improved Routability Stephen Jang, Xilinx Inc. Billy Chan, Xilinx Inc. Kevin Chung, Xilinx Inc. Alan Mishchenko,
Global Delay Optimization using Structural Choices Alan Mishchenko Robert Brayton UC Berkeley Stephen Jang Xilinx Inc.
Reducing Structural Bias in Technology Mapping
A New Logic Synthesis, ExorBDS
This chapter in the book includes: Objectives Study Guide
Technology Mapping into General Programmable Cells
Power Optimization Toolbox for Logic Synthesis and Mapping
Logic and Computer Design Fundamentals
Delay Optimization using SOP Balancing
Faster Logic Manipulation for Large Designs
Alan Mishchenko Department of EECS UC Berkeley
Enhancing PDR/IC3 with Localization Abstraction
Alan Mishchenko Robert Brayton UC Berkeley
Alan Mishchenko Satrajit Chatterjee Robert Brayton UC Berkeley
A. Mishchenko S. Chatterjee1 R. Brayton UC Berkeley and Intel1
Magic An Industrial-Strength Logic Optimization, Technology Mapping, and Formal Verification System Alan Mishchenko UC Berkeley.
Applying Logic Synthesis for Speeding Up SAT
Versatile SAT-based Remapping for Standard Cells
Leakage-Aware FPGA Re-synthesis
Synthesis for Verification
SmartOpt An Industrial Strength Framework for Logic Synthesis
Designing Area-Efficient Dividers and Square Rooters
Verilog to Routing CAD Tool Optimization
Standard-Cell Mapping Revisited
LUT Structure for Delay: Cluster or Cascade?
Fast Computation of Symmetries in Boolean Functions Alan Mishchenko
SAT-Based Area Recovery in Technology Mapping
Alan Mishchenko University of California, Berkeley
ECE 331 – Digital System Design
Canonical Computation without Canonical Data Structure
SAT-Based Optimization with Don’t-Cares Revisited
Canonical Computation Without Canonical Data Structure
Scalable and Scalably-Verifiable Sequential Synthesis
Mapping into LUT Structures
Sungho Kang Yonsei University
FPGA Glitch Power Analysis and Reduction
Alan Mishchenko UC Berkeley
Alan Mishchenko UC Berkeley (With many thanks to Donald Knuth,
Reinventing The Wheel: Developing a New Standard-Cell Synthesis Flow
Alan Mishchenko UC Berkeley (With many thanks to Donald Knuth for
Integrating an AIG Package, Simulator, and SAT Solver
Improvements in FPGA Technology Mapping
Canonical Computation without Canonical Data Structure
Yu Hu1, Satyaki Das2, Steve Trimberger2, and Lei He1
Recording Synthesis History for Sequential Verification
Delay Optimization using SOP Balancing
Canonical Computation without Canonical Data Structure
Reinventing The Wheel: Developing a New Standard-Cell Synthesis Flow
Magic An Industrial-Strength Logic Optimization, Technology Mapping, and Formal Verification System Alan Mishchenko UC Berkeley.
FIGURE 5-1 MOS Transistor, Symbols, and Switch Models
Innovative Sequential Synthesis and Verification
Robert Brayton Alan Mishchenko Niklas Een
SAT-based Methods: Logic Synthesis and Technology Mapping
Robert Brayton Alan Mishchenko Niklas Een
Presentation transcript:

Mapping into LUT Structures Alan Mishchenko Stephen Jang Chao Chen UC Berkeley Agate Logic Inc

Overview Introduction Contributions Algorithms Experimental results Conclusion

Introduction Delay optimizations is a high priority Can be addressed by CAD algorithms FPGA hardware We propose a two-fold solution Improved LUT mapping algorithm Modification to the hardware

Two LUT Structures 7-input LUT structure “44”

Contributions Developed of an efficient matching algorithm to check whether a given Boolean function can be implemented using a given LUT structure Modified the priority-cut-based technology mapper in ABC to perform mapping into the LUT structures Evaluated new algorithm and new LUT structures Collected statistics on implementable K-input Boolean functions appearing in industrial designs

“44” Matching Algorithm The input is a N-input Boolean function (4 ≤ N ≤ 7). The output is the configuration of two 4-LUTs. Implementation: If N = 4, the function can be trivially implemented using one 4-LUT. If N = 5, the function can be implemented using two 4-LUTs. Relatively few 5-input functions cannot be implemented using the “44” structure. (See Theorem 1.) If N = 6, a naïve decomposition check tries each group of four variables. If N = 7, the only case when the function can be implemented using a given structure, is when it is DSD-decomposable and its DSD structure can be mapped into the given LUT structure.

Experimental Setup Mapping into dedicated hardware Improved traditional mapping Baseline (if) and mapping with structural choices (MSC) (dch; if -j)4 # k area delay 1-4 1.00 1.00 44: Runs (dch; if -j)4 with LUT library: 5-7 2.00 2.00 444: Runs (dch; if -j)4 with LUT library: 5-10 3.00 2.00 Best 444: Runs (dch; if -j)4 with LUT library: 5-6 2.00 2.00 7 2.50 2.00 8-10 3.00 2.00 Mapping into dedicated hardware 1.20

Experimental Results Summary Table 4.1. Improvements to the traditional FPGA mapping. Table 4.2. Delay-optimization using dedicated FPGA architecture with direct connections between adjacent LUTs.

Table 4.1

Table 4.2

Ratios of Implementable Functions “44” structure 4-input – 100% 5-input – 99.99% 6-input – 99% 7-input – 84% “444” structure 5-input – 100% 6-input – 99.99% 7-input – 97.6% 8-input – 94.5% 9-input – 75.5% 10-input – 39.7%

Conclusions Motivated delay improvement Introduced several LUT structures Proposed fast truth-table-based Boolean matching Evaluated improvements and got promising results In traditional mapping, -10% in delay and -6% in area With dedicated hardware, -41% in delay and +24% in area Future work: Measure improvements after P&R

Abstract Mapping into K-input lookup tables (K-LUTs) is an important step in synthesis for Field-Programmable Gate Arrays (FPGAs). The traditional FPGA architecture assumes routable interconnect between individual LUTs. We propose a modified FPGA architecture which allows for direct (non-routable) connections between adjacent LUTs. The delay between such LUTs can be shorter. The improvement in delay may come with the restriction on the fanout of LUTs connected using direct connections. As a result, delay can be reduced while area can be increased, compared to the traditional mapping. This paper investigates two types of LUT structures and the associated tradeoffs. Experimental results indicate that when the LUT structures are used, the results of traditional mapping can be improved roughly 10% in delay and 6% in area. When the dedicated hardware is used, the delay improvement can be up to 40% at the cost of some area increase.