Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mapping into LUT Structures

Similar presentations


Presentation on theme: "Mapping into LUT Structures"— Presentation transcript:

1 Mapping into LUT Structures
Alan Mishchenko Stephen Jang Chao Chen UC Berkeley Agate Logic Inc

2 Overview Introduction Contributions Algorithms Experimental results
Conclusion

3 Introduction Delay optimizations is a high priority
Can be addressed by CAD algorithms FPGA hardware We propose a two-fold solution Improved LUT mapping algorithm Modification to the hardware

4 Two LUT Structures 7-input LUT structure “44”

5 Contributions Developed of an efficient matching algorithm to check whether a given Boolean function can be implemented using a given LUT structure Modified the priority-cut-based technology mapper in ABC to perform mapping into the LUT structures Evaluated new algorithm and new LUT structures Collected statistics on implementable K-input Boolean functions appearing in industrial designs

6 “44” Matching Algorithm The input is a N-input Boolean function (4 ≤ N ≤ 7). The output is the configuration of two 4-LUTs. Implementation: If N = 4, the function can be trivially implemented using one 4-LUT. If N = 5, the function can be implemented using two 4-LUTs. Relatively few 5-input functions cannot be implemented using the “44” structure. (See Theorem 1.) If N = 6, a naïve decomposition check tries each group of four variables. If N = 7, the only case when the function can be implemented using a given structure, is when it is DSD-decomposable and its DSD structure can be mapped into the given LUT structure.

7 Experimental Setup Mapping into dedicated hardware
Improved traditional mapping Baseline (if) and mapping with structural choices (MSC) (dch; if -j)4 # k area delay 44: Runs (dch; if -j)4 with LUT library: 444: Runs (dch; if -j)4 with LUT library: Best 444: Runs (dch; if -j)4 with LUT library: Mapping into dedicated hardware 1.20

8 Experimental Results Summary
Table 4.1. Improvements to the traditional FPGA mapping. Table 4.2. Delay-optimization using dedicated FPGA architecture with direct connections between adjacent LUTs.

9 Table 4.1

10 Table 4.2

11 Ratios of Implementable Functions
“44” structure 4-input – 100% 5-input – 99.99% 6-input – 99% 7-input – 84% “444” structure 5-input – 100% 6-input – 99.99% 7-input – 97.6% 8-input – 94.5% 9-input – 75.5% 10-input – 39.7%

12 Conclusions Motivated delay improvement
Introduced several LUT structures Proposed fast truth-table-based Boolean matching Evaluated improvements and got promising results In traditional mapping, -10% in delay and -6% in area With dedicated hardware, -41% in delay and +24% in area Future work: Measure improvements after P&R

13

14 Abstract Mapping into K-input lookup tables (K-LUTs) is an important step in synthesis for Field-Programmable Gate Arrays (FPGAs). The traditional FPGA architecture assumes routable interconnect between individual LUTs. We propose a modified FPGA architecture which allows for direct (non-routable) connections between adjacent LUTs. The delay between such LUTs can be shorter. The improvement in delay may come with the restriction on the fanout of LUTs connected using direct connections. As a result, delay can be reduced while area can be increased, compared to the traditional mapping. This paper investigates two types of LUT structures and the associated tradeoffs. Experimental results indicate that when the LUT structures are used, the results of traditional mapping can be improved roughly 10% in delay and 6% in area. When the dedicated hardware is used, the delay improvement can be up to 40% at the cost of some area increase.


Download ppt "Mapping into LUT Structures"

Similar presentations


Ads by Google