Enumeration of Irredundant Circuit Structures Alan Mishchenko Department of EECS UC Berkeley UC Berkeley.

Slides:



Advertisements
Similar presentations
Glitches & Hazards.
Advertisements

Class Presentation on Binary Moment Diagrams by Krishna Chillara Base Paper: “Verification of Arithmetic Circuits using Binary Moment Diagrams” by.
ECE 667 Synthesis & Verification - Boolean Functions 1 ECE 667 Spring 2013 ECE 667 Spring 2013 Synthesis and Verification of Digital Circuits Boolean Functions.
Combining Technology Mapping and Retiming EECS 290A Sequential Logic Synthesis and Verification.
Boolean Functions and their Representations
ECE Synthesis & Verification - Lecture 19 1 ECE 667 Spring 2009 ECE 667 Spring 2009 Synthesis and Verification of Digital Systems Functional Decomposition.
1 FRAIGs: Functionally Reduced And-Inverter Graphs Adapted from the paper “FRAIGs: A Unifying Representation for Logic Synthesis and Verification”, by.
ECE Synthesis & Verification - Lecture 14 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Systems BDD-based.
DAG-Aware AIG Rewriting Alan Mishchenko, Satrajit Chatterjee, Robert Brayton Department of EECS, University of California Berkeley Presented by Rozana.
Logic Synthesis 3 1 Logic Synthesis Part III Maciej Ciesielski Univ. of Massachusetts Amherst, MA.
Reinventing The Wheel: Developing a New Standard-Cell Synthesis Flow Alan Mishchenko University of California, Berkeley.
Logic Synthesis Primer
Logic Decomposition ECE1769 Jianwen Zhu (Courtesy Dennis Wu)
Wenlong Yang Lingli Wang State Key Lab of ASIC and System Fudan University, Shanghai, China Alan Mishchenko Department of EECS University of California,
June 10, Functionally Linear Decomposition and Synthesis of Logic Circuits for FPGAs Tomasz S. Czajkowski and Stephen D. Brown University of Toronto.
Faster Logic Manipulation for Large Designs Alan Mishchenko Robert Brayton University of California, Berkeley.
Electrical and Computer Engineering Archana Rengaraj ABC Logic Synthesis basics ECE 667 Synthesis and Verification of Digital Systems Spring 2011.
Optimization Algorithm
05/04/06 1 Integrating Logic Synthesis, Tech mapping and Retiming Presented by Atchuthan Perinkulam Based on the above paper by A. Mishchenko et al, UCAL.
BoolTool: A Tool for Manipulation of Boolean Functions Petr Fišer, David Toman Czech Technical University in Prague Dept. of Computer Science and Engineering.
Combinational and Sequential Mapping with Priority Cuts Alan Mishchenko Sungmin Cho Satrajit Chatterjee Robert Brayton UC Berkeley.
1 Stephen Jang Kevin Chung Xilinx Inc. Alan Mishchenko Robert Brayton UC Berkeley Power Optimization Toolbox for Logic Synthesis and Mapping.
Output Grouping Method Based on a Similarity of Boolean Functions Petr Fišer, Pavel Kubalík, Hana Kubátová Czech Technical University in Prague Department.
Wenlong Yang Lingli Wang State Key Lab of ASIC and System Fudan University, Shanghai, China Alan Mishchenko Department of EECS University of California,
Designing Combinational Logic Circuits
Boolean Functions 1 ECE 667 ECE 667 Synthesis and Verification of Digital Circuits Boolean Functions Basics Maciej Ciesielski Univ.
A Semi-Canonical Form for Sequential Circuits Alan Mishchenko Niklas Een Robert Brayton UC Berkeley Michael Case Pankaj Chauhan Nikhil Sharma Calypto Design.
1 EENG 2710 Chapter 3 Simplification of Switching Functions.
Global Delay Optimization using Structural Choices Alan Mishchenko Robert Brayton UC Berkeley Stephen Jang Xilinx Inc.
A Toolbox for Counter-Example Analysis and Optimization
Reducing Structural Bias in Technology Mapping
Technology Mapping into General Programmable Cells
Power Optimization Toolbox for Logic Synthesis and Mapping
Mapping into LUT Structures
Delay Optimization using SOP Balancing
Faster Logic Manipulation for Large Designs
Alan Mishchenko Satrajit Chatterjee Robert Brayton UC Berkeley
Logic Synthesis Primer
A. Mishchenko S. Chatterjee1 R. Brayton UC Berkeley and Intel1
Applying Logic Synthesis for Speeding Up SAT
Versatile SAT-based Remapping for Standard Cells
SAT-based Methods: Logic Synthesis and Technology Mapping
A Boolean Paradigm in Multi-Valued Logic Synthesis
Standard-Cell Mapping Revisited
Faster Logic Manipulation for Large Designs
Enumeration of Irredundant Circuit Structures
Digital Logic & Design Dr. Waseem Ikram Lecture 13.
ECE 667 Synthesis and Verification of Digital Systems
Fast Computation of Symmetries in Boolean Functions Alan Mishchenko
SAT-Based Area Recovery in Technology Mapping
Polynomial Construction for Arithmetic Circuits
Alan Mishchenko University of California, Berkeley
Optimization Algorithm
Mapping into LUT Structures
Alan Mishchenko UC Berkeley
Alan Mishchenko UC Berkeley (With many thanks to Donald Knuth,
Reinventing The Wheel: Developing a New Standard-Cell Synthesis Flow
Alan Mishchenko UC Berkeley (With many thanks to Donald Knuth for
Integrating an AIG Package, Simulator, and SAT Solver
Introduction to Logic Synthesis
Recording Synthesis History for Sequential Verification
Delay Optimization using SOP Balancing
Logic Synthesis: Past and Future
Reinventing The Wheel: Developing a New Standard-Cell Synthesis Flow
A Practical Approach to Arithmetic Circuit Verification
Innovative Sequential Synthesis and Verification
Robert Brayton Alan Mishchenko Niklas Een
SAT-based Methods: Logic Synthesis and Technology Mapping
Robert Brayton Alan Mishchenko Niklas Een
Alan Mishchenko Department of EECS UC Berkeley
Presentation transcript:

Enumeration of Irredundant Circuit Structures Alan Mishchenko Department of EECS UC Berkeley UC Berkeley

2 Overview Logic synthesis is important and challenging task Logic synthesis is important and challenging task Boolean decomposition is a way to do logic synthesis Boolean decomposition is a way to do logic synthesis Several algorithms - many heuristics Several algorithms - many heuristics Drawbacks Drawbacks Incomplete algorithms - suboptimal results Incomplete algorithms - suboptimal results Computationally expensive algorithms - high runtime Computationally expensive algorithms - high runtime Our goal is to overcome these drawbacks Our goal is to overcome these drawbacks Perform exhaustive enumeration offline Perform exhaustive enumeration offline Use pre-computed results online, to get good Q&R and low runtime Use pre-computed results online, to get good Q&R and low runtime Practical discoveries Practical discoveries The number of unique functions up to 16 inputs is not too high The number of unique functions up to 16 inputs is not too high The number of unique decompositions of a function is not too high The number of unique decompositions of a function is not too high

Small Practical Functions Classifications of Boolean functions Classifications of Boolean functions Random functions Random functions Special function classes Special function classes Symmetric Symmetric Unate Unate etc etc Logic synthesis and technology mapping deal with Logic synthesis and technology mapping deal with Functions appearing in the designs Functions appearing in the designs Functions with small support (up to 16 variables) Functions with small support (up to 16 variables) These functions are called small practical functions (SPFs) These functions are called small practical functions (SPFs) We will concentrate on SPFs and study their properties We will concentrate on SPFs and study their properties In particular, we will ask In particular, we will ask How many different SPFs exist? How many different SPFs exist? How many different irredundant logic structures they have? How many different irredundant logic structures they have?

DSD Structure DSD structure is a tree of nodes derived by applying DSD recursively until remaining nodes are not decomposable DSD structure is a tree of nodes derived by applying DSD recursively until remaining nodes are not decomposable DSD is full if the resulting tree consists of only simple gates (AND/XOR/MUX) DSD is full if the resulting tree consists of only simple gates (AND/XOR/MUX) DSD is partial if the resulting tree has non-decomposable nodes (called prime nodes) DSD is partial if the resulting tree has non-decomposable nodes (called prime nodes) DSD does not exist if the tree is composed of one node DSD does not exist if the tree is composed of one node a b cde f abcde f Full DSDPartial DSD abcdef No DSD

Computing DSD The input is a Boolean function The input is a Boolean function The output is a DSD structure The output is a DSD structure The structure is unique up to several normalizations: The structure is unique up to several normalizations: Selection of base functions (elementary gates) Selection of base functions (elementary gates) Placement of inverters Placement of inverters Factoring of multi-input AND/XOR gates Factoring of multi-input AND/XOR gates Ordering of fanins of AND/XOR gates Ordering of fanins of AND/XOR gates Ordering of data inputs of MUXes Ordering of data inputs of MUXes NPN representative of prime nodes NPN representative of prime nodes This computation is fast and reliable This computation is fast and reliable Originally implemented with BDDs (Bertacco et al) Originally implemented with BDDs (Bertacco et al) In a limited form, re-implemented with truth tables In a limited form, re-implemented with truth tables Detects about 95% of DSDs of cut functions Detects about 95% of DSDs of cut functions To put DSD computation in perspective To put DSD computation in perspective For 8-LUT mapping, it takes roughly the same time to For 8-LUT mapping, it takes roughly the same time to to compute structural cuts to compute structural cuts to derive their truth tables to derive their truth tables to compute DSDs of the truth tables to compute DSDs of the truth tables F(a,b,c,d) = ab + cd cda b F

Pre-computing Non-Disjoint-Support Decompositions Enumerate bound sets while increasing size Enumerate bound sets while increasing size Enumerate shared sets while increasing size Enumerate shared sets while increasing size If the bound+shared set is irredundant If the bound+shared set is irredundant Add it to the computed set Add it to the computed set Bound+shared set is redundant Bound+shared set is redundant If a variable can be removed and the resulting set is decomposable If a variable can be removed and the resulting set is decomposable Ex: (abCD) is redundant if (abcD) or (abD) is a valid set Ex: (abCD) is redundant if (abcD) or (abD) is a valid set abCDe abcD ab c D e e H HH G G G

Example of Non-DS Decomposition: Mapping 4:1 MUX into two 4-LUTs The complete set of support-reducing bound-sets for Boolean function of 4:1 MUX: Set 0 : S = 1 D = 3 C = 5 x=Acd y=xAbef Set 1 : S = 1 D = 3 C = 5 x=Bce y=xaBdf Set 2 : S = 1 D = 3 C = 5 x=Ade y=xAbcf Set 3 : S = 1 D = 3 C = 5 x=Bde y=xaBcf Set 4 : S = 1 D = 3 C = 5 x=Acf y=xAbde Set 5 : S = 1 D = 3 C = 5 x=Bcf y=xaBde Set 6 : S = 1 D = 3 C = 5 x=Bdf y=xaBce Set 7 : S = 1 D = 3 C = 5 x=Aef y=xAbcd Set 8 : S = 1 D = 4 C = 4 x=aBcd y=xBef Set 9 : S = 1 D = 4 C = 4 x=Abce y=xAdf Set 10 : S = 1 D = 4 C = 4 x=Abdf y=xAce Set 11 : S = 1 D = 4 C = 4 x=aBef y=xBcd Set 12 : S = 2 D = 5 C = 4 x=ABcde y=xABf Set 13 : S = 2 D = 5 C = 4 x=ABcdf y=xABe Set 14 : S = 2 D = 5 C = 4 x=ABcef y=xABd Set 15 : S = 2 D = 5 C = 4 x=ABdef y=xABc

Application to LUT Structure Mapping: Matching 6-input function with LUT structure “44” abcDe H G f abcdef abcDe H G abcde f f abCde H G abcde f f abCde H’ G f Case 1 Case 2 Case 3

Application to Standard Cell Mapping Enumerate decomposable bound sets Enumerate decomposable bound sets For each bound set, enumerate NPN classes of G and H For each bound set, enumerate NPN classes of G and H Use them as choice nodes Use them as choice nodes Use choice nodes to improve quality of Boolean matching Use choice nodes to improve quality of Boolean matching Property: When non-disjoint-support decomposition is applied, there are exactly M = 2^((2^k)-1) pairs of different NPN classes of decomposition/composition functions, G and H, where k is the number of shared variables Property: When non-disjoint-support decomposition is applied, there are exactly M = 2^((2^k)-1) pairs of different NPN classes of decomposition/composition functions, G and H, where k is the number of shared variables H G F kM

Example of a Typical SPF abc 01> rt 000A115F abc 02> print_dsd –d F = F(a,b,c,d,e) This 5-variable function has 10 decomposable variable sets: Set 0 : S = 1 D = 3 C = 4 x=abC y=xCde 0 : 011D{decf} 1 : 110D{decf} Set 1 : S = 1 D = 3 C = 4 x=bCd y=xaCe 0 : !(!d!(cb)) 1 : 1C{bdc} 3407{aecf} Set 2 : S = 1 D = 3 C = 4 x=abE y=xcdE 0 : 0153{cdef} 1 : 5103{cdef} Set 3 : S = 1 D = 3 C = 4 x=acE y=xbdE 0 : !(!c!(ea)) 01F3{bdef} 1 : 1C{ace} F103{bdef} Set 4 : S = 1 D = 3 C = 4 x=bcE y=xadE 0 : (c!(!e!b)) (!f ) 1 : 38{bce} 5003{adef} Set 5 : S = 1 D = 3 C = 4 x=bCe y=xaCd 0 : !(!e!(cb)) 1 : 1C{bec} 3503{adcf} Set 6 : S = 1 D = 3 C = 4 x=adE y=xbcE 0 : (!f!(c!(!e!b))) 1 : 3007{bcef} Set 7 : S = 1 D = 4 C = 3 x=abcE y=xdE 0 : FAC0{abce} (!f!(!ed)) 1 : 05C0{abce} C1{def} Set 8 : S = 1 D = 4 C = 3 x=aCde y=xbC 0 : (!f!(cb)) 1 : 03AC{adec} 43{bcf} Set 9 : S = 1 D = 4 C = 3 x=bcdE y=xaE 0 : CCF8{bcde} (!f!(ea)) 1 : 33F8{bcde} 43{aef} abc 01> rt 000A115F abc 02> pk Truth table: 000a115f d e \ a b c | 1 | 1 | 1 | 1 | 1 | | | 1 | | | | | | 1 | | | 1 | | | | | | | | | | | 1 | 1 | | | | | | | NOTATIONS: !a is complementation NOT(a) (ab) is AND(a,b) [ab] is XOR(a,b) is MUX(a, b, c) = ab + !ac {abc} is PRIME node

Statistics of DSD Manager abc 01> pub12_16.dsd; dsd_ps Total number of objects = Externally used objects = Non-DSD objects (max =12) = Non-DSD structures = Prime objects = Memory used for objects = MB. Memory used for functions = MB. Memory used for hash table = MB. Memory used for bound sets = MB. Memory used for array = MB. 0 : All = 1 0 : All = 1 1 : All = 1 1 : All = 1 2 : All = 2 2 : All = 2 3 : All = 10 3 : All = 10 4 : All = : All = : All = : All = : All = : All = : All = : All = : All = : All = : All = : All = : All = : All = : All = : All = : All = : All = All : All = abc 01> time elapse: 3.00 seconds, total: 3.00 seconds This DSD manager was created using cut enumeration applied to *all* MCNC, ISCAS, and ITC benchmarks circuits (the total of about 835K AIG nodes). This involved computing 16 priority 12-input cuts at each node. Binary file “pub12_16.dsd” has size 177 MB. Gzipped archive has size 42 MB. Reading it into ABC takes 3 sec. Harvesting functions contained in this DSD manager took 1 hour.

Typical DSD Structures NOTATIONS: !a is complementation NOT(a) (ab) is AND(a,b) [ab] is XOR(a,b) is MUX(a, b, c) = ab + !ac {abc} is a PRIME node with hexadecimal

Support-Reducing Decompositions For each support size (S) of NPN classes of non-DSD-decomposable functions - the columns are ranges of counts of irredundant decompositions - the entries are percentages of functions in each range - the last two columns are the maximum and average decomposition counts

LUT Structure Mapping LUT: LUT count Level: LUT level count Time, s: Runtime, in seconds The last two columns: - with online DSD computations - with offline DSD computations (based on pre-computed data)

LUT Level Minimization 6-LUT mapping: Standard mapping into 6-LUTs with structural choices LUTB: DSD-based LUT balancing proposed in this work SOPB+LUTB: SOP balancing followed by LUT balancing (ICCAD’11) LMS+LUTB: Lazy Man’s Logic Synthesis followed by LUT balancing (ICCAD’12)

Conclusions Introduced Boolean decomposition Introduced Boolean decomposition Proposed exhaustive enumeration of decomposable sets Proposed exhaustive enumeration of decomposable sets Discussed applications to Boolean matching Discussed applications to Boolean matching Experimented with benchmarks to find a 3x speedup in LUT structure mapping Experimented with benchmarks to find a 3x speedup in LUT structure mapping Future work will focus on Improving implementation Extending to standard cells Use in technology-independent synthesis

Abstract A new approach to Boolean decomposition and matching is proposed. It uses enumeration of all support-reducing decompositions of Boolean functions up to 16 inputs. The approach is implemented in a new framework that compactly stores multiple circuit structures. The method makes use of pre-computations performed offline, before the framework is started by the calling application. As a result, the runtime of the online computations is substantially reduced. For example, matching Boolean functions against an interconnected LUT structure during technology mapping is reduced to the extent that it no longer dominates the runtime of the mapper. Experimental results indicate that this work has promising applications in CAD tools for both FPGAs and standard cells. A new approach to Boolean decomposition and matching is proposed. It uses enumeration of all support-reducing decompositions of Boolean functions up to 16 inputs. The approach is implemented in a new framework that compactly stores multiple circuit structures. The method makes use of pre-computations performed offline, before the framework is started by the calling application. As a result, the runtime of the online computations is substantially reduced. For example, matching Boolean functions against an interconnected LUT structure during technology mapping is reduced to the extent that it no longer dominates the runtime of the mapper. Experimental results indicate that this work has promising applications in CAD tools for both FPGAs and standard cells.