Area and Speed Oriented Implementations of Asynchronous Logic Operating Under Strong Constraints.

Slides:



Advertisements
Similar presentations
Gregory Shklover, Ben Emanuel Intel Corporation MATAM, Haifa 31015, Israel Simultaneous Clock and Data Gate Sizing Algorithm with Common Global Objective.
Advertisements

New Ways of Generating Large Realistic Benchmarks for Testing Synthesis Tools Petr Fišer, Jan Schmidt Faculty of Information Technology Czech Technical.
Andrey Mokhov, Victor Khomenko Danil Sokolov, Alex Yakovlev Dual-Rail Control Logic for Enhanced Circuit Robustness.
Logical Design.
Figure 4.1. The function f (x1, x2, x3) =  m(0, 2, 4, 5, 6).
ECE 551 Digital System Design & Synthesis Lecture 08 The Synthesis Process Constraints and Design Rules High-Level Synthesis Options.
Self-Timed Systems Timing complexity growing in digital design -Wiring delays can dominate timing analysis (increasing interdependence between logical.
Slide 1/20IWLS 2003, May 30Early Output Logic with Anti-Tokens Charlie Brej, Jim Garside APT Group Manchester University.
Uncle – An RTL Approach to Asynchronous Design Presentor : Chi-Chuan Chuang Date :
Programmable Logic Devices
Hazard-free logic synthesis and technology mapping I Jordi Cortadella Michael Kishinevsky Alex Kondratyev Luciano Lavagno Alex Yakovlev Univ. Politècnica.
Asynchronous Design Using Commercial HDL Synthesis Tools Michiel Ligthart Karl Fant Ross Smith Alexander Taubin Alex Kondratyev.
Modern VLSI Design 2e: Chapter4 Copyright  1998 Prentice Hall PTR.
Caltech CS184a Fall DeHon1 CS184a: Computer Architecture (Structures and Organization) Day10: October 25, 2000 Computing Elements 2: Cascades, ALUs,
Technology Mapping.
Low Power Design for Wireless Sensor Networks Aki Happonen.
Logic Synthesis Outline –Logic Synthesis Problem –Logic Specification –Two-Level Logic Optimization Goal –Understand logic synthesis problem –Understand.
1 Logic synthesis from concurrent specifications Jordi Cortadella Universitat Politecnica de Catalunya Barcelona, Spain In collaboration with M. Kishinevsky,
Logic Synthesis 1 Outline –Logic Synthesis Problem –Logic Specification –Two-Level Logic Optimization Goal –Understand logic synthesis problem –Understand.
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 11: February 14, 2007 Compute 1: LUTs.
Logic Design Outline –Logic Design –Schematic Capture –Logic Simulation –Logic Synthesis –Technology Mapping –Logic Verification Goal –Understand logic.
CS294-6 Reconfigurable Computing Day 14 October 7/8, 1998 Computing with Lookup Tables.
 2000 M. CiesielskiPTL Synthesis1 Synthesis for Pass Transistor Logic Maciej Ciesielski Dept. of Electrical & Computer Engineering University of Massachusetts,
FPGA Technology Mapping. 2 Technology mapping:  Implements the optimized nodes of the Boolean network to the target device library.  For FPGA, library.
Digital Integrated Circuits© Prentice Hall 1995 Combinational Logic COMBINATIONAL LOGIC.
Lecture # 12 University of Tehran
Wenlong Yang Lingli Wang State Key Lab of ASIC and System Fudan University, Shanghai, China Alan Mishchenko Department of EECS University of California,
Digital Design Strategies and Techniques. Analog Building Blocks for Digital Primitives We implement logical devices with analog devices There is no magic.
RTL Hardware Design by P. Chu Chapter Derivation of efficient HDL description 2. Operator sharing 3. Functionality sharing 4. Layout-related circuits.
POWER-DRIVEN MAPPING K-LUT-BASED FPGA CIRCUITS I. Bucur, N. Cupcea, C. Stefanescu, A. Surpateanu Computer Science and Engineering Department, University.
Asynchronous Datapath Design Adders Comparators Multipliers Registers Completion Detection Bus Pipeline …..
ECE 331 – Digital System Design NAND and NOR Circuits, Multi-level Logic Circuits, and Multiple-output Logic Circuits (Lecture #9) The slides included.
ECE Advanced Digital Systems Design Lecture 12 – Timing Analysis Capt Michael Tanner Room 2F46A HQ U.S. Air Force Academy I n t e g r i.
Paper review: High Speed Dynamic Asynchronous Pipeline: Self Precharging Style Name : Chi-Chuan Chuang Date : 2013/03/20.
THE TESTING APPROACH FOR FPGA LOGIC CELLS E. Bareiša, V. Jusas, K. Motiejūnas, R. Šeinauskas Kaunas University of Technology LITHUANIA EWDTW'04.
Sneha.  Gates Gates  Characteristics of gates Characteristics of gates  Basic Gates Basic Gates  AND Gate AND Gate  OR gate OR gate  NOT gate NOT.
BoolTool: A Tool for Manipulation of Boolean Functions Petr Fišer, David Toman Czech Technical University in Prague Dept. of Computer Science and Engineering.
Reading1: An Introduction to Asynchronous Circuit Design Al Davis Steve Nowick University of Utah Columbia University.
CS/EE 3700 : Fundamentals of Digital System Design Chris J. Myers Lecture 4: Logic Optimization Chapter 4.
Background Motivation Implementation Conclusion 2.
ETE 204 – Digital Electronics
On Logic Synthesis of Conventionally Hard to Synthesize Circuits Using Genetic Programming Petr Fišer, Jan Schmidt Faculty of Information Technology, Czech.
Technology Mapping. 2 Technology mapping is the phase of logic synthesis when gates are selected from a technology library to implement the circuit. Technology.
Curtis A. Nelson 1 Technology Mapping of Timed Circuits Curtis A. Nelson University of Utah September 23, 2002.
Courtesy RK Brayton (UCB) and A Kuehlmann (Cadence) 1 Logic Synthesis Multi-Level Logic Synthesis.
Output Grouping Method Based on a Similarity of Boolean Functions Petr Fišer, Pavel Kubalík, Hana Kubátová Czech Technical University in Prague Department.
Wenlong Yang Lingli Wang State Key Lab of ASIC and System Fudan University, Shanghai, China Alan Mishchenko Department of EECS University of California,
1 Practical Design and Performance Evaluation of Completion Detection Circuits Fu-Chiung Cheng Department of Computer Science Columbia University.
Arithmetic-Logic Units. Logic Gates AND gate OR gate NOT gate.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 11: January 31, 2005 Compute 1: LUTs.
CBSSS 2002: DeHon Universal Programming Organization André DeHon Tuesday, June 18, 2002.
Circuit Synthesis A logic function can be represented in several different forms:  Truth table representation  Boolean equation  Circuit schematic 
1 Architecture of Datapath- oriented Coarse-grain Logic and Routing for FPGAs Andy Ye, Jonathan Rose, David Lewis Department of Electrical and Computer.
Global Delay Optimization using Structural Choices Alan Mishchenko Robert Brayton UC Berkeley Stephen Jang Xilinx Inc.
Digital Logic.
Lecture 9 Topics Glitches and Hazards Types of Functions and Delays
A New Logic Synthesis, ExorBDS
Mapping into LUT Structures
Delay Optimization using SOP Balancing
Alan Mishchenko Satrajit Chatterjee Robert Brayton UC Berkeley
Anne Pratoomtong ECE734, Spring2002
Standard-Cell Mapping Revisited
CSE 370 – Winter 2002 – Comb. Logic building blocks - 1
ECE 331 – Digital System Design
CSE 370 – Winter Combinational Implementation - 1
FPGA Glitch Power Analysis and Reduction
VLSI CAD Flow: Logic Synthesis, Placement and Routing Lecture 5
Delay Optimization using SOP Balancing
FIGURE 5-1 MOS Transistor, Symbols, and Switch Models
Robert Brayton Alan Mishchenko Niklas Een
Presentation transcript:

Area and Speed Oriented Implementations of Asynchronous Logic Operating Under Strong Constraints

EUROMICRO DSD 2010, Lille2 Outline Asynchronous circuits model used Motivation & proposed method Experimental results Conclusions

EUROMICRO DSD 2010, Lille3 Asynchronous Circuits Model Used Unbounded delay model Gate and wire delays are not limited The circuit is able to recognize the moment when input states have changed   Dual-rail encoding Positive and negative values of each signal are provided f (0) = 1, f (1) = 0 – log. 1 f (0) = 0, f (1) = 1 – log. 0 f (0) = 0, f (1) = 0 – space state (spacer) f (0) = 1, f (1) = 1 – not allowed

EUROMICRO DSD 2010, Lille4 Four-Phase Discipline Inputs in space state (00) Inputs in working state (10, 01) Outputs in space state (00) Outputs in working state (10, 01)

EUROMICRO DSD 2010, Lille5 Seitz’s Constraints Strong constraints Each output changes its state only when all inputs have changed their state In contrast to weak constraints Some outputs are permitted to change their state when some inputs have changed their state

EUROMICRO DSD 2010, Lille6 Seitz’s Constraints Strong constraints Each output changes its state only when all inputs have changed their state In contrast to weak constraints Some outputs are permitted to change their state when some inputs have changed their state

EUROMICRO DSD 2010, Lille7 Seitz’s Strong Constraints Pros Regularity Extra completion detection logic not needed Circuit delay is based on actual gate delays No additional synchronization chains Cons Rather high area and delay DIMS (Delay-Insensitive Minterm Synthesis) NCL (Null Convention Logic) Direct Logic

EUROMICRO DSD 2010, Lille8 DIMS (Delay-Insensitive Minterm Synthesis) 2-level implementation 2 n n-input C-elements + n-input OR  Function implemented as sum-of-minterms

EUROMICRO DSD 2010, Lille9 NCL (Null Convention Logic) Library of 27 special gates Based on threshold functions Any function up to 4 inputs can be implemented … but in dual-rail, 4 inputs = 2 variables only

EUROMICRO DSD 2010, Lille10 Direct Logic Two ‑ level C-OR DIMS logic implemented as a single gate Both positive and complemented outputs are provided Different delays for each input

EUROMICRO DSD 2010, Lille11 Comparison DIMS Direct logic InputsTrans.DelayTrans.Delay N/A90N/A 6896N/A158N/A NCL 2-input gate Trans.Delay AND, OR215.8 XOR248.6

EUROMICRO DSD 2010, Lille12 Multi-Level Dual-Rail Network Positive and complemented values of each signal provided Each node implemented as DIMS, NCL, or Direct logic

EUROMICRO DSD 2010, Lille13 Motivation & Proposed Method State-of-the-art Nodes are implemented as simple gates (NAND, XOR) 4x 2-input gate = 22*4 = 88 transistors in Direct logic

EUROMICRO DSD 2010, Lille14 Motivation & Proposed Method Proposed Nodes are implemented as complex gates 1x 2-input gate + 1x 3-input gate = = 56 transistors

EUROMICRO DSD 2010, Lille15 Motivation & Proposed Method State-of-the-art Nodes are implemented as simple gates (NAND, XOR) Proposed Nodes are implemented as complex gates, i.e. gates of a given number of inputs and any function Can be implemented both in DIMS and Direct logic Like FPGA LUTs Tools for synchronous synthesis can be used  FPGA mapping

EUROMICRO DSD 2010, Lille16 Where’s the Problem? Facts: Increase of the number of node inputs will: Decrease the number of nodes Decrease the number of levels Increase the node size Increase the node delay Question: Where is the trade-off?

EUROMICRO DSD 2010, Lille17 Experimental Setup 228 circuits processed (MCNC, ISCAS) Optimized by ABC choice script Mapped into k-input NANDs (ABC map command )  state-of-the-art (k-NAND) Mapped into k-LUTs (ABC fpga command)  complex gates (k-CG) Mapped into MCNC standard cells (ABC map)  something in-between (SC) k = 2…6 Implemented as DIMS, Direct logic, and NCL

EUROMICRO DSD 2010, Lille18 Results – DIMS - Area

EUROMICRO DSD 2010, Lille19 Results – DIMS - Area

EUROMICRO DSD 2010, Lille20 Results – DIMS – Delay

EUROMICRO DSD 2010, Lille21 Results – DIMS – Delay

EUROMICRO DSD 2010, Lille22 Discussion - DIMS Implementation using arbitrary 2-input gates is the best one, both in area and delay No big surprise. Complexity (and delay) of DIMS grows exponentially with the number of gate inputs Results are consistent – the more node inputs, the higher area and delay

EUROMICRO DSD 2010, Lille23 Results – Direct Logic - Area

EUROMICRO DSD 2010, Lille24 Results - Direct Logic - Area

EUROMICRO DSD 2010, Lille25 Results – Direct Logic - Delay

EUROMICRO DSD 2010, Lille26 Results – Direct Logic - Delay

EUROMICRO DSD 2010, Lille27 Discussion - Direct Logic Implementation using 3-input complex gates is the best one, both in area and delay This is a good result confirming our theory Results are consistent - no coincidence State-of-the-art 2-NAND implementation is extremely inefficient: 21% area improvement 19% delay improvement 3-CG implementation is even better than NCL 10% area improvement 19% delay improvement

EUROMICRO DSD 2010, Lille28 Conclusions Efficient implementation of asynchronous logic operating under strong constraints proposed Tools (& methods) for synchronous synthesis are used for asynchronous synthesis 3-input complex nodes implemented using Direct logic Extensive experiments confirmed the theory cca. 20% area and delay improvement vs. all state-of-the-art methods