12004 MAPLD: 153Brej Early output logic and Anti-Tokens Charlie Brej APT Group Manchester University.

Slides:



Advertisements
Similar presentations
IT253: Computer Organization
Advertisements

Self-Timed Logic Timing complexity growing in digital design -Wiring delays can dominate timing analysis (increasing interdependence between logical and.
1 Lecture 16 Timing  Terminology  Timing issues  Asynchronous inputs.
Reading1: An Introduction to Asynchronous Circuit Design Al Davis Steve Nowick University of Utah Columbia University.
Modern VLSI Design 4e: Chapter 5 Copyright  2008 Wayne Wolf Topics n Performance analysis of sequential machines.
Fault-Tolerant Delay-Insensitive Inter-Chip Communication Yebin Shi Apt Group The University of Manchester.
24-1 Chapter 24. Congestion Control and Quality of Service (part 1) 23.1 Data Traffic 23.2 Congestion 23.3 Congestion Control 23.4 Two Examples.
Clockless Logic System-Level Specification and Synthesis Ack: Tiberiu Chelcea.
Self-Timed Systems Timing complexity growing in digital design -Wiring delays can dominate timing analysis (increasing interdependence between logical.
Introduction to CMOS VLSI Design Sequential Circuits.
VLSI Design EE 447/547 Sequential circuits 1 EE 447/547 VLSI Design Lecture 9: Sequential Circuits.
Introduction to CMOS VLSI Design Sequential Circuits
MICROELETTRONICA Sequential circuits Lection 7.
ELEC 256 / Saif Zahir UBC / 2000 Timing Methodology Overview Set of rules for interconnecting components and clocks When followed, guarantee proper operation.
Lecture 11: Sequential Circuit Design. CMOS VLSI DesignCMOS VLSI Design 4th Ed. 11: Sequential Circuits2 Outline  Sequencing  Sequencing Element Design.
Penn ESE370 Fall DeHon 1 ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Day 24: November 4, 2011 Synchronous Circuits.
Avshalom Elyada, Ran GinosarPipeline Synchronization 1 A Unique and Successfully Implemented Approach to the Synchronization Problem Based on the article.
Slide 1/20IWLS 2003, May 30Early Output Logic with Anti-Tokens Charlie Brej, Jim Garside APT Group Manchester University.
Sequential Circuits. Outline  Floorplanning  Sequencing  Sequencing Element Design  Max and Min-Delay  Clock Skew  Time Borrowing  Two-Phase Clocking.
Z. Feng MTU EE4800 CMOS Digital IC Design & Analysis EE4800 CMOS Digital IC Design & Analysis Lecture 11 Sequential Circuit Design Zhuo Feng.
Introduction to CMOS VLSI Design Lecture 19: Design for Skew David Harris Harvey Mudd College Spring 2004.
Introduction to CMOS VLSI Design Clock Skew-tolerant circuits.
Clock Design Adopted from David Harris of Harvey Mudd College.
© Ran Ginosar Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures Lecture 3 S&F Ch. 5: Handshake Ckt Implementations.
Digital Integrated Circuits© Prentice Hall 1995 Timing ISSUES IN TIMING.
1 Clockless Logic Montek Singh Thu, Jan 13, 2004.
1 Clockless Logic Montek Singh Tue, Mar 23, 2004.
ELEC 6200, Fall 07, Oct 24 Jiang: Async. Processor 1 Asynchronous Processor Design for ELEC 6200 by Wei Jiang.
Low Power Design for Wireless Sensor Networks Aki Happonen.
COMP Clockless Logic and Silicon Compilers Lecture 3
Pipelining and Retiming 1 Pipelining  Adding registers along a path  split combinational logic into multiple cycles  increase clock rate  increase.
Network Data Organizational Communications and Technologies Prithvi N. Rao Carnegie Mellon University Web:
Introduction to CMOS VLSI Design Lecture 10: Sequential Circuits
Introduction to CMOS VLSI Design Lecture 10: Sequential Circuits Credits: David Harris Harvey Mudd College (Material taken/adapted from Harris’ lecture.
Fall 2009 / Winter 2010 Ran Ginosar (
Lecture 11 MOUSETRAP: Ultra-High-Speed Transition-Signaling Asynchronous Pipelines.
1 Recap: Lectures 5 & 6 Classic Pipeline Styles 1. Williams and Horowitz’s PS0 pipeline 2. Sutherland’s micropipelines.
1 Clockless Logic: Dynamic Logic Pipelines (contd.)  Drawbacks of Williams’ PS0 Pipelines  Lookahead Pipelines.
Fall 2007 L16: Memory Elements LECTURE 16: Clocks Sequential circuit design The basic memory element: a latch Flip Flops.
Pipelining By Toan Nguyen.
1 CSE370, Lecture 16 Lecture 19 u Logistics n HW5 is due today (full credit today, 20% off Monday 10:29am, Solutions up Monday 10:30am) n HW6 is due Wednesday.
Advanced Computers and Communications (ACC) Faculty Advisors: Dr. Charles Liu Dr. Helen Boussalis 10/25/20121NASA Grant URC NCC NNX08BA44A Student Assistants:
Digital Design: Principles and Practices
Clockless Chips Date: October 26, Presented by:
MOUSETRAP Ultra-High-Speed Transition-Signaling Asynchronous Pipelines Montek Singh & Steven M. Nowick Department of Computer Science Columbia University,
Paper review: High Speed Dynamic Asynchronous Pipeline: Self Precharging Style Name : Chi-Chuan Chuang Date : 2013/03/20.
August 1, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 9: I/O Devices and Communication Buses * Jeremy R. Johnson Wednesday,
Reading1: An Introduction to Asynchronous Circuit Design Al Davis Steve Nowick University of Utah Columbia University.
Penn ESE370 Fall DeHon 1 ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Day 26: October 31, 2014 Synchronous Circuits.
Reading Assignment: Rabaey: Chapter 9
1 Practical Design and Performance Evaluation of Completion Detection Circuits Fu-Chiung Cheng Department of Computer Science Columbia University.
1 Recap: Lecture 4 Logic Implementation Styles:  Static CMOS logic  Dynamic logic, or “domino” logic  Transmission gates, or “pass-transistor” logic.
1 Clockless Logic Montek Singh Thu, Mar 2, Review: Logic Gate Families  Static CMOS logic  Dynamic logic, or “domino” logic  Transmission gates,
Lecture 11: Sequential Circuit Design
Welcome To Seminar Presentation Seminar Report On Clockless Chips
Other Approaches.
Sequential circuit design with metastability
Recap: Lecture 1 What is asynchronous design? Why do we want to study it? What is pipelining? How can it be used to design really fast hardware?
Blame Passing for Analysis and Optimisation
Burst read Valid high until ready high
Topics Performance analysis..
332:578 Deep Submicron VLSI Design Lecture 14 Design for Clock Skew
High Performance Asynchronous Circuit Design and Application
Clockless Logic: Asynchronous Pipelines
Lecture 19 Logistics Last lecture Today
Reduction in synchronisation in bundled data systems
Wagging Logic: Moore's Law will eventually fix it
A Quasi-Delay-Insensitive Method to Overcome Transistor Variation
Early output logic and Anti-Tokens
Pipelining and Superscalar Techniques
Presentation transcript:

12004 MAPLD: 153Brej Early output logic and Anti-Tokens Charlie Brej APT Group Manchester University

22004 MAPLD: 153Brej Overview  Synchronous Problems  Asynchronous Logic  Why?  How?  Solutions  Early Output  Anti-Tokens

32004 MAPLD: 153Brej Problems: Communication  Communication horizon  “For a 60 nanometer process a signal can reach only 5% of the die’s length in a clock cycle” [D. Matzke,1997]  Clock distributed using wave pipelining

42004 MAPLD: 153Brej Problems: Performance Cycle time Unbalanced Stages Clock Skew/Jitter Transistor Variability Signal Integrity Worst – Average case performance Real Computation Clock overheads Timing Assumption overheads

52004 MAPLD: 153Brej Clock! What is it good for?  No arguing with the clock  9am - 5pm. No excuses!

62004 MAPLD: 153Brej Bundled-Data  When you finish, do the next task  Flexitime Request + Delay Acknowledge

72004 MAPLD: 153Brej How do you know when you are finished?  Synchronous:  Estimate  Global timing reference  Asynchronous (bundled-data)  Estimate  Local delay elements  Asynchronous (delay-insensitive)  When the data arrives  Intrinsic

82004 MAPLD: 153Brej Becoming Delay Insensitive  Dual-Rail  Two wires  00 – NULL  01 – Zero  10 – One  (11 – Not used)  Four Phase handshake  Return to zero R1 Ack R0

92004 MAPLD: 153Brej Early Output Logic  Dual-Rail interfaces  Output generated as early as possible  Two Early output cases  If either input is ‘0’ then the output is ‘0’

MAPLD: 153Brej Bit level pipelining  Forward completed parts of the result  Pace work  Don’t stall parts unless you have to

MAPLD: 153Brej Bit level pipelining  Forward completed parts of the result  Pace work  Don’t stall parts unless you have to

MAPLD: 153Brej Bit level pipelining  Forward completed parts of the result  Pace work  Don’t stall parts unless you have to

MAPLD: 153Brej Early Output cases

MAPLD: 153Brej Validity  Unnecessary late inputs  Must be acknowledged  Must wait until they arrive  Validity signal  Latch generated  Ready to be acknowledged  Result before all inputs present  Acknowledge after all inputs present

MAPLD: 153Brej Synchronisation Hurts  No need to wait before generating result  Need to wait for input in order to acknowledge it  Unnecessary stall

MAPLD: 153Brej Anti-Tokens  Unnecessary late inputs  Stall the entire stage  Proactive approach  Send a ‘cancel’ signal backward to the source  Acknowledge before data arrives  Anti-Token latches  Assert validity early

MAPLD: 153Brej Anti-token generation 0 1 C

MAPLD: 153Brej Anti-token generation 0 A 1 C

MAPLD: 153Brej Anti-token Propagation 1 C A

MAPLD: 153Brej Anti-token Propagation 1 C A A

MAPLD: 153Brej Anti-token Token collisions 11 AA 11 AA ? A ? 1

MAPLD: 153Brej Anti-token Token collisions 11 A 11 AA 1 A1 1 1

MAPLD: 153Brej Remove Unnecessary computation Cycle time Unbalanced Stages Clock Skew/Jitter Transistor Variability Signal Integrity Worst – Average case performance Real Computation Clock overheads Timing Assumption overheads Unnecessary Computation/Delays

MAPLD: 153Brej Summary  Asynchronous  Delay Insensitive  Safe  No timing assumptions  Average case performance  Remove unnecessary computation  Anti-tokens without mutual exclusion units