Advances in Designing Clockless Digital Systems Prof. Steven M. Nowick Department of Computer Science Columbia University New York,

Slides:



Advertisements
Similar presentations
Elements of a Microprocessor system Central processing unit. This performs the arithmetic and logical operations, such as add/subtract, multiply/divide,
Advertisements

A reconfigurable system featuring dynamically extensible embedded microprocessor, FPGA, and customizable I/O Borgatti, M. Lertora, F. Foret, B. Cali, L.
1 Clockless Logic  Recap: Lookahead Pipelines  High-Capacity Pipelines.
Modern VLSI Design 4e: Chapter 8 Copyright  2008 Wayne Wolf Topics High-level synthesis. Architectures for low power. GALS design.
Advances in Clockless and Mixed-Timing Digital Systems Prof. Steven M. Nowick Department of Computer Science Columbia University.
1 Clockless Logic Montek Singh Thu, Jan 13, 2004.
Applications of Systolic Array FTR, IIR filtering, and 1-D convolution. 2-D convolution and correlation. Discrete Furier transform Interpolation 1-D and.
Design and Implementation of VLSI Systems (EN0160) Sherief Reda Division of Engineering, Brown University Spring 2007.
ELEC 6200, Fall 07, Oct 24 Jiang: Async. Processor 1 Asynchronous Processor Design for ELEC 6200 by Wei Jiang.
COMP Clockless Logic and Silicon Compilers Lecture 3
Jordi Cortadella, Universitat Politècnica de Catalunya, Spain
A. A. Jerraya Mark B. Josephs South Bank University, London System Timing.
1 Synchronization of complex systems Jordi Cortadella Universitat Politecnica de Catalunya Barcelona, Spain Thanks to A. Chakraborty, T. Chelcea, M. Greenstreet.
1 Clockless Logic or How do I make hardware fast, power- efficient, less noisy, and easy-to-design? Montek Singh Thu, Jan 8, 2004.
ELEN468 Lecture 11 ELEN468 Advanced Logic Design Lecture 1Introduction.
1 Clockless Computing Montek Singh Thu, Sep 13, 2007.
Lecture 11 MOUSETRAP: Ultra-High-Speed Transition-Signaling Asynchronous Pipelines.
1 Clockless Logic: Dynamic Logic Pipelines (contd.)  Drawbacks of Williams’ PS0 Pipelines  Lookahead Pipelines.
ENEE 644 Dr. Ankur Srivastava Office: 1349 A.V. Williams URL: Computer-Aided Design of.
ECE 232 L1 Intro.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 1 Introduction.
Low Power Design of Integrated Systems Assoc. Prof. Dimitrios Soudris
3.1Introduction to CPU Central processing unit etched on silicon chip called microprocessor Contain tens of millions of tiny transistors Key components:
1 Microprocessor speeds Measure of system clock speed –How many electronic pulses the clock produces per second (clock frequency) –Usually expressed in.
(1) Introduction © Sudhakar Yalamanchili, Georgia Institute of Technology, 2006.
Chap. 1 Overview of Digital Design with Verilog. 2 Overview of Digital Design with Verilog HDL Evolution of computer aided digital circuit design Emergence.
Ronny Krashinsky Seongmoo Heo Michael Zhang Krste Asanovic MIT Laboratory for Computer Science SyCHOSys Synchronous.
Clockless Chips Date: October 26, Presented by:
1 Seminar on High-Speed Asynchronous Pipelines Montek Singh Thursdays 10-11, SN325.
COLUMBIA UNIVERSITY Computer Engineering Program The Fu Foundation School of Engineering and Applied Science IN THE CITY OF NEW YORK Computer Engineering:
CAD for Physical Design of VLSI Circuits
MOUSETRAP Ultra-High-Speed Transition-Signaling Asynchronous Pipelines Montek Singh & Steven M. Nowick Department of Computer Science Columbia University,
1 EE 587 SoC Design & Test Partha Pande School of EECS Washington State University
Design Verification An Overview. Powerful HDL Verification Solutions for the Industry’s Highest Density Devices  What is driving the FPGA Verification.
2/6/2003IDEAL-IST Workshop, Christos P. Sotiriou, ICS-FORTH 1 IDEAL-IST Workshop Christos P. Sotiriou, Institute of Computer Science, FORTH.
Section 10: Advanced Topics 1 M. Balakrishnan Dept. of Comp. Sci. & Engg. I.I.T. Delhi.
MICAS Department of Electrical Engineering (ESAT) Design-In for EMC on digital circuit October 27th, 2005 AID–EMC: Low Emission Digital Circuit Design.
CMP 4202: VLSI System Design Lecturer: Geofrey Bakkabulindi
Paper review: High Speed Dynamic Asynchronous Pipeline: Self Precharging Style Name : Chi-Chuan Chuang Date : 2013/03/20.
CSE 494: Electronic Design Automation Lecture 2 VLSI Design, Physical Design Automation, Design Styles.
COE 405 Design and Modeling of Digital Systems
NTU Confidential Test Asynchronous FIR Filter Design Presenter: Po-Chun Hsieh Advisor:Tzi-Dar Chiueh Date: 2003/12/1.
MICAS Department of Electrical Engineering (ESAT) Design-In for EMC on digital circuit December 5th, 2005 Low Emission Digital Circuit Design Junfeng Zhou.
Introduction to FPGA Created & Presented By Ali Masoudi For Advanced Digital Communication Lab (ADC-Lab) At Isfahan University Of technology (IUT) Department.
1 COMP Clockless Logic and Silicon Compilers or How do I take “hard” out of hardware design? Montek Singh Thu, Jan 12, 2006.
ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Under-Graduate Project Improving Timing, Area, and Power Speaker: 黃乃珊 Adviser: Prof.
Reader: Pushpinder Kaur Chouhan
Reading Assignment: Rabaey: Chapter 9
IMPLEMENTATION OF MIPS 64 WITH VERILOG HARDWARE DESIGN LANGUAGE BY PRAMOD MENON CET520 S’03.
Curtis A. Nelson 1 Technology Mapping of Timed Circuits Curtis A. Nelson University of Utah September 23, 2002.
Lecture 11: FPGA-Based System Design October 18, 2004 ECE 697F Reconfigurable Computing Lecture 11 FPGA-Based System Design.
By Nasir Mahmood.  The NoC solution brings a networking method to on-chip communication.
Moore’s Law and Its Future Mark Clements. 15/02/2007EADS 2 This Week – Moore’s Law History of Transistors and circuits The Integrated circuit manufacturing.
NTU Confidential Introduction to the Applications of Asynchronous Circuits Presenter: Po-Chun Hsieh Advisor:Tzi-Dar Chiueh Date: 2003/09/22.
FPGA Logic Cluster Design Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223.
VADA Lab.SungKyunKwan Univ. 1 L5:Lower Power Architecture Design 성균관대학교 조 준 동 교수
Implementing Tile-based Chip Multiprocessors with GALS Clocking Styles Zhiyi Yu, Bevan Baas VLSI Computation Lab, ECE Department University of California,
DUSD(Labs) GSRC Calibrating Achievable Design 11/02.
Clockless Chips Under the esteemed guidance of Romy Sinha Lecturer, REC Bhalki Presented by: Lokesh S. Woldoddy 3RB05CS122 Date:11 April 2009.
1 Digital Logic Design (41-135) Introduction Younglok Kim Dept. of Electrical Engineering Sogang University Spring 2006.
Submitted by Abi Mathew Roll No:1
Welcome To Seminar Presentation Seminar Report On Clockless Chips
Asynchronous Interface Specification, Analysis and Synthesis
Roadmap History Synchronized vs. Asynchronous overview How it works
HIGH LEVEL SYNTHESIS.
Emerging Technologies of Computation
Clockless Logic: Asynchronous Pipelines
Combinational Circuits
Combinational Circuits
1.Introduction to Advanced Digital Design (14 marks)
Clockless Computing Lecture 3
Presentation transcript:

Advances in Designing Clockless Digital Systems Prof. Steven M. Nowick Department of Computer Science Columbia University New York, NY, USA

#2 Introduction Synchronous vs. Asynchronous Systems? Synchronous vs. Asynchronous Systems?  Synchronous Systems: use a global clock  entire system operates at fixed-rate  uses “centralized control” clock

#3 Introduction (cont.) Synchronous vs. Asynchronous Systems? (cont.) Synchronous vs. Asynchronous Systems? (cont.)  Asynchronous Systems: no global clock  components can operate at varying rates  communicate locally via “handshaking”  uses “distributed control” “handshaking interfaces” (channels)

#4 Trends and Challenges Trends in Chip Design: next decade  “Semiconductor Industry Association (SIA) Roadmap” (97-8) Unprecedented Challenges:  complexity and scale (= size of systems)  clock speeds  power management  reusability & scalability  “time-to-market” Design becoming unmanageable using a centralized single clock (synchronous) approach….

#5 Trends and Challenges (cont.) 1. Clock Rate:  1980: several MegaHertz  2001: ~750 MegaHertz - 1+ GigaHertz  2005: several GigaHertz Design Challenge:  “clock skew”: clock must be near-simultaneous across entire chip

#6 Trends and Challenges (cont.) 2. Chip Size and Density: Total #Transistors per Chip: 60-80% increase/year  ~1970: 4 thousand (Intel 4004 microprocessor)  today: million  2006 and beyond: towards 1 billion+ Design Challenges:  system complexity, design time, clock distribution  clock will require cycles to reach across chip

#7 Trends and Challenges (cont.) 3. Power Consumption  Low power: ever-increasing demand  consumer electronics: battery-powered  high-end processors: avoid expensive fans, packaging Design Challenge:  clock inherently consumes power continuously  “power-down” techniques: complex, only partly effective

#8 Trends and Challenges (cont.) 4. Time-to-Market, Design Re-Use, Scalability Increasing pressure for faster “time-to-market”. Need:  reusable components: “plug-and-play” design  flexible interfacing: under varied conditions, voltage scaling  scalable design: easy system upgrades Design Challenge: mismatch w/ central fixed-rate clock

#9 Trends and Challenges (cont.) 5. Future Trends: “Mixed Timing” Domains Chips themselves becoming distributed systems….  contain many sub-regions, operating at different speeds: Design Challenge: breakdown of single centralized clock control

#10 Asynchronous Design: Potential Advantages Several Potential Advantages:  Lower Power  no clock  components use power only “on demand”  Robustness, Scalability  no global timing  “mix-and-match” variable-speed components  composable/modular design style  “object-oriented”  Higher Performance  systems not limited to “worst-case” clock rate

#11 Asynchronous Design: Some Recent Developments 1. Philips Semiconductors:  commercial use: 100 million async chips for consumer electronics: pagers, cell phones, smart cards, digital passports, automotive  3-4x lower power, less electromagnetic interference (“EMI”) 2. Intel:  experimental: Pentium instruction-length decoder = “RAPPID” (1990’s)  3-4x faster than synchronous subsystem 3. Sun Labs:  commercial use: high-speed FIFO’s in recent “Ultra’s” (memory access) 4. IBM Research:  experimental: high-speed pipelines, filters, mixed-timing systems Recent Startups: Fulcrum, Theseus Logic, Handshake Solutions, Silistrix

#12 Asynchronous CAD Tools: Recent Developments DARPA’s “CLASS” Program: Clockless Initiative ( ) Goals: - CAD tool: produce viable commercial-grade async tool flow - demonstration: a complex Boeing ASIC chip Participants:  Lead (PI): Boeing  Industrial participants:  Philips (via async incubated startup, “Handshake Solutions”)  Theseus Logic, Codetronix  Academic participants:  Columbia, UNC, UW, Yale, OSU Targets: cover wide “design space” – very robust to high-speed circuits Columbia’s role: (i) high-speed pipelines, (ii) CAD optimizations

#13 Asynchronous Design: Challenges Critical Design Issues: Critical Design Issues:  components must communicate cleanly: ‘hazard-free’ design  highly-concurrent designs: much harder to verify! Lack of Automated “Computer-Aided Design” Tools: Lack of Automated “Computer-Aided Design” Tools:  most commercial “CAD” tools targeted to synchronous

#14 What Are CAD Tools? Software programs to aid digital designers = “computer-aided design” tools  automatically synthesize and optimize digital circuits CAD TOOL Input: desired circuit specification Output: optimized circuit implementation

#15 Asynchronous Design Challenge Lack of Existing Asynchronous Design Tools:  Most commercial “CAD” tools targeted to synchronous  Synchronous CAD tools:  major drivers of growth in microelectronics industry  Asynchronous “chicken-and-egg” problem:  few CAD tools  less commercial use of async design  especially lacking: tools for designing/optmzng. large systems

#16 Overview: My Research Areas CAD Tools for Asynchronous Controllers (FSM’s) CAD Tools for Asynchronous Controllers (FSM’s)  “MINIMALIST” Package: for synthesis + optimization Other Research Areas: Other Research Areas:  CAD Tools for Designing Large-Scale Async Systems  Mixed-Timing Interface Circuits:  for interfacing sync/async systems  High-Speed Asynchronous Pipelines

#17 CAD Tools for Async Controllers MINIMALIST: developed at Columbia University [1994-]  extensible CAD package for synthesis of asynchronous controllers  integrates synthesis, optimization and verification tools  used in 80+ sites/17+ countries (being taught in IIT Bombay)  URL: http :// Includes several optimization tools:  State Minimization  CHASM: optimal state encoding  2-Level Hazard-Free Logic Minimization  Verilog back-end Key goal: facilitate design-space exploration

#18 Example: “PE-SEND-IFC” (HP Labs) Inputs: req-send treq rd-iq adbld-out ack-pkt Outputs: tack peack adbld req-send+ treq+ rd-iq+/ adbld+ adbld-out+/ peack+ rd-iq-/ peack- adbld- tack+ adbld-out- treq- rd-id+/ adbld+ adbld-out+/ peack+ rd-iq-/ peack- adbld- tack- adbld-out- treq+ ack-pkt+/ peack+ tack+ ack-pkt- treq-/ peack- tack- treq-/ tack- treq+/ tack+ ack-pkt+/ peack- tack- adbld-out- treq- ack-pkt+/ peack+ req-send-/ -- adbld-out- treq+ rd-iq+/ adbld+ From HP Labs “Mayfly” Project: B.Coates, A.Davis, K.Stevens, “The Post Office Experience: Designing a Large Asynchronous Chip”, INTEGRATION: the VLSI Journal, vol. 15:3, pp (Oct. 1993)

#19 EXAMPLE (cont.): Examples: Design-Space Exploration using MINIMALIST: optimizing for area vs. speed

#20 CAD Tools for Large-Scale Asynchronous Systems Target: - synthesize distributed control - 1 controller per functional unit Target Architecture: RegisterRegister FunctionalUnit FunctionalUnit FunctionalUnit Start Loop C< 0 B:=2dx+dx C:=X<a M:=U*X1 Endloop End X:=X+dx C:=X<a Input Specification: = “Control Data-flow Graph” control unit Ctrlr 1 Ctrlr 2 Ctrlr 3 [Theobald/Nowick, IEEE Design Automation Conf. (2001)]

#21 Mixed-Timing Interfaces Asynchronous Domain Synchronous Domain 1 Synchronous Domain 2 Goal: provide low-latency communication between “timing domains” Challenge: avoid synchronization errors Goal: provide low-latency communication between “timing domains” Challenge: avoid synchronization errors Asynchronous Domain

#22 Mixed-Timing Interfaces: Solution Asynchronous Domain Synchronous Domain 1 Synchronous Domain 2 Async-Sync FIFO Sync-Async FIFO Mixed-Clock FIFO’s … developed complete family of mixed-timing interface circuits [Chelcea/Nowick, IEEE Design Automation Conf. (2001)] Solution: insert mixed-timing FIFO’s  provide safe data transfer Asynchronous Domain

#23 global clock NON-PIPELINED COMPUTATION: High-Speed Asynchronous Pipelines “datapath component” = adder, multiplier, etc. SYNCHRONOUS

#24 global clock SYNCHRONOUS ASYNCHRONOUS “PIPELINED COMPUTATION”: like an assembly line no global clock High-Speed Asynchronous Pipelines

#25 Goal: extremely fast async datapath components  speed: comparable to fastest existing synchronous designs  additional benefits:  dynamically adapt to variable-speed interfaces: voltage scaling!  “elastic” processing of data in pipeline  no clock distribution Contributions: 3 new async pipeline styles [SINGH/NOWICK]  MOUSETRAP: static logic  High-Capacity/Lookahead: dynamic logic Obtain multi-GigaHertz speeds Used by IBM, currently incorporated into Philips tool flow High-Speed Asynchronous Pipelines

#26 req N ack N-1 req N+1 ack N Data Latch Latch Controller done N Data in Data out Stage NStage N-1Stage N+1 En MOUSETRAP: A Basic FIFO (no computation) Stages communicate using transition-signaling: [Singh/Nowick, IEEE Int. Conf. on Computer Design (2001)]

#27 Stage N+1 logic delay Stage N Data Latch Latch Controller done N logic delay Stage N-1 logic delay req N ack N-1 req N+1 ack N “MOUSETRAP” Pipeline: w/computation “MOUSETRAP” Pipeline: w/computation Function Blocks: use “synchronous” single-rail circuits (not hazard-free!) “Bundled Data” Requirement:  each “req” must arrive after data inputs valid and stable

#28

#29 req N ack N-1 req N+1 ack N Data Latch Latch Controller done N Data in Data out Stage NStage N-1Stage N+1 En MOUSETRAP: A Basic FIFO Stages communicate using transition-signaling: 1 transition per data item! One Data Item