Southampton: Oct 99Asynchronous Circuit Compilation- 1 Asynchronous Circuit Compilation Dr. Doug Edwards

Slides:



Advertisements
Similar presentations
Computer Architecture
Advertisements

System Integration and Performance
Accessing I/O Devices Processor Memory BUS I/O Device 1 I/O Device 2.
Programmable FIR Filter Design
Reading1: An Introduction to Asynchronous Circuit Design Al Davis Steve Nowick University of Utah Columbia University.
Introduction to VLSI Programming TU/e course 2IN30 Lecture 3: Control Handshake Circuits (2)
Clockless Logic System-Level Specification and Synthesis Ack: Tiberiu Chelcea.
1/1/ / faculty of Electrical Engineering eindhoven university of technology Architectures of Digital Information Systems Part 1: Interrupts and DMA dr.ir.
Southampton: Oct 99AMULET3i - 1 AMULET3i - asynchronous SoC Steve Furber - n Agenda: AMULET3i Design tools Future problems.
VLSI Design EE 447/547 Sequential circuits 1 EE 447/547 VLSI Design Lecture 9: Sequential Circuits.
MICROELETTRONICA Sequential circuits Lection 7.
Sequential Circuits. Outline  Floorplanning  Sequencing  Sequencing Element Design  Max and Min-Delay  Clock Skew  Time Borrowing  Two-Phase Clocking.
Southampton: Oct 99Asynchronous Circuit Compilation- 1 AMULET3-H n Asynchronous macrocell ARM compatible processor core Full custom RAM Compiled ROM Balsa.
© Ran Ginosar Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures Lecture 3 S&F Ch. 5: Handshake Ckt Implementations.
11-May-04 Qianyi Zhang School of Computer Science, University of Birmingham (Supervisor: Dr Georgios Theodoropoulos) A Distributed Colouring Algorithm.
The Design Process Outline Goal Reading Design Domain Design Flow
1 Clockless Logic Montek Singh Thu, Jan 13, 2004.
ELEC 6200, Fall 07, Oct 24 Jiang: Async. Processor 1 Asynchronous Processor Design for ELEC 6200 by Wei Jiang.
Behavioral Design Outline –Design Specification –Behavioral Design –Behavioral Specification –Hardware Description Languages –Behavioral Simulation –Behavioral.
COMP Clockless Logic and Silicon Compilers Lecture 3
Jordi Cortadella, Universitat Politècnica de Catalunya, Spain
University College Cork IRELAND Hardware Concepts An understanding of computer hardware is a vital prerequisite for the study of operating systems.
State Machines Timing Computer Bus Computer Performance Instruction Set Architectures RISC / CISC Machines.
TECH CH03 System Buses Computer Components Computer Function
ARMOR Asynchronous RISC Microprocessor הטכניון - מכון טכנולוגי לישראל המעבדה למערכות ספרתיות מהירות הפקולטה להנדסת חשמל Submitted by: Tziki Oz-Sinay, Ori.
VHDL. What is VHDL? VHDL: VHSIC Hardware Description Language  VHSIC: Very High Speed Integrated Circuit 7/2/ R.H.Khade.
Lecture 11 MOUSETRAP: Ultra-High-Speed Transition-Signaling Asynchronous Pipelines.
Design for Testability
Lecture 12 Today’s topics –CPU basics Registers ALU Control Unit –The bus –Clocks –Input/output subsystem 1.
CS-334: Computer Architecture
Advanced Digital Design Asynchronous EDA by A. Steininger, J. Lechner and R. Najvirt Vienna University of Technology.
CPU BASICS, THE BUS, CLOCKS, I/O SUBSYSTEM Philip Chan.
Architectural Design portions ©Ian Sommerville 1995 Establishing the overall structure of a software system.
MICROPROCESSOR INPUT/OUTPUT
Computer Architecture and Organization Introduction.
MOUSETRAP Ultra-High-Speed Transition-Signaling Asynchronous Pipelines Montek Singh & Steven M. Nowick Department of Computer Science Columbia University,
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 10Slide 1 Architectural Design l Establishing the overall structure of a software system.
Top Level View of Computer Function and Interconnection.
Computer Architecture Lecture10: Input/output devices Piotr Bilski.
1 H ardware D escription L anguages Modeling Digital Systems.
Paper review: High Speed Dynamic Asynchronous Pipeline: Self Precharging Style Name : Chi-Chuan Chuang Date : 2013/03/20.
Chapter 1 Introduction. Architecture & Organization 1 Architecture is those attributes visible to the programmer —Instruction set, number of bits used.
Advanced Computer Architecture 0 Lecture # 1 Introduction by Husnain Sherazi.
SEQUENTIAL CIRCUITS Component Design and Use. Register with Parallel Load  Register: Group of Flip-Flops  Ex: D Flip-Flops  Holds a Word of Data 
Async2000, April, Eilat Balsa Demonstration - 1 Balsa – A Hands-on Tutorial Session Doug Edwards & A. Bardsley.
Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.
EEE440 Computer Architecture
Reading1: An Introduction to Asynchronous Circuit Design Al Davis Steve Nowick University of Utah Columbia University.
Accessing I/O Devices Processor Memory BUS I/O Device 1 I/O Device 2.
Fall 2004EE 3563 Digital Systems Design EE 3563 VHSIC Hardware Description Language  Required Reading: –These Slides –VHDL Tutorial  Very High Speed.
Computer Organization CDA 3103 Dr. Hassan Foroosh Dept. of Computer Science UCF © Copyright Hassan Foroosh 2002.
IT3002 Computer Architecture
Chapter 3 System Buses.  Hardwired systems are inflexible  General purpose hardware can do different tasks, given correct control signals  Instead.
Processor Organization and Architecture Module III.
1 Advanced Digital Design Asynchronous Design Automation by A. Steininger and J. Lechner Vienna University of Technology.
Synthesis from HDL Other synthesis paradigms
Introduction to Programmable Logic
Part IV: Synthesis from HDL Other synthesis paradigms
Recap: Lecture 1 What is asynchronous design? Why do we want to study it? What is pipelining? How can it be used to design really fast hardware?
Clockless Logic: Asynchronous Pipelines
Chapter 13: I/O Systems.
Wagging Logic: Moore's Law will eventually fix it
Introduction to Silicon Programming in the Tangram/Haste language
Presentation transcript:

Southampton: Oct 99Asynchronous Circuit Compilation- 1 Asynchronous Circuit Compilation Dr. Doug Edwards

Southampton: Oct 99Asynchronous Circuit Compilation- 2 Overview: n Asynchronous circuits n Advantages n Asynchronous Design Paradigms n Syntax Directed Compilation Handshake Circuits n Balsa n Datapath Compilation n Design Example - DMA Controller

Southampton: Oct 99Asynchronous Circuit Compilation- 3 Asynchronous (self-timed) Basics n Synchronous circuits a global clock separates system states – A time domain view of system activity. n Asynchronous circuits input changes separate system states –A sequence or trace domain view of system activity.

Southampton: Oct 99Asynchronous Circuit Compilation- 4 Why Asynchronous? n Low Power data-driven: power is only used to do useful work zero power when idle with instant restart n Low EMI In a clocked circuit, all noise is correlated Async circuits have “distributed” switching activity leading to uncorrelated EMI

Southampton: Oct 99Asynchronous Circuit Compilation- 5 Why Asynchronous? n No clock distribution problems n Composability/Modularity facilitates IP reuse n Average Case Performance exploit the fact that worst-case often occurs infrequently

Southampton: Oct 99Asynchronous Circuit Compilation- 6 Timing Models n Delay Insensitive (DI) Delays in circuits & wires are arbitrary n Quasi-Delay Insensitive (QDI) Similar to DI but assuming isochronic forks n Speed Independent (SI) Wires have no delays, arbitrary gate delays n Bounded Delay Single-sided timing constraints

Southampton: Oct 99Asynchronous Circuit Compilation- 7 Asynchronous Design Paradigms n AFSMs - for fast controllers etc Traditionally hard –hazards, races,state asigment problems Research has led to new techniques –STG/Petri net based SI circuits –Burst-Mode circuits n Macromodule-like for larger systems micropipeline approach, handshake circuits

Southampton: Oct 99Asynchronous Circuit Compilation- 8 n With no clock, some other means is required to co-ordinate control flow n Use a request/acknowledge handshake Asynchronous Control Req Ack Sender

Southampton: Oct 99Asynchronous Circuit Compilation- 9 Signalling Protocols n req & ack are abstractions: layer a signalling protocol on top of them: n Two common protocols 2-phase (transition signalling, NRZ) 4-phase (Return-to-Zero signalling)

Southampton: Oct 99Asynchronous Circuit Compilation- 10 Data Validity Models n Self Timed The validity of the data is encoded within the data itself – redundant coding e.g. Dual Rail: each data bit requires two wires. 00 -> no data, 01 -> ‘0’, 10 -> ‘1’ n Bundled Data approach conventional datapath validity is assured by imposing timing constraints.

Southampton: Oct 99Asynchronous Circuit Compilation- 11 valid 1 transaction1 transaction valid  Req Ack 2-phase Protocol n Events are transitions:

Southampton: Oct 99Asynchronous Circuit Compilation phase protocol n Signals are returned to initial state after each transaction Several possible interleavings of the signal transitions

Southampton: Oct 99Asynchronous Circuit Compilation- 13 Comparison of Approaches n 2-phase/4-phase 2-phase conceptually simpler (once an event mind-set is adopted) 2-phase circuits slower & more complex think 2-phase, build 4-phase n Bundled-Data/Dual-rail Current orthodoxy: bundled data is faster, lower power, smaller area with tolerancing task no worse than for a clocked design

Southampton: Oct 99Asynchronous Circuit Compilation- 14 Current Approach n QDI control n Bounded-Delay (bundled-data) datapath n 4-phase signalling Amulet3i

Southampton: Oct 99Asynchronous Circuit Compilation- 15 Asynchronous HDLs n Conventional programming languages lack 3 necessary constructs: communication parallelism/concurrency sharing (of hardware) n Conventional HDLs lack adequate fine-grain concurrency channel based communication primitives

Southampton: Oct 99Asynchronous Circuit Compilation- 16 Asynchronous HDLs – 2 n Tangram, Balsa CSP based + data types + … based on underlying formal semantics –guarantees correct composition rules –easier composition than in sync circuits??? transparent compilation –each production rule in the language translates to an intermediate handshake circuit –allows designer to infer circuit costs & performance from the program

Southampton: Oct 99Asynchronous Circuit Compilation- 17 Handshake Circuits - 1 n Circuits communicate along channels n Channels connect ports at circuit interface n Ports have: Type Direction Sense

Southampton: Oct 99Asynchronous Circuit Compilation- 18 Handshake Circuits - 2 n Port type determines the number of data wires no data wires == control only port! n Port direction is input, output or control only n Port sense Active: initiates transfers Passive: responds to requests

Southampton: Oct 99Asynchronous Circuit Compilation- 19 Micropipeline-Style Circuits: Push Circuits: Circuit waits for data passive input req ack data cct active output req ack data

Southampton: Oct 99Asynchronous Circuit Compilation- 20 Micropipeline-Style Circuits: Push Circuits: data arrives req ack data cct req ack data

Southampton: Oct 99Asynchronous Circuit Compilation- 21 Micropipeline-Style Circuits: Push Circuits: data validity signalled req ack data cct req ack data

Southampton: Oct 99Asynchronous Circuit Compilation- 22 Micropipeline-Style Circuits: Push Circuits: circuit accepts data req ack data cct req ack data

Southampton: Oct 99Asynchronous Circuit Compilation- 23 Micropipeline-Style Circuits: Push Circuits: circuit signals data taken req ack data cct req ack data

Southampton: Oct 99Asynchronous Circuit Compilation- 24 Micropipeline-Style Circuits: Push Circuits: Circuit outputs data req ack data cct req ack data

Southampton: Oct 99Asynchronous Circuit Compilation- 25 Micropipeline-Style Circuits: Push Circuits: Circuit signals validity req ack data cct req ack data

Southampton: Oct 99Asynchronous Circuit Compilation- 26 Micropipeline-Style Circuits: Push Circuits: receiver takes data req ack data cct req ack data

Southampton: Oct 99Asynchronous Circuit Compilation- 27 Micropipeline-Style Circuits: n 4-phase protocol not detailed n Previous circuit decoupled input and ouput implies a latch inside the handshake circuit n An alternative is for the input handshake to enclose the output handshake

Southampton: Oct 99Asynchronous Circuit Compilation- 28 Enclosed Handshake: Push Circuits: data arrives req ack data cct req ack data

Southampton: Oct 99Asynchronous Circuit Compilation- 29 Enclosed Handshake: Push Circuits: data validity signalled req ack data cct req ack data

Southampton: Oct 99Asynchronous Circuit Compilation- 30 Enclosed Handshake: Push Circuits: circuit accepts data req ack data cct req ack data

Southampton: Oct 99Asynchronous Circuit Compilation- 31 Enclosed Handshake: Push Circuits: Circuit outputs data req ack data cct req ack data

Southampton: Oct 99Asynchronous Circuit Compilation- 32 Enclosed Handshake: Push Circuits: Circuit signals validity req ack data cct req ack data

Southampton: Oct 99Asynchronous Circuit Compilation- 33 Enclosed Handshake: Push Circuits: receiver takes data req ack data cct req ack data

Southampton: Oct 99Asynchronous Circuit Compilation- 34 Enclosed Handshake: Push Circuits: input handshake completes No latch required req ack data cct req ack data

Southampton: Oct 99Asynchronous Circuit Compilation- 35 Tangram Style Circuits Pull Circuits: active ported circuits/ control driven req ack data cct req ack data active input port

Southampton: Oct 99Asynchronous Circuit Compilation- 36 Tangram Style Circuits Pull Circuits: Circuit demands data req ack data cct req ack data

Southampton: Oct 99Asynchronous Circuit Compilation- 37 Tangram Style Circuits Pull Circuits: data is sent on demand req ack data cct req ack data

Southampton: Oct 99Asynchronous Circuit Compilation- 38 Tangram Style Circuits Pull Circuits: data is accepted and can then be released req ack data cct req ack data

Southampton: Oct 99Asynchronous Circuit Compilation- 39 Balsa n Language for synthesising large async circuits & systems n CSP/OCCAM background n Tangram-like based on Tangram compilation function compiles to a small (but expanding) set of handshake circuits origins: ESPRIT EXACT project

Southampton: Oct 99Asynchronous Circuit Compilation- 40 Balsa Language Features n Data types based on sequence of bits Arrays and records are bit-based Element extraction is by array slicing Strict data typing n Structural iteration n Arrayed channels n Parameterised & recursive functions

Southampton: Oct 99Asynchronous Circuit Compilation- 41 Balsa Language Features n Enclosed selection semantics Allows passive ported circuits Allows push (micropipeline-style) circuits Allows unbuffered (latch-free) circuits Can be considered a restricted form of Burns’ probe construct.

Southampton: Oct 99Asynchronous Circuit Compilation- 42 Balsa Source

Southampton: Oct 99Asynchronous Circuit Compilation- 43 Example: Single Place Buffer import [balsa.types.basic] public type word is 16 bits procedure buffer (input i : word; output o : word) is local variable x : word begin loop i -> x;-- Input communication o <- x-- Output communication end library mechanism visibility type declaration channel declarations procedure definition implies latch repeat forever sequential operation read input channel into local variable x output local variable x to output channel

Southampton: Oct 99Asynchronous Circuit Compilation- 44 Buffer Handshake Circuit Single-place buffer  # x T ; T io activation channel repeater sequencer variable transferrer

Southampton: Oct 99Asynchronous Circuit Compilation- 45 # Buffer Handshake Circuit Single-place buffer repeater is activated  x T ; T io

Southampton: Oct 99Asynchronous Circuit Compilation- 46 ; # Buffer Handshake Circuit Single-place buffer Sequencer handshakes to left transferrer  x TT io

Southampton: Oct 99Asynchronous Circuit Compilation- 47 ; # Buffer Handshake Circuit Single-place buffer transferrer requests data from environment  x TT io

Southampton: Oct 99Asynchronous Circuit Compilation- 48 x ; # Buffer Handshake Circuit Single-place buffer data transferred to variable x  TT io

Southampton: Oct 99Asynchronous Circuit Compilation- 49 x ; # Buffer Handshake Circuit Single-place buffer variable handshake completes  TT io

Southampton: Oct 99Asynchronous Circuit Compilation- 50 x ; # Buffer Handshake Circuit Single-place buffer transferrer handshake completes to environment  TT io

Southampton: Oct 99Asynchronous Circuit Compilation- 51 x ; # Buffer Handshake Circuit Single-place buffer transferrer handshake completes  TT io

Southampton: Oct 99Asynchronous Circuit Compilation- 52 x ; # Buffer Handshake Circuit Single-place buffer Sequencer handshakes to right transferrer  TT io

Southampton: Oct 99Asynchronous Circuit Compilation- 53 x ; # Buffer Handshake Circuit Single-place buffer Transferrer reads variable  TT io

Southampton: Oct 99Asynchronous Circuit Compilation- 54 x ; # Buffer Handshake Circuit Single-place buffer Transferrer outputs to environment  TT io

Southampton: Oct 99Asynchronous Circuit Compilation- 55 x ; # Buffer Handshake Circuit Single-place buffer handshakes complete  TT io

Southampton: Oct 99Asynchronous Circuit Compilation- 56 x ; # Buffer Handshake Circuit Single-place buffer Sequencer completes its input handshake  TT io

Southampton: Oct 99Asynchronous Circuit Compilation- 57 Buffer Handshake Circuit Single-place buffer repeater initiates another transfer, etc x ; #  TT i o

Southampton: Oct 99Asynchronous Circuit Compilation- 58 Example: Single Place Buffer import [balsa.types.basic] public type word is 16 bits procedure buffer (input i : word; output o : word) is local variable x : word begin loop i -> x;-- Input communication o <- x-- Output communication end

Southampton: Oct 99Asynchronous Circuit Compilation- 59 Example: 2-place buffer import [balsa.types.basic] import [buffer1a] public type word is 16 bits procedure buffer2c (input i : word; output o : word) is local channel c : word begin buffer (i, c) || buffer (c, o) end parallel composition reuse component internal channel connects two 1-place buffers buffers connected by common signal name

Southampton: Oct 99Asynchronous Circuit Compilation place Buffer Handshake Circuit B i x   par component o cc passivator

Southampton: Oct 99Asynchronous Circuit Compilation place Buffer Handshake Circuit x ; # T T i x ; # T T #  #  par component o cc passivator

Southampton: Oct 99Asynchronous Circuit Compilation- 62 Peephole Optimisation n Composition of handshake circuits leads to inefficiencies at circuit boundaries n Straightforward peephole optimizations

Southampton: Oct 99Asynchronous Circuit Compilation place Buffer Handshake Circuit x ; # T T i x ; # T T #  #  par component o cc passivator

Southampton: Oct 99Asynchronous Circuit Compilation- 64 Optimized 2-place Buffer Circuit x ; #  T T i x ; # T  control-only

Southampton: Oct 99Asynchronous Circuit Compilation- 65 The Repeater n “Formal” Definition REP(a ,b ) = (a  : #[b ]) denotes active port  denotes passive port # denotes repeat : denotes handshake enclosure

Southampton: Oct 99Asynchronous Circuit Compilation- 66 The Repeater n “Formal” Definition REP (a ,b ) = (a  : #[b ]) = (a   : #[b  ;b  ]) = (a r   : #[b r  ; b a  ; b r  ; b a  ]) b r b a a r a a

Southampton: Oct 99Asynchronous Circuit Compilation- 67 The Transferrer n Several Implementations simplest – wire-only: arar crcr baba a brbr caca data[n]

Southampton: Oct 99Asynchronous Circuit Compilation- 68 Balsa Toolkit -1 n balsa-c The compiler for the language n breeze2dot Produces a postscript plot of the generated handshake circuits n breezecost Reports the cost of the compiled circuit in arbitrary units

Southampton: Oct 99Asynchronous Circuit Compilation- 69 Balsa Toolkit -2 n breeze2lard The interface to the LARD simulation environment. –balsa source is translated to LARD –simple test harness is generated n balsa-md An automatic makefile generation facility. n balsa-mgr A GUI project manager

Southampton: Oct 99Asynchronous Circuit Compilation- 70 Mod-16 Counter (all even)

Southampton: Oct 99Asynchronous Circuit Compilation- 71 Bundled-Data Datapaths n Problems random standard cell layout –mixed control + datapath timing analysis required robustness of design reduced n Possible Solutions DI codes hybrid bundled + DI simpler timing analysis

Southampton: Oct 99Asynchronous Circuit Compilation- 72 DI Codes n Dual Rail (used in 1st Tangram system) Can use standard cell approach without timing analysis –no need to distinguish between control & data abandoned in favour of bundled-data –area cost in extra wires –area & time cost in completion detection Tangram/Balsa generates push-pull pipelines with expensive synchronization

Southampton: Oct 99Asynchronous Circuit Compilation- 73 Generic Pipeline n Passivators join compiled procedure B i B   o cc passivator

Southampton: Oct 99Asynchronous Circuit Compilation- 74 Passivator Implementation n Bundled Data n Dual Rail arar babaa brbr data[n] d0d0 d1d1 C brbr babaa n-wide C-gate C C n-bits wide d n-1

Southampton: Oct 99Asynchronous Circuit Compilation- 75 DI Code Synchronizations n Expensive need C-element synchronisation tree n A partial solution (not always possible/desirable) is: transform to push-style datapath –(not possible in Tangram only Balsa)

Southampton: Oct 99Asynchronous Circuit Compilation- 76 Push Pipeline B i B   o cc Passive input port connector (wires-only)

Southampton: Oct 99Asynchronous Circuit Compilation- 77 Hybrid Solutions n Use DI coding within bundled datapath framework e.g. use dual-rail carry signals within a conventional adder –early completion easily detected n Average-case performance n Only applicable to a few datapath operations

Southampton: Oct 99Asynchronous Circuit Compilation- 78 Simpler Timing Analysis n Separate control and datapath generate regular, compiled, datapath –area improvement over standard cell (because of regular layout) – generate matched delay paths (c.f. self-timed PLAs) must be able to recognize datapath –difficult: control often contains datapath-like elements. –e.g. start at variables and work backwards...

Southampton: Oct 99Asynchronous Circuit Compilation- 79 Datapath meets Control n Example: Balsa case statement data “n” bits wide true/complement lines: dual-rail expansion 1 hot encoding

Southampton: Oct 99Asynchronous Circuit Compilation- 80 Case Component n input from datapath dual-rail simplifies internal logic n expansions parameterisable n “encode” component is similar opposite of case with true/false expansion

Southampton: Oct 99Asynchronous Circuit Compilation- 81 Simpler Timing Analysis n Tool support required use existing (non-Balsa) tools if possible automatically add matched paths/delays to synthesised datapaths n Design own cells where appropriate e.g. hybrid stages

Southampton: Oct 99Asynchronous Circuit Compilation- 82 Future Work n Provide support for DI, hybrid and datapath-compiled datapaths even with datapath compilation, some datapath would still be standard cell –e.g. instruction decoder (control heavy) –datapath in control cost of connecting separate blocks in layout n Test Design required (datapath heavy)

Southampton: Oct 99Asynchronous Circuit Compilation- 83 Tool Enhancement n balsa-c support for attribution to select compilation mechanisms/ optimisation schemes n breeze2lard new models n balsa-netlist: new tech-mapping descriptions interface to datapath compilers

Southampton: Oct 99Asynchronous Circuit Compilation- 84 AMULET3i n Asynchronous macrocell ARM compatible processor core Full custom RAM Compiled ROM Balsa compiled DMA controller Test I/F, synchronous and off-chip bus bridges n Synchronous peripherals Designed by commercial partner...

Southampton: Oct 99Asynchronous Circuit Compilation- 85 AMULET3 System CPU / RAM ROMDMAC Periph1 Sync bridge MARBLESOCB

Southampton: Oct 99Asynchronous Circuit Compilation- 86 DMA Local RAM Access CPU / RAM ROMDMAC Periph1 Sync bridge MARBLESOCB

Southampton: Oct 99Asynchronous Circuit Compilation- 87 DMA Peripheral Accesses CPU / RAM ROMDMAC Periph1 Sync bridge MARBLESOCB DMA requests

Southampton: Oct 99Asynchronous Circuit Compilation- 88 Requirements / Specification n 16 clients, 32 channels n 3 channel types - complicated register structure n Programmable client  channel 1  many mapping n Support synchronous requests n Transfers mostly between synchronous clients

Southampton: Oct 99Asynchronous Circuit Compilation- 89 Controller Structure

Southampton: Oct 99Asynchronous Circuit Compilation- 90 Two Controller Descriptions n Sequential (previous slides) Very simple control flow Requires two passes through register bank Slow!, Only memory decoupling helps n Parallel (next slides) Decouple TE actions from memory R/W with a new unit: Transfer Interface Interrupt the register bank on end of transfer

Southampton: Oct 99Asynchronous Circuit Compilation- 91 “Parallel” Design

Southampton: Oct 99Asynchronous Circuit Compilation- 92 The Design n 919 lines of Balsa describing register bank control, TE and TI. n Custom register banks and Synchronous Peripheral Interface n Miscellaneous glue standard cells Register bank controllers MARBLE interfaces n Compass Design Automation CAD

Southampton: Oct 99Asynchronous Circuit Compilation- 93 Implementation Technology n 0.35  m, 3LM CMOS n Standard cells from ARM Ltd. n Locally designed complex gates and asynchronous elements/gates. n Automated standard cell P&R n Only “essential” and simple gate level optimisation (by hand)

Southampton: Oct 99Asynchronous Circuit Compilation- 94 Design Partitioning Marble BUS: outside of DMA controller

Southampton: Oct 99Asynchronous Circuit Compilation- 95 Design Partitioning Balsa synthesised standard cells

Southampton: Oct 99Asynchronous Circuit Compilation- 96 Design Partitioning Custom “regular” layout

Southampton: Oct 99Asynchronous Circuit Compilation- 97 Design Partitioning Hand designed standard cells

Southampton: Oct 99Asynchronous Circuit Compilation- 98 DMA Controller Floor-Plan