© 2003-2009 Ran Ginosar048878 Lecture 4: Speed-Independent Control Circuits 1 VLSI Architectures 048878 Lecture 4 S&F Ch. 6: Speed-Independent Control.

Slides:



Advertisements
Similar presentations
Self-Timed Logic Timing complexity growing in digital design -Wiring delays can dominate timing analysis (increasing interdependence between logical and.
Advertisements

Combinational Logic.
Reading1: An Introduction to Asynchronous Circuit Design Al Davis Steve Nowick University of Utah Columbia University.
Introduction to CMOS VLSI Design Sequential Circuits.
Introduction to CMOS VLSI Design Sequential Circuits
Give qualifications of instructors: DAP
1 Logic Design of Asynchronous Circuits Jordi Cortadella Jim Garside Alex Yakovlev Univ. Politècnica de Catalunya, Barcelona, Spain Manchester University,
Lecture 11: Sequential Circuit Design. CMOS VLSI DesignCMOS VLSI Design 4th Ed. 11: Sequential Circuits2 Outline  Sequencing  Sequencing Element Design.
CS 151 Digital Systems Design Lecture 19 Sequential Circuits: Latches.
Computer Architecture CS 215
Delay/Phase Regeneration Circuits Crescenzo D’Alessandro, Andrey Mokhov, Alex Bystrov, Alex Yakovlev Microelectronics Systems Design Group School of EECE.
1 Lecture 20 Sequential Circuits: Latches. 2 Overview °Circuits require memory to store intermediate data °Sequential circuits use a periodic signal to.
Charles Kime & Thomas Kaminski © 2008 Pearson Education, Inc. (Hyperlinks are active in View Show mode) Chapter 5 – Sequential Circuits Part 1 – Storage.
1 Advanced Digital Design Synthesis of Control Circuits by A. Steininger and J. Lechner Vienna University of Technology.
1 Clockless Logic  Recap: Lookahead Pipelines  High-Capacity Pipelines.
Jordi Cortadella, Universitat Politecnica de Catalunya, Barcelona Mike Kishinevsky, Intel Corp., Strategic CAD Labs, Hillsboro.
Hazard-free logic synthesis and technology mapping I Jordi Cortadella Michael Kishinevsky Alex Kondratyev Luciano Lavagno Alex Yakovlev Univ. Politècnica.
Hardware and Petri nets Synthesis of asynchronous circuits from Signal Transition Graphs.
© Ran Ginosar Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures Lecture 3 S&F Ch. 5: Handshake Ckt Implementations.
ECE 331 – Digital System Design Introduction to and Analysis of Sequential Logic Circuits (Lecture #20) The slides included herein were taken from the.
Hardware and Petri nets: application to asynchronous circuit design Jordi CortadellaUniversitat Politècnica de Catalunya, Spain Michael KishinevskyIntel.
1 Clockless Logic Montek Singh Thu, Jan 13, 2004.
Give qualifications of instructors: DAP
© Ran GinosarAsynchronous Design and Synchronization 1 VLSI Architectures Lecture 2: Theoretical Aspects (S&F 2.5) Data Flow Structures.
Introduction to asynchronous circuit design: specification and synthesis Part III: Advanced topics on synthesis of control circuits from STGs.
1 Logic design of asynchronous circuits Part II: Logic synthesis from concurrent specifications.
Introduction to asynchronous circuit design: specification and synthesis Part II: Synthesis of control circuits from STGs.
COMP Clockless Logic and Silicon Compilers Lecture 3
1 Logic synthesis from concurrent specifications Jordi Cortadella Universitat Politecnica de Catalunya Barcelona, Spain In collaboration with M. Kishinevsky,
1 Clockless Logic Prof. Montek Singh Feb. 3, 2004.
Asynchronous Interface Specification, Analysis and Synthesis M. Kishinevsky Intel Corporation J. Cortadella Technical University of Catalonia.
Jordi Cortadella, Universitat Politècnica de Catalunya, Spain
STG-based synthesis and Petrify J. Cortadella (Univ. Politècnica Catalunya) Mike Kishinevsky (Intel Corporation) Alex Kondratyev (University of Aizu) Luciano.
Engineering Models and Design Methods for Quantum State Machines.
High-Throughput Asynchronous Pipelines for Fine-Grain Dynamic Datapaths Montek Singh and Steven Nowick Columbia University New York, USA
Synthesis of Asynchronous Control Circuits with Automatically Generated Relative Timing Assumptions Jordi Cortadella, University Politècnica de Catalunya.
1 A Case for Using Signal Transition Graphs for Analysing and Refining Genetic Networks Richard Banks, Victor Khomenko and Jason Steggles School of Computing.
1 Petrify: Method and Tool for Synthesis of Asynchronous Controllers and Interfaces Jordi Cortadella (UPC, Barcelona, Spain), Mike Kishinevsky (Intel Strategic.
Automatic synthesis and verification of asynchronous interface controllers Jordi CortadellaUniversitat Politècnica de Catalunya, Spain Michael KishinevskyIntel.
Introduction to CMOS VLSI Design Lecture 10: Sequential Circuits Credits: David Harris Harvey Mudd College (Material taken/adapted from Harris’ lecture.
CS 151 Digital Systems Design Lecture 32 Hazards
1 Clockless Computing Montek Singh Thu, Sep 13, 2007.
Fall 2009 / Winter 2010 Ran Ginosar (
Asynchronous Circuit Verification and Synthesis with Petri Nets J. Cortadella Universitat Politècnica de Catalunya, Barcelona Thanks to: Michael Kishinevsky.
Lecture 11 MOUSETRAP: Ultra-High-Speed Transition-Signaling Asynchronous Pipelines.
1 Recap: Lectures 5 & 6 Classic Pipeline Styles 1. Williams and Horowitz’s PS0 pipeline 2. Sutherland’s micropipelines.
1 Clockless Logic: Dynamic Logic Pipelines (contd.)  Drawbacks of Williams’ PS0 Pipelines  Lookahead Pipelines.
Sequential Circuits Chapter 4 S. Dandamudi To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S.
Modern VLSI Design 4e: Chapter 8 Copyright  2008 Wayne Wolf Topics Basics of register-transfer design: –data paths and controllers; –ASM charts. Pipelining.
ENG241 Digital Design Week #8 Registers and Counters.
Introduction to State Machine
Reading1: An Introduction to Asynchronous Circuit Design Al Davis Steve Nowick University of Utah Columbia University.
1 COMP541 Sequential Circuits Montek Singh Feb 1, 2012.
Modern VLSI Design 3e: Chapter 8 Copyright  1998, 2002 Prentice Hall PTR Topics n Basics of register-transfer design: –data paths and controllers; –ASM.
1 Bridging the gap between asynchronous design and designers Peter A. BeerelFulcrum Microsystems, Calabasas Hills, CA, USA Jordi CortadellaUniversitat.
EE 466/586 VLSI Design Partha Pande School of EECS Washington State University
High Performance Embedded Computing © 2007 Elsevier Lecture 4: Models of Computation Embedded Computing Systems Mikko Lipasti, adapted from M. Schulte.
CS151 Introduction to Digital Design Chapter 5: Sequential Circuits 5-1 : Sequential Circuit Definition 5-2: Latches 1Created by: Ms.Amany AlSaleh.
Petrify Massoud Daneshtalab Mohammad Riazati. VSTGL Tool:.inputs b a.outputs d c.graph P0 b+ a+ P1 b- b+ c+ c+ P1 c- P0 b- c- a+ b+/1 b+/1 d+ d+ c+/1.
1 Recap: Lecture 4 Logic Implementation Styles:  Static CMOS logic  Dynamic logic, or “domino” logic  Transmission gates, or “pass-transistor” logic.
Specification mining for asynchronous controllers Javier de San Pedro† Thomas Bourgeat ‡ Jordi Cortadella† † Universitat Politecnica de Catalunya ‡ Massachusetts.
Overview Part 1 – The Design Space
Registers and Counters
Asynchronous Interface Specification, Analysis and Synthesis
Concurrent Systems Modeling using Petri Nets – Part II
Synthesis of asynchronous controllers from Signal Transition Graphs:
De-synchronization: from synchronous to asynchronous
Clockless Logic: Asynchronous Pipelines
Introduction to Silicon Programming in the Tangram/Haste language
Lecture 3: Timing & Sequential Circuits
Presentation transcript:

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 1 VLSI Architectures Lecture 4 S&F Ch. 6: Speed-Independent Control Circuits

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 2 Control Circuits Many methods, tools, assumptions, frameworks Some tools generate both control and data- path We now consider one method / tool for control-only spec, verification and synthesis: –Assumption: Speed independent control for speed independent bundled data –Method: Signal Transition Graph A special type of Petri net –Tool: Petrify (mostly from Jordi Cortadella at UPC)

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 3 Some definitions… Fundamental mode: –When all inputs, outputs and internal nodes are stable, environment may change ONE input. Input / output mode: –When all inputs and outputs are stable, environment may change inputs. Guess which one is more realistic…

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 4 Delay Models Fixed delay: d=c Min-max delay: choose d  [m,M] –Max delay: d  [0,M] –Low-bounded delay: d  [m,  ) Unbounded delay: d  [0,  ) Inertial delay: Glitches are filtered. –We don’t assume inertial delays in async control design

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 5 Petri net, Signal Transition Graph (STG) a+b+ c+ a-b- c- a+b+ c+ a-b- c-

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 6 Separating Inputs, Outputs and Internals a+b+ c+ a-b- c- a+b+ c+ a-b- c- INPUTS, OUTPUTS

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 7 In the STG: PN transition are signal transitions PN places and arcs are causal relations Simple places (one arc in, one arc out) are omitted MARKING  assignment of tokens to places  state of the circuit

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 8 Token Preservation (from Lecture #1) Tokens do not disappear Tokens do not appear (from nowhere) One token does not overtake another A transition with n inputs and m outputs: –waits for n tokens on inputs –Generates m tokens on outputs nm

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 9 State Graph (SG) abc000 State: SG more complex than STG What complexity? Bad for design spec Needed for synthesis a+b+ a+ c+ a-b- a- c-

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 10 State Graph (SG) Quiescent Region QR(c=0) Quiescent Region QR(c=1) Excitation Region ER(c=R) Excitation Region ER(c=F) a+b+ a+ c+ a-b- a- c-

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 11 State Graph (SG) Quiescent Region QR(c=0) Quiescent Region QR(c=1) Excitation Region ER(c=R) Excitation Region ER(c=F) a+b+ a+ c+ a-b- a- c- SET C RESET C KEEP C=1 KEEP C=0

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 12 Synthesis: C Element Two methods: –Using gates –Using SR latch

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 13 Synthesis: C Element using gates a+b+ a+ c+ a-b- a- c- SET C KEEP C=1 00R0 F c\ab c=ab+ac+bc ! Hazardous Material !

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 14 Hazardous Hazards

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 15 Hazardous Hazards Hazards are no big deal in sync circuits Hazards may be deadly in (some) async circuits Now you know why you were taught about them in school… We can build hazard-free circuits We can (sometimes) make timing assumptions and add delays

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 16 Hiding Hazards Behind Delays Do you like slowing your circuit ?

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 17 Avoiding Hazards with Complex Gates Theoretically, same hazard problem (e.g. T1 slow) Must use caution or delay output

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 18 RESET C KEEP C=0 Synthesis: C Element using SR Latch a+b+ a+ c+ a-b- a- c- SET C KEEP C=1 00R0 F c\ab set=ab reset=ab

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 19 Synthesis using SR latches Needs (mutually exclusive) SET, RESET regions May use KEEP 0, KEEP 1 regions May use unreachable states (more later…) Area / performance / power may be more or less than gate circuits May use C elements instead of SR latches SR latches enable implementation with standard libraries

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 20 Implementation using Latches / Cel

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 21 Generalized C Element

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 22 More interesting PNs and STGs FORKs JOINs (input) CHOICE MERGE CONTROLLED CHOICE MUTUALLY EXCLUSIVE MUTUALLY EXCLUSIVE x+

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 23 STG with Input Choice x+y+ z+b+ y- x- z-b-

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 24 (S&F Fig. 6.8)

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 25 (S&F Fig. 6.8) dummy 2-Way Edge Req=1  Ctl stable and controls dummy1/2

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 26 A “Simple” Choice Net a b c d d+ a- a+b+ c- c+ b- b+ c+ d- RR00 b+ 01R0 c+ 0F10 a+ b+ 1R00 110R c- 0F10 b- d+ 11R1 a- F F d-c+ III

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 27 Unreachable States cd\ab xxx x xx x RR00 b+ 01R0 c+ 0F10 a+ b+ 1R00 110R c- 0F10 b- d+ 11R1 a- F F d-c+ 00 b c a+ b c b- d a d-c+

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 28 Regions and Maps For c cd\ab 0 R 00 xxRx x x1x F11x c=d+a’b+bc set(c)=d+a’b reset(c)=b’ RR00 b+ 01R0 c+ 0F10 a+ b+ 1R00 110R c- 00F0 b- d+ 11R1 a- F F d-c+

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 29 c=d+a’b+bc set(c)=d+a’b reset(c)=b’

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 30 Regions and Maps For d cd\ab 00R0 xx1x x xFx 000x d=abc’ set(d)=abc’ reset(d)=c RR00 b+ 01R0 c+ 0F10 a+ b+ 1R00 110R c- 0F10 b- d+ 11R1 a- F F d-c+

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 31 Walks on SG and KM cd\ab xxx x xx x I II I RR00 b+ 01R0 c+ 0F10 a+ b+ 1R00 110R c- 0F10 b- d+ 11R1 a- F F d-c+

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 32 STG Rules Any STG: –Input free-choice—Only mutex inputs may control the choice) –1-Bounded—Maximum 1 token per place –Liveness—No deadlocks STG for Speed Independent circuits: –Consistent state assignment—Signals strictly alternate between + and – –Persistency—Excited signals fire, namely they cannot be disabled by another transition Synthesizable STG: –Complete state coding—Different markings must represent different states

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 33 We use the following circuit to explain STG rules: req ack REQ ACK

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 34 1-Bounded (Safety) STG is safe if no place or arc can ever contain more than one token Often caused by one-sided dependency STG is not safe: If left cycle goes fast and right cycle lags, then arc ack+  REQ+ accumulates tokens. ( REQ+ depends on both ack+ and ACK- ) Possible solution: stop left cycle by right cycle REQ+ACK+ REQ- ACK- req+ack+ req- ack-

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 35 Liveness STG is live if from every reachable marking, every transition can eventually be fired The STG is not live: Transitions reset, reset_ok cannot be repeated. But non-liveness is useful for initialization reset_ok-reset-req+ack+ req- ack-

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 36 Consistent State Assignment The following subset of STG makes no sense: a+ a-

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 37 Persistency STG is persistent if for all arcs a*  b*, other arcs ensure that b* fires before opposite transition of a* (* is either + or -) Non-persistency may be caused by one-sided relations STG is not persistent (in addition to being unsafe): If left cycle goes fast and right cycle lags, then ack+  ack- before REQ+. Danger: Logic design may be REQ+ = ack+ Exception: If a*  b*, assume that the environment assures persistency. Possible solution: stop left cycle by right cycle. REQ+ACK+ REQ- ACK- req+ack+ req- ack-

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 38 Complete State Coding STG has a complete state coding if no two different markings have identical values for all signals. REQ+ACK+ REQ- ACK- ack-req+ ack+ req req,ack,REQ,ACK: cd\ab Disaster!

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 39 Complete State Coding Possible solution: Add an internal state variable x-x+ req,ack,REQ,ACK,x: REQ+ACK+ REQ- ACK- ack-req+ ack+ req

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 40 Original STG We have considered the following circuit and STG: REQ+ACK+ REQ- ACK- ack-req+ ack+ req- req ack REQ ACK

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 41 A faster STG? Does it need an extra variable? ack- req+ ack+ req- ACK- REQ+ ACK+ REQ-

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 42 Drawn by draw_astg

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 43 The SG req,ack,REQ,ACK: 0000 r a+ R r R a+ A A a+r R+ a- A r a- R R a+ R r a- R- A A a+ A r- a- A r r+ R- A a+ R a+ A r- R r- A- a-

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 44 The SG req,ack,REQ,ACK: 0000 r a+ R r R a+ A A a+r R+ a- A r a- R R a+ R r a- R- A A a+ A r- a- A r r+ R- A a+ R a+ A r- R r- A- a- R+

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 45 Drawn by write_sg & draw_astg

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 46 Extra states inserted by petrify

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 47 Rearranged STG ack- req+ ack+ req- c1- c1+ c0- ACK- REQ+ ACK+ REQ- c2- c2+ c0+ Initial Internal State: c0=c1=c2=1

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 48 The new State Graph…

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 49 The Synthesized Complex Gates Circuit INORDER = r A a R csc0 csc1 csc2; OUTORDER = [a] [R] [csc0] [csc1] [csc2]; [a] = a (csc2 + csc0) + csc1'; [R] = csc2 (csc0 (a + r) + R); [csc0] = csc0 (csc1' + a') + R' csc2; [csc1] = r' (csc0 + csc1); [csc2] = A' (csc0' (csc1' + a') + csc2);

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 50 Technology Mapping INORDER = r A a R csc0 csc1 csc2; OUTORDER = [a] [R] [csc0] [csc1] [csc2]; [0] = R'; # gate inv:combinational [1] = [0]' A' + csc2'; # gate oai12:combinational [a] = a csc0' + [1]; # gate sr_nor:asynch [3] = csc1'; # gate inv:combinational [4] = csc0' csc2' [3]'; # gate nor3:combinational [5] = [4]' (csc1' + R'); # gate aoi12:combinational [R] = [5]'; # gate inv:combinational [7] = (csc2' + a') (csc0' + A');# gate aoi22:combinational [8] = csc0'; # gate inv:combinational [csc0] = [8]' csc1' + [7]'; # gate oai12:combinational [csc1] = A' (csc0 + csc1); # gate rs_nor:asynch [11] = R'; # gate inv:combinational [12] = csc0' ([11]' + csc1'); # gate aoi12:combinational [csc2] = [12] (r' + csc2) + r' csc2; # gate c_element1:asynch

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 51 The Synthesized Gen-C Circuit INORDER = r A a R csc0 csc1 csc2; OUTORDER = [a] [R] [csc0] [csc1] [csc2]; [0] = csc0' csc1 (R' + A); [1] = csc0 csc2 (a + r); [2] = csc2' A; [R] = R [2]' + [1]; # mappable onto gC [4] = a csc1 csc2'; [csc0] = csc0 [4]' + csc2; # mappable onto gC [6] = r' csc0; [csc1] = csc1 r' + [6]; # mappable onto gC [8] = A' csc0' (csc1' + a'); [csc2] = csc2 R' + [8]; # mappable onto gC [a] = a [0]' + csc1'; # mappable onto gC

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 52 Petrify Environment STG EQN draw_astg ps write_sg SG lib petrify

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 53 Petrify Command line tool petrify –h for help (flags etc.) petrify –cg for complex gates petrify –gc for generalized C-elements petrify –tm for tech mapping draw_astg to draw write_sg to create state graphs Documented on line, incl. tutorial –See

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 54 Technical notes on Petrify Have or install cygwin on Windows – Download dot.exe –Linked from the petrify site Include the bin dir in your path –In.bash_profile in the cygwin home dir. For example: export CLASSPATH=. export PATH=/cygdrive/c/cygwin/home/ran/petrify-4.2/bin:$PATH To draw_astg: draw_astg –Tdot file.g –o file.dot dot –Tps file.dot –o file.ps

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 55 A safer STG? ack- req+ ack+ req- ACK- REQ+ ACK+ REQ-

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 56 A safer STG?

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 57 The safer STG is a serial circuit INORDER = r A a R; OUTORDER = [a] [R]; [a] = A; [R] = r; ack- req+ ack+ req- ACK- REQ+ ACK+ REQ- req ack REQ ACK

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 58 Yet another STG? ack- req+ ack+ req- ACK- REQ+ ACK+ REQ-

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 59 Yet another STG?

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 60

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 61 Output Handshake First ack- req+ ack+ req- ACK- REQ+ ACK+ REQ-

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 62 Still a serial controller

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 63 Synthesis INORDER = r A a R csc0; OUTORDER = [a] [R] [csc0]; [a] = csc0 A'; # gate and2_1:combinational [R] = r csc0'; # gate and2_1:combinational [2] = A' (csc0' + r'); # gate aoi12:combinational [csc0] = [2]'; # gate inv:combinational

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 64 A different STG ack- req+ ack+ req- ACK- REQ+ ACK+ REQ- Redundant, will be ignored by petrify

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 65 A different STG

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 66 Synthesis INORDER = r A a R; OUTORDER = [a] [R]; [a] = R; [1] = r A'; [2] = r' A; [R] = R [2]' + [1]; # mappable onto gC  R = R(Ar’)’+A’r = R(A’+r)+A’r = A’r + RA’ +Rr

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 67 Latch Control req ack REQ ACK req ack REQ ACK Enable Data-less fifo: Latch: Lt Enable=0: transparent

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 68 Constraints on sequence of events Must keep input data available until after it is latched Assume input data available only when req=1 Once ack+, req- (and input data invalid) can follow very fast Lt+ before ack+

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 69 Latch Control: STG Fragments ack- req+ ack+ req- ACK- REQ+ ACK+ REQ- req+ACK- REQ+ Lt+ ack+ req-ACK+ REQ- Lt- ack-

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 70 Latch Control: Combined STG ack- req+ ack+ req- ACK- REQ+ ACK+ REQ- Lt+ Lt-

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 71 Latch Control: Combined STG

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 72

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 73 INORDER = Rin Aout Ain Rout Lt; OUTORDER = [Ain] [Rout] [Lt]; [Ain] = Lt; [1] = Aout' Rin; [2] = Aout Rin'; [Rout] = Rout [2]' + [1]; # mappable onto gC [Lt] = Rout;  R = R(Ar’)’+A’r = R(A’+r)+A’r = A’r + RA’ +Rr

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 74 MUX Control So far all examples are doable “by hand” A deceptively simple example: Control for 4-phase bundled data mux PUSH channels

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 75 4-phase Bundled-data Mux We have already drawn this (dual-rail control): Easier to start with the dual-rail control

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 76 The environment Four independent environments (rCf, rCt share aC) Must not specify any dependency by mistake: a0- r0+ a0+ r0- a1- r1+ a1+ r1- aC- rCf+ aC+ rCf- A- R+ A+ R- In0In1Ctl.fOut aC- rCt+ aC+ rCt- Ctl.t

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 77 The control A+ must precede a0+ or a1+ (make sure data passed through and were captured) In0 and In1 handshakes must be made MUTEX Choices are matches with Merges

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 78 a0+ a1+ r0+ A+ rCf+ rCt+ aC- r1+ R+ Input Free Choice P3P1 aC+ P2 rCf- rCt- P4P4 r0-r0- r1-r1- P5P5 Controlled Choice Merge Duplicate arrow

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 79 mux.g

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 80 mux.gc Replicated transitions: R+, R+/1 Inserted state variable to retain In0 vs. In1 info

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 81 “Compact” state graph

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 82 Mux Control Synthesis INORDER = r0 r1 rCf rCt A a0 a1 aC R csc0 csc1; OUTORDER = [a0] [a1] [aC] [R] [csc0] [csc1]; [0] = r0 csc0; [a1] = csc1 r1; [2] = csc1 (A + csc0); [3] = rCt' rCf' csc1'; [aC] = aC [3]' + [2]; # mappable onto gC [R] = a0' csc0' csc1 r0 + csc0 csc1'; [6] = a0' csc1 A r0 + csc1' r1 A'; [7] = aC (r0' + a0); [csc0] = csc0 [7]' + [6]; # mappable onto gC [9] = r0 A' + csc0 A; [10] = r0' csc0' r1'; [csc1] = csc1 [10]' + [9]; # mappable onto gC [a0] = a0 r0 + [0]; # mappable onto gC

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 83 Another Mux (4p ctl, bd) (Fig 6.24)

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 84 SG for the Fig 6.24 mux

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 85 All-bundled Mux

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 86 All-bundled Mux Synthesis INORDER = In0Req OutAck In1Req Ctl CtlReq In1Ack In0Ack OutReq CtlAck csc0; OUTORDER = [In1Ack] [In0Ack] [OutReq] [CtlAck] [csc0]; [In1Ack] = OutAck csc0'; [In0Ack] = OutAck csc0; [2] = CtlReq (In1Req csc0' + In0Req Ctl'); [3] = CtlReq' (In1Req' csc0' + In0Req' csc0); [OutReq] = OutReq [3]' + [2]; # mappable onto gC [5] = OutAck' csc0; [CtlAck] = CtlAck [5]' + OutAck; # mappable onto gC [7] = OutAck' CtlReq'; [8] = CtlReq Ctl; [csc0] = csc0 [8]' + [7]; # mappable onto gC

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 87 Reduced concurrency mux

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 88 A simple filter: specification y := 0; loop x := READ (IN); WRITE (OUT, (x+y)/2); y := x; end loop R in A in A out R out IN OUT filter Following slides borrowed from Jordi Cortadella, UPC

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 89 A simple filter: block diagram xy + control R in A in R out A out RxRx AxAx RyRy AyAy RaRa AaAa IN OUT x and y are level-sensitive latches (transparent when R=1) + is a bundled-data adder (matched delay between R a and A a ) R in indicates the validity of IN After A in + the environment is allowed to change IN (R out,A out ) control a level-sensitive latch at the output

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 90 A simple filter: control spec. xy + control R in A in R out A out RxRx AxAx RyRy AyAy RaRa AaAa IN OUT R in + A in + R in - A in - Rx+Rx+ Ax+Ax+ Rx-Rx- Ax-Ax- Ry+Ry+ Ay+Ay+ Ry-Ry- Ay-Ay- Ra+Ra+ Aa+Aa+ Ra-Ra- Aa-Aa- R out + A out + R out - A out -

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 91 A simple filter: control impl. R in + A in + R in - A in - Rx+Rx+ Ax+Ax+ Rx-Rx- Ax-Ax- Ry+Ry+ Ay+Ay+ Ry-Ry- Ay-Ay- Ra+Ra+ Aa+Aa+ Ra-Ra- Aa-Aa- R out + A out + R out - A out - C R in A in RxRx AxAx RyRy AyAy AaAa RaRa A out R out

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 92 Control: observable behavior Rx+Rx+ R in + Ax+Ax+Ra+Ra+Aa+Aa+R out +A out +z+R out -A out -Ry+Ry+ Ry-Ry- Ay+Ay+ Rx-Rx-Ax-Ax- Ay-Ay- A in - A in + Ra-Ra- R in - Aa-Aa- z- C R in A in RxRx AxAx RyRy AyAy AaAa RaRa A out R out z

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 93 Rx+Rx+ R in + Ax+Ax+Ra+Ra+Aa+Aa+R out +A out +z+R out -A out -Ry+Ry+ Ry-Ry- Ay+Ay+ Rx-Rx-Ax-Ax- Ay-Ay- A in - A in + Ra-Ra- R in - Aa-Aa- z- R in + A in + R in - A in - Rx+Rx+ Ax+Ax+ Rx-Rx- Ax-Ax- Ry+Ry+ Ay+Ay+ Ry-Ry- Ay-Ay- Ra+Ra+ Aa+Aa+ Ra-Ra- Aa-Aa- R out + A out + R out - A out - z+ z- Backward Annotation

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 94 Homework Design an async controller and data path for computing the Fibonacci series (f n =f n-1 +f n-2, f 0 =f 1 =1). It starts with a Req and generates the sequence infinitely. Implement the controller with Petrify, using: –Complex gates –Generalized C elements –SR latches Compare the three solutions. Decide which three comparison criteria to use

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 95 Data Validity

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 96 Four Channel Types

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 97 2 phase protocols When are the data valid?

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 98 4 phase push protocols Four different possibilities:

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 99 Precharged CMOS Needs Extended-Early Input data valid during evaluate phase (REQ_IN=1) From Lecture 6

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 100 Hierarchical Order of Data Validity Broad can be used when any other type needed (etc.) Handshake-transparent circuits (function blocks): –Strength(outputs)  Strength(inputs) Latch: –Strength(outputs)  Strength(inputs) BROAD EXTENDED EARLY EARLYLATE weak strong

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 101 Validity Example: Join Extended early inputs, early outputs ya- yr+ ya+ yr- za- zr+ za+ zr- xa- xr+ xa+ xr- Why?

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 102 Validity Example: Latch Early inputs, extended early outputs req ack REQ ACK Enable Lt Enable=0: transparent ack- req+ ack+ req- ACK- REQ+ ACK+ REQ- Lt+ Lt-

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits phase pull protocols

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 104 Latch Decoupling

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 105 Simple Muller Pipeline with Latches Only half full (spread = 2) Stage i+1 must be empty before stage i can latch new data

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 106 Decoupled Pipeline with Latches We need a new controller Want to get spread=1

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 107 Latch Control Revisited Lt+ waits for REQ+ REQ+ waits for previous ACK- Result: Next stage must be EMPTY before we can latch. Wish to decouple the two sides. req ack REQ ACK Enable Lt Enable=0: transparent ack- req+ ack+ req- ACK- REQ+ ACK+ REQ- Lt+ Lt-

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 108 ack- req+ ack+ req- ACK- REQ+ ACK+ REQ- Lt+ Lt- Semi-Decoupled Latch Control All stages can be filled (spread=1) –[Furber & Day (1996), Furber and Liu (1996)] But ack- waits for ACK+  may slow a pipeline Add a second state variable  fully decouple input and output ack- req+ ack+ req- ACK- REQ+ ACK+ REQ- A+ Lt+ A- Lt- Early Input Extended Early output Early Input Early output

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 109 Fully-Decoupled Latch Control ack- decoupled from ACK+ But output validity too short – not good for holding data for function blocks Early Input ack- req+ ack+ req- ACK- REQ+ ACK+ REQ- A+ Lt+ A- Lt- B+ B- Early output ack- req+ ack+ req- ACK- REQ+ ACK+ REQ- A+ Lt+ A- Lt- Early Input Early output

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 110 Broad Decoupled Latch Control Complex control  decoupled pipeline latch ack- req+ ack+ req- ACK- REQ+ ACK+ REQ- A+ Lt+ A- Lt- Early Input Broad output B+ B- Early Input ack- req+ ack+ req- ACK- REQ+ ACK+ REQ- A+ Lt+ A- Lt- B+ B- Early output

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 111 CSC Input validity: early, output: broad (maybe convert output to early?) LD=1 latch transparent Pulsed Ld. Input validity: broad, output: broad

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 112 Normally Opaque Latch Control Latch opens only when new data arrive [Sparso et al., 1998] Lt-  REQ+ delay must be > Lt-  data out delay ack- req+ ack+ req- ACK- REQ+ ACK+ REQ- Lt- A+ Lt+ A- Early Input Broad++ output

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 113 Normally Opaque Latch Control

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 114 Doubly-Latched Pipeline “ultimate” decoupling –[Branover (1998), Kol & Ginosar (1997)] Neighbor stages are fully independent Cost: 2  latches CL control L M L S L M L S L M L S RoRi AoAi RoRi AoAi RoRi AoAi

© Ran Ginosar Lecture 4: Speed-Independent Control Circuits 115 Pipeline Schedule AB Task 1 Task 2 Task 3 A112 B211 A B A B A B synchronous fully-decoupled DLAP