Bridging the gap between asynchronous design and designers Part II: Logic synthesis from concurrent specifications.

Slides:



Advertisements
Similar presentations
Delay models (I) A B C Real (analog) behaviorAbstract behavior A B C Abstractions are necessary to define delay models manageable for design, synthesis.
Advertisements

Andrey Mokhov, Victor Khomenko Danil Sokolov, Alex Yakovlev Dual-Rail Control Logic for Enhanced Circuit Robustness.
ECE 3110: Introduction to Digital Systems
Reading1: An Introduction to Asynchronous Circuit Design Al Davis Steve Nowick University of Utah Columbia University.
1 BalsaOpt a tool for Balsa Synthesis Francisco Fernández-Nogueira, UPC (Spain) Josep Carmona, UPC (Spain)
1 Advanced Digital Design Synthesis of Control Circuits by A. Steininger and J. Lechner Vienna University of Technology.
Hazard-free logic synthesis and technology mapping I Jordi Cortadella Michael Kishinevsky Alex Kondratyev Luciano Lavagno Alex Yakovlev Univ. Politècnica.
Hardware and Petri nets Synthesis of asynchronous circuits from Signal Transition Graphs.
Synthesis of Embedded Software Using Free-Choice Petri Nets.
Logic Decomposition of Asynchronous Circuits Using STG Unfoldings Victor Khomenko School of Computing Science, Newcastle University, UK.
Direct synthesis of large-scale asynchronous controllers using a Petri-net-based approach Ivan BlunnoPolitecnico di Torino Alex BystrovUniv. Newcastle.
Logic Synthesis for Asynchronous Circuits Based on Petri Net Unfoldings and Incremental SAT Victor Khomenko, Maciej Koutny, and Alex Yakovlev University.
Detecting State Coding Conflicts in STGs Using Integer Programming Victor Khomenko, Maciej Koutny, and Alex Yakovlev University of Newcastle upon Tyne.
Hardware and Petri nets: application to asynchronous circuit design Jordi CortadellaUniversitat Politècnica de Catalunya, Spain Michael KishinevskyIntel.
Introduction to asynchronous circuit design: specification and synthesis Jordi Cortadella, Universitat Politècnica de Catalunya, Spain Michael Kishinevsky,
Introduction to asynchronous circuit design: specification and synthesis Part IV: Synthesis from HDL Other synthesis paradigms.
Give qualifications of instructors: DAP
Introduction to asynchronous circuit design: specification and synthesis Part III: Advanced topics on synthesis of control circuits from STGs.
1 Logic design of asynchronous circuits Part II: Logic synthesis from concurrent specifications.
Asynchronous Sequential Logic
Introduction to asynchronous circuit design: specification and synthesis Part II: Synthesis of control circuits from STGs.
Combining Decomposition and Unfolding for STG Synthesis (application paper) Victor Khomenko 1 and Mark Schaefer 2 1 School of Computing Science, Newcastle.
1 Logic synthesis from concurrent specifications Jordi Cortadella Universitat Politecnica de Catalunya Barcelona, Spain In collaboration with M. Kishinevsky,
Asynchronous Interface Specification, Analysis and Synthesis M. Kishinevsky Intel Corporation J. Cortadella Technical University of Catalonia.
1 Logic design of asynchronous circuits Part III: Advanced topics on synthesis.
Visualisation and Resolution of Coding Conflicts in Asynchronous Circuit Design A. Madalinski, V. Khomenko, A. Bystrov and A. Yakovlev University of Newcastle.
Resolution of Encoding Conflicts by Signal Insertion and Concurrency Reduction based on STG Unfoldings V. Khomenko, A. Madalinski and A. Yakovlev University.
Behaviour-Preserving Transition Insertions in Unfolding Prefixes
STG-based synthesis and Petrify J. Cortadella (Univ. Politècnica Catalunya) Mike Kishinevsky (Intel Corporation) Alex Kondratyev (University of Aizu) Luciano.
1 State Encoding of Large Asynchronous Controllers Josep Carmona and Jordi Cortadella Universitat Politècnica de Catalunya Barcelona, Spain.
Synthesis of Asynchronous Control Circuits with Automatically Generated Relative Timing Assumptions Jordi Cortadella, University Politècnica de Catalunya.
A New Type of Behaviour- Preserving Transition Insertions in Unfolding Prefixes Victor Khomenko.
Detecting State Coding Conflicts in STGs Using SAT Victor Khomenko, Maciej Koutny, and Alex Yakovlev University of Newcastle upon Tyne.
Contemporary Logic Design FSM Optimization © R.H. Katz Transparency No Chapter #9: Finite State Machine 9.4 Choosing Flip-Flops 9.5 Machine Partitioning.
1 Petrify: Method and Tool for Synthesis of Asynchronous Controllers and Interfaces Jordi Cortadella (UPC, Barcelona, Spain), Mike Kishinevsky (Intel Strategic.
Automatic synthesis and verification of asynchronous interface controllers Jordi CortadellaUniversitat Politècnica de Catalunya, Spain Michael KishinevskyIntel.
Derivation of Monotonic Covers for Standard C Implementation Using STG Unfoldings Victor Khomenko.
CS 151 Digital Systems Design Lecture 32 Hazards
Asynchronous Circuit Verification and Synthesis with Petri Nets J. Cortadella Universitat Politècnica de Catalunya, Barcelona Thanks to: Michael Kishinevsky.
Sequential Circuits Chapter 4 S. Dandamudi To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S.
FPGA-Based System Design: Chapter 5 Copyright  2004 Prentice Hall PTR Topics n Basics of sequential machines. n Sequential machine specification. n Sequential.
Digital Computer Design Fundamental
A Usable Reachability Analyser Victor Khomenko Newcastle University.
UK Asynchronous Forum, September Synthesis of multiple rail phase encoding circuits Andrey Mokhov, Crescenzo D’Alessandro, Alex Yakovlev Microelectronics.
CY2003 Computer Systems Lecture 7 Petri net. © LJMU, 2004CY2003- Week 72 Overview Petri net –concepts –Petri net representation –Firing a transition –Marks.
Reading1: An Introduction to Asynchronous Circuit Design Al Davis Steve Nowick University of Utah Columbia University.
Hardware Design and The Petri Net Abhijit K. Deb SAM, LECS, IMIT, KTH Kista, Stockholm.
Petri Nets Lecturer: Roohollah Abdipour. Agenda Introduction Petri Net Modelling with Petri Net Analysis of Petri net 2.
1 Copyright  2001 Pao-Ann Hsiung SW HW Module Outline l Introduction l Unified HW/SW Representations l HW/SW Partitioning Techniques l Integrated HW/SW.
Curtis A. Nelson 1 Technology Mapping of Timed Circuits Curtis A. Nelson University of Utah September 23, 2002.
Lecture 11: FPGA-Based System Design October 18, 2004 ECE 697F Reconfigurable Computing Lecture 11 FPGA-Based System Design.
CSCI1600: Embedded and Real Time Software Lecture 11: Modeling IV: Concurrency Steven Reiss, Fall 2015.
ENG241 Digital Design Week #7 Sequential Circuits (Part B)
High Performance Embedded Computing © 2007 Elsevier Lecture 4: Models of Computation Embedded Computing Systems Mikko Lipasti, adapted from M. Schulte.
Specification mining for asynchronous controllers Javier de San Pedro† Thomas Bourgeat ‡ Jordi Cortadella† † Universitat Politecnica de Catalunya ‡ Massachusetts.
EE552 Extra Credit Project1 Extended Burst Mode Design Orignally Submitted by : Amish Patel Revised by : Sumit Bhargava
Structural methods for synthesis of large specifications
Victor Khomenko Newcastle University
Finite state machine optimization
Finite state machine optimization
Chapter #6: Sequential Logic Design
Synthesis from HDL Other synthesis paradigms
Asynchronous Interface Specification, Analysis and Synthesis
Introduction Introduction to VHDL Entities Signals Data & Scalar Types
Part IV: Synthesis from HDL Other synthesis paradigms
Clockless Computing COMP
332:437 Lecture 12 Finite State Machine Design
Synthesis of asynchronous controllers from Signal Transition Graphs:
Synthesis of multiple rail phase encoding circuits
CSE 370 – Winter Sequential Logic-2 - 1
Presentation transcript:

Bridging the gap between asynchronous design and designers Part II: Logic synthesis from concurrent specifications

Outline Overview of the synthesis flow Specification State graph and next-state functions State encoding Implementability conditions Review of some advanced topics

Book and synthesis tool J. Cortadella, M. Kishinevsky, A. Kondratyev, L. Lavagno and A. Yakovlev, Logic synthesis for asynchronous controllers and interfaces, Springer-Verlag, 2002 petrify:

4 Specification (STG) State Graph SG with CSC Next-state functions Decomposed functions Gate netlist Reachability analysis State encoding Boolean minimization Logic decomposition Technology mapping Design flow

x y z x+ x- y+ y- z+ z- Signal Transition Graph (STG) (Petri Net with interpreted events (often free-choice)) x y z Specification

x y z x+ x- y+ y- z+ z- Token flow

x+ x- y+ y- z+ z- xyz 000 x+ 100 y+ z+ y x y+ z- 010 y- State graph

Next-state functions xyz 000 x+ 100 y+ z+ y x y+ z- 010 y-

x z y Gate netlist

10 Specification (STG) State Graph SG with CSC Next-state functions Decomposed functions Gate netlist Reachability analysis State encoding Boolean minimization Logic decomposition Technology mapping Design flow

VME bus Device LDS LDTACK D DSr DSw DTACK VME Bus Controller Data Transceiver Bus DSr LDS LDTACK D DTACK Read Cycle

STG for the READ cycle LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ LDS LDTACK D DSr DTACK VME Bus Controller

Choice: Read and Write cycles DSr+ LDS+ LDTACK+ D+ DTACK+ DSr- D- DTACK- LDS- LDTACK- DSw+ D+ LDS+ LDTACK+ D- DTACK+ DSw- DTACK- LDS- LDTACK-

Choice: Read and Write cycles DTACK- DSr+ LDS+ LDTACK+ D+ DTACK+ DSr- D- LDS- LDTACK- DSw+ D+ LDS+ LDTACK+ D- DTACK+ DSw-

Circuit synthesis Goal: –Derive a hazard-free circuit under a given delay model and mode of operation

Speed independence Delay model –Unbounded gate / environment delays –Certain wire delays shorter than certain paths in the circuit Conditions for implementability: –Consistency –Complete State Coding –Persistency

17 Specification (STG) State Graph SG with CSC Next-state functions Decomposed functions Gate netlist Reachability analysis State encoding Boolean minimization Logic decomposition Technology mapping Design flow

STG for the READ cycle LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ LDS LDTACK D DSr DTACK VME Bus Controller

Binary encoding of signals DSr+ DTACK- LDS- LDTACK- D- DSr-DTACK+ D+ LDTACK+ LDS+

Binary encoding of signals DSr+ DTACK- LDS- LDTACK- D- DSr-DTACK+ D+ LDTACK+ LDS (DSr, DTACK, LDTACK, LDS, D)

QR (LDS+) QR (LDS-) Excitation / Quiescent Regions ER (LDS+) ER (LDS-) LDS- LDS+ LDS-

Next-state function 0  1 LDS- LDS+ LDS- 1  0 0  0 1  1

Next-state function. Exercise A+ C- A- C+ B+ B-

Next-state function 0  1 LDS- LDS+ LDS- 1  0 0  0 1 

Karnaugh map for LDS DTACK DSr D LDTACK DTACK DSr D LDTACK LDS = 0 LDS = /1?

26 Specification (STG) State Graph SG with CSC Next-state functions Decomposed functions Gate netlist Reachability analysis State encoding Boolean minimization Logic decomposition Technology mapping Design flow

Concurrency reduction LDS- LDS+ LDS DSr+

Concurrency reduction LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+

State encoding conflicts LDS- LDTACK- LDTACK+ LDS

Signal Insertion LDS- LDTACK- D- DSr- LDTACK+ LDS+ CSC- CSC

Regions and excitation regions Region = set of states r s.t. each event a either enters, or exits or does not cross it (Nielsen et el. 92) Pre-region: a exits r Post-region: a enters r Excitation region: all states exited by a Any region is a union of minimal regions a a a a a a a a enterexitnon-cross excitation region

Event insertion (Vanbekbergen ‘92) abc ER(x) bc a b SR(x) xxxx delay exit transitions by x S - ER(x)

Properties to preserve during insertion: –trace equivalence –speed-independence necessary and sufficient conditions: persistency (Speed-Independence Preserving set) u commutativity Event insertion (Vanbekbergen ‘92) a b a b ab ab

Regions and SIP sets Legal SIP set: –region –persistent excitation region –exit border of persistent region –intersection of pre-regions (if forward connected)

State signal insertion (Vanbekbergen ‘92) S+ (ER(x+)) and S- (ER(x-)) must be SIP (speed-independence) I-partition must not have illegal arcs (well- formedness) e.g. S0  S1, S+  S0, S1  S+ are illegal S0 S1 S-S+ x+x- S0 S1 S-S+

Using regions to find bipartitions Bricks: –minimal regions –intersections of pre-regions and post- regions Blocks =  bricks x+y+ x+ y- x- x+y+ x+ y- x-  (r1  r2)  (r3  r4) x+y+ x+ y- x-

From bipartition to I-partition Find a bipartition {b, b} Find (by expansion) exit borders which are well-formed and SIP Add the new state signal z x+y+ x+ y- x- x+y+ x+ y- x- x+y+ x+ y- x z-

The cost function Comparison among pairs of I-partitions: –correctness (SIP, well-formed,...) –number of solved CSC conflicts –estimation of logic Trade-off between CSC conflicts and estimated logic

Conclusions Regions guide the search among blocks of states Regions are the base of symbolic manipulations of TSs Property-preserving TS transformations Completeness: all CSC conflicts are solvable for excitation-closed TSs Designer/tool interaction made easier by STG  TS  STG transformations Public domain PN and asynchronous circuit synthesis tool Petrify:

40 Specification (STG) State Graph SG with CSC Next-state functions Decomposed functions Gate netlist Reachability analysis State encoding Boolean minimization Logic decomposition Technology mapping Design flow

Complex-gate implementation

Implementability conditions Boundedness (reachability space is finite) Consistency –Rising and falling transitions of each signal alternate in any trace Complete state coding (CSC) –Next-state functions correctly defined Persistency –No event can be disabled by another event (unless they are both inputs)

Persistency a- c+ b+b+ b+b+ a c b a c b is this a pulse ? Speed independence  glitch-free output behavior under any delay

Implementability conditions Bound + Consistent + CSC + Persistent There exists a speed-independent circuit that implements the behavior of the STG (under the assumption that any Boolean function can be implemented with one complex gate)

Implementability analysis Boundedness (reachability graph with  markings) Consistency (concurrency + alternation) Polynomial for free-choice Persistency (conflicts) Polynomial for free-choice CSC (requires SG extraction) Implementability analysis is not hard

Part3. Advanced topics Logic Decomposition Optimization based on timing information

a+ b+ c+ d+ a- b- d- a+ c-a a+ b+ c+ a- b- c- a+ c- a- d- d+

a+ b+ c+ a- b- c- a+ c- a- d- d+ ab cd ER(d+) ER(d-)

ab cd a+ b+ c+ a- b- c- a+ c- a- d- d+ Complex gate

Implementation with C elements C R S z  S+  z+  S-  R+  z-  R-  S (set) and R (reset) must be mutually exclusive S must cover ER(z+) and must not intersect ER(z-)  QR(z-) R must cover ER(z-) and must not intersect ER(z+)  QR(z+)

ab cd a+ b+ c+ a- b- c- a+ c- a- d- d+ C S R d

a+ b+ c+ a- b- c- a+ c- a- d- d+ C S R d but...

a+ b+ c+ a- b- c- a+ c- a- d- d+ C S R d Assume that R=ac has an unbounded delay Starting from state 0000 (R=1 and S=0): a+ ; R- ; b+ ; a- ; c+ ; S+ ; d+ ; R+ disabled (potential glitch)

ab cd a+ b+ c+ a- b- c- a+ c- a- d- d+ C S R d Monotonic covers

C-based implementations C S R d C d a b c a b c d weak a c d generalized C elements (gC) weak

Speed-independent implementations Implementability conditions –Consistency –Complete state coding –Persistency Circuit architectures –Complex (hazard-free) gates –C elements with monotonic covers –...

Synthesis exercise y- z-w- y+x+ z+ x- w y- y+ x- x+ w+ w- z+ z- w- z- y+ x+ Derive circuits for signals x and z (complex gates and monotonic covers)

Synthesis exercise y- y+ x- x+ w+ w- z+ z- w- z- y+ x+ wx yz Signal x

Synthesis exercise y- y+ x- x+ w+ w- z+ z- w- z- y+ x+ wx yz Signal z

y- z-w- y+x+ z+ x- w y- y+ x- x+ w+ w- z+ z- w- z- y+ x+ Logic decomposition: example

yz=1 yz= y- y+ x- x+ w+ w- z+ z- w- z- y+ x y- y+ x- x+ w+ w- z+ z- w- z- y+ x+ C C x y x y w z x y z y z w z w z y Logic decomposition: example

s- s+ s- s=1 s= y+ x- w+ z+ z x+ w- z- y+ x y+ z C C x y x y w z x y z w z w z y s y- Logic decomposition: example

y- z-w- y+x+ z+ x- w+ s- s+ s- s+ s- s=1 s= y+ x- w+ z+ z x+ w- z- y+ x y+ z y- Logic decomposition: example

Speed-independent Netlist LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ DTACK D DSr LDS LDTACK csc map

Adding timing assumptions LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ DTACK D DSr LDS LDTACK csc map LDTACK- before DSr+ FAST SLOW

Adding timing assumptions DTACK D DSr LDS LDTACK csc map LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ LDTACK- before DSr+

State space domain LDTACK- before DSr+ LDTACK- DSr+

State space domain LDTACK- before DSr+ LDTACK- DSr+

State space domain LDTACK- before DSr+ LDTACK- DSr+ Two more unreachable states

Boolean domain DTACK DSr D LDTACK DTACK DSr D LDTACK LDS = 0 LDS = /1?

Boolean domain DTACK DSr D LDTACK DTACK DSr D LDTACK LDS = 0 LDS = One more DC vector for all signalsOne state conflict is removed

Netlist with one constraint LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ DTACK D DSr LDS LDTACK csc map

Netlist with one constraint LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ DTACK D DSr LDS LDTACK LDTACK- before DSr+ TIMING CONSTRAINT

Conclusions STGs have a high expressiveness power at a low level of granularity (similar to FSMs for synchronous systems) Synthesis from STGs can be fully automated Synthesis tools often suffer from the state explosion problem (symbolic techniques are used) The theory of logic synthesis from STGs can be found in: J. Cortadella, M. Kishinevsky, A. Kondratyev, L. Lavagno and A. Yakovlev, Logic Synthesis of Asynchronous Controllers and Interfaces, Springer Verlag, 2002.

Synthesis from asynchronous HDL CSP based languages CSP = communicating sequential processes [T. Hoare] Two synthesis techniques –based on program transformations [Caltech] –based on direct compilation [Philips] Complete shift in design methodology is required

Using CSP for control generation After li goes high do full handshake at the right, then complete handshake at the left and iterate. li+ro+ri+ro-ri-lo+li-lo- ro ri li lo Q element *[[li];ro+;[ri];ro-;[not ri];lo+;[not li];lo-] “;” = sequencing operator ro+ = ro goes high; ro- = ro goes low [li] = wait until li is high; [not li] = wait until li is low CSP: STG:

Using CSP for control generation *[[li];ro+;[ri];ro-;[not ri];lo+;[not li];lo-] Conflict: ro+ and ro- are not mutually exclusive Eliminate conflict by state signal insertion (= CSC) CSP: Production rules: li -> ro+; ri -> ro- not ri -> lo+; not li -> lo- ri li ro weak

Conflict elimination *[[li];ro+;[ri];x+;[x];ro-;[not ri];lo+;[not li];x-;[not x];lo-] CSP: Production rules: not x and li -> ro+; x or not li -> ro- x and not ri -> lo+; not x or ri -> lo- ri -> x+; not li -> x- FF x not x li lo ri ro

Conclusions Generating circuits from CSP control program is similar to STG synthesis One can be reduced to the other Particular technique may vary. Direct CSP program transformations can be (and were) used instead of methods based on state space generation See reference list for more details

Buffer example in Tangram (a?byte & b!byte) begin x0: var byte | forever do a?x0 ; b!x0 od end Buffer * x a b T ; T a b passive port active port Each circle mapped to a netlist Data path Q element

Summary Tangram program is partitioned into data path and control Data path is implemented as dual or single rail Control is mapped to composition of standard elements (“;” “||” etc) Each standard element is mapped to a circuit Post-optimization is done Composing islands of control elements and re-synthesis with STG can give more aggressive optimization Philips made a few chips using Tangram, including a product: 8051 micro-controller in low-power pager Muna (25 wks battery life from one AAA battery) Public domain Tangram compiler is available (Balsa, U. Manchester)

Burst mode FSM s1 s2 s3 s4 b-/x- a+b+/y+ a-/x+y- c+/y- c-/y+ Close to synchronous FSMs with binary encoded I/O Work in bursts: –Input transitions fire –Output transitions fire –State signals change Mostly limited to fundamental mode: next input burst cannot arrive before stabilization at the outputs

Extended Burst mode s1 s2 s3 s4 b-/x- a+b*/y+ a-/x+y- c+/y- c-/y+ Directed don’t cares (b*): some concurrency is allowed for input transitions that do not influence an output burst Conditional guards = “if b=1 then …”

Synthesis of XBM Next state and output functions free of functional and logic hazards Sequential feedbacks should not introduce new hazards State assignment –one state of the BM spec to one layer of Karnaugh map –compatible layers are merged –layers are compatible if merging does not introduce CSC violations or hazards –Layers are encoded using race free encoding

XBM and STG s1 s2 s3 s4 b-/x- a+b*/y+ a-/x+y- c+/y- c-/y+ x- a+ y+ b+ eps c- a- c+ y- y+ x+ y- b-

Summary Specification: XBM is subclass of STGs Synthesis: techniques are extensions of synchronous state assignment and logic minimization Timing: –environment is limited to fundamental mode (difficult for pipelined and highly concurrent systems) –internals are delay insensitive See reference list for details