Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bridging the gap between asynchronous design and designers Part II: Logic synthesis from concurrent specifications.

Similar presentations


Presentation on theme: "Bridging the gap between asynchronous design and designers Part II: Logic synthesis from concurrent specifications."— Presentation transcript:

1 Bridging the gap between asynchronous design and designers Part II: Logic synthesis from concurrent specifications

2 Outline Overview of the synthesis flow Specification State graph and next-state functions State encoding Implementability conditions Review of some advanced topics

3 Book and synthesis tool J. Cortadella, M. Kishinevsky, A. Kondratyev, L. Lavagno and A. Yakovlev, Logic synthesis for asynchronous controllers and interfaces, Springer-Verlag, 2002 petrify: http://www.lsi.upc.es/~petrify

4 4 Specification (STG) State Graph SG with CSC Next-state functions Decomposed functions Gate netlist Reachability analysis State encoding Boolean minimization Logic decomposition Technology mapping Design flow

5 x y z x+ x- y+ y- z+ z- Signal Transition Graph (STG) (Petri Net with interpreted events (often free-choice)) x y z Specification

6 x y z x+ x- y+ y- z+ z- Token flow

7 x+ x- y+ y- z+ z- xyz 000 x+ 100 y+ z+ y+ 101 110 111 x- 001 011 y+ z- 010 y- State graph

8 Next-state functions xyz 000 x+ 100 y+ z+ y+ 101 110 111 x- 001 011 y+ z- 010 y-

9 x z y Gate netlist

10 10 Specification (STG) State Graph SG with CSC Next-state functions Decomposed functions Gate netlist Reachability analysis State encoding Boolean minimization Logic decomposition Technology mapping Design flow

11 VME bus Device LDS LDTACK D DSr DSw DTACK VME Bus Controller Data Transceiver Bus DSr LDS LDTACK D DTACK Read Cycle

12 STG for the READ cycle LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ LDS LDTACK D DSr DTACK VME Bus Controller

13 Choice: Read and Write cycles DSr+ LDS+ LDTACK+ D+ DTACK+ DSr- D- DTACK- LDS- LDTACK- DSw+ D+ LDS+ LDTACK+ D- DTACK+ DSw- DTACK- LDS- LDTACK-

14 Choice: Read and Write cycles DTACK- DSr+ LDS+ LDTACK+ D+ DTACK+ DSr- D- LDS- LDTACK- DSw+ D+ LDS+ LDTACK+ D- DTACK+ DSw-

15 Circuit synthesis Goal: –Derive a hazard-free circuit under a given delay model and mode of operation

16 Speed independence Delay model –Unbounded gate / environment delays –Certain wire delays shorter than certain paths in the circuit Conditions for implementability: –Consistency –Complete State Coding –Persistency

17 17 Specification (STG) State Graph SG with CSC Next-state functions Decomposed functions Gate netlist Reachability analysis State encoding Boolean minimization Logic decomposition Technology mapping Design flow

18 STG for the READ cycle LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ LDS LDTACK D DSr DTACK VME Bus Controller

19 Binary encoding of signals DSr+ DTACK- LDS- LDTACK- D- DSr-DTACK+ D+ LDTACK+ LDS+

20 Binary encoding of signals DSr+ DTACK- LDS- LDTACK- D- DSr-DTACK+ D+ LDTACK+ LDS+ 10000 10010 10110 0111001110 01100 00110 10110 (DSr, DTACK, LDTACK, LDS, D)

21 QR (LDS+) QR (LDS-) Excitation / Quiescent Regions ER (LDS+) ER (LDS-) LDS- LDS+ LDS-

22 Next-state function 0  1 LDS- LDS+ LDS- 1  0 0  0 1  1

23 Next-state function. Exercise A+ C- A- C+ B+ B-

24 Next-state function 0  1 LDS- LDS+ LDS- 1  0 0  0 1  1 10110

25 Karnaugh map for LDS DTACK DSr D LDTACK 00011110 00 01 11 10 DTACK DSr D LDTACK 00011110 00 01 11 10 LDS = 0 LDS = 1 01-0 000000/1? 1 111 - - - --- ---- - ---- ---

26 26 Specification (STG) State Graph SG with CSC Next-state functions Decomposed functions Gate netlist Reachability analysis State encoding Boolean minimization Logic decomposition Technology mapping Design flow

27 Concurrency reduction LDS- LDS+ LDS- 10110 DSr+

28 Concurrency reduction LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+

29 State encoding conflicts LDS- LDTACK- LDTACK+ LDS+ 10110

30 Signal Insertion LDS- LDTACK- D- DSr- LDTACK+ LDS+ CSC- CSC+ 101101 101100

31 Regions and excitation regions Region = set of states r s.t. each event a either enters, or exits or does not cross it (Nielsen et el. 92) Pre-region: a exits r Post-region: a enters r Excitation region: all states exited by a Any region is a union of minimal regions a a a a a a a a enterexitnon-cross excitation region

32 Event insertion (Vanbekbergen ‘92) abc ER(x) bc a b SR(x) xxxx delay exit transitions by x S - ER(x)

33 Properties to preserve during insertion: –trace equivalence –speed-independence necessary and sufficient conditions: persistency (Speed-Independence Preserving set) u commutativity Event insertion (Vanbekbergen ‘92) a b a b ab ab

34 Regions and SIP sets Legal SIP set: –region –persistent excitation region –exit border of persistent region –intersection of pre-regions (if forward connected)

35 State signal insertion (Vanbekbergen ‘92) S+ (ER(x+)) and S- (ER(x-)) must be SIP (speed-independence) I-partition must not have illegal arcs (well- formedness) e.g. S0  S1, S+  S0, S1  S+ are illegal S0 S1 S-S+ x+x- S0 S1 S-S+

36 Using regions to find bipartitions Bricks: –minimal regions –intersections of pre-regions and post- regions Blocks =  bricks x+y+ x+ y- x- x+y+ x+ y- x-  (r1  r2)  (r3  r4) x+y+ x+ y- x-

37 From bipartition to I-partition Find a bipartition {b, b} Find (by expansion) exit borders which are well-formed and SIP Add the new state signal z x+y+ x+ y- x- x+y+ x+ y- x- x+y+ x+ y- x- 100010 110 111 011101 001 000 z-

38 The cost function Comparison among pairs of I-partitions: –correctness (SIP, well-formed,...) –number of solved CSC conflicts –estimation of logic Trade-off between CSC conflicts and estimated logic

39 Conclusions Regions guide the search among blocks of states Regions are the base of symbolic manipulations of TSs Property-preserving TS transformations Completeness: all CSC conflicts are solvable for excitation-closed TSs Designer/tool interaction made easier by STG  TS  STG transformations Public domain PN and asynchronous circuit synthesis tool Petrify: http://www.ac.upc.es/~vlsi/petrify/petrify.html

40 40 Specification (STG) State Graph SG with CSC Next-state functions Decomposed functions Gate netlist Reachability analysis State encoding Boolean minimization Logic decomposition Technology mapping Design flow

41 Complex-gate implementation

42 Implementability conditions Boundedness (reachability space is finite) Consistency –Rising and falling transitions of each signal alternate in any trace Complete state coding (CSC) –Next-state functions correctly defined Persistency –No event can be disabled by another event (unless they are both inputs)

43 Persistency 100000001 a- c+ b+b+ b+b+ a c b a c b is this a pulse ? Speed independence  glitch-free output behavior under any delay

44 Implementability conditions Bound + Consistent + CSC + Persistent There exists a speed-independent circuit that implements the behavior of the STG (under the assumption that any Boolean function can be implemented with one complex gate)

45 Implementability analysis Boundedness (reachability graph with  markings) Consistency (concurrency + alternation) Polynomial for free-choice Persistency (conflicts) Polynomial for free-choice CSC (requires SG extraction) Implementability analysis is not hard

46 Part3. Advanced topics Logic Decomposition Optimization based on timing information

47 a+ b+ c+ d+ a- b- d- a+ c-a- 0000 1000 1100 0100 0110 0111 1111 10111011 00111001 0001 a+ b+ c+ a- b- c- a+ c- a- d- d+

48 0000 1000 1100 0100 0110 0111 1111 10111011 00111001 0001 a+ b+ c+ a- b- c- a+ c- a- d- d+ ab cd 00011110 00 01 11 10 1 11 11 1 0 0000 ER(d+) ER(d-)

49 ab cd 00011110 00 01 11 10 1 11 11 1 0 0000 0000 1000 1100 0100 0110 0111 1111 10111011 00111001 0001 a+ b+ c+ a- b- c- a+ c- a- d- d+ Complex gate

50 Implementation with C elements C R S z  S+  z+  S-  R+  z-  R-  S (set) and R (reset) must be mutually exclusive S must cover ER(z+) and must not intersect ER(z-)  QR(z-) R must cover ER(z-) and must not intersect ER(z+)  QR(z+)

51 ab cd 00011110 00 01 11 10 1 11 11 1 0 0000 0000 1000 1100 0100 0110 0111 1111 10111011 00111001 0001 a+ b+ c+ a- b- c- a+ c- a- d- d+ C S R d

52 0000 1000 1100 0100 0110 0111 1111 10111011 00111001 0001 a+ b+ c+ a- b- c- a+ c- a- d- d+ C S R d but...

53 0000 1000 1100 0100 0110 0111 1111 10111011 00111001 0001 a+ b+ c+ a- b- c- a+ c- a- d- d+ C S R d Assume that R=ac has an unbounded delay Starting from state 0000 (R=1 and S=0): a+ ; R- ; b+ ; a- ; c+ ; S+ ; d+ ; R+ disabled (potential glitch)

54 ab cd 00011110 00 01 11 10 1 11 11 1 0 0000 0000 1000 1100 0100 0110 0111 1111 10111011 00111001 0001 a+ b+ c+ a- b- c- a+ c- a- d- d+ C S R d Monotonic covers

55 C-based implementations C S R d C d a b c a b c d weak a c d generalized C elements (gC) weak

56 Speed-independent implementations Implementability conditions –Consistency –Complete state coding –Persistency Circuit architectures –Complex (hazard-free) gates –C elements with monotonic covers –...

57 Synthesis exercise y- z-w- y+x+ z+ x- w+ 1011 0111 0011 1001 1000 1010 0001 00000101 00100100 0110 y- y+ x- x+ w+ w- z+ z- w- z- y+ x+ Derive circuits for signals x and z (complex gates and monotonic covers)

58 Synthesis exercise 1011 0111 0011 1001 1000 1010 0001 00000101 00100100 0110 y- y+ x- x+ w+ w- z+ z- w- z- y+ x+ wx yz 00011110 00 01 11 10 - - - - Signal x 1 0 1 1 1 1 1 0 0 0 0 0

59 Synthesis exercise 1011 0111 0011 1001 1000 1010 0001 00000101 00100100 0110 y- y+ x- x+ w+ w- z+ z- w- z- y+ x+ wx yz 00011110 00 01 11 10 - - - - Signal z 1 0 0 0 0 1 1 1 0 0 0 0

60 y- z-w- y+x+ z+ x- w+ 10011011 1000 1010 0001 00000101 00100100 01100111 0011 y- y+ x- x+ w+ w- z+ z- w- z- y+ x+ Logic decomposition: example

61 yz=1 yz=0 10011011 1000 1010 0001 00000101 00100100 01100111 0011 y- y+ x- x+ w+ w- z+ z- w- z- y+ x+ 10011011 1000 1010 0001 00000101 00100100 01100111 0011 y- y+ x- x+ w+ w- z+ z- w- z- y+ x+ C C x y x y w z x y z y z w z w z y Logic decomposition: example

62 s- s+ s- s=1 s=0 10011011 1000 1010 0111 0011 y+ x- w+ z+ z- 0001 00000101 00100100 0110 x+ w- z- y+ x+ 1001 1000 1010 y+ z- 0111 C C x y x y w z x y z w z w z y s y- Logic decomposition: example

63 y- z-w- y+x+ z+ x- w+ s- s+ s- s+ s- s=1 s=0 10011011 1000 1010 0111 0011 y+ x- w+ z+ z- 0001 00000101 00100100 0110 x+ w- z- y+ x+ 1001 1000 1010 y+ z- 0111 y- Logic decomposition: example

64 Speed-independent Netlist LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ DTACK D DSr LDS LDTACK csc map

65 Adding timing assumptions LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ DTACK D DSr LDS LDTACK csc map LDTACK- before DSr+ FAST SLOW

66 Adding timing assumptions DTACK D DSr LDS LDTACK csc map LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ LDTACK- before DSr+

67 State space domain LDTACK- before DSr+ LDTACK- DSr+

68 State space domain LDTACK- before DSr+ LDTACK- DSr+

69 State space domain LDTACK- before DSr+ LDTACK- DSr+ Two more unreachable states

70 Boolean domain DTACK DSr D LDTACK 00011110 00 01 11 10 DTACK DSr D LDTACK 00011110 00 01 11 10 LDS = 0 LDS = 1 01-0 000000/1? 1 111 - - - --- ---- - ---- ---

71 Boolean domain DTACK DSr D LDTACK 00011110 00 01 11 10 DTACK DSr D LDTACK 00011110 00 01 11 10 LDS = 0 LDS = 1 01-0 00-001 1 111 - - - --- ---- - ---- --- One more DC vector for all signalsOne state conflict is removed

72 Netlist with one constraint LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ DTACK D DSr LDS LDTACK csc map

73 Netlist with one constraint LDS+LDTACK+D+DTACK+DSr-D- DTACK- LDS-LDTACK- DSr+ DTACK D DSr LDS LDTACK LDTACK- before DSr+ TIMING CONSTRAINT

74 Conclusions STGs have a high expressiveness power at a low level of granularity (similar to FSMs for synchronous systems) Synthesis from STGs can be fully automated Synthesis tools often suffer from the state explosion problem (symbolic techniques are used) The theory of logic synthesis from STGs can be found in: J. Cortadella, M. Kishinevsky, A. Kondratyev, L. Lavagno and A. Yakovlev, Logic Synthesis of Asynchronous Controllers and Interfaces, Springer Verlag, 2002.

75 Synthesis from asynchronous HDL CSP based languages CSP = communicating sequential processes [T. Hoare] Two synthesis techniques –based on program transformations [Caltech] –based on direct compilation [Philips] Complete shift in design methodology is required

76 Using CSP for control generation After li goes high do full handshake at the right, then complete handshake at the left and iterate. li+ro+ri+ro-ri-lo+li-lo- ro ri li lo Q element *[[li];ro+;[ri];ro-;[not ri];lo+;[not li];lo-] “;” = sequencing operator ro+ = ro goes high; ro- = ro goes low [li] = wait until li is high; [not li] = wait until li is low CSP: STG:

77 Using CSP for control generation *[[li];ro+;[ri];ro-;[not ri];lo+;[not li];lo-] Conflict: ro+ and ro- are not mutually exclusive Eliminate conflict by state signal insertion (= CSC) CSP: Production rules: li -> ro+; ri -> ro- not ri -> lo+; not li -> lo- ri li ro weak

78 Conflict elimination *[[li];ro+;[ri];x+;[x];ro-;[not ri];lo+;[not li];x-;[not x];lo-] CSP: Production rules: not x and li -> ro+; x or not li -> ro- x and not ri -> lo+; not x or ri -> lo- ri -> x+; not li -> x- FF x not x li lo ri ro

79 Conclusions Generating circuits from CSP control program is similar to STG synthesis One can be reduced to the other Particular technique may vary. Direct CSP program transformations can be (and were) used instead of methods based on state space generation See reference list for more details

80 Buffer example in Tangram (a?byte & b!byte) begin x0: var byte | forever do a?x0 ; b!x0 od end Buffer * x a b T ; T a b passive port active port Each circle mapped to a netlist Data path Q element

81 Summary Tangram program is partitioned into data path and control Data path is implemented as dual or single rail Control is mapped to composition of standard elements (“;” “||” etc) Each standard element is mapped to a circuit Post-optimization is done Composing islands of control elements and re-synthesis with STG can give more aggressive optimization Philips made a few chips using Tangram, including a product: 8051 micro-controller in low-power pager Muna (25 wks battery life from one AAA battery) Public domain Tangram compiler is available (Balsa, U. Manchester)

82 Burst mode FSM s1 s2 s3 s4 b-/x- a+b+/y+ a-/x+y- c+/y- c-/y+ Close to synchronous FSMs with binary encoded I/O Work in bursts: –Input transitions fire –Output transitions fire –State signals change Mostly limited to fundamental mode: next input burst cannot arrive before stabilization at the outputs

83 Extended Burst mode s1 s2 s3 s4 b-/x- a+b*/y+ a-/x+y- c+/y- c-/y+ Directed don’t cares (b*): some concurrency is allowed for input transitions that do not influence an output burst Conditional guards = “if b=1 then …”

84 Synthesis of XBM Next state and output functions free of functional and logic hazards Sequential feedbacks should not introduce new hazards State assignment –one state of the BM spec to one layer of Karnaugh map –compatible layers are merged –layers are compatible if merging does not introduce CSC violations or hazards –Layers are encoded using race free encoding

85 XBM and STG s1 s2 s3 s4 b-/x- a+b*/y+ a-/x+y- c+/y- c-/y+ x- a+ y+ b+ eps c- a- c+ y- y+ x+ y- b-

86 Summary Specification: XBM is subclass of STGs Synthesis: techniques are extensions of synchronous state assignment and logic minimization Timing: –environment is limited to fundamental mode (difficult for pipelined and highly concurrent systems) –internals are delay insensitive See reference list for details


Download ppt "Bridging the gap between asynchronous design and designers Part II: Logic synthesis from concurrent specifications."

Similar presentations


Ads by Google