Presentation is loading. Please wait.

Presentation is loading. Please wait.

Asynchronous Design Using Commercial HDL Synthesis Tools Michiel Ligthart Karl Fant Ross Smith Alexander Taubin Alex Kondratyev.

Similar presentations


Presentation on theme: "Asynchronous Design Using Commercial HDL Synthesis Tools Michiel Ligthart Karl Fant Ross Smith Alexander Taubin Alex Kondratyev."— Presentation transcript:

1 Asynchronous Design Using Commercial HDL Synthesis Tools Michiel Ligthart Karl Fant Ross Smith Alexander Taubin Alex Kondratyev

2 Outline Added Value of NCL - Simplification of design Added Value of NCL - Simplification of design Canonical form of gates - The key for optimization Canonical form of gates - The key for optimization NCL in CAD flow. An example NCL in CAD flow. An example Validation of optimization Validation of optimization Experimental results Experimental results Conclusion and future work Conclusion and future work

3 Outline

4 Inherent to asynchronous  Inherent to asynchronous -no clock system -low EMI -free stand-by mode, etc. Particular to NULL Convention Logic (NCL)  Particular to NULL Convention Logic (NCL) -ease of design (reduced time to market) ­use standard HDL and commercial tools to simulate and synthesize asynchronous circuits -nicely fits current/future (DSM) technology Inherent to delay-insensitive  Inherent to delay-insensitive -easy to reuse design -plug-’n’-play SoC design -easily portable among technologies Potential NCL Advantages Potential NCL Advantages

5 Outline

6 NULL Data Communication Based on DI Encoding Completion detection Combinational circuitry Request for DATA/NULL DI protocol with spacer (NULL) –NULL propagation / NULL acknowledge –Data propagation / Data acknowledge Register DATA Completion by codeword

7 Logic gate no data present NCL:Pushing Two-phase Behavior Down NCL: Pushing Two-phase Behavior Down to the Level of Each Gate

8 Logic gate complete data present NCL:Pushing Two-phase Behavior Down NCL: Pushing Two-phase Behavior Down to the Level of Each Gate Gate output acknowledges input changes Simplest DI encoding - dual-rail [Sims’58]

9 General Implementation of Hysteresis Gates in CMOS p-tree n-tree Set function Reset... x1 xn g g=S+gR Dual-rail circuits under two-phase operation: A transition from NULL to Data is monotonicA transition from NULL to Data is monotonic An input transition to NULL resets all gates to NULLAn input transition to NULL resets all gates to NULL Set is positively unate Reset

10 Refined Implementation of NCL Hysteresis Gates in CMOS n-tree Set function... x1 xn g Depends only on the number of inputs Canonical form of reset is the key to use synchronous optimization tools Reset of each individual gate scales up to the whole networkg=S+gR

11 Room for optimization 2 2 3 2 3 4 2 3 4 5 1 11 1 M of N threshold gates with hysteresis behavior C-element equivalents OR gate equivalents DIMS [Muller’62] [Sparso’92] Family of Logic Gates

12 z=ab+ac+bc+z(a+b+c) The gate switchesThe gate switches to when M inputs are to data when M inputs are data to when all inputs are to NULL when all inputs are NULL It is possible to use “negative logic” – reversing pull-up and pull-down networks a b b b c c a a z Example: 2-of-3 Threshold Gate with Hysteresis c

13 Outline

14 RTL Design Flow – Combinational Optimization Separate combinational logic and registers Request for data/null reset Combi- national process Request for data/null Sequential process Replaced by NCL registration in RTL code Subject of synthesis and optimization The topic of this presentation

15 NCLlibrary VHDL Generic library Synthesis Dual-raildefinition Intermediate netlist NCLnetlist Synthesis  Step 1. Translate HDL into “synchronous” netlist Translate HDL into “synchronous” netlist  Step 2. Convert intermediate netlist into NCL netlist Convert intermediate netlist into NCL netlist Two-Step Synthesis Flow (Using Synopsys' Design Compiler) (Using Synopsys' Design Compiler)

16 RTL description (MUX) entity test input a,b,s : ncl_logic; output z : ncl_logic; architecture process (a, b, s) is begin if s = ‘1’ then z <= a; else z <= b; end if; end process; a b s z Input to Step 1: RTL Description (Multiplexer Example)

17 MUX Example: Output of Step 1 / Input to Step 2: Intermediate Netlist a s b x y z Two input NAND gates a b s z

18 Dual-rail Package Define type type dual_rail_logic is record rail1 : std_logic ; rail0 : std_logic ; end record; a.0 a.1 {0,1} a {0,1,N} function “not” a.0 a.1 z.1 z.0 Overload operators 22 13 z.0 z.1 a.0 a.1 b.0 b.1 function “nand” th22 = two-input C-element th13 = three-input OR

19 Optimizing with Design Compiler  Dual-rail expansion  Two phases (set and reset) are separated  Set phase ensures circuit functionality  Reset phase is implied  Optimizations are applied to the set phase

20 Dual-rail Expansion of MUX a s b x y z Naive semi-static DIMS implementation – 114 transistors (can be reduced to 63 transistors by merging C-elements with OR-gates) versus 14 for a synchronous circuit b.f a.t b.t D-R NAND D-R NAND D-R NAND x.t s.f a.f x.f y.t y.f z.t z.f s.t

21 “Images”-Boolean Gates Implementing Set Functions NCL gates z=ab+z(a+b) a b th22 z a a b z z=a+b z=a(b+c)+z(a+b+c) b c th33w2 z … Boolean gates (images) z=ab a a b z z=a+b z=a(b+c) a b z b c z … equivalent for set phase In the initial state: z=a=b=c=0 Hysteresis- sequential behavior Combinational behavior Projection for optimization Mapping for implementation

22 Image of Dual-rail NAND Gate out.t out.f C C C C a.t b.t a.f b.f D-R NAND a.t a.f b.t b.f C-element equation: z=ab+z(a+b). out.t out.f

23 Image of Dual-rail NAND Gate out.t out.f a.t b.t a.f b.f C-element equation: z=ab+z(a+b), initially z=a=b=0 In a set phase it behaves like an AND gate z=ab In a set phase it behaves like an AND gate z=ab

24 Dual-rail Expansion for MUX b.f a.t b.t x.t s.f a.f x.f y.t y.f z.t z.f s.t Twelve 2-input C-gates & Three 3-input OR-gates

25 Image Circuit of Dual-rail Expansion for MUX b.f a.t b.t x.t s.f a.f x.f y.t y.f z.t z.f s.t

26 Optimized with Design Compiler MUX circuit passes technology independent optimization and is mapped to “images” of gates from NCL library. b.f a.t b.t s.f a.f s.t z.t z.f image of th33w2 A(B+C) image of thXOR AB+CD

27 Technology Mapping with Design Compiler images are replaced by gates with hysteresis NCL circuit: images are replaced by gates with hysteresis b.f a.t b.t s.f a.f s.t thXOR z.t z.f th33w2 th33w2 thXOR 2 2 thXOR 2 2 th22 th24w2 f e m n e f m n k Semi-static CMOS implementation of thXOR. 44 transistors - 30% better than optimized DIMS

28 Outline

29 Optimization Flow Boolean circuit Dual-rail image translation Optimized circuit optimization Mapped to images dual-rail package Design compiler Design compiler tech.mapping Synchronous DIMS circuit Hysteresis gates DI equivalence Asynchronous Virtual object Real object

30 Validation of Optimization DI equivalence The validity of transformations (DI equivalence) is based on two properties: Functional equivalence  Functional equivalence of optimized and original circuits (under two-phase operation) Maintenance of DI properties  Maintenance of DI properties in optimized circuit Both are based on the properties of prime and irredundant networks and properties of algebraic factorization [Brayton’90, Hachtel’92]

31 Starting point: prime and irredundant Boolean network (known to be 100% stuck-at testable, [Scherz’72]) algebraic transformations Set of test vectors for stuck-at faults is maintained [Hachtel’92] induction by topology order Testability: each gate acknowledges inputs changes (Delay insensitivity) Same for tree-based technology mapping Validation of Optimization: Idea of the Proof

32 Outline

33 Manual vs. Synthesized Designs Area (transistor number) For bigger circuits Synthesis/Manual ratio is better (22% improvement for biggest example)

34 Synchronous vs. NCL design gates transistors Penalty in transistors:  Dual-rail implementation  Effective delay-insensitivity To reduce transistor count:  Use four-rail encoding  Improve architectural solutions: e.g., OR instead MUX  Compromise delay insensitivity

35 Outline

36 Conclusions First methodology to use standard HDL and commercial tools both to simulate and synthesize asynchronous circuits First methodology to use standard HDL and commercial tools both to simulate and synthesize asynchronous circuits The methodology is formally validated The methodology is formally validated The results of the synthesis are acceptable The results of the synthesis are acceptable

37 Future Tasks  Reduce area/power without losing delay insensitivity (e.g., four-rail design)  Relax DI requirements to reduce area (e.g., using timing assumptions)  Use peephole optimizations (e.g., merge gates used for registration with their input gates etc.)  Write DesignWare components to get better performance for arithmetic units (infer hand designed components)

38  Completion detection (request signal)  Inverter (acknowledgement signal) Structural View on Sequential NCL

39 orphans 2 2 2 2 00 01 10 11 A0 A1 B0 B1 1 1 1 S0 S1 C0 C1 Propagation of DATA/NULL through orphans is not acked by output Timing assumption: orphans paths are faster than circuit cycle time Orphans are:  more local than fundamental mode assumption (concern particular paths)  safer than isochronic forks (compare wire delays to cycle time) 2NCL Delay-sensitivity: Orphans

40 3 3 2 2 111 000 111 000 A0 A1 C0 C1 B0 B1 S0 S1 Full adder Orphan containing gate  do not cross the completion boundaries  could be avoided by adding observability points C1 C0 Orphans (continued)


Download ppt "Asynchronous Design Using Commercial HDL Synthesis Tools Michiel Ligthart Karl Fant Ross Smith Alexander Taubin Alex Kondratyev."

Similar presentations


Ads by Google