Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSE 246: Computer Arithmetic Algorithms and Hardware Design Instructor: Prof. Chung-Kuan Cheng Lecture 4: Adders.

Similar presentations


Presentation on theme: "CSE 246: Computer Arithmetic Algorithms and Hardware Design Instructor: Prof. Chung-Kuan Cheng Lecture 4: Adders."— Presentation transcript:

1 CSE 246: Computer Arithmetic Algorithms and Hardware Design Instructor: Prof. Chung-Kuan Cheng Lecture 4: Adders

2 CSE 2462 Topics:  Adders AND/OR gate v.s. Circuit Logic Design Graph Design (Prefix Adder)

3 CSE 2463 Chapter 2: ADDERS  Half Adders Half adders can add two 1-bit binary numbers when there is no carry in. If the inputs are x i and y i, the sum and carry-out is given by the formula  s i = x i ^ y i  c i+1 = x i. y i We use the following notations throughout the slides . means logical AND  + means logical OR  ^ means logical XOR  ‘ means complementation

4 CSE 2464 Full Adder  The inputs are x[i], y[i] (operand bits) and c[i] (carry in)  The outputs are s[i] (result bit) and c[i+1] (carry out)  Inputs and outputs are related by these relations s[i] = x[i] ^ y[i] ^ c[i] c[i+1] = x[i].y[i] + c[i].(x[i] + y[i]) = x[i].y[i] + c[i].(x[i] ^ y[i])

5 CSE 2465 Full Adder  If carry-in bit is zero, then full adder becomes half adder  If carry-in bit is one, then s[i] = (x[i] ^ y[i]) ’ c[i+1] = x[i] + y[i]  To add two n-bit numbers, we can chain n full adders to build a ripple carry adder

6 CSE 2466 Ripple Carry Adder x[0] y[0] cin/c[0] s[0]... x[1] y[1] c[1] x[n-1] y[n-1] c[n-1] s[1] c[2] s[n-1] cout Overflow happen when operands are of same sign, and the result is of different sign. If we use 2’s complement to represent negative numbers, overflow occurs when (cout ^ c[n-1]) is 1

7 CSE 2467 Ripple Carry Adder  For sake of brevity, we use the following notations: g[i] = x[i].y[i] p[i] = x[i] + y[i]  In terms of these notations, we can rewrite carry equations as c[1] = g[0] + p[0].c[0] c[2] = g[1] + p[1].c[1] and so on … We shall use these notations afterwards while discussing the design of other kind of adders  It has been observed that expected length of carry chain is 2, while expected maximal length of carry chain is lg n. Hence, ripple carry adders are in general fast.

8 CSE 2468 Ripple Carry Adder  How do know that an adder has completed the operation? Worst case scenario: Wait for the longest chain in the carry propagation network We might inspect c[i+1] and its complement b[i+1] to determine the status of the adder c[i+1]b[i+1]Remark 00Not complete 10Complete 01 11Don ’ t care

9 CSE 2469 Improvement to Ripple Carry Adder: Manchester Adders  By intelligently using our device properties, we can reduce the complexity of the circuit used to compute carries in a ripple carry adder.  Define: a[i] = (x[i]) ’.(y[i]) ’  Next we observe that c[i+1] is 1 in exactly these scenarios: g[i] is 1, i.e. both x[i] & y[i] are 1 c[i] is 1 and it is propagated because p[i] is 1  c[i+1] is ‘ pulled down ’ to logic 0 irrespective of the value of c[i], when a[i] is 1, i.e. both x[i] and y[i] are 0  From these conditions, and keeping in mind the general characteristics of transistor devices we can design simplified circuits for computing carries – as shown in the next slide

10 CSE 24610 Improvement to Ripple Carry Adder: Manchester Adders

11 CSE 24611 Implementation of Manchester Adder using MOS transistors This is essentially the same circuit for computing carry, but implemented with MOS devices

12 CSE 24612 Manchester Adder: Alternate design  We divide the computation cycle into two distinct half-cycle : ‘ precharge ’ and ‘ evaluate ’. In the precharge half- cycle, g[i] and c[i+1] are assigned a tentative value of logic 1. This is evaluated in the next half-cycle with actual value of a[i].  The actual circuit for computing carries is shown in the next slide.

13 CSE 24613 Manchester Adder: Alternate design Time  Q precharge evaluation

14 CSE 24614 Carry Look-ahead Adder  In a ripple-carry adder m-full adders are grouped together (m is usually equal to 4). Once the carry-in to the group is known, all the internal carries and the output carry is calculated simultaneously.  We can use some algebraic manipulations to minimize hardware complexity.  Consider the carry out of the group c[i] = g[i-1] + p[i-1].c[i-1] Putting the value of c[i-1], we can rewrite as c[i] = g[i-1] + p[i-1].g[i-2] + p[i-1].p[i-2].c[i-2] Proceeding in this manner we get c[i] = g[i-1] + p[i-1].g[i-2] + p[i-1].p[i-2].g[i-3] + p[i- 1].p[i-2].p[i-3].g[i-4] + p[i-1].p[i-2].p[i-3].p[i-4].c[i-4] To further simplify the equation, we note that g[i-1] = g[i-1].p[i-1], and p[i-1] can be factored out

15 CSE 24615 Ling ’ s Adder c[i] = g[i-1] + p[i-1].g[i-2] + p[i-1].p[i-2].g[i- 3] + p[i-1].p[i-2].p[i-3].g[i-4] + p[i-1].p[i- 2].p[i-3].p[i-4].c[i-4] We replace p[i]=x[i]^y[i] with t[i]=x[i]+y[i]. Because g[i]=g[i]t[i], we have c[i] = g[i-1]t[i-1] + t[i-1]g[i-2] + t[i-1].t[i- 2].g[i-3] + t[i-1].t[i-2].t[i-3].g[i-4] + t[i- 1].t[i-2].t[i-3].t[i-4].c[i-4] Let h[i] = g[i-1] + g[i-2] + t[i-2].g[i-3] + t[i- 2].t[i-3].g[i-4] + t[i-2].t[i-3].t[i-4].t[i-5] h[i-4] C[i]= h[i]t[i-1]

16 CSE 24616 Ling ’ s Adder h[0]=c[0] h[3]=g[2]+g[1]+t[1]g[0]+t[1]t[0]h[0] s[3]=p[3]^c[3]=p[3]^(h[3]t[2]) =t[3] ’ h[3]t[2]+t[3](h[3] ’ +t[2] ’ ) =h[3] ’ p[3]+h[3](p[3]^t[2]) h[6]=g[5]+g[4]+t[4]g[3]+t[4]t[3]t[2]h[3] s[6]=h[6] ’ p[6]+h[6] ’ (p[6]^t[5])

17 CSE 24617 Generalized Design for Adders: Prefix Adder  Prefix computation Given n inputs x 1, x 2, x 3 … x n and an associative operator ×. We want to compute yi = x i × x i-1 × x i-2 … × x 2 × x 1 for all i, 1≤ i ≤n x can be a scalar/vector/matrix For design of adders, we define the operator × in the following manner  (g, p) = (g ’, p ’ ) × (g ’’, p ’’ )  g = g ’’ + p ’’.g ’  p = p ’.p ’’

18 CSE 24618 Alternate modeling of Prefix Computer: Finite State Machine  A finite state machine has a set of states, and it ‘ moves ’ from one state to another according to input. Mathematically, s k = f (s k-1, a k-1 )  The problem is to determine final state s n in O(lg n) operations, given initial state s 0 and sequence of inputs (a 0, a 1, … a n-1 )  This problem can be formulated in terms of prefix computation

19 CSE 24619 Alternate modeling of Prefix Computer: Finite State Machine  We assume that number of states are small and finite.  Let s k = f a k-1 (s k-1 ), f a k-1 can be represented by matrix M a k-1  Now we are ready to represent our problem in terms of prefix computation.

20 CSE 24620 Alternate Modeling of Prefix Computer: Finite State Machine  The algorithm 1.Compute M a i in parallel 2.Compute N 1 = M a 1 N 2 = M a 2.M a 1 … N n = M a n.M a n-1 … M a 1 3.Compute S i+1 = N i (S 0 )

21 CSE 24621 Prefix Computation  FSM example: Given:  initial state S 0 =A  A sequence of inputs: (0 0 1 1 1 0 1 0 1) Derive the sequence of outputs ABC 0/01/0 0/0 1/0 0/0 1/1 PSNextState X=0X=1 ABA BBC CBA State table BC BB BA X=0 NSPS M0M0 AC CB AA X=1 NSPS M1M1 Input Sequence: 0 1 … Compute N ’ s: N 1 =M 0 N 2 =M 0 M 0 N 3 =M 1 M 0 M 0 N 4 =M 1 M 1 M 0 M 0 … PSNS 12 AB BB CB PSNS 13 AC BC CC PSNS 14 AA BA CA

22 CSE 24622 Graph Based Approach  Consider the (g p) chain break the long paths C4C4 g3g3 g2g2 g1g1 p3p3 p2p2 p1p1 C1C1

23 CSE 24623 Graph Based Approach  Generating g 32 and p 32 C4C4 g3g3 g2g2 g1g1 p3p3 p2p2 p1p1 C1C1 g3g3 p3p3 g2g2 p2p2 g 32 p 32

24 CSE 24624 Graph Based Approach  Generating g 10 and p 10 C4C4 g3g3 g2g2 g1g1 p3p3 p2p2 p1p1 c in g1g1 p1p1 g 10 p 10

25 CSE 24625 Graph Based Approach  Generating g 30 and p 30 g 32 p 32 g 10 p 10 g 30 p 30 g3g3 p3p3 g2g2 p2p2 g 32 p 32 g1g1 p1p1 c in g 10 p 10

26 CSE 24626 Boolean Approach g 4 + p 4 ( g 3 + p 3 ( g 2 + p 2 ( g 1 + p 1 ( g 0 + p 0 c in ) ) ) ) g 4, p 4 g 3, p 3 g 2, p 2 g 1, p 1 g 0, p 0 c in g 4 +p 4 g 3, p 4 p 3 g 2 +p 2 g 1, p 2 p 1 g 0, p 0 c in g 4 +p 4 g 3 +p 4 p 3 (g 2 +p 2 g 1 ), p 4 p 3 p 2 p 1 g 0, p 0 c in g 4 +p 4 g 3 +p 4 p 3 (g 2 +p 2 g 1 )+(p 4 p 3 p 2 p 1 )g 0, (p 4 p 3 p 2 p 1 ) p 0 c in

27 CSE 24627 Prefix Adder  Given: n inputs (g i, p i ) An operation o  Compute: y i = (g i, p i ) o … o (g 1, p 1 ) ( 1 <= i <= n)  Associativity (A o B) o C = A o ( B o C)  (g ’’, p ’’ ) o (g ’, p ’ ) = (g, p)  g=g ’’ + p ’’ g ’  p=p ’’ p ’ gi=pi=gi=pi= a, i=1 a i b i, otherwise 1, i=1 a i xor b i, otherwise

28 CSE 24628 Prefix Adder: Graph Representation  Example: Ripple Carry Adder a i b i (g i, p i ) x y x o y

29 CSE 24629 Prefix Adders: Conditional Sum Adder 8 7 6 5 4 3 2 1

30 CSE 24630 Prefix Adders: Conditional Sum Adder  For output y i, there is an alphabetical tree covering inputs (x i, x i-1, …, x 1 ) 8 7 6 5 4 3 2 1  alphabetical tree:  Binary tree  Edges do not cross

31 CSE 24631 Prefix Adders: Conditional Sum Adder  From input x 1, there is a tree covering all outputs (y i, y i-1, …, y 1 ) 8 7 6 5 4 3 2 1  The nodes in this tree can be reduced to (g, p) o c = g+pc

32 CSE 24632 Prefix Adders: size and depth  Objective: Minimize # of nodes, s c(n). Minimize depth, d c(n)  Ripple Carry Adder: s c(8) = 7 d c(8) = 7 total = 14  Conditional Sum Adder: s c(8) = 12 d c(8) = 3 total = 15

33 CSE 24633 Prefix Adder – Well-known and Well-developed?  Classic prefix networks: Sklansky, Kogge- Stone, Brent-Kung, Ladner-Fischer, Han- Carlson, Knowles etc.

34 CSE 24634 Prefix Adders: Brent – Kung Adder 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 s c(16) = 26 d c(16) = 6 total = 32

35 CSE 24635 Prefix Adder – New Respects, New Method  Realistic design considerations: Timing, Power and Area.  Integer Linear Programming for prefix adder: Logic effort timing model (gate cap. + wire cap.) Activity-statistic power model Non-uniform signal arrival/required times Logic Levels Max FanoutsMax Wire Tracks Timing PowerArea

36 CSE 24636 Prefix Adder – Optimum Prefix adders  Uniform signal arrival/required times Sklansky AdderKogge-Stone Adder Fastest depth-4 optimal prefix adder Fastest depth-3 optimal prefix adder

37 CSE 24637 Prefix Adder – Optimum Prefix adders  Uniform signal arrival/required times

38 CSE 24638 The Big Picture What is the minimum depth of zero-deficiency circuits for a given width?

39 CSE 24639  Proof Consider the alphabetical tree rooted at the MSB output with all the input nodes being its leaves; The size of this tree is n-1 while its depth is d M ; At most d M prefix outputs can be generated from this tree; At least one extra node is needed for the columns where the prefix results are not ready. Consequently size ≥ (n-1)+(n-(d M + 1)) = 2n -2 - d M which is size + depth ≥ 2n - 2 Proof for Snir ’ s Theorem Given an arbitrary prefix graph of width n, we have depth + size ≥ 2n – 2

40 CSE 24640 Definitions For a prefix circuit, define  Backbone The binary alphabetical tree generating MSB prefix output;  Affiliated tree rooted at the LSB input, with all the prefix outputs (except MSB output) as its tree nodes  Ridge the path from the LSB input to the MSB output. Backbone Affiliated Tree

41 CSE 24641 How to … ?  Look from the MSB output  Since the circuit is of zero-deficiency, the ridge has exactly d nodes (excluding the first input node), one node per level.  The idea: try to stretch the ridge as long as possible while maintaining zero- deficiency

42 CSE 24642 T-tree  Definition of T k (k) tree

43 CSE 24643 T-tree example – T 3 (5)

44 CSE 24644 A-tree  Definition of A k (t) tree

45 CSE 24645 A-tree example – A 3 (5)

46 CSE 24646 Compound of A tree and T-tree

47 CSE 24647 Example

48 CSE 24648 Proposed Prefix Circuit

49 CSE 24649 An Example: Z(d)| d=8 5980 8188 T 2 (6) + A 2 (6) T 1 (7) + A 1 (7) Width = 88

50 CSE 24650 The width of Z(d) Circuit  The width of Z(d) circuit is N z(d) = F(d+3) – 1 (d≥1) Where F(i) are the Fibonacci numbers  Numerical Comparison dLSLYDZDLSLYDZDLSLYDZDLSLYDZ 3777847778813260308986181535162510945 41112 96695143143834461596192055213917710 51620 1095135232155175762583203071317628656 62333 11131169376165758434180214104420246367 73354 1219124260917103011016764226143626475024 LS : Design by Lin & Shish, 1999 LYD : Design by S. Lakshmivarahan, C.M. Yang & S.K. Dhall, 1987

51 CSE 24651 Comparison  64-bit case  Based on logical effort method to include fan-out effect and interconnect capacitance  Five adders Z64: A 64-bit Z(d) circuit derived from Z(d)| d=8 BK: Brent-Kung adder Sklansky KS: Kogge-Stone adder HC: Han-Carlson Adder

52 CSE 24652 Results  w is the weight for lateral interconnect capacitance; KS and HC have large w value to compensate for coupling effect  Z64 and BK adder have similar delay and area, but Z64 could be more power efficient because it has less logic levels

53 CSE 24653 Carry Skip Adder A0A0 a 3,0 b 3,0 c in c4c4 0101 p 3,0 A1A1 a 7,4 b 7,4 c8c8 0101 p 7,4 c4c4 A2A2 a 11,8 b 11,8 c 12 0101 p 11,8 c8c8 c 12 If p 3,0 =p 3 p 2 p 1 p 0 = 1, then x = c in x

54 CSE 24654 Carry Propagation Paths  A 2 <- MUX <- MUX <- c in  A 2 <- MUX <- A 1  A 2 <- MUX <- MUX <- A 0  c 12 <- MUX <- A 2  c 12 <- MUX <- MUX <- A 1  c 12 <- MUX <- MUX <- MUX <- A 0  c 12 <- MUX <- MUX <- MUX <- MUX <- c in

55 CSE 24655 False Path  A 1 <- MUX <- A 0 <- c in is a false path If carry is from cin, then block must have p 3 p 2 p 1 p 0 = 1 Since p 3,0 = 1, g 3,0 must be 0 The carry is not generated from A 0 The carry needs not to propagate via A 0, it will go from the MUX

56 CSE 24656 Label Algorithm  Problem: Given a digraph, a set of false paths Derive the longest path of the graph  Algorithm: Color the edges on each false path a label The length of the walk of the same labels are accumulated Otherwise, change to no label


Download ppt "CSE 246: Computer Arithmetic Algorithms and Hardware Design Instructor: Prof. Chung-Kuan Cheng Lecture 4: Adders."

Similar presentations


Ads by Google