Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fast Adders: Parallel Prefix Network Adders, Conditional-Sum Adders, & Carry-Skip Adders ECE 645: Lecture 5.

Similar presentations


Presentation on theme: "Fast Adders: Parallel Prefix Network Adders, Conditional-Sum Adders, & Carry-Skip Adders ECE 645: Lecture 5."— Presentation transcript:

1 Fast Adders: Parallel Prefix Network Adders, Conditional-Sum Adders, & Carry-Skip Adders ECE 645: Lecture 5

2 Required Reading Chapter 6.4, Carry Determination as Prefix Computation Chapter 6.5, Alternative Parallel Prefix Networks Chapter 7.4, Conditional-Sum Adder Chapter 7.5, Hybrid Adder Designs Chapter 7.1, Simple Carry-Skip Adders Note errata at: http://www.ece.ucsb.edu/~parhami/text_comp_arit_1ed.htm#errors Behrooz Parhami, Computer Arithmetic: Algorithms and Hardware Design

3 Recommended Reading J-P. Deschamps, G. Bioul, G. Sutter, Synthesis of Arithmetic Circuits: FPGA, ASIC and Embedded Systems Chapter 11.1.9, Prefix Adders Chapter 11.1.3, Carry-Skip Adder Chapter 11.1.4, Optimization of Carry-Skip Adders Chapter 11.1.10, FPGA Implementations of Adders Chapter 11.1.11, Long-Operand Adders

4 Parallel Prefix Network Adders

5 5 Basic component - Carry operator (1) B”B’ g” p” g’ p’ gp g = g” + g’p” p = p’p” (g, p) = (g’, p’) ¢ (g”, p”) = (g” + g’p”, p’p”) B Parallel Prefix Network Adders

6 6 Basic component - Carry operator (2) B”B’ g” p” g’ p’ gp g = g” + g’p” p = p’p” (g, p) = (g’, p’) ¢ (g”, p”) = (g” + g’p”, p’p”) B overlap okay! Parallel Prefix Network Adders

7 7 Associative Not commutative [(g 1, p 1 ) ¢ (g 2, p 2 )] ¢ (g 3, p 3 ) = (g 1, p 1 ) ¢ [(g 2, p 2 ) ¢ (g 3, p 3 )] (g 1, p 1 ) ¢ (g 2, p 2 )  (g 2, p 2 ) ¢ (g 1, p 1 ) Properties of the carry operator ¢

8 8 Major concept Given: Find: (g 0, p 0 ) (g 1, p 1 ) (g 2, p 2 ) …. (g k-1, p k-1 ) (g [0,0], p [0,0] ) (g [0,1], p [0,1] ) (g [0,2], p [0,2] ) … (g [0,k-1], p [0,k-1] ) c i = g [0,i-1] + c 0 p [0,i-1] Parallel Prefix Network Adders block generate from index 0 to k-1

9 9 Parallel Prefix Sum Problem Given: Find: Parallel Prefix Adder Problem Given: Find: x 0 x 0 +x 1 x 0 +x 1 +x 2 … x 0 +x 1 +x 2 + …+ x k-1 x 0 x 1 x 2 … x k-1 x 0 x 0 ¢ x 1 x 0 ¢ x 1 ¢ x 2 … x 0 ¢ x 1 ¢ x 2 ¢ … ¢ x k-1 x 0 x 1 x 2 … x k-1 where x i = (g i, p i ) Similar to Parallel Prefix Sum Problem

10 10 Parallel Prefix Sums Network I

11 11 Cost = C(k) = 2 C(k/2) + k/2 = = 2 [2C(k/4) + k/4] + k/2 = 4 C(k/4) + k/2 + k/2 = = …. = = 2 log k-1 C(2) + k/2 (log 2 k-1) = = k/2 log 2 k 2 Example: C(16) = 2 C(8) + 8 = 2[2 C(4) + 4] + 8 = = 4 C(4) + 16 = 4 [2 C(2) + 2] + 16 = = 8 C(2) + 24 = 8 + 24 = 32 = (16/2) log 2 16 C(2) = 1 Parallel Prefix Sums Network I – Cost (Area) Analysis

12 12 Delay = D(k) = D(k/2) + 1 = = [D(k/4) + 1] + 1 = D(k/4) + 1 + 1 = = …. = = log 2 k Example: D(16) = D(8) + 1 = [D(4) + 1] + 1 = = D(4) + 2 = [D(2) + 1] + 2 = = 4 = log 2 16 D(2) = 1 Parallel Prefix Sums Network I – Delay Analysis

13 13 Parallel Prefix Sums Network II (Brent-Kung)

14 14 Cost = C(k) = C(k/2) + k-1 = = [C(k/4) + k/2-1] + k-1 = C(k/4) + 3k/2 - 2 = = …. = = C(2) + (2k - 2k/2 (log k-1) ) - (log 2 k-1) = = 2k - 2 - log 2 k Example: C(16) = C(8) + 16-1 = [C(4) + 8-1] + 16-1 = = C(2) + 4-1 + 24-2 = 1 + 28 - 3 = 26 = 2·16 - 2 - log 2 16 C(2) = 1 2 Parallel Prefix Sums Network II – Cost (Area) Analysis

15 15 Delay = D(k) = D(k/2) + 2 = = [D(k/4) + 2] + 2 = D(k/4) + 2 + 2 = = …. = = 2 log 2 k - 1 Example: D(16) = D(8) + 2 = [D(4) + 2] + 2 = = D(4) + 4 = [D(2) + 2] + 4 = = 7 = 2 log 2 16 - 1 D(2) = 1 Parallel Prefix Sums Network II – Delay Analysis

16 16 8-bit Brent-Kung Parallel Prefix Network

17 17 x1’x1’x3’x3’ s1’s1’s3’s3’ 2 –bit B-K PPN x5’x5’x7’x7’ s5’s5’s7’s7’ 4-bit Brent-Kung Parallel Prefix Network

18 18 8-bit Brent-Kung Parallel Prefix Network Critical Path

19 19 Critical Path GP c C S g i = x i y i p i = x i  y i g = g” + g’ p” p = p’ p” c i+1 = g [0,i] + c 0 p [0,i] s i = p i  c i 1 gate delay 2 gate delays 1 gate delay

20 20 Brent-Kung Parallel Prefix Graph for 16 Inputs

21 21 Kogge-Stone Parallel Prefix Graph for 16 Inputs

22 22 Comparison of architectures Hybrid Network 2 Brent-Kung Kogge-Stone Cost(k) Delay(k) Cost(16) Delay(16) Cost(32) Delay(32) 2 log 2 k - 2 2k - 2 - log 2 k log 2 k+1 log 2 k k/2 log 2 k k log 2 k - k + 1 6 5 4 26 32 49 8 6 5 57 80 129 Parallel Prefix Network Adders

23 23 Latency vs. Area Tradeoff

24 24 Hybrid Brent-Kung/Kogge-Stone Parallel Prefix Graph for 16 Inputs

25 Conditional-Sum Adders

26 One-level k-bit Carry-Select Adder

27 Two-level k-bit Carry Select Adder

28 28 Conditional Sum Adder Extension of carry-select adder Carry select adder One-level using k/2-bit adders Two-level using k/4-bit adders Three-level using k/8-bit adders Etc. Assuming k is a power of two, eventually have an extreme where there are log 2 k-levels using 1-bit adders This is a conditional sum adder

29 29 Conditional Sum Adder: Top-Level Block for One Bit Position

30 30 Three Levels of a Conditional Sum Adder 2 2 2 2 x i+3 y i+3 c=0 c=1 2 2 x i+2 y i+ 2 c=0 c=1 1 1 1 1 3 c=0 3 c=1 2 2 2 2 x i+1 y i+1 c=0 c=1 2 2 1-bit conditional sum block xixi yiyi c=0 c=1 1 1 1 1 3 c=0 3 c=1 22 1 1 3 3 5 5 c=0 1+1 2+1 4+1 branch point concatenation block carry-in determines selection

31 31 16-Bit Conditional Sum Adder Example

32 32 Conditional Sum Adder Metrics

33 Hybrid Adders

34 A Hybrid Ripple-Carry/Carry-Lookahead Adder

35 A Hybrid Carry-Lookahead/Carry-Select Adder

36 Carry-Skip Adders Fixed-Block-Size

37 Apr. 2008Computer Arithmetic, Addition/SubtractionSlide 37 7.1 Simple Carry-Skip Adders Fig. 7.1 Converting a 16-bit ripple-carry adder into a simple carry-skip adder with 4-bit skip blocks.

38 Apr. 2008Computer Arithmetic, Addition/SubtractionSlide 38 Another View of Carry-Skip Addition Street/freeway analogy for carry-skip adder.

39 Apr. 2008Computer Arithmetic, Addition/SubtractionSlide 39 Mux-Based Skip Carry Logic The carry-skip adder with “OR combining” works fine if we begin with a clean slate, where all signals are 0s at the outset; otherwise, it will run into problems, which do not exist in mux-based version 0101 p [4j, 4j+3] c 4j+4 Fig. 10.7 of arch book

40 Apr. 2008Computer Arithmetic, Addition/SubtractionSlide 40 Carry-Skip Adder with Fixed Block Size Block width b; k/b blocks to form a k-bit adder (assume b divides k) Example: k = 32, b opt = 4, T opt = 12.5 stages (contrast with 32 stages for a ripple-carry adder) T fixed-skip-add = (b – 1) + 0.5 + (k/b – 2) + (b – 1) in block 0 OR gate skips in last block  2b + k/b – 3.5 stages dT/db = 2 – k/b 2 = 0  b opt =  k/2 T opt = 2  2k – 3.5... 1

41 Fixed-Block-Size Carry-Skip Adder (1) Notation & Assumptions Delay of skip logic = Delay of one stage of ripple-carry adder = 1 delay unit Latency of the carry-skip adder with fixed block width Fixed block size - b bits Latency fixed-carry-skip = ( b - 1 ) + 0.5 + - 2 + ( b - 1 ) in block 0 OR gate skipsin last block Adder size - k-bits = 2b + - 3.5 k b k b Number of stages - t

42 Fixed-Block-Size Carry-Skip Adder (2) Optimal fixed block size dLatency fixed-carry-skip db =2 - k b2b2 =0 b opt =  k 2 t opt = k b opt =  2 k Latency fixed-carry-skip opt = 2  k 2 + k  k 2 - 3.5 = =  2 k +  - 3.5=  2 k 2 - 3.5

43 k b opt Latency fixed-carry-skip 32 16 4 Latency ripple-carry Latency look-ahead 64 128 Fixed-Block-Size Carry-Skip Adder (3) 8 t opt 8 16 12.5 28.5 32 128 6.5 4.5 8.5 2 3 8 5 16 5 6 13 11 18.5 64 7.5 18.5

44 Carry-Chain & Carry-Skip Adders in Xilinx FPGAs

45 LUT 01 xixi yiyi c i+1 00110011 01010101 yiyiyiyi cici cici xixi yiyi pipi cici sisi Basic Cell of a Carry-Chain Adder in Xilinx FPGAs

46 LUT 01 Carry-Skip Adder b-bit block LUT 01 01 xibxib....... yibyib pibpib cibcib sibsib c i  b+1 s i  b+1 x i  b+1 y i  b+1 p i  b+1 x i  b+(b-1) y i  b+(b-1) p i  b+(b-1) cc (i+1)  b s i  b+(b-1)

47 LUT 01 Carry-Skip Adder: Carry Skip Multiplexers LUT 01 01....... p [0, b-1] c0c0 cbcb p [b, 2  b-1] p [k-b, k-1] ckck c0c0 cbcb c k-b p [k-b, k-1] p [b, 2  b-1] p [0, b-1] cc b cc 2  b cc k-b

48 LUT 01 Carry-Skip Adder: Computation of the Block Propagate Signals LUT 01....... cbcb p [i  b, i  b+(b-1)] 0 1 p [i  b, i  b+1] p [i  b, i  b+3] p [i  b, i  b+(b-3)] 01 0 0 x i  b+1 y i  b+1 x i  b y i  b x i  b+3 y i  b+3 x i  b+2 y i  b+2 x i  b+(b-1) y i  b+(b-1) x i  b+(b-2) y i  b+(b-2)

49 Complete Carry-Skip Adder x b-1..0 y b-1..0 x 2  b-1..b y 2  b-1..b x 3  b-1..2  b y 3  b-1..2  b x k-1..k-b y k-1..k-b..... cc 2  b 0 1 cbcb 0 1 c2bc2b cc 3  b cc b 0 1 c0c0 cc k c k-b... s s-1..0 s 2  s-1..s s 3  b-1..2  b s k-1..k-b c0c0 0 1 ckck x b-1..0 y b-1..0 x 2  b-1..b y 2  b-1..b x 3  b-1..2  b y 3  b-1..2  b x k-1..k-b y k-1..k-b..... p [0,b-1] p [b,2  b-1] p [2  b,3  b-1] p [k-b,k-1]

50 Carry-Skip Adders in Xilinx Spartan II FPGAs by J-P. Deschamps, G. Bioul, G. Sutter Delay in ns b=k b=8 b=16 b=32 Max. Frequency Increase 64 14 13 12 13% 96 16 14 13 21% 128 23 14 14 63% 256 38 - 16 17 141% 512 77 - 20 20 296% 1024 159 - 28 25 531% k

51 Carry-Skip Adders in Xilinx Spartan II FPGAs by J-P. Deschamps, G. Bioul, G. Sutter Area in CLB slices b=k b=8 b=16 b=32 b=8 b=16 b=32 Area Overhead 64 32 47 41 - 47% 28% - 96 48 73 66 - 52% 38% - 128 64 99 91 - 55% 42% - 256 128 - 191 179 - 49% 40% 512 256- 391 375 - 53% 46% 1024 512 - 791 767 - 54% 50% k

52 Carry-Skip Adders Variable-Block-Size

53 Apr. 2008Computer Arithmetic, Addition/SubtractionSlide 53 Carry-Skip Adder with Variable-Width Blocks Fig. 7.2 Carry-skip adder with variable-size blocks and three sample carry paths. The total number of bits in the t blocks is k: 2[b + (b + 1) +... + (b + t/2 – 1)] = t(b + t/4 – 1/2) = k b = k/t – t/4 + 1/2 T var-skip-add = 2(b – 1) + 0.5 + t – 2 = 2k/t + t/2 – 2.5 dT/db = –2k/t 2 + 1/2 = 0  t opt = 2  k T opt = 2  k – 2.5 (a factor of  2 smaller than for fixed-block)

54 P xjxj yjyj pjpj P x j+1 y j+1 p j+1 P p j+bi-2 P x j+b -2 y j+b -2 x j+b -1 y j+b -1 p j+bi-1 i i i i CP p [i,i|+b i -1] c in =0 xjxj yjyj FA x j+1 y j+1 FA x j+b -2 y j+b -2 FA... b i stages x j+b -1 y j+b -1 i i i i c out s j+b -1 s j+b -2 s j+1 sjsj Most critical path to produce carry i i

55 P xjxj yjyj pjpj P x j+1 y j+1 p j+1 P p j+b -2 P x j+b -2 y j+b -2 x j+b -1 y j+b -1 p j+b -1 i i i i CP p [i,i|+b i -1] c in xjxj yjyj FA x j+1 y j+1 FA x j+b -2 y j+b -2 FA... b i stages x j+b -1 y j+b -1 i i i c out s j+b -1 s j+b -2 s j+1 sjsj Most critical path to assimilate carry i i i i i

56 Variable-Block-Size Carry-Skip Adder (1) Notation & Assumptions Delay of skip logic = Delay of one stage of ripple-carry adder = 1 delay unit Adder size - k-bits Number of stages - t Block size - variable First and last block size - b bits

57 Variable-Block-Size Carry-Skip Adder (2) Optimum block sizes b t-1 b t-2 b t-3... b t/2+1 b t/2-1... b 2 b 1 b 0 b b+1 b+2... b+ t 2 b+ t 2 b+2 b+1 b Total number of bits k = 2 [ b + (b+1) + (b+2) + … + (b + + 1 ) ] = = t ( b + - ) t 2 t 4 1 2

58 Variable-Block-Size Carry-Skip Adder (3) Number of bits in the first and last block b = k t - t 4 + 1 2 Latency of the carry-skip adder with variable block width Latency fixed-carry-skip = ( b - 1 ) + 0.5 + t - 2 + ( b - 1 ) in block 0 OR gate skipsin last block = 2 b + t - 3.5 = 2 k t - t 4 + 1 2 + t - 3.5 = = 2k t 1 2 t- 2.5 +

59 Variable-Block-Size Carry-Skip Adder (4) Optimal number of blocks dLatency variable-carry-skip dt = 2k t2t2 t opt =  4 k -+ 1 2 = 0 =  k 2 b opt = k t opt - 4 + 1 2 = = k - 4 + 1 2  k 2  k 2 = 1 2 b opt =1

60 Variable-Block-Size Carry-Skip Adder (5) Optimal latency = 2k t 1 2 t- 2.5 + Latency variable-carry-skip opt = = 2k + 2  k 2  k 2 = - 2.5 =  k 2 Latency variable-carry-skip opt  Latency fixed-carry-skip opt  2


Download ppt "Fast Adders: Parallel Prefix Network Adders, Conditional-Sum Adders, & Carry-Skip Adders ECE 645: Lecture 5."

Similar presentations


Ads by Google