Presentation is loading. Please wait.

Presentation is loading. Please wait.

Prof. V.G. OklobdzijaVLSI Arithmetic1 VLSI Arithmetic Adders & Multipliers Prof. Vojin G. Oklobdzija University of California

Similar presentations


Presentation on theme: "Prof. V.G. OklobdzijaVLSI Arithmetic1 VLSI Arithmetic Adders & Multipliers Prof. Vojin G. Oklobdzija University of California"— Presentation transcript:

1 Prof. V.G. OklobdzijaVLSI Arithmetic1 VLSI Arithmetic Adders & Multipliers Prof. Vojin G. Oklobdzija University of California

2 Prof. V.G. OklobdzijaVLSI Arithmetic2 Introduction Digital Computer Arithmetic belongs to Computer Architecture, however, it is also an aspect of logic design The objective of Computer Arithmetic is to develop appropriate algorithms that are utilizing available hardware in the most efficient way. Ultimately, speed, power and chip area are the most often used measures, making a strong link between the algorithms and technology of implementation.

3 Prof. V.G. OklobdzijaVLSI Arithmetic3 Basic Operations Addition Multiplication Multiply-Add Division Evaluation of Functions

4 Prof. V.G. OklobdzijaVLSI Arithmetic4 Addition of Binary Numbers Full Adder. The full adder is the fundamental building block of most arithmetic circuits: The sum and carry outputs are described as: Full Adder C in C out sisi aiai bibi

5 Prof. V.G. OklobdzijaVLSI Arithmetic5 Addition of Binary Numbers Propagate Generate InputsOutputs cici aiai bibi sisi c i

6 Prof. V.G. OklobdzijaVLSI Arithmetic6 Full-Adder Implementation Full Adder operations is defined by equations: One-bit adder could be implemented as shown Carry-Propagate: and Carry-Generate g i

7 Prof. V.G. OklobdzijaVLSI Arithmetic7 High-Speed Addition One-bit adder could be implemented more efficiently because MUX is faster

8 Prof. V.G. OklobdzijaVLSI Arithmetic8 The Ripple-Carry Adder

9 Prof. V.G. OklobdzijaVLSI Arithmetic9 The Ripple-Carry Adder From Rabaey

10 Prof. V.G. OklobdzijaVLSI Arithmetic10 Inversion Property From Rabaey

11 Prof. V.G. OklobdzijaVLSI Arithmetic11 Minimize Critical Path by Reducing Inverting Stages From Rabaey

12 Prof. V.G. OklobdzijaVLSI Arithmetic12 Manchester Carry-Chain Realization of the Carry Path Simple and very popular scheme for implementation of carry signal path

13 Prof. V.G. OklobdzijaVLSI Arithmetic13 Manchester Carry Chain Kilburn, et al, IEE Proc, Implement P with pass-transistors Implement G with pull-up, kill (delete) with pull-down Use dynamic logic to reduce the complexity and speed up

14 Prof. V.G. OklobdzijaVLSI Arithmetic14 Ripple Carry Adder Carry-Chain of an RCA implemented using multiplexer from the standard cell library: Critical Path Oklobdzija, ISCAS’88

15 Prof. V.G. OklobdzijaVLSI Arithmetic15 Pass-Transistor Realization in DPL

16 Prof. V.G. OklobdzijaVLSI Arithmetic16 Carry-Skip Adder MacSorley, Proc IRE 1/61 Lehman, Burla, IRE Trans on Comp, 12/61

17 Prof. V.G. OklobdzijaVLSI Arithmetic17 Carry-Skip Adder Bypass From Rabaey

18 Prof. V.G. OklobdzijaVLSI Arithmetic18 Carry-Skip Adder: N-bits, k-bits/group, r=N/k groups

19 Prof. V.G. OklobdzijaVLSI Arithmetic19 Carry-Skip Adder k

20 Prof. V.G. OklobdzijaVLSI Arithmetic20 Variable Block Adder (Oklobdzija, Barnes: IBM 1985)

21 Prof. V.G. OklobdzijaVLSI Arithmetic21 Carry-chain of a 32-bit Variable Block Adder (Oklobdzija, Barnes: IBM 1985)

22 Prof. V.G. OklobdzijaVLSI Arithmetic22 Carry-chain of a 32-bit Variable Block Adder (Oklobdzija, Barnes: IBM 1985)  =9 Any-point-to-any-point delay = 9  as compared to 12  for CSKA

23 Prof. V.G. OklobdzijaVLSI Arithmetic23 Carry-chain block size determination for a 32-bit Variable Block Adder (Oklobdzija, Barnes: IBM 1985)

24 Prof. V.G. OklobdzijaVLSI Arithmetic24 Delay Calculation for Variable Block Adder (Oklobdzija, Barnes: IBM 1985) Delay model:

25 Prof. V.G. OklobdzijaVLSI Arithmetic25 Variable Block Adder (Oklobdzija, Barnes: IBM 1985) Variable Group Length Oklobdzija, Barnes, Arith’85

26 Prof. V.G. OklobdzijaVLSI Arithmetic26 Carry-chain of a 32-bit Variable Block Adder (Oklobdzija, Barnes: IBM 1985) Variable Block Lengths No closed form solution for delay It is a dynamic programming problem

27 Prof. V.G. OklobdzijaVLSI Arithmetic27 Delay Comparison: Variable Block Adder (Oklobdzija, Barnes: IBM 1985)

28 Prof. V.G. OklobdzijaVLSI Arithmetic28 Delay Comparison: Variable Block Adder VBA- Multi-Level CLA VBA

29 Prof. V.G. OklobdzijaVLSI Arithmetic29 Fan-Out Dependency

30 Prof. V.G. OklobdzijaVLSI Arithmetic30 Fan-In Dependency

31 Prof. V.G. OklobdzijaVLSI Arithmetic31 Delay Comparison: Variable Block Adder (Oklobdzija, Barnes: IBM 1985)

32 Prof. V.G. OklobdzijaVLSI Arithmetic32

33 Prof. V.G. OklobdzijaVLSI Arithmetic33 Carry-Lookahead Adder (Weinberger and Smith) Weinberger and J. L. Smith, “A Logic for High-Speed Addition”, National Bureau of Standards, Circ. 591, p.3-12, 1958.

34 Prof. V.G. OklobdzijaVLSI Arithmetic34 Carry-Lookahead Adder (Weinberger and Smith)

35 Prof. V.G. OklobdzijaVLSI Arithmetic35 Carry-Lookahead Adder One gate delay  to calculate p, g One  to calculate P and two for G Three gate delays To calculate C 4(j+1) Compare that to 8  in RCA !

36 Prof. V.G. OklobdzijaVLSI Arithmetic36 Carry-Lookahead Adder (Weinberger and Smith) Additional two gate delays C 16 will take a total of 5  vs. 32  for RCA !

37 Prof. V.G. OklobdzijaVLSI Arithmetic37 32-bit Carry Lookahead Adder

38 Prof. V.G. OklobdzijaVLSI Arithmetic38 Carry-Lookahead Adder (Weinberger and Smith: original derivation )

39 Prof. V.G. OklobdzijaVLSI Arithmetic39 Carry-Lookahead Adder (Weinberger and Smith: original derivation )

40 Prof. V.G. OklobdzijaVLSI Arithmetic40 Carry-Lookahead Adder (Weinberger and Smith) please notice the similarity with Parallel-Prefix Adders !

41 Prof. V.G. OklobdzijaVLSI Arithmetic41 Carry-Lookahead Adder (Weinberger and Smith) please notice the similarity with Parallel-Prefix Adders !

42 Prof. V.G. OklobdzijaVLSI Arithmetic42 Delay Optimized CLA B. Lee, V. G. Oklobdzija Journal of VLSI Signal Processing, Vol.3, No.4, October 1991

43 Prof. V.G. OklobdzijaVLSI Arithmetic43 Delay Optimized CLA: Lee-Oklobdzija ‘91 (a.) Fixed groups and levels (b.) variable-sized groups, fixed levels (c.) variable-sized groups and fixed levels (d.) variable-sized groups and levels

44 Prof. V.G. OklobdzijaVLSI Arithmetic44 Two-Levels of Logic Implementation of the Carry Block

45 Prof. V.G. OklobdzijaVLSI Arithmetic45 Two-Levels of Logic Implementation of the Carry-Lookahead Block

46 Prof. V.G. OklobdzijaVLSI Arithmetic46 Three-Levels of Logic Implementation of the Carry Block (restricted fan-in)

47 Prof. V.G. OklobdzijaVLSI Arithmetic47 Three-Levels of Logic Implementation of the Carry Lookahead (restricted fan-in)

48 Prof. V.G. OklobdzijaVLSI Arithmetic48 Delay Optimized CLA: Lee-Oklobdzija ‘91 Delay: Two-level BCLA Delay: Three-level BCLA

49 Prof. V.G. OklobdzijaVLSI Arithmetic49 Delay Optimized CLA: Lee-Oklobdzija ‘91 (a.) 2-level BCLA  =8.5nS (b.) 3-level BCLA  =8.9nS

50 Prof. V.G. OklobdzijaVLSI Arithmetic50 Motorola: CLA Implementation Example A. Naini, D. Bearden and W. Anderson, “A 4.5nS 96b CMOS Adder Design”, Proceedings of the IEEE Custom Integrated Circuits Conference, May 3-6, 1992.

51 Prof. V.G. OklobdzijaVLSI Arithmetic51 Critical path in Motorola's 64-bit CLA

52 Prof. V.G. OklobdzijaVLSI Arithmetic52 Motorola's 64-bit CLA conventional PG Block

53 Prof. V.G. OklobdzijaVLSI Arithmetic53 Motorola's 64-bit CLA Modified PG Block Intermediate propagate signals P i:0 are generated to speed-up C 3

54 Prof. V.G. OklobdzijaVLSI Arithmetic54 Ling’s Adder Huey Ling, “High-Speed Binary Adder” IBM Journal of Research and Development, Vol.5, No.3, 1981.

55 Prof. V.G. OklobdzijaVLSI Arithmetic55 Ling Adder Variation of CLA: Ling, IBM J. Res. Dev, 5/81 Ling’s equations:

56 Prof. V.G. OklobdzijaVLSI Arithmetic56 Ling Adder Ling’s equation Doran, Trans on Comp 9/88 Propagates information on two bits

57 Prof. V.G. OklobdzijaVLSI Arithmetic57 Ling Adder Conventional: Ling:

58 Prof. V.G. OklobdzijaVLSI Arithmetic58 S. Naffziger, ISSCC’96

59 Prof. V.G. OklobdzijaVLSI Arithmetic59 S. Naffziger, ISSCC’96

60 Prof. V.G. OklobdzijaVLSI Arithmetic60 S. Naffziger, ISSCC’96

61 Prof. V.G. OklobdzijaVLSI Arithmetic61 S. Naffziger, ISSCC’96

62 Prof. V.G. OklobdzijaVLSI Arithmetic62 S. Naffziger, ISSCC’96

63 Prof. V.G. OklobdzijaVLSI Arithmetic63 S. Naffziger, ISSCC’96

64 Prof. V.G. OklobdzijaVLSI Arithmetic64 S. Naffziger, ISSCC’96

65 Prof. V.G. OklobdzijaVLSI Arithmetic65 S. Naffziger, ISSCC’96

66 Prof. V.G. OklobdzijaVLSI Arithmetic66 S. Naffziger, ISSCC’96

67 Prof. V.G. OklobdzijaVLSI Arithmetic67 S. Naffziger, ISSCC’96

68 Prof. V.G. OklobdzijaVLSI Arithmetic68 S. Naffziger, ISSCC’96

69 Prof. V.G. OklobdzijaVLSI Arithmetic69 Results: S. Naffziger, “A Subnanosecond 64-b Adder”, ISSCC ‘ u Technology Speed: nS Nominal process, 80C, V=3.3V

70 Prof. V.G. OklobdzijaVLSI Arithmetic70 ConditionalSum Adder J. Sklansky, “Conditional-Sum Addition Logic”, IRE Transactions on Electronic Computers, EC-9, p , 1960.

71 Prof. V.G. OklobdzijaVLSI Arithmetic71 Conditional Sum Adder

72 Prof. V.G. OklobdzijaVLSI Arithmetic72 ConditionalSum Adder

73 Prof. V.G. OklobdzijaVLSI Arithmetic73 Carry-Select Adder O. J. Bedrij, “Carry-Select Adder”, IRE Transactions on Electronic Computers, June 1962, p

74 Prof. V.G. OklobdzijaVLSI Arithmetic74 Carry-Select Adder Addition under assumption of C in =0 and C in =1.

75 Prof. V.G. OklobdzijaVLSI Arithmetic75 Carry Select Adder: combining two 32-b VBAs in select mode Delay =  VBA32 +  MUX

76 Prof. V.G. OklobdzijaVLSI Arithmetic76 Addition Under Non-equal Signal Arrival Profile Assumption P. Stelling, V. G. Oklobdzija, "Design Strategies for Optimal Hybrid Final Adders in a Parallel Multiplier", special issue on VLSI Arithmetic, Journal of VLSI Signal Processing, Kluwer Academic Publishers, Vol.14, No.3, December 1996

77 Prof. V.G. OklobdzijaVLSI Arithmetic77 Signal Arrival Profile form the Parallel Multiplier Partial-Product Recuction Tree

78 Prof. V.G. OklobdzijaVLSI Arithmetic78 Oklobdzija, Villeger, IEEE Transactions on VLSI Systems, June, 1995

79 Prof. V.G. OklobdzijaVLSI Arithmetic79 Oklobdzija and Villeger, IEEE Transactions on VLSI Systems, June, 1995

80 Prof. V.G. OklobdzijaVLSI Arithmetic80

81 Prof. V.G. OklobdzijaVLSI Arithmetic81

82 Prof. V.G. OklobdzijaVLSI Arithmetic82

83 Prof. V.G. OklobdzijaVLSI Arithmetic83

84 Prof. V.G. OklobdzijaVLSI Arithmetic84

85 Prof. V.G. OklobdzijaVLSI Arithmetic85

86 Prof. V.G. OklobdzijaVLSI Arithmetic86

87 Prof. V.G. OklobdzijaVLSI Arithmetic87

88 Prof. V.G. OklobdzijaVLSI Arithmetic88 Performing Multiply-Add Operation in the Multiply Time P. Stelling, V. G. Oklobdzija, " Achieving Multiply-Accumulate Operation in the Multiply Time", Thirteenth International Symposium on Computer Arithmetic, Pacific Grove, California, July 5 - 9, 1997.

89 Prof. V.G. OklobdzijaVLSI Arithmetic89

90 Prof. V.G. OklobdzijaVLSI Arithmetic90 Final Adder: Implementation

91 Prof. V.G. OklobdzijaVLSI Arithmetic91 Final Adder: Implementation

92 Prof. V.G. OklobdzijaVLSI Arithmetic92 Final Adder: Implementation

93 Prof. V.G. OklobdzijaVLSI Arithmetic93 Final Adder: Implementation

94 Prof. V.G. OklobdzijaVLSI Arithmetic94 Recurrence Solver Based Adders Koggie and Stone, IEEE Trans on Computers, August 1973 Bilgory and Gajski, 18 th DAC, 1981 Brent and Kung, IEEE Trans on Computers, March 1982

95 Prof. V.G. OklobdzijaVLSI Arithmetic95 Recurrence Solver Based Adders 1973, Koggie and Stone published a general recurrence scheme for parallel computation 1979, Brent and Kung published Tech. Report on regular layout for parallel adders 1980, Guibas and Vuillemin, developed a layout scheme based on recurrence equation for addition 1980, Ladner and Fisher published “parallel prefix computation”, Jo of ACM 1981, Bilgory and Gajski published a paper on recurrence structures for automatic cell generation

96 Prof. V.G. OklobdzijaVLSI Arithmetic96 Recurrence Solver Based Adders They are based on recurrence equation for P,G (what is new there since Weinberger ?!!): Or:and

97 Prof. V.G. OklobdzijaVLSI Arithmetic97 Recurrence Solver Based Adders

98 Prof. V.G. OklobdzijaVLSI Arithmetic98 Carry-Lookahead Adder (Weinberger and Smith) Just to remind you ! please notice the similarity with Parallel-Prefix Adders !

99 Prof. V.G. OklobdzijaVLSI Arithmetic99 Multiplexer Based Adder Farooqui and Oklobdzija 1999 Int’l Sym. on VLSI Technology, Taipei, Taiwan, June 8-10, 1999

100 Prof. V.G. OklobdzijaVLSI Arithmetic100 Multiplexer Based Adder Based on the realization that MUX circuit is faster than a logic gate due to its transmission gate implementation Based on Carry-Lookahead method (W-S), or recurrence solver.

101 Prof. V.G. OklobdzijaVLSI Arithmetic101 Multiplexer Based Adder A. A. Farooqui, V. G. Oklobdzija, F. Chechrazi, 1999 Int’l Sym. on VLSI Technology, Taipei, Taiwan, June 8-10, 1999.

102 Prof. V.G. OklobdzijaVLSI Arithmetic102 Multiplexer Based Adder A. A. Farooqui, V. G. Oklobdzija, F. Chechrazi, 1999 Int’l Sym. on VLSI Technology, Taipei, Taiwan, June 8-10, 1999.

103 Prof. V.G. OklobdzijaVLSI Arithmetic103 Multiplexer Based Adder A. A. Farooqui, V. G. Oklobdzija, F. Chechrazi, 1999 Int’l Sym. on VLSI Technology, Taipei, Taiwan, June 8-10, 1999.

104 Prof. V.G. OklobdzijaVLSI Arithmetic104 Multiplexer Based Adder A. A. Farooqui, V. G. Oklobdzija, F. Chechrazi, 1999 Int’l Sym. on VLSI Technology, Taipei, Taiwan, June 8-10, Results in a very fast structure 7-MUX delays for a 64-b adder Delay using standard cell 0.25u, 2.5V, 25 o C : Adder Size (bits) Delay (pS)

105 Prof. V.G. OklobdzijaVLSI Arithmetic105 DEC "Alpha" Adder Combination: –8-bit tapered pre-discharged Manchester Carry Chains, with C in = 0 and C in = 1 –32-bit LSB Carry Lookahead Adder –32-bit MSB Conditional-Sum Adder –Carry-Select on most significant 32-bits –Latches in the middle: pipelined addition

106 Prof. V.G. OklobdzijaVLSI Arithmetic106 DEC "Alpha" Adder

107 Prof. V.G. OklobdzijaVLSI Arithmetic107 DEC "Alpha" Adder: Results The first 200MHz processor Built using 0.75u technology V=3.3V, 30W Pipelined (two-latches) allowing 5nS throughput and 10nS latency

108 Prof. V.G. OklobdzijaVLSI Arithmetic108 Conclusion VLSI Implementation of Addition

109 Prof. V.G. OklobdzijaVLSI Arithmetic109 Conclusion: VLSI Implementation of Addition Currently, implementation parameters are not reflected in algorithms used for development Layout and wire delays effects are largely neglected and this is becoming intolerable in the next generation of technology Transistor sizing has a large effect which can outweight the algorithm There is a great disconnect between algorithm and implementation New rules and measures of goodness are needed

110 Prof. V.G. OklobdzijaVLSI Arithmetic110 Multiplication Parallel Multiplier Implementation

111 Prof. V.G. OklobdzijaVLSI Arithmetic111 Multiplication Algorithm: for j=0,....,n-1 initially p(n)=XY after n steps

112 Prof. V.G. OklobdzijaVLSI Arithmetic112 Parallel Multipliers

113 Prof. V.G. OklobdzijaVLSI Arithmetic113 4:2 Compressor

114 Prof. V.G. OklobdzijaVLSI Arithmetic114 Re-designed 4:2 Compressor with 3 XOR Delay C in I1 I2 I3 I4 0 1 S C C out

115 Prof. V.G. OklobdzijaVLSI Arithmetic115 Three-Dimensional optimization Method: TDM (Oklobdzija, Villeger, Liu, 1996)

116 Prof. V.G. OklobdzijaVLSI Arithmetic116 Generation of the Partial Product Reduction Tree in TDM multiplier

117 Prof. V.G. OklobdzijaVLSI Arithmetic117 Speed of Partial Product Reduction for Various Schemes

118 Prof. V.G. OklobdzijaVLSI Arithmetic118 Booth Recoding Algorithm x i+2 x i+1 x i Add to partial product 000+0Y 001+1Y 010+1Y 011+2Y 100-2Y 101-1Y 110-1Y 111-0Y

119 Prof. V.G. OklobdzijaVLSI Arithmetic119 Organization of Hitachi's DPL multiplier

120 Prof. V.G. OklobdzijaVLSI Arithmetic120 Hitachi's 4:2 compressor structure

121 Prof. V.G. OklobdzijaVLSI Arithmetic121 DPL multiplexer circuit

122 Prof. V.G. OklobdzijaVLSI Arithmetic122 Conclusion References: 1.E. Swartzlander, "Computer Arithmetic". Vol. 1&2, IEEE Computer Society Press, K. Hwang, "Computer Arithmetic : Principles, Architecture and Design", John Wiley and Sons, M. Ercegovac, “Digital Systems and Hardware/Firmware Algorithms”, Chapter 12: Arithmetic Algorithms and Processors, John Wiley & Sons, A. Chandrakasan, W. Bowhill, F Fox, Editors, "Design of High Performance Microprocessors Circuits", IEEE Press, July V. G. Oklobdzija, “High-Performance System Design: Circuits and Logic”, IEEE Press, July Also:


Download ppt "Prof. V.G. OklobdzijaVLSI Arithmetic1 VLSI Arithmetic Adders & Multipliers Prof. Vojin G. Oklobdzija University of California"

Similar presentations


Ads by Google