Download presentation

Presentation is loading. Please wait.

Published byJohn Nickols Modified about 1 year ago

1
Prof. V.G. OklobdzijaVLSI Arithmetic1 VLSI Arithmetic Adders & Multipliers Prof. Vojin G. Oklobdzija University of California

2
Prof. V.G. OklobdzijaVLSI Arithmetic2 Introduction Digital Computer Arithmetic belongs to Computer Architecture, however, it is also an aspect of logic design The objective of Computer Arithmetic is to develop appropriate algorithms that are utilizing available hardware in the most efficient way. Ultimately, speed, power and chip area are the most often used measures, making a strong link between the algorithms and technology of implementation.

3
Prof. V.G. OklobdzijaVLSI Arithmetic3 Basic Operations Addition Multiplication Multiply-Add Division Evaluation of Functions

4
Prof. V.G. OklobdzijaVLSI Arithmetic4 Addition of Binary Numbers Full Adder. The full adder is the fundamental building block of most arithmetic circuits: The sum and carry outputs are described as: Full Adder C in C out sisi aiai bibi

5
Prof. V.G. OklobdzijaVLSI Arithmetic5 Addition of Binary Numbers Propagate Generate InputsOutputs cici aiai bibi sisi c i

6
Prof. V.G. OklobdzijaVLSI Arithmetic6 Full-Adder Implementation Full Adder operations is defined by equations: One-bit adder could be implemented as shown Carry-Propagate: and Carry-Generate g i

7
Prof. V.G. OklobdzijaVLSI Arithmetic7 High-Speed Addition One-bit adder could be implemented more efficiently because MUX is faster

8
Prof. V.G. OklobdzijaVLSI Arithmetic8 The Ripple-Carry Adder

9
Prof. V.G. OklobdzijaVLSI Arithmetic9 The Ripple-Carry Adder From Rabaey

10
Prof. V.G. OklobdzijaVLSI Arithmetic10 Inversion Property From Rabaey

11
Prof. V.G. OklobdzijaVLSI Arithmetic11 Minimize Critical Path by Reducing Inverting Stages From Rabaey

12
Prof. V.G. OklobdzijaVLSI Arithmetic12 Manchester Carry-Chain Realization of the Carry Path Simple and very popular scheme for implementation of carry signal path

13
Prof. V.G. OklobdzijaVLSI Arithmetic13 Manchester Carry Chain Kilburn, et al, IEE Proc, Implement P with pass-transistors Implement G with pull-up, kill (delete) with pull-down Use dynamic logic to reduce the complexity and speed up

14
Prof. V.G. OklobdzijaVLSI Arithmetic14 Ripple Carry Adder Carry-Chain of an RCA implemented using multiplexer from the standard cell library: Critical Path Oklobdzija, ISCAS’88

15
Prof. V.G. OklobdzijaVLSI Arithmetic15 Pass-Transistor Realization in DPL

16
Prof. V.G. OklobdzijaVLSI Arithmetic16 Carry-Skip Adder MacSorley, Proc IRE 1/61 Lehman, Burla, IRE Trans on Comp, 12/61

17
Prof. V.G. OklobdzijaVLSI Arithmetic17 Carry-Skip Adder Bypass From Rabaey

18
Prof. V.G. OklobdzijaVLSI Arithmetic18 Carry-Skip Adder: N-bits, k-bits/group, r=N/k groups

19
Prof. V.G. OklobdzijaVLSI Arithmetic19 Carry-Skip Adder k

20
Prof. V.G. OklobdzijaVLSI Arithmetic20 Variable Block Adder (Oklobdzija, Barnes: IBM 1985)

21
Prof. V.G. OklobdzijaVLSI Arithmetic21 Carry-chain of a 32-bit Variable Block Adder (Oklobdzija, Barnes: IBM 1985)

22
Prof. V.G. OklobdzijaVLSI Arithmetic22 Carry-chain of a 32-bit Variable Block Adder (Oklobdzija, Barnes: IBM 1985) =9 Any-point-to-any-point delay = 9 as compared to 12 for CSKA

23
Prof. V.G. OklobdzijaVLSI Arithmetic23 Carry-chain block size determination for a 32-bit Variable Block Adder (Oklobdzija, Barnes: IBM 1985)

24
Prof. V.G. OklobdzijaVLSI Arithmetic24 Delay Calculation for Variable Block Adder (Oklobdzija, Barnes: IBM 1985) Delay model:

25
Prof. V.G. OklobdzijaVLSI Arithmetic25 Variable Block Adder (Oklobdzija, Barnes: IBM 1985) Variable Group Length Oklobdzija, Barnes, Arith’85

26
Prof. V.G. OklobdzijaVLSI Arithmetic26 Carry-chain of a 32-bit Variable Block Adder (Oklobdzija, Barnes: IBM 1985) Variable Block Lengths No closed form solution for delay It is a dynamic programming problem

27
Prof. V.G. OklobdzijaVLSI Arithmetic27 Delay Comparison: Variable Block Adder (Oklobdzija, Barnes: IBM 1985)

28
Prof. V.G. OklobdzijaVLSI Arithmetic28 Delay Comparison: Variable Block Adder VBA- Multi-Level CLA VBA

29
Prof. V.G. OklobdzijaVLSI Arithmetic29 Fan-Out Dependency

30
Prof. V.G. OklobdzijaVLSI Arithmetic30 Fan-In Dependency

31
Prof. V.G. OklobdzijaVLSI Arithmetic31 Delay Comparison: Variable Block Adder (Oklobdzija, Barnes: IBM 1985)

32
Prof. V.G. OklobdzijaVLSI Arithmetic32

33
Prof. V.G. OklobdzijaVLSI Arithmetic33 Carry-Lookahead Adder (Weinberger and Smith) Weinberger and J. L. Smith, “A Logic for High-Speed Addition”, National Bureau of Standards, Circ. 591, p.3-12, 1958.

34
Prof. V.G. OklobdzijaVLSI Arithmetic34 Carry-Lookahead Adder (Weinberger and Smith)

35
Prof. V.G. OklobdzijaVLSI Arithmetic35 Carry-Lookahead Adder One gate delay to calculate p, g One to calculate P and two for G Three gate delays To calculate C 4(j+1) Compare that to 8 in RCA !

36
Prof. V.G. OklobdzijaVLSI Arithmetic36 Carry-Lookahead Adder (Weinberger and Smith) Additional two gate delays C 16 will take a total of 5 vs. 32 for RCA !

37
Prof. V.G. OklobdzijaVLSI Arithmetic37 32-bit Carry Lookahead Adder

38
Prof. V.G. OklobdzijaVLSI Arithmetic38 Carry-Lookahead Adder (Weinberger and Smith: original derivation )

39
Prof. V.G. OklobdzijaVLSI Arithmetic39 Carry-Lookahead Adder (Weinberger and Smith: original derivation )

40
Prof. V.G. OklobdzijaVLSI Arithmetic40 Carry-Lookahead Adder (Weinberger and Smith) please notice the similarity with Parallel-Prefix Adders !

41
Prof. V.G. OklobdzijaVLSI Arithmetic41 Carry-Lookahead Adder (Weinberger and Smith) please notice the similarity with Parallel-Prefix Adders !

42
Prof. V.G. OklobdzijaVLSI Arithmetic42 Delay Optimized CLA B. Lee, V. G. Oklobdzija Journal of VLSI Signal Processing, Vol.3, No.4, October 1991

43
Prof. V.G. OklobdzijaVLSI Arithmetic43 Delay Optimized CLA: Lee-Oklobdzija ‘91 (a.) Fixed groups and levels (b.) variable-sized groups, fixed levels (c.) variable-sized groups and fixed levels (d.) variable-sized groups and levels

44
Prof. V.G. OklobdzijaVLSI Arithmetic44 Two-Levels of Logic Implementation of the Carry Block

45
Prof. V.G. OklobdzijaVLSI Arithmetic45 Two-Levels of Logic Implementation of the Carry-Lookahead Block

46
Prof. V.G. OklobdzijaVLSI Arithmetic46 Three-Levels of Logic Implementation of the Carry Block (restricted fan-in)

47
Prof. V.G. OklobdzijaVLSI Arithmetic47 Three-Levels of Logic Implementation of the Carry Lookahead (restricted fan-in)

48
Prof. V.G. OklobdzijaVLSI Arithmetic48 Delay Optimized CLA: Lee-Oklobdzija ‘91 Delay: Two-level BCLA Delay: Three-level BCLA

49
Prof. V.G. OklobdzijaVLSI Arithmetic49 Delay Optimized CLA: Lee-Oklobdzija ‘91 (a.) 2-level BCLA =8.5nS (b.) 3-level BCLA =8.9nS

50
Prof. V.G. OklobdzijaVLSI Arithmetic50 Motorola: CLA Implementation Example A. Naini, D. Bearden and W. Anderson, “A 4.5nS 96b CMOS Adder Design”, Proceedings of the IEEE Custom Integrated Circuits Conference, May 3-6, 1992.

51
Prof. V.G. OklobdzijaVLSI Arithmetic51 Critical path in Motorola's 64-bit CLA

52
Prof. V.G. OklobdzijaVLSI Arithmetic52 Motorola's 64-bit CLA conventional PG Block

53
Prof. V.G. OklobdzijaVLSI Arithmetic53 Motorola's 64-bit CLA Modified PG Block Intermediate propagate signals P i:0 are generated to speed-up C 3

54
Prof. V.G. OklobdzijaVLSI Arithmetic54 Ling’s Adder Huey Ling, “High-Speed Binary Adder” IBM Journal of Research and Development, Vol.5, No.3, 1981.

55
Prof. V.G. OklobdzijaVLSI Arithmetic55 Ling Adder Variation of CLA: Ling, IBM J. Res. Dev, 5/81 Ling’s equations:

56
Prof. V.G. OklobdzijaVLSI Arithmetic56 Ling Adder Ling’s equation Doran, Trans on Comp 9/88 Propagates information on two bits

57
Prof. V.G. OklobdzijaVLSI Arithmetic57 Ling Adder Conventional: Ling:

58
Prof. V.G. OklobdzijaVLSI Arithmetic58 S. Naffziger, ISSCC’96

59
Prof. V.G. OklobdzijaVLSI Arithmetic59 S. Naffziger, ISSCC’96

60
Prof. V.G. OklobdzijaVLSI Arithmetic60 S. Naffziger, ISSCC’96

61
Prof. V.G. OklobdzijaVLSI Arithmetic61 S. Naffziger, ISSCC’96

62
Prof. V.G. OklobdzijaVLSI Arithmetic62 S. Naffziger, ISSCC’96

63
Prof. V.G. OklobdzijaVLSI Arithmetic63 S. Naffziger, ISSCC’96

64
Prof. V.G. OklobdzijaVLSI Arithmetic64 S. Naffziger, ISSCC’96

65
Prof. V.G. OklobdzijaVLSI Arithmetic65 S. Naffziger, ISSCC’96

66
Prof. V.G. OklobdzijaVLSI Arithmetic66 S. Naffziger, ISSCC’96

67
Prof. V.G. OklobdzijaVLSI Arithmetic67 S. Naffziger, ISSCC’96

68
Prof. V.G. OklobdzijaVLSI Arithmetic68 S. Naffziger, ISSCC’96

69
Prof. V.G. OklobdzijaVLSI Arithmetic69 Results: S. Naffziger, “A Subnanosecond 64-b Adder”, ISSCC ‘ u Technology Speed: nS Nominal process, 80C, V=3.3V

70
Prof. V.G. OklobdzijaVLSI Arithmetic70 ConditionalSum Adder J. Sklansky, “Conditional-Sum Addition Logic”, IRE Transactions on Electronic Computers, EC-9, p , 1960.

71
Prof. V.G. OklobdzijaVLSI Arithmetic71 Conditional Sum Adder

72
Prof. V.G. OklobdzijaVLSI Arithmetic72 ConditionalSum Adder

73
Prof. V.G. OklobdzijaVLSI Arithmetic73 Carry-Select Adder O. J. Bedrij, “Carry-Select Adder”, IRE Transactions on Electronic Computers, June 1962, p

74
Prof. V.G. OklobdzijaVLSI Arithmetic74 Carry-Select Adder Addition under assumption of C in =0 and C in =1.

75
Prof. V.G. OklobdzijaVLSI Arithmetic75 Carry Select Adder: combining two 32-b VBAs in select mode Delay = VBA32 + MUX

76
Prof. V.G. OklobdzijaVLSI Arithmetic76 Addition Under Non-equal Signal Arrival Profile Assumption P. Stelling, V. G. Oklobdzija, "Design Strategies for Optimal Hybrid Final Adders in a Parallel Multiplier", special issue on VLSI Arithmetic, Journal of VLSI Signal Processing, Kluwer Academic Publishers, Vol.14, No.3, December 1996

77
Prof. V.G. OklobdzijaVLSI Arithmetic77 Signal Arrival Profile form the Parallel Multiplier Partial-Product Recuction Tree

78
Prof. V.G. OklobdzijaVLSI Arithmetic78 Oklobdzija, Villeger, IEEE Transactions on VLSI Systems, June, 1995

79
Prof. V.G. OklobdzijaVLSI Arithmetic79 Oklobdzija and Villeger, IEEE Transactions on VLSI Systems, June, 1995

80
Prof. V.G. OklobdzijaVLSI Arithmetic80

81
Prof. V.G. OklobdzijaVLSI Arithmetic81

82
Prof. V.G. OklobdzijaVLSI Arithmetic82

83
Prof. V.G. OklobdzijaVLSI Arithmetic83

84
Prof. V.G. OklobdzijaVLSI Arithmetic84

85
Prof. V.G. OklobdzijaVLSI Arithmetic85

86
Prof. V.G. OklobdzijaVLSI Arithmetic86

87
Prof. V.G. OklobdzijaVLSI Arithmetic87

88
Prof. V.G. OklobdzijaVLSI Arithmetic88 Performing Multiply-Add Operation in the Multiply Time P. Stelling, V. G. Oklobdzija, " Achieving Multiply-Accumulate Operation in the Multiply Time", Thirteenth International Symposium on Computer Arithmetic, Pacific Grove, California, July 5 - 9, 1997.

89
Prof. V.G. OklobdzijaVLSI Arithmetic89

90
Prof. V.G. OklobdzijaVLSI Arithmetic90 Final Adder: Implementation

91
Prof. V.G. OklobdzijaVLSI Arithmetic91 Final Adder: Implementation

92
Prof. V.G. OklobdzijaVLSI Arithmetic92 Final Adder: Implementation

93
Prof. V.G. OklobdzijaVLSI Arithmetic93 Final Adder: Implementation

94
Prof. V.G. OklobdzijaVLSI Arithmetic94 Recurrence Solver Based Adders Koggie and Stone, IEEE Trans on Computers, August 1973 Bilgory and Gajski, 18 th DAC, 1981 Brent and Kung, IEEE Trans on Computers, March 1982

95
Prof. V.G. OklobdzijaVLSI Arithmetic95 Recurrence Solver Based Adders 1973, Koggie and Stone published a general recurrence scheme for parallel computation 1979, Brent and Kung published Tech. Report on regular layout for parallel adders 1980, Guibas and Vuillemin, developed a layout scheme based on recurrence equation for addition 1980, Ladner and Fisher published “parallel prefix computation”, Jo of ACM 1981, Bilgory and Gajski published a paper on recurrence structures for automatic cell generation

96
Prof. V.G. OklobdzijaVLSI Arithmetic96 Recurrence Solver Based Adders They are based on recurrence equation for P,G (what is new there since Weinberger ?!!): Or:and

97
Prof. V.G. OklobdzijaVLSI Arithmetic97 Recurrence Solver Based Adders

98
Prof. V.G. OklobdzijaVLSI Arithmetic98 Carry-Lookahead Adder (Weinberger and Smith) Just to remind you ! please notice the similarity with Parallel-Prefix Adders !

99
Prof. V.G. OklobdzijaVLSI Arithmetic99 Multiplexer Based Adder Farooqui and Oklobdzija 1999 Int’l Sym. on VLSI Technology, Taipei, Taiwan, June 8-10, 1999

100
Prof. V.G. OklobdzijaVLSI Arithmetic100 Multiplexer Based Adder Based on the realization that MUX circuit is faster than a logic gate due to its transmission gate implementation Based on Carry-Lookahead method (W-S), or recurrence solver.

101
Prof. V.G. OklobdzijaVLSI Arithmetic101 Multiplexer Based Adder A. A. Farooqui, V. G. Oklobdzija, F. Chechrazi, 1999 Int’l Sym. on VLSI Technology, Taipei, Taiwan, June 8-10, 1999.

102
Prof. V.G. OklobdzijaVLSI Arithmetic102 Multiplexer Based Adder A. A. Farooqui, V. G. Oklobdzija, F. Chechrazi, 1999 Int’l Sym. on VLSI Technology, Taipei, Taiwan, June 8-10, 1999.

103
Prof. V.G. OklobdzijaVLSI Arithmetic103 Multiplexer Based Adder A. A. Farooqui, V. G. Oklobdzija, F. Chechrazi, 1999 Int’l Sym. on VLSI Technology, Taipei, Taiwan, June 8-10, 1999.

104
Prof. V.G. OklobdzijaVLSI Arithmetic104 Multiplexer Based Adder A. A. Farooqui, V. G. Oklobdzija, F. Chechrazi, 1999 Int’l Sym. on VLSI Technology, Taipei, Taiwan, June 8-10, Results in a very fast structure 7-MUX delays for a 64-b adder Delay using standard cell 0.25u, 2.5V, 25 o C : Adder Size (bits) Delay (pS)

105
Prof. V.G. OklobdzijaVLSI Arithmetic105 DEC "Alpha" Adder Combination: –8-bit tapered pre-discharged Manchester Carry Chains, with C in = 0 and C in = 1 –32-bit LSB Carry Lookahead Adder –32-bit MSB Conditional-Sum Adder –Carry-Select on most significant 32-bits –Latches in the middle: pipelined addition

106
Prof. V.G. OklobdzijaVLSI Arithmetic106 DEC "Alpha" Adder

107
Prof. V.G. OklobdzijaVLSI Arithmetic107 DEC "Alpha" Adder: Results The first 200MHz processor Built using 0.75u technology V=3.3V, 30W Pipelined (two-latches) allowing 5nS throughput and 10nS latency

108
Prof. V.G. OklobdzijaVLSI Arithmetic108 Conclusion VLSI Implementation of Addition

109
Prof. V.G. OklobdzijaVLSI Arithmetic109 Conclusion: VLSI Implementation of Addition Currently, implementation parameters are not reflected in algorithms used for development Layout and wire delays effects are largely neglected and this is becoming intolerable in the next generation of technology Transistor sizing has a large effect which can outweight the algorithm There is a great disconnect between algorithm and implementation New rules and measures of goodness are needed

110
Prof. V.G. OklobdzijaVLSI Arithmetic110 Multiplication Parallel Multiplier Implementation

111
Prof. V.G. OklobdzijaVLSI Arithmetic111 Multiplication Algorithm: for j=0,....,n-1 initially p(n)=XY after n steps

112
Prof. V.G. OklobdzijaVLSI Arithmetic112 Parallel Multipliers

113
Prof. V.G. OklobdzijaVLSI Arithmetic113 4:2 Compressor

114
Prof. V.G. OklobdzijaVLSI Arithmetic114 Re-designed 4:2 Compressor with 3 XOR Delay C in I1 I2 I3 I4 0 1 S C C out

115
Prof. V.G. OklobdzijaVLSI Arithmetic115 Three-Dimensional optimization Method: TDM (Oklobdzija, Villeger, Liu, 1996)

116
Prof. V.G. OklobdzijaVLSI Arithmetic116 Generation of the Partial Product Reduction Tree in TDM multiplier

117
Prof. V.G. OklobdzijaVLSI Arithmetic117 Speed of Partial Product Reduction for Various Schemes

118
Prof. V.G. OklobdzijaVLSI Arithmetic118 Booth Recoding Algorithm x i+2 x i+1 x i Add to partial product 000+0Y 001+1Y 010+1Y 011+2Y 100-2Y 101-1Y 110-1Y 111-0Y

119
Prof. V.G. OklobdzijaVLSI Arithmetic119 Organization of Hitachi's DPL multiplier

120
Prof. V.G. OklobdzijaVLSI Arithmetic120 Hitachi's 4:2 compressor structure

121
Prof. V.G. OklobdzijaVLSI Arithmetic121 DPL multiplexer circuit

122
Prof. V.G. OklobdzijaVLSI Arithmetic122 Conclusion References: 1.E. Swartzlander, "Computer Arithmetic". Vol. 1&2, IEEE Computer Society Press, K. Hwang, "Computer Arithmetic : Principles, Architecture and Design", John Wiley and Sons, M. Ercegovac, “Digital Systems and Hardware/Firmware Algorithms”, Chapter 12: Arithmetic Algorithms and Processors, John Wiley & Sons, A. Chandrakasan, W. Bowhill, F Fox, Editors, "Design of High Performance Microprocessors Circuits", IEEE Press, July V. G. Oklobdzija, “High-Performance System Design: Circuits and Logic”, IEEE Press, July Also:

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google