Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using Carry-Save Adders For Radix- 4, Can Be Used to Generate 3a – No Booth’s Slight Delay Penalty from CSA – 3 Gates.

Similar presentations

Presentation on theme: "Using Carry-Save Adders For Radix- 4, Can Be Used to Generate 3a – No Booth’s Slight Delay Penalty from CSA – 3 Gates."— Presentation transcript:

1 Using Carry-Save Adders For Radix- 4, Can Be Used to Generate 3a – No Booth’s Slight Delay Penalty from CSA – 3 Gates

2 Upper Half P in Stored Carry For Radix-2, Better Use in Keeping Cumulative Product in Redundant Form for First k -1 Cycles Then Use a CPA in the Last Cycle

3 CSA With Booth Recoding Better Usage when Combined with Booth’s Recoding –Reduces Cycles by 50% Each Cycle Faster Due to CSA Sign of  a,  2a Incorporated Directly in Recoder/Selector Instead of Add/Subtract Signal Generation

4 CSA Combined with Booth Recoding

5 Booth Recoder/Selector Circuitry Shown on Following Slide Negative Multiples –a, -2a in 2’s Complement a, 2a Aligned at Right with Position i Must be Padded with i Zeros to Right Bitwise Complement (when –a, -2a Needed) Converts zeros to ones Followed by LSb add of 1 Converts Back to zeros Causes a Carry-in of 1 into Position i Can Ignore Positions 0 through i -1 (in neg. multiples) Insert carry-in directly (dot)

6 Booth Recoder – Selector Circuit

7 Radix-4 with CSA – No Booth

8 Radices > 4 Radix-8 (3 bits at a time-k/3 multiples) Requires 3-Level CSA Tree –Might as Well Use Radix-16 (4 bits at a time) –Still 3-level tree with one more CSA MUXes Can Be Replaced with Booth Recoder/Selector Circuits in Higher Radix Multipliers Can Continue to Increase Radix (256-8bits) Leading to Wider Trees Tradeoff is Speed Versus Area

9 Radix-16 Multiplication

10 Classification of Multipliers

11 Twin-Beat Mult. with Radix-8 Booth Recoding

12 Full Tree Multipliers All k PPs Produced Simultaneously Input to k-input Multioperand Tree Multiples of a (Binary, High-Radix or Recoded) Formed at Top of Tree Multiple-Forming Circuits –AND Gates (binary multiplier) –radix-4 Booth (recoded multiplier) Tree Results in Product in Redundant Form (2 Values – Carry-Store for Example) Final Product Formed With Converter (Fast CPA for Exmaple)

13 General Parallel Multiplier

14 Tree Type Multiplier Classification Distinguished by Design of: 1.Partial Product Forming Circuits (i.e., Booth, Hi-Rad, etc.) 2.Reduction Tree Type 3.Redundant-to-Binary Converter If Redundant Result in Carry-Save Form, Converter is Just a CPA Could Use Other Redundant Adders Such as Signed Binary (4:2 Compressors) High Radix Multipliers Lead to Fewer Values to Accumulate –Sequential Design – Fewer Cycles –Parallel Design Smaller Tree –Tradeoff Tree Complexity Versus Multiple Forming Circuit

15 Wallace and Dadda Tree Multipliers Wallace – Combine Partial Products as Soon as Possible Dadda – Maintain Critical Path Length (Tree Depth) but Combine as Late as Possible Wallace – Fastest Possible Design Since Typically Smaller CPA at End Dadda – Simpler Tree but Wider CPA at End

16 4  4 Example 16 AND Gates Used to Form x i a j Terms (dots)  1 2 3 4 3 2 1

17 Wallace Example 1 2 3 4 3 2 1 5 FAs, 3 HAs, 4-bit CPA

18 Dadda Examples 1 2 3 4 3 2 1 3 FAs, 3 HAs, 6-bit CPA 1 2 3 4 3 2 1 4 FAs, 2 HAs, 6-bit CPA

19 Trees in Numeric Representation Many Times Hybrid Approach Used to Find Smallest Width CPA MS Thesis Topic – Optimize Tree With Different Counter Types

20 Implementation Issues Logarithmic Depth Tree – Irregular Structure Design/Layout Difficult Various Length Signal Propagation Paths Hazards and Signal Skew Need Iterated Recursive Structures Automatic Synthesis and Layout Motivates Search for Alternative Reduction Tree Structures

21 Other Tree Architectures Can Compose from Larger Counters, e.g. (7:2) –Use “0” Inputs for Some –Or Prune the Tree for Some Use “slices” – Example is (11:2) – Next Slide –Can be Laid Out to Occupy Narrow Vertical Slice and Replicated –All Carries Produced in Level i Enter Level i+1 –Balanced Delay Tree Results 3 Columns – 1, 3, 5 FAs Can Expand from 11 to 18 – Append Col. of 7

22 (11:2) Tree Slice

23 Other Tree Blocks Converter Stage is Fast CPA Can Also Use SBD With SBD the Converter Stage is a Fast Subtractor

24 Array Multipliers Can Eliminate Top CSA With 0 Input Can Replace 0 With y to Compute ax+y

25 Array Multipliers Tree is One-Sided Longest Delay is 4 CSA Plus k-bit CPA Slower than Wallace/Dadda Tree Regular Structure –short wires in horiz., vert., diag. positions –simple, efficient layout –easily pipelined (latches after each CSA row)

26 Methods for Reducing Array Size

27 Reducing Array Size (cont.)

28 5 by 5 Array Multiplier (unsgnd)

29 Signed Array Multiplier Array with 2’s Complement Alternative is Pezaris Array with Different Cell Types Need Array of AND Gates for Multiple Generation Critical Path is Main Diagonal then Ripple Thru CPA Can skip “h” Cells Along Main Diag –lower right cell now has 4 inputs –move to “extra” input in second cell in diag. –less regular layout now but faster

30 5 by 5 Array Multiplier (signed)

31 5 by 5 Array Multiplier AND Gates Embedded inside FA Blocks

32 Pipelined Partial Tree Multiplier

33 Pipelined Array Multiplier

Download ppt "Using Carry-Save Adders For Radix- 4, Can Be Used to Generate 3a – No Booth’s Slight Delay Penalty from CSA – 3 Gates."

Similar presentations

Ads by Google