1 Integer Multipliers. 2 Multipliers A must have circuit in most DSP applications A variety of multipliers exists that can be chosen based on their performance.

Slides:



Advertisements
Similar presentations
Programmable FIR Filter Design
Advertisements

Lecture 23: Registers and Counters (2)
Multiplication and Shift Circuits Dec 2012 Shmuel Wimer Bar Ilan University, Engineering Faculty Technion, EE Faculty 1.
Registers and Counters
Using Carry-Save Adders For Radix- 4, Can Be Used to Generate 3a – No Booth’s Slight Delay Penalty from CSA – 3 Gates.
Datapath Functional Units. Outline  Comparators  Shifters  Multi-input Adders  Multipliers.
Multiplication Schemes Continued
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE VLSI Circuit Design Lecture 24 - Subsystem.
1 CS 140 Lecture 14 Standard Combinational Modules Professor CK Cheng CSE Dept. UC San Diego Some slides from Harris and Harris.
The Control Unit: Sequencing the Processor Control Unit: –provides control signals that activate the various microoperations in the datapath the select.
Copyright 2008 Koren ECE666/Koren Part.6b.1 Israel Koren Spring 2008 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer.
EECS Components and Design Techniques for Digital Systems Lec 18 – Arithmetic II (Multiplication) David Culler Electrical Engineering and Computer.
Chapter 6 Arithmetic. Addition Carry in Carry out
UNIVERSITY OF MASSACHUSETTS Dept
Logic and Computer Design Fundamentals Registers and Counters
Contemporary Logic Design Arithmetic Circuits © R.H. Katz Lecture #24: Arithmetic Circuits -1 Arithmetic Circuits (Part II) Randy H. Katz University of.
EE466: VLSI Design Lecture 14: Datapath Functional Units.
Chapter 7 - Part 2 1 CPEN Digital System Design Chapter 7 – Registers and Register Transfers Part 2 – Counters, Register Cells, Buses, & Serial Operations.
Introduction to CMOS VLSI Design Datapath Functional Units
Copyright 2008 Koren ECE666/Koren Part.6a.1 Israel Koren Spring 2008 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer.
KU College of Engineering Elec 204: Digital Systems Design
Multiplication.
Low-power, High-speed Multiplier Architectures
Lecture 18: Datapath Functional Units
Aug Shift Operations Source: David Harris. Aug Shifter Implementation Regular layout, can be compact, use transmission gates to avoid threshold.
Chapter 6-2 Multiplier Multiplier Next Lecture Divider
Digital Integrated Circuits Chpt. 5Lec /29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin (
ECE 645 – Computer Arithmetic Lecture 7: Tree and Array Multipliers ECE 645—Computer Arithmetic 3/18/08.
Copyright 1995 by Coherence LTD., all rights reserved (Revised: Oct 97 by Rafi Lohev, Oct 99 by Yair Wiseman, Sep 04 Oren Kapah) IBM י ב מ 10-1 The ALU.
Chapter 4 – Arithmetic Functions and HDLs Logic and Computer Design Fundamentals.
Reconfigurable Computing - Multipliers: Options in Circuit Design John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots on.
Digital Kommunikationselektronik TNE027 Lecture 2 1 FA x n –1 c n c n1- y n1– s n1– FA x 1 c 2 y 1 s 1 c 1 x 0 y 0 s 0 c 0 MSB positionLSB position Ripple-Carry.
Multi-operand Addition
Advanced VLSI Design Unit 05: Datapath Units. Slide 2 Outline  Adders  Comparators  Shifters  Multi-input Adders  Multipliers.
EKT 221/4 DIGITAL ELECTRONICS II  Registers, Micro-operations and Implementations - Part3.
SEQUENTIAL CIRCUITS Component Design and Use. Register with Parallel Load  Register: Group of Flip-Flops  Ex: D Flip-Flops  Holds a Word of Data 
ENG241 Digital Design Week #8 Registers and Counters.
Topics covered: Arithmetic CSE243: Introduction to Computer Architecture and Hardware/Software Interface.
Wallace Tree Previous Example is 7 Input Wallace Tree
Full Tree Multipliers All k PPs Produced Simultaneously Input to k-input Multioperand Tree Multiples of a (Binary, High-Radix or Recoded) Formed at Top.
ECE DIGITAL LOGIC LECTURE 15: COMBINATIONAL CIRCUITS Assistant Prof. Fareena Saqib Florida Institute of Technology Fall 2015, 10/20/2015.
Comparison of Various Multipliers for Performance Issues 24 March Depart. Of Electronics By: Manto Kwan High Speed & Low Power ASIC
CSE477 L21 Multiplier Design.1Irwin&Vijay, PSU, 2002 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin (
Reconfigurable Computing - Options in Circuit Design John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots on Cockburn Sound,
Full Adder Truth Table Conjugate Symmetry A B C CARRY SUM
Prof. Sin-Min Lee Department of Computer Science
Registers and Counters
Multiplier Design [Adapted from Rabaey’s Digital Integrated Circuits, Second Edition, ©2003 J. Rabaey, A. Chandrakasan, B. Nikolic]
CSE477 VLSI Digital Circuits Fall 2003 Lecture 21: Multiplier Design
EKT 221 – Counters.
EKT 221 : Digital 2 COUNTERS.
1 Integer Multipliers. 2 Multipliers A must have circuit in most DSP applications A variety of multipliers exists that can be chosen based on their performance.
Multipliers Multipliers play an important role in today’s digital signal processing and various other applications. The common multiplication method is.
Instructor: Alexander Stoytchev
Digital System Design Review.
Instructor: Alexander Stoytchev
Chap. 8 Datapath Units: Multiplier Design
ECE 434 Advanced Digital System L13
ECE 434 Advanced Digital System L12
Unsigned Multiplication
ECEN 248: INTRODUCTION TO DIGITAL SYSTEMS DESIGN
Integer Multipliers.
UNIVERSITY OF MASSACHUSETTS Dept
UNIVERSITY OF MASSACHUSETTS Dept
UNIVERSITY OF MASSACHUSETTS Dept
Lecture 9 Digital VLSI System Design Laboratory
Comparison of Various Multipliers for Performance Issues
UNIVERSITY OF MASSACHUSETTS Dept
Appendix J Authors: John Hennessy & David Patterson.
UNIVERSITY OF MASSACHUSETTS Dept
Presentation transcript:

1 Integer Multipliers

2 Multipliers A must have circuit in most DSP applications A variety of multipliers exists that can be chosen based on their performance Serial, Serial/Parallel,Shift and Add, Array, Booth, Wallace Tree,….

3 16x16 multiplier converter Converter RB r e s e t e n converter RC e n r e s e t RA r e s e t e n

4 Multiplication Algorithm Yn-1X0 Yn-2X0 Yn-3X0 …… Y1X0 Y0X0 Yn-1X1 Yn-2X1 Yn-3X1 …… Y1X1 Y0X1 Yn-1X2 Yn-2X2 Yn-3X2 …… Y1X2 Y0X2 … … … … …. …. …. …. …. Yn-1Xn-2 Yn-2X0 n-2 Yn-3X n-2 …… Y1Xn-2 Y0Xn-2 Yn-1Xn-1 Yn-2X0n-1 Yn-3Xn-1 …… Y1Xn-1 Y0Xn P2n-1 P2n-2 P2n-3 P2 P1 P0 X= Xn-1 Xn-2 …………………X0 Multiplicand Y=Yn-1 Yn-2…………………….Y0 Multiplier

5 A7 A6 A5 A4 A3 A2 A1 A0 B7 B6 B5 B4 B3 B2 B1 B0 A7.B2 A6.B2 A5.B2 A4.B2 A3.B2 A2.B2 A1.B2 A0.B2 A7.B3 A6.B3 A5.B3 A4.B3 A3.B3 A2.B3 A1.B3 A0.B3 A7.B4 A6.B4 A5.B4 A4.B4 A3.B4 A2.B4 A1.B4 A0.B4 A7.B5 A6.B5 A5.B5 A4.B5 A3.B5 A2.B5 A1.B5 A0.B5 1. Multiplication Algorithms Implementation of multiplication of binary numbers boils down to how to do the the additions. Consider the two 8 bit numbers A and B to generate the 16 bit product P. First generate the 64 partial Products and then add them up. A7.B0 A6.B0 A5.B0 A4.B0 A3.B0 A2.B0 A1.B0 A0.B0 A7.B1 A6.B1 A5.B1 A4.B1 A3.B1 A2.B1 A1.B1 A0.B1. A7.B6 A6.B6 A5.B6 A4.B6 A3.B6 A2.B6 A1.B6 A0.B6 A3.B7 A2.B7 A1.B7 A0.B7 A3.B7 A2.B7 A1.B7 A0.B7 P15 P14 P13 P12 P11 P10 P9 P8 P7 P6 P5 P4 P3 P2 P1 P0 The equation is :.

6 MU (16X16 Multiplier Unit) REGIN1 REGIN1 REG OUT REG OUT Control Unit Storage Multiplier Design

7 Slide 1 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:

8 S i : the ith bit of the final result Slide 2 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:

9 S i : the ith bit of the final result Slide 3 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:

10 S i : the ith bit of the final result Slide 4 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:

11 S i : the ith bit of the final result Slide 5 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:

12 S i : the ith bit of the final result C i : the only carry from column i S i : the ith bit of the final result C i : the only carry from column i Slide 6 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:

13 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 7 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:

14 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 8 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:

15 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 9 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:

16 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 10 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:

17 Slide 11 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:

18 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 12 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:

19 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 13 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:

20 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 14 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:

21 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 15 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:

22 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: Slide 16

23 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 17 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:

24 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 18 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:

25 Slide 19 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:

26 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 20 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:

27 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 21 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:

28 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 21 X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset: X: x 3 x 2 x 1 x 0 Y:y 3 y 2 y 1 y 0 Input Sequence for G1: 00x 3 x 2 x 1 x 0 0 x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 0x 3 x 2 x 1 x 0 00y 3 y 3 y 3 y 3 0y 2 y 2 y 2 y 2 0y 1 y 1 y 1 y 1 0y 0 y 0 y 0 y 0 Reset:

29 S i : the ith bit of the final result Slide 1

30 S i : the ith bit of the final result C i : the only carry from column i S i : the ith bit of the final result C i : the only carry from column i Slide 2

31 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 3

32 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 4

33 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 5

34 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 6

35 S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i S i : the ith bit of the final result C i : the only carry from column i S i j : the jth partial sum for column i C i j : the jth partial carry from column i Slide 7

36 S i : the ith bit of the final result C i : the only carry from column i S i : the ith bit of the final result C i : the only carry from column i Slide 8

37 8 bit Adder MUX 0 INPUT Ain (7 downto 0) REGA Result (7 downto 0) Result (15 downto 8) INPUT Bin (7 downto 0) CLOCK REGB REGC Shift Add Multiplier Design Implementation

38 Synchronous Shift and Add Multiplier controller  Multiplication process:  5 states: Idle, Init, Test, Add, and Shift&Count.  Idle: Starts by receiving the Start signal;  Init: Multiplicand and multiplier are loaded into a load register and a shift register, respectively;  Test: The LSB in the shift register which contains the multiplier is tested to decide the next state;

39 Synchronous Shift and Add Multiplier ControllerDesign  Add: If LSB is ‘1’, then next state is to add the new partial product to the accumulation result, and the state machine transits to shift&count state ;  Shift&Count: If LSB is ‘0’, then the two shift register shift their contains one bit right, and the counter counts up by one step. After that, the state machine transits back to test state;  When the counter reaches to N, a Stop signal is asserted and the state machine goes to the idle state;  Idle: In the idle state, a Done signal is asserted to indicate the end of multiplication.

40 Slide 1 n-bit Multiplier: Q 0 =1: Multiplicand is added to register A; the result is stored in register A; registers C, A, Q are shifted to the right one bit Q 0 =0: Registers C, A, Q are shifted to the right one bit n-bit Multiplier: Q 0 =1: Multiplicand is added to register A; the result is stored in register A; registers C, A, Q are shifted to the right one bit Q 0 =0: Registers C, A, Q are shifted to the right one bit

41 Slide 2 Example: 4-bit Multiplier Initial Values Example: 4-bit Multiplier Initial Values

42 Slide 3 Example: 4-bit Multiplier First Cycle--Add Example: 4-bit Multiplier First Cycle--Add

43 Slide 4 Example: 4-bit Multiplier First Cycle--Shift Example: 4-bit Multiplier First Cycle--Shift

44 Slide 5 Example: 4-bit Multiplier Second Cycle--Shift Example: 4-bit Multiplier Second Cycle--Shift

45 Slide 6 Example: 4-bit Multiplier Third Cycle--Add Example: 4-bit Multiplier Third Cycle--Add

46 Slide 7 Example: 4-bit Multiplier Third Cycle--Shift Example: 4-bit Multiplier Third Cycle--Shift

47 Slide 8 Example: 4-bit Multiplier Fourth Cycle--Add Example: 4-bit Multiplier Fourth Cycle--Add

48 Slide 9 Example: 4-bit Multiplier Fourth Cycle--Shift Example: 4-bit Multiplier Fourth Cycle--Shift

49 4*4 Synchronous Shift and Add Multiplier Design Layout Design Floor plan of the 4*4 Synchronous Shift and Add Multiplier

50 Comparison between Synchronous and Asynchronous Approaches.

51 Example : (simulated by Ovais Ahmed, Fall_03,project) Multiplicand = =89 16 Multiplier = =AB 16 Expected Result = =5B83 16

52  Regular structure based on add and shift algorithm.  Addition is mainly done by carry save algorithm.  Sign bit extension results in a higher capacitive load and slows down the speed of the circuit. Array Multiplier

53 Addition with CLA

54 Array Multiplier with CSA

55 Critical Path with Array Multipliers HAFA HAFA HAFA Two of the possible paths for the Ripple-Carry based 4*4 Multiplier Area = (N*N) AND Gate + (N-1)N Full-Adder Delay = τ HA + (2N-1) τ FA

56

57 Wallace Tree

58 Array Multiplier + Wallace Tree

59 4/12/2015Concordia VLSI Lab59 Background  Baugh-Wooley Algorithm Convert negative partial products to positive representation No sign-extension required

60 4/12/2015Concordia VLSI Lab60 examples of 5-by-5 Baugh-Wooley

61 a7a6a5a4a3a2a1a0 *a7a6a5a4a3a2a1a a7*a0a6*a0a5*a0a4*a0a3*a0a2*a0a1*a0a0*a0 a7*a1a6*a1a5*a1a4*a1a3*a1a2*a1a1*a1a0*a1 a7*a2a6*a2a5*a2a4*a2a3*a2a2*a2a1*a2a0*a2 a7*a3a6*a3a5*a3a4*a3a3*a3a2*a3a1*a3a0*a3 a7*a4a6*a4a5*a4a4*a4a3*a4a2*a4a1*a4a0*a4 a7*a5a6*a5a5*a5a4*a5a3*a5a2*a5a1*a5a0*a5 a7*a6a6*a6a5*a6a4*a6a3*a6a2*a6a1*a6a0*a6 a7*a7a6*a7a5*a7a4*a7a3*a7a2*a7a1*a7a0*a a7*a6a7*a5a7*a4a7*a3a7*a2a7*a1a7*a0a6*a0a5*a0a4*a0a3*a0a2*a0a1*a0‘0'a0 a7*a7a6*a5a6*a4a6*a3a6*a2a6*a1a5*a1a4*a1a3*a1a2*a1a1*a1 a6*a6a5*a4a5*a3a5*a2a4*a2a3*a2a2*a2 a5*a5a4*a3a3*a3 a4*a4 S15, S14S13S12S11S10S9S8S7S6S5S4S3S2S1S0

62 Example of an 8bit squarer N*N N=8bits

63 Array Multiplier 32bits by 32bits multiplier

64 1 Booth (Radix-4) Multiplier  Radix-4 (3 bit recoding) reduces number of partial products to be added by half.  Great saving in area and increased speed. A = -a n-1 2 n-1 + a n-2 2 n-2 + a n-3 2 n-3 + …. + a a 0 B = -b n-1 2 n-1 + b n-2 2 n-2 + b n-3 2 n-3 + …. + b b 0 · Base 4 redundant sign digit representation of B is (n/2) - 1 B =  2 2i K i i = 0

65   K i is calculated by following equation K i = -2b 2i+1 + b 2i + b 2i-1 i = 0,1,2,….(n-2)/2  3 bits of Multiplier B, b 2i+1, b 2i, b 2i-1, are examined and corresponding K i is calculated.  B is always appended on the right with zero (b -1 = 0), and n is always even (B is sign extended if needed).  The product A  B is then obtained by adding n/2 partial products. (n/2) - 1 A  B= P =  2 2i K i A i = 0

66 Booth Algorithm Decoding of multiplier to generate signals for hardware use Xi+1XiXi-1OPNEGZEROTWO

67 Booth Algorithm A Booth recoded multiplier examines Three bits of the multiplicand at a time It determine whether to add zero, 1, -1, 2, or -2 of that rank of the multiplicand. The operation to be performed is based on the current two bits of the multiplicand and the previous bit X i+1 XX i-1 Z i/

68 BIT M is OPERATION multiplied XiXi X i+1 X i+2 by 000add zero (no string)+0 001add multipleic (end of string)+X 010add multiplic. (a string)+X 011add twice the mul. (end of string)+2X 100sub. twice the m. (beg. of string)-2X 101sub. the m. (-2X and +X)-X 110sub. the m. (beg. of string)-X 111sub. zero (center of string)-0

69 Booth Algorithm-a higher radix Multiplication Multiplicand A = ● ● ● ● Multiplier B = (●●)(●●) Partial product bits ● ● ● ● (B 1 B 0 ) 2 A4 0 Partial product bits ● ● ● ● (B 3 B 2 )A4 1 Product P = ● ● ● ● ● ● ● ●

70 The following example is used to show how the calculation is done properly. Multiplicand X = Multiplier Y = After booth decoding, Y is decoded as to multiply X by +2, -1, +1 separately, then shift the partial product two bits and add them together. X* X* X* Example Added to the multiplier

71 Sign Extension

72 4/12/2015Concordia VLSI Lab72 Sign extension  Traditional sign-extension scheme Segment the input operands based on the size of embedded blocks Multiply the segmented inputs and extend the sign bit of each partial products Sum all partial products Segmented input operands Sign extension × + Final result partial products Sign

73 Booth Algorithm-Example 1 Example 1:

74 Booth Algorithm Example 2 Notice sign extensions

75 Booth Algorithm-Example 3 Notice the sign extensions

76 Comparison of Booth and parallel multiplier shift and Add

77 Please note that each operand is 17 bit ie. the 17 th bit is the sign bit. Also negative numbers are entered as 1’s complement, this is why you need to add the S in the right hand side of the diagram. If you use 2’complement then the S’s on right side of the diagram can be removed Template to reduce sign extensions for Booth Algorithm For hardware implementation

78 Comparison of Template and the sign extension

79 Partial Product matrix generated for a 16 * 16 bit multiplication, Using booth and the template given in previous slide

80 Using the Template 25 * -35 Sign bit Add SS Add inverted S Add Inverted sign and add * 1 Add Inverted sign bit * * 2 No sign bit * This is a –ve number. Convert it = 875 Example of using the template 25 * - 35 with -35 as the multiplier. Using 8 bit representation

81 Booth Multiplier Components Multiplier M ult ipl ic an d Booth Encoder PPU (Partial products unit) PPA (Partial products adding unit) Product

82 Wallace Tree and Ripple Carry Adder Structure. Of 8*8 multiplier With Pipeline

83 Hardware implementation of Booth with shift and add

84 Simulation Plan

85 Testing the Design

86 Simulation For Parallel Multipliers Signed Number: Unsigned Number:

87 Simulation For Signed S/P Multipliers There are 340 ns delay between the result and the operators because of the D flip-flops delay.

88 FPGA after implementation, areas of programming shown clearly

89 Another implementation of the above after pipelining, the place and rout has paced the design in different places.

90 Spartacus FPGA board

91 Testing the multiplication system

92 Comparison of Multipliers Table 7. Performance comparison for two ’ s complement multipliers By Chen Yaoquan, M.Eng Array Multiplier Modified Booth Multiplier Wallace-Tree Multiplier Modified Booth- Wallace Tree Multiplier Twin Pipe Serial- Parallel Multiplier Behavioral Multiplier Area – Total CLB’s (#) Maximum Delay D(ns) (3.36x32)49.33 Total Dynamic Power P (W) Delay ·Power Product (DP) (ns W) AreaPower Product (AP) (# W) AreaDelay Product (AD) (# ns) 1.10E E E E E E+05 AreaDelay 2 Product (AD 2 ) (# ns 2 ) 3.94E E E E E E+06

93 Comparison of Multipliers Table 7. Performance comparison for Unsigned multipliers By Chen Yaoquan, M.Eng Array Multiplier Modified Booth Multiplier Wallace-Tree Multiplier Modified Booth- Wallace Tree Multiplier Twin Pipe Serial- Parallel Multiplier Behavioral Multiplier Area – Total CLB’s (#) Maximum Delay D(ns) Total Dynamic Power P (W) Delay ·Power Product (DP) (ns W) AreaPower Product (AP) (# W) AreaDelay Product (AD) (# ns) 1.22E E E E E E+05 AreaDelay 2 Product (AD 2 ) (# ns 2 ) 4.55E E E E E E+06

94 Comparison of Multipliers The relation of Area and Delay for behavioral multiplier -- "banana curve" Change the value of “set_max_delay” in Script file (ns) >60 Area(#) Power(w) Delay(n s)

95 Comparison of Multipliers By Chen Yaoquan, M.Eng Array Multiplier Modified Booth Multiplier Wallace- Tree Multiplier Modified Booth- Wallace Tree Multiplier Twin Pipe Serial- Parallel Multiplier Behavioral Multiplier Area MediumSmallLargeSmallSmallestMedium Critical Delay MediumFastVery FastFastestVery LargeLarge Power Consumption LargeMediumLargeMediumSmallestMedium Complexity SimpleComplex More Complex SimpleSimplest Implement EasyMediumDifficut EasyEasiest

96 Pipelining Simulation

97 Synthesis for Signed Multipliers Array Modified Booth Wallace Tree Modified Booth -Wallace Tree Twin Pipe S/P Behavioral

98 Synthesis for Unsigned Multipliers Array Modified Booth Wallace Tree Modified Booth -Wallace Tree Twin Pipe S/P Behavioral

99 Conclusion Modified Booth and Wallace Tree are the best techniques for high speed multiplication. Wallace Tree has the best performance, but it is hard to implement. Booth algorithm based multipliers have lower area among parallel multipliers. For behavioral multipliers, the area will increase while the delay decreases.

100 Comparison Array Multiplier Modified Booth Multiplier Wallace Tree Multiplier Modified Booth & Wallace Tree Multiplier Twin Pipe Serial- Parallel Multiplier Area – Total CLB’s (#) Maximum Delay (ns) ns ns ns ns 22.58ns (722.56ns) Power Consumption at highest speed (mW) m W (at 188ns) mW (at 140ns) 30.95mW (at ns) mW (at ns) 2.089mW (at ns) Delay  Power Product (DP) (ns mW) Area  Power Product (AP) (# mW) x x x x Area  Delay Product (AD) (# ns) x x x x x 10 3 Area  Delay 2 Product(AD 2 ) (# ns 2 ) x x x x x 10 6

101 NOTICE  The rest of these slides are for extra information only and are not part of the lecture

102 Array Addition

103 Addition of 8 binary numbers using the Wallace tree principal

104

105

106

107 Baugh-Wooley two's complement multiplier:

108

109 Cluster Multipliers Divide the multiplier into smaller multipliers

110 Cluster Multipliers 8-bit cluster low power multiplier The circuit used to generate the enable signal

111 Cluster Multipliers Dividing the multiplication circuit into clusters (blocks) of smaller multipliers Applying clock gating techniques to disable the blocks that are producing a zero result. Features –Low Power (claims 13.4 % savings)

112 Multiplexer-Based Array Multipliers Z jZ j xjyjxjyj

113 Multiplexer-Based Array Multipliers Two types of cells: Cell 1: produce the terms Z i j 2 j and includes a full adder of carry save adder array Cell 2: produce the terms x j y j 2 j and includes a full adder of carry save adder array

114 Multiplexer-Based Array Multipliers Characteristics –Faster than Modified Booth –Unlike Booth, does not require encoding logic –Requires approximately N 2 /2 cells –Has a zigzag shape, thus not layout-friendly

115 Multiplexer-Based Array Multipliers Improvement –More rectangular layout –Save up to 40 percent area without penalties –Outperforms the modified Booth multiplier in both speed and power by 13% to 26%

116 Gray-Encoded Array Multiplier DecHybDecHybDecHybDecHyb ’s complement Hybrid Coding –Having a single bit different for consecutive values –Reducing the number of transitions, and thus power ( for highly correlated streams ).

117 Gray-Encoded Array Multiplier An 8-bit wide 2’s complement radix-4 array multiplier

118 Gray-Encoded Array Multiplier Characteristics –Uses gray code to reduce the switching activity of multiplier –Saves 45.6% power than Modified Booth –Uses greater area(26.4% ) than Modified Booth

119 Ultra-high Speed Parallel Multiplier How to ultra-high speed? –Based on Modified Booth Algorithm and Tree Structure (Column compress) –Chooses efficient counters (3:2 and 5:3) –Uses the new compressor (faster 20% ) –Uses First Partial product Addition (FPA) Algorithm (reducing the bits of CLA by 50%)

120 Ultra-high Speed Parallel Multiplier Calculate the partial products as soon as possible. The final CLA is only 16-bit instead of 32-bit. Divide into 3 rows or 5 rows only (most efficient). Calculation process using parallel counter in case of 16x16 ---Totally reduce delay by about 30%

121 ULLRLF Multiplier ULLRLF stands for Upper/Lower Left-to- Right Leapfrog. Combine the following techniques: –Signal flow optimization in [3:2] adder array for partial product reduction, –Left-to-right leapfrog (LRLF) signal flow, –Splitting of the reduction array into upper/lower parts.

122 ULLRLF Multiplier 1)Signal flow optimization in [3:2] adder array -- For n = 32, the delay is reduced by 30 percent. -- The power is saved also. PP ij is always connected to pin A S in /C in are connected to B/C, most S in signals are connected to C

123 ULLRLF Multiplier 2) Left-to-Right Leapfrog (LRLF) Structure -- The delay of signals is more balanceable. -- Low power. The sum signals skip over alternate rows.

124 ULLRLF Multiplier 3) Upper/Lower Split Structure -- The long path of data path be broken into parallel short paths, there would be a saving in power. -- The delay of Partial Products Reduction is reduced. Only n+2 bits

125 ULLRLF Multiplier Floorplan of ULLRLF (n = 32) ULLRLF multipliers have less power than optimized tree multipliers for n ≤ 32 while keeping similar delay and area. With more regularity and inherently shorter interconnects, the ULLRLF structure presents a competitive alternative to tree structures.

126 Signed Array Multiplier

127 Unsigned Array Multiplier

128 Signed Modified Booth Multiplier

129 Signed Modified Booth Multiplier

130 Unsigned Modified Booth Multiplier

131 Unsigned Modified Booth Multiplier

132 Wallace Tree multipliers

133 Wallace Tree multipliers Use the 3:2 counters and 2:2 counters Number of levels of = log (32/2) / log (3/2) ≈8 Irregular structure Fast

134 Wallace Tree multipliers 2-level hierarchical

135 Modified Booth-Wallace Tree Multipliers

136 Modified Booth-Wallace Tree Multipliers Use the 3:2 counters and 2:2 counters Number of levels of = log (16/2) / log (3/2) ≈6 Irregular structure Fast Less area

137 Twin pipe serial-parallel multipliers

138 Signed twin pipe serial-parallel multipliers “Sign” control line and the sign-change hardware

139 Unsigned twin pipe serial-parallel multipliers Don’t need the “Sign” control line and the sign-change hardware