Digital Integrated Circuits 2e: Chapter 11.1-11.3 Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei ECE 300 Advanced VLSI Design Fall 2006 Lecture.

Slides:



Advertisements
Similar presentations
ECE555 Lecture 8/9 Nam Sung Kim University of Wisconsin – Madison
Advertisements

EE141 © Digital Integrated Circuits 2nd Arithmetic Circuits 1 Digital Integrated Circuits A Design Perspective Arithmetic Circuits Jan M. Rabaey Anantha.
EE141 Adder Circuits S. Sundar Kumar Iyer.
Sequential Definitions  Use two level sensitive latches of opposite type to build one master-slave flipflop that changes state on a clock edge (when the.
CSE477 L19 Timing Issues; Datapaths.1Irwin&Vijay, PSU, 2002 Complex Digital Circuits Design Lecture 2: Timing Issues; [Adapted from Rabaey’s Digital Integrated.
Henry Hexmoor1 Chapter 5 Arithmetic Functions Arithmetic functions –Operate on binary vectors –Use the same subfunction in each bit position Can design.
Design and Implementation of VLSI Systems (EN1600) Lecture 27: Datapath Subsystems 3/4 Prof. Sherief Reda Division of Engineering, Brown University Spring.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE VLSI Circuit Design Lecture 24 - Subsystem.
EE141 © Digital Integrated Circuits 2nd Arithmetic Circuits 1 [Adapted from Rabaey’s Digital Integrated Circuits, ©2002, J. Rabaey et al.]
S. Reda EN160 SP’07 Design and Implementation of VLSI Systems (EN0160) Lecture 28: Datapath Subsystems 2/3 Prof. Sherief Reda Division of Engineering,
EECS Components and Design Techniques for Digital Systems Lec 18 – Arithmetic II (Multiplication) David Culler Electrical Engineering and Computer.
CSE477 VLSI Digital Circuits Fall 2002 Lecture 20: Adder Design
IMPLEMENTATION OF µ - PROCESSOR DATA PATH
CSE241 L2 Datapath/Memory.1Kahng & Cichy, UCSD ©2003 CSE241A VLSI Digital Circuits Winter 2003 Lecture 02: Datapath and Memory.
Introduction to CMOS VLSI Design Lecture 11: Adders
Modern VLSI Design 2e: Chapter 6 Copyright  1998 Prentice Hall PTR Topics n Shifters. n Adders and ALUs.
S. Reda EN1600 SP’08 Design and Implementation of VLSI Systems (EN1600) Lecture 25: Datapath Subsystems 1/4 Prof. Sherief Reda Division of Engineering,
Spring 2006EE VLSI Design II - © Kia Bazargan 187 EE 5324 – VLSI Design II Kia Bazargan University of Minnesota Part IV: Control Path and Busses.
Design and Implementation of VLSI Systems (EN1600) Lecture 26: Datapath Subsystems 2/4 Prof. Sherief Reda Division of Engineering, Brown University Spring.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE VLSI Circuit Design Lecture 23 - Subsystem.
Arithmetic-Logic Units CPSC 321 Computer Architecture Andreas Klappenecker.
Digital Integrated Circuits© Prentice Hall 1995 Arithmetic Arithmetic Building Blocks.
Lecture 17: Adders.
Spring 2002EECS150 - Lec10-cl1 Page 1 EECS150 - Digital Design Lecture 10 - Combinational Logic Circuits Part 1 Feburary 26, 2002 John Wawrzynek.
Chapter 5 Arithmetic Logic Functions. Page 2 This Chapter..  We will be looking at multi-valued arithmetic and logic functions  Bitwise AND, OR, EXOR,
VLSI Digital System Design 1.Carry-Save, 2.Pass-Gate, 3.Carry-Lookahead, and 4.Manchester Adders.
Adders. Full-Adder The Binary Adder Express Sum and Carry as a function of P, G, D Define 3 new variable which ONLY depend on A, B Generate (G) = AB.
Lec 17 : ADDERS ece407/507.
Introduction to CMOS VLSI Design Lecture 11: Adders David Harris Harvey Mudd College Spring 2004.
Bar Ilan University, Engineering Faculty
Abdullah Aldahami ( ) Feb26, Introduction 2. Feedback Switch Logic 3. Arithmetic Logic Unit Architecture a.Ripple-Carry Adder b.Kogge-Stone.
Digital Integrated Circuits Chpt. 5Lec /29/2006 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin (
Chapter 6-1 ALU, Adder and Subtractor
Arithmetic Building Blocks
EE141 © Digital Integrated Circuits 2nd Arithmetic Circuits 1 Digital Integrated Circuits A Design Perspective Arithmetic Circuits Reference: Digital Integrated.
Arithmetic Building Blocks
Advanced VLSI Design Unit 05: Datapath Units. Slide 2 Outline  Adders  Comparators  Shifters  Multi-input Adders  Multipliers.
1/8/ L3 Data Path DesignCopyright Joanne DeGroat, ECE, OSU1 ALUs and Data Paths Subtitle: How to design the data path of a processor.
EECS Components and Design Techniques for Digital Systems Lec 16 – Arithmetic II (Multiplication) David Culler Electrical Engineering and Computer.
Chapter 14 Arithmetic Circuits (I): Adder Designs Rev /12/2003
Design of a 32-Bit Hybrid Prefix-Carry Look-Ahead Adder
Modern VLSI Design 4e: Chapter 6 Copyright  2008 Wayne Wolf Topics n Shifters. n Adders and ALUs.
FPGA-Based System Design: Chapter 4 Copyright  2003 Prentice Hall PTR Topics n Number representation. n Shifters. n Adders and ALUs.
1 Lecture 6 BOOLEAN ALGEBRA and GATES Building a 32 bit processor PH 3: B.1-B.5.
Spring C:160/55:132 Page 1 Lecture 19 - Computer Arithmetic March 30, 2004 Sukumar Ghosh.
CDA 3101 Fall 2013 Introduction to Computer Organization The Arithmetic Logic Unit (ALU) and MIPS ALU Support 20 September 2013.
EE 466/586 VLSI Design Partha Pande School of EECS Washington State University
EE141 © Digital Integrated Circuits 2nd Arithmetic Circuits 1 Digital Integrated Circuits A Design Perspective Arithmetic Circuits Jan M. Rabaey Anantha.
Digital Integrated Circuits© Prentice Hall 1995 Arithmetic Arithmetic Building Blocks.
EE466: VLSI Design Lecture 13: Adders
CMPEN 411 VLSI Digital Circuits Spring 2009 Lecture 19: Adder Design
Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei ECE 300 Advanced VLSI Design Fall 2006 Lecture.
Addition, Subtraction, Logic Operations and ALU Design
CSE477 VLSI Digital Circuits Fall 2002 Lecture 20: Adder Design
Arithmetic-Logic Units. Logic Gates AND gate OR gate NOT gate.
Sp09 CMPEN 411 L21 S.1 CMPEN 411 VLSI Digital Circuits Spring 2009 Lecture 21: Shifters, Decoders, Muxes [Adapted from Rabaey’s Digital Integrated Circuits,
CSE477 L21 Multiplier Design.1Irwin&Vijay, PSU, 2002 CSE477 VLSI Digital Circuits Fall 2002 Lecture 21: Multiplier Design Mary Jane Irwin (
EE141 Arithmetic Circuits 1 Chapter 14 Arithmetic Circuits Rev /12/2003 Rev /05/2003.
EE141 Arithmetic Circuits 1 Chapter 14 Arithmetic Circuits Rev /12/2003.
Full Adder Truth Table Conjugate Symmetry A B C CARRY SUM
CSE477 L20 Adder Design.1Irwin&Vijay, PSU, 2003 CSE477 VLSI Digital Circuits Fall 2003 Lecture 20: Adder Design Mary Jane Irwin (
Multiplier Design [Adapted from Rabaey’s Digital Integrated Circuits, Second Edition, ©2003 J. Rabaey, A. Chandrakasan, B. Nikolic]
CSE477 VLSI Digital Circuits Fall 2003 Lecture 21: Multiplier Design
Topics Number representation. Shifters. Adders and ALUs.
Mary Jane Irwin ( ) CSE477 VLSI Digital Circuits Fall 2002 Lecture 22: Shifters, Decoders, Muxes Mary Jane.
Digital Integrated Circuits A Design Perspective
Review: Basic Building Blocks
Lecture 9 Digital VLSI System Design Laboratory
Arithmetic Building Blocks
Arithmetic Circuits.
Presentation transcript:

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei ECE 300 Advanced VLSI Design Fall 2006 Lecture 17: Datapath Design & Adders Yunsi Fei [Adapted from Jan Rabaey et al’s Digital Integrated Circuits ©2002, PSU Irwin & Vijay © 2002, and Princeton Wayne Wolf’s Modern VLSI Design © 2002 ]

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Major Components of a Computer Processor Control Datapath Memory Devices Input Output n Modern processor architecture styles –Pipelined, single issue (e.g., ARM) –Pipelined, hardware controlled multiple issue – superscalar –Pipelined, software controlled multiple issue – VLIW –Pipelined, multiple issue from multiple process threads - multithreaded

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Basic Building Blocks n Datapath –Execution units »Adder, multiplier, divider, shifter, etc. –Register file and pipeline registers –Multiplexers, decoders n Control –Finite state machines (PLA, ROM, random logic) n Interconnect –Switches, arbiters, buses n Memory –Caches, TLBs, DRAM, buffers

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei MIPS 5-Stage Pipelined (Single Issue) Datapath pipeline stage isolation register FetchDecodeExecuteMemoryWriteBack clk Icache precharge Dcache precharge RegWrite

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Datapath Bit-Sliced Organization Control Flow Bit 0 Bit 1 Bit 2 Bit 3 Tile identical bit-slice elements Register File Pipeline RegisterAdderShifterPipeline RegisterMultiplexer Data Flow Pipeline Register From I$ Pipeline Register To/From D$

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Adders n Carry-ripple n Manchester carry chain n Carry skip n Carry select n Carry look ahead

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei The 1-bit Binary Adder 1-bit Full Adder (FA) A B S C in S = A  B  C in C out = A&B | A&C in | B&C in (majority function)  How can we use it to build a 64-bit adder?  How can we modify it easily to build an adder/subtractor?  How can we make it better (faster, lower power, smaller)? ABC in C out Scarry status 00000kill propagate generate C out G = A&B P = A  B K = !A & !B = P  C in = G | P&C in

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei FA Gate Level Implementations AB S C out C in t1 t0 t2 t0 t1 AB S C out C in t2

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Review: XOR FA C out S C in A B 16 transistors

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Review: CPL FA A !A B!B C in !C in !S S C out !C out A !A B !B BC in !C in C in !C in 20+8 transistors, dual rail – beware of threshold drops

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Delay Balanced FA B!B Identical Delays for Carry and Sum P!P Signal set-up B A !B p A Carry generation Sum generation C in !P A !C out !P P C in P A !C out P !P S C in 20+2 transistors

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Review: Mirror Adder B BB BB B B B A A A A A A A A C in !C out !S 24+4 transistors kill generate 0-propagate 1-propagate C out = A&B | B&C in | A&C in SUM = A&B&C in | C OUT &(A | B | C in ) Sizing: Each input in the carry circuit has a logical effort of 2 so the optimal fan-out for each is also 2. Since !C out drives 2 internal and 2 inverter transistor gates (to form C in for the nms bit adder) should oversize the carry circuit. PMOS/NMOS ratio of 2.

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Mirror Adder Features n The NMOS and PMOS chains are completely symmetrical with a maximum of two series transistors in the carry circuitry, guaranteeing identical rise and fall transitions if the NMOS and PMOS devices are properly sized. n When laying out the cell, the most critical issue is the minimization of the capacitances at node !C out (four diffusion capacitances, two internal gate capacitances, and two inverter gate capacitances). Shared diffusions can reduce the stack node capacitances. n The transistors connected to C in are placed closest to the output. n Only the transistors in the carry stage have to be optimized for optimal speed. All transistors in the sum stage can be minimal size.

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei A 64-bit Adder/Subtractor 1-bit FA S0S0 C 0 =C in C1C1 1-bit FA S1S1 C2C2 S2S2 C3C3 C 64 =C out 1-bit FA S 63 C  Ripple Carry Adder (RCA) built out of 64 FAs  Subtraction – complement all subtrahend bits (xor gates) and set the low order carry-in  RCA advantage: simple logic, so small (low cost) disadvantage: slow (O(N) for N bits) and lots of glitching (so lots of energy consumption) A0A0 B0B0 A1A1 B1B1 A2A2 B2B2 A 63 B 63 add/subt

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Ripple Carry Adder (RCA) A0A0 B0B0 S0S0 C 0 =C in FA A1A1 B1B1 S1S1 A2A2 B2B2 S2S2 A3A3 B3B3 S3S3 C out =C 4 T = O(N) worst case delay T adder  T FA (A,B  C out ) + (N-2)T FA (C in  C out ) + T FA (C in  S) Real Goal: Make the fastest possible carry path

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Inversion Property AB S C in FA !C out (A, B, C in ) = C out (!A, !B, !C in ) C out AB S FAC out C in !S (A, B, C in ) = S(!A, !B, !C in )  n Inverting all inputs to a FA results in inverted values for all outputs

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Exploiting the Inversion Property A0A0 B0B0 S0S0 C 0 =C in FA’ A1A1 B1B1 S1S1 A2A2 B2B2 S2S2 A3A3 B3B3 S3S3 C out =C 4 Now need two “flavors” of FAs regular cellinverted cell  Minimizes the critical path (the carry chain) by eliminating inverters between the FAs

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Fast Carry Chain Design n The key to fast addition is a low latency carry network n What matters is whether in a given position a carry is –generatedG i = A i & B i = A i B i –propagatedP i = A i  B i (sometimes use A i | B i ) –annihilated (killed)K i = !A i & !B i n Giving a carry recurrence of C i+1 = G i | P i C i C 1 = G 0 | P 0 C 0 C 2 = G 1 | P 1 G 0 | P 1 P 0 C 0 C 3 = G 2 | P 2 G 1 | P 2 P 1 G 0 | P 2 P 1 P 0 C 0 C 4 = G 3 | P 3 G 2 | P 3 P 2 G 1 | P 3 P 2 P 1 G 0 | P 3 P 2 P 1 P 0 C 0

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Manchester Carry Chain n Switches controlled by G i and P i n Total delay of –time to form the switch control signals G i and P i –setup time for the switches –signal propagation delay through N switches in the worst case GiGi PiPi !C i !C i+1 clk

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei 4-bit Sliced MCC Adder GP !C 0 clk GPGPGP  & &  & &  & &  & &  A0A0 B0B0 A1A1 B1B1 A2A2 B2B2 A3A3 B3B3 S0S0 S1S1 S2S2 S3S3 !C 1 !C 2 !C 3 !C 4

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Domino Manchester Carry Chain Circuit C i,0 G0G0 clk P0P0 P1P1 P2P2 P3P3 G1G1 G2G2 G3G3 C i, !(G 0 | P 0 C i,0 ) !(G 1 | P 1 G 0 | P 1 P 0 C i,0 ) !(G 2 | P 2 G 1 | P 2 P 1 G 0 | P 2 P 1 P 0 C i,0 ) !(G 3 | P 3 G 2 | P 3 P 2 G 1 | P 3 P 2 P 1 G 0 | P 3 P 2 P 1 P 0 C i,0 )

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Binary Adder Landscape synchronous word parallel adders ripple carry adders (RCA) carry prop min adders signed-digit fast carry prop residue adders adders adders Manchester carry parallel conditional carry carry chain select prefix sum skip T = O(N), A = O(N) T = O(1), A = O(N) T = O(log N) A = O(N log N) T = O(  N), A = O(N) T = O(N) A = O(N)

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Carry-Skip (Carry-Bypass) Adder If (P 0 & P 1 & P 2 & P 3 = 1) then C o,3 = C i,0 otherwise the block itself kills or generates the carry internally A0A0 B0B0 S0S0 C i,0 FA A1A1 B1B1 S1S1 A2A2 B2B2 S2S2 A3A3 B3B3 S3S3 C o,3 BP = P 0 P 1 P 2 P 3 “Block Propagate”

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Carry-Skip Chain Implementation BP block carry-in block carry-out carry-out C in G0G0 P0P0 P1P1 P2P2 P3P3 G1G1 G2G2 G3G3 !C out BP

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei 4-bit Block Carry-Skip Adder Worst-case delay  carry from bit 0 to bit 15 = carry generated in bit 0, ripples through bits 1, 2, and 3, skips the middle two groups (B is the group size in bits), ripples in the last group from bit 12 to bit 15 C i,0 Sum Carry Propagation Setup Sum Carry Propagation Setup Sum Carry Propagation Setup Sum Carry Propagation Setup bits 0 to 3bits 4 to 7bits 8 to 11bits 12 to 15 T add = t setup + B t carry + ((N/B) -1) t skip +B t carry + t sum

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Optimal Block Size and Time n Assuming one stage of ripple (t carry ) has the same delay as one skip logic stage (t skip ) and both are 1 T CSkA = 1 + B + (N/B-1) + B + 1 t setup ripple in skips ripple in t sum block 0 last block = 2B + N/B + 1 n So the optimal block size, B, is dT CSkA /dB = 0   (N/2) = B opt n And the optimal time is Optimal T CSkA = 2(  (2N)) + 1

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Carry-Skip Adder Extensions n Variable block sizes –A carry that is generated in, or absorbed by, one of the inner blocks travels a shorter distance through the skip blocks, so can have bigger blocks for the inner carries without increasing the overall delay C in C out n Multiple levels of skip logic skip level 1 skip level 2 C in C out AND of the first level skip signals (BP’s)

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Carry-Skip Adder Comparisons B=2 B=3 B=4 B=5 B=6

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Carry Select Adder 4-b Setup “0” carry propagation “1” carry propagation1 0 multiplexerC in C out Sum generation P’sG’s C’s  Precompute the carry out of each block for both carry_in = 0 and carry_in = 1 (can be done for all blocks in parallel) and then select the correct one A’sB’s S’s

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Carry Select Adder: Critical Path Setup “0” carry “1” carry 1 0 mux C in Sum gen P’sG’s C’s S’s A’sB’s Setup “0” carry “1” carry mux Sum gen P’sG’s C’s S’s A’sB’s Setup “0” carry “1” carry mux Sum gen P’sG’s C’s S’s A’sB’s Setup “0” carry “1” carry mux C out Sum gen P’sG’s C’s S’s A’sB’s bits 0 to 3bits 4 to 7bits 8 to 1bits 12 to 15

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Square Root Carry Select Adder Setup “0” carry “1” carry 1 0 mux C in Sum gen P’sG’s C’s S’s AsB’sA’sBs 1 0 S’s Setup “0” carry “1” carry mux Sum gen P’sG’s C’s A’sB’s Setup “0” carry “1” carry 1 0 mux C out Sum gen P’sG’s C’s S’s A’sB’s bits 0 to 1bits 2 to 4 bits 5 to 8bits 9 to 13 T add = t setup + 2 t carry + √N t mux + t sum Setup 1 0 mux Sum gen P’sG’s C’s S’s “1” carry “0” carry Setup “0” carry “1” carry mux Sum gen P’sG’s C’s A’sB’s bits 14 to S’s

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Parallel Prefix Adders (PPAs) n Define carry operator € on (G,P) signal pairs –€ is associative, i.e., [(g’’’,p’’’) € (g’’,p’’)] € (g’,p’) = (g’’’,p’’’) € [(g’’,p’’) € (g’,p’)] € (G’’,P’’)(G’,P’) (G,P) where G = G’’  P’’G’ P = P’’P’ € €€ € G’G’ !G G ’’ P ’’

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei PPA General Structure n Given P and G terms for each bit position, computing all the carries is equal to finding all the prefixes in parallel (G 0,P 0 ) € (G 1,P 1 ) € (G 2,P 2 ) € … € (G N-2,P N-2 ) € (G N-1,P N-1 ) n Since € is associative, we can group them in any order –but note that it is not commutative n Measures to consider –number of € cells –tree cell depth (time) –tree cell area –cell fan-in and fan-out –max wiring length –wiring congestion –delay path variation (glitching) P i, G i logic (1 unit delay) S i logic (1 unit delay) C i parallel prefix logic tree (1 unit delay per level)

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Brent-Kung PPA Parallel Prefix Computation € G0P0G0P0 G1P1G1P1 G2p2G2p2 G3P3G3P3 G4P4G4P4 G5P5G5P5 G6P6G6P6 G7P7G7P7 G8P8G8P8 G9p9G9p9 G 10 P 10 G 11 p 11 G 12 P 12 G 13 p 13 G 14 p 14 G 15 p 15 €€€€€€€€€€€€€€€€€€€€€€€€€ C1C1 C2C2 C3C3 C4C4 C5C5 C6C6 C7C7 C8C8 C9C9 C 10 C 11 C 12 C 13 C 14 C 15 C 16 C in € T = log 2 N T = log 2 N - 2 A = 2log 2 N A = N/2

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Kogge-Stone PPF Adder Parallel Prefix Computation € G0P0G0P0 G1P1G1P1 G2P2G2P2 G3P3G3P3 G4P4G4P4 G5P5G5P5 G6P6G6P6 G7P7G7P7 G8P8G8P8 G9P9G9P9 G 10 P 10 G 11 P 11 G 12 P 12 G 13 P 13 G 14 P 14 G 15 P 15 €€€€€€€€€€€€€€€ C1C1 C2C2 C3C3 C4C4 C5C5 C6C6 C7C7 C8C8 C9C9 C 10 C 11 C 12 C 13 C 14 C 15 C 16 C in € T = log 2 N A = log 2 N A = N €€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€€ T add = t setup + log 2 N t € + t sum

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei More Adder Comparisons

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Adder Speed Comparisons

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Adder Average Power Comparisons

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei PDP of Adder Comparisons From Nagendra, 1996

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Topics n Adders and ALUs (§6.4, §6.5) –Carry-ripple –Carry look ahead –Manchester carry chain –Carry skip –Carry select n Multipliers (§6.6) n Subsystem design principles (§6.2)

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Adders n 1-bit full adder – S i = a i  b i  c i – c i+1 = a i b i + a i c i + b i c i n Carry-ripple adder –n-bit adder built from full adders n Adder delay is dominated by carry chain –Naming: Carry- … adder

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei 1-bit Full Adder: the Mirror Adder V DD C i A B BA B A A B V C i AB C i C i B A C i A B B A V S C o 24 transistors

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Carry-lookahead Adder n First compute carry propagate, generate: – P i = a i + b i – G i = a i b i n Compute sum and carry from P and G: – S i = c i  P i  G i = a i  b i  c i – c i+1 = G i + P i c i = G i + P i G i-1 + P i P i-1 G i-2 + … +P i …P j c j

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Depth-4 Carry-lookahead n C 1 = G 0 + P 0 C in n C 2 = G 1 + P 1 G 0 + P 1 P 0 C in n C 3 = G 2 + P 2 G 1 +P 2 P 1 G 0 + P 2 P 1 P 0 C in n C 4 = G 3 + P 3 G 2 + P 3 P 2 G 1 + P 3 P 2 P 1 G 0 + P 3 P 2 P 1 P 0 C in

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Analysis n Deepest carry expansion requires gates with large fanin: large, slow –Generally use 4-bit groups –Domino logic implementation n Carry look ahead tree – C 4 = G 3 + P 3 G 2 + P 3 P 2 G 1 + P 3 P 2 P 1 G 0 + P 3 P 2 P 1 P 0 C in » G* = G 3 + P 3 G 2 + P 3 P 2 G 1 + P 3 P 2 P 1 G 0 » P* = P 3 P 2 P 1 P 0 » C 4 = G* + P*C in

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Manchester Carry Chain Circuit  G i-1 P i-1 +  GiGi PiPi + stage i-1stage i C i+1 C i-1 CiCi

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Manchester Carry Chain n Precharged/evaluate carry chain n Principles –If G i = a i b i = 1, P i = a i +b i = 0, C i+1 = 1 –If G i = a i b i = 0, P i = a i +b i = 0, C i+1 = 0 –If G i = a i b i = 0, P i = a i +b i = 1, C i+1 = C i n Worst-case discharge path goes through entire carry chain.

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Carry-skip Adder n For m-bit addition, its C out can be –Inherited from C in »a i  b i for every bit in stage –Generated locally within m-bit »i.e. The C out when C in = 0 n Optimum group size: m = sqrt(n/2) n Longest path: –Similar to Manchester chain

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Two-bit Carry-skip Structure aiai bibi a i+1 b i+1 CiCi a i+1 b i+1 + (a i+1 +b i+1 )a i b i C i+2 or using a mux

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Carry-skip Group Structure M-bit FA group M-bit FA group

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Carry-select Adder n Computes two results in parallel, each for different carry input assumptions. n Uses actual carry in to select correct result. n Reduces delay to multiplexer.

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Carry-select Structure

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei DEC “alpha” Adder n 64-bit adder, 0.75  m technology, 5ns delay n On the 8-bit level: Manchester chain n On the 32-bit sub-block: Carry look ahead n On the 64-bit block: Carry select

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Serial Adder n May be used in signal-processing arithmetic where fast computation is important but latency is unimportant. n Data format (LSB first): bit 0bit 1bit 2bit 3...

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Serial adder Structure LSB control signal clears the carry shift register:

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Subtraction n a – b = a + (-b) n For an n-bit number b, how do we get its complement? –(-b) = b + 1 –a + (-b) = a + b + 1 »Using “1” as the carry-in to avoid two additions

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei ALUs n ALU computes a variety of logical and arithmetic functions based on opcode. –Shift »Arithmetic/logical shift left, shift right –Logic operations »AND, OR, NOT, … –Add/subtract »Signed/unsigned, …

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei Opcodes n The control bits that determine the datapath –Whether it is a shift, add, subtract … n Must be carefully designed to ease decoding –Use decoder/de-multiplexer to select the correct datapath

Digital Integrated Circuits 2e: Chapter Copyright  2002 Prentice Hall PTR, Adapted by Yunsi Fei An ALU Adder Structure