Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Lecture 4: Arithmetic for Computers (Part 3) CS 447 Jason Bakos.

Similar presentations


Presentation on theme: "1 Lecture 4: Arithmetic for Computers (Part 3) CS 447 Jason Bakos."— Presentation transcript:

1 1 Lecture 4: Arithmetic for Computers (Part 3) CS 447 Jason Bakos

2 2 Chapter 4 Review So far, we’ve covered the following topics for this chapter –Binary representation of signed integers 16 to 32 bit signed conversion –Binary addition/subtraction Overflow detection/overflow exception handling –Shift and logical operations –Parts of the CPU –AND, OR, XOR, and inverter gates –Multiplexor (mux) and full adder –Sum-of-products logic equations (truth tables) –Logic minimization techniques Don’t cares and Karnaugh Maps

3 3 1-bit ALU Design A 1-bit ALU can be constructed –Components AND, OR, and adder 4-to-1 mux “Binverter” (inverter and 2-to-1 mux) –Interface Inputs: A, B, Binvert, Operation (2 bits), CarryIn, and Less Outputs: CarryOut and Result –Digital functions are performed in parallel and the outputs are routed into a mux The mux will also accept a Less input which we’ll accept from outside the 1-bit ALU –The select lines of the mux make up the “operation” input to the ALU

4 4 32-bit ALU In order to create a multi-bit ALU, array 32 1-bit ALUs –Connect the CarryOut of each bit to the CarryIn of the next bit –A and B of each 1-bit ALU will be connected to each successive bit of the 32-bit A and B –The Result outputs of each 1-bit ALU will form the 32-bit result We need to add an SLT unit and connect the output to the least significant 1-bit ALU’s Less input –Hardwire the other “Less” inputs to 0 We need to add an Overflow unit We need to add a Zero detection unit

5 5 SLT Unit To compute SLT, we need to make sure that when the 1-bit ALU’s Operation is set to 11, a subtract operation is also being computed –With this happening, the SLT unit can compute Less based on the MSB (sign) of A, B, and Result AsignBsignRsignLess 0000 0011 01X0 10X1 1101 1110

6 6 Overflow Unit When doing signed arithmetic, we need to follow this table, as we covered previously… How do we implement this in hardware? OperationOperand AOperand BResult A+BPositive Negative A+BNegative Positive A-BPositiveNegative A-BNegativePositive

7 7 Overflow Unit We need a truth table… Since we’ll be computing the logic equation with SOP, we only need the rows where the output is 1 OperationA(31)B(31)R(31)Overflow 010 (add)0011 1101 110 (sub)0111 1001

8 8 Zero Detection Unit “Or” together all the 1-bit ALU outputs – the result is the Zero output to the ALU

9 9 32-bit ALU Operation We need a 3-bit ALU Operation input into our 32-bit ALU The two least significant bits can be routed into all the 1-bit ALUs internally The most significant bit can be routed into the least significant 1-bit ALU’s CarryIn, and to Binvert of all the 1-bit ALUs

10 10 32-bit ALU Operation Here’s the final ALU Operation table: ALU OperationFunction 000and 001or 010add 110subtract 111set on less than

11 11 32-bit ALU In the end, our ALU will have the following interface: –Inputs: A and B (32 bits each) ALU Operation (3 bits) –Outputs: CarryOut (1 bit) Zero (1 bit) Result (32 bits) Overflow (1 bit)

12 12 Carry Lookahead The adder architecture we previously looked at requires n*2 gate delays to compute its result (worst case) –The longest path that a digital signal must propagate through is called the “critical path” –This is WAAAYYYY too slow! There other ways to build an adder that require lg n delay Obviously, using SOP, we can build a circuit that will compute ANY function in 2 gate delays (2 levels of logic) –Obviously, in the case of a 64-input system, the resulting design will be too big and too complex

13 13 Carry Lookahead For example, we can easily see that the CarryIn for bit 1 is computed as: –c1=(a0b0)+(a0c0)+(b0c0) –c2=(a1b1)+(a1c1)+(b1c1) Hardware executes in parallel, so using the following fast CarryIn computation, we can perform an add with 3 gate delays –c2=(a1b1)+(a1a0b0)+(a1a0c0)+(a1b0c0)+(b1a0b0 )+(b1a0c0)+(b1b0c0) I used the logical distributive law to compute this As you can see, the CarryIn logic gets bigger and bigger for consecutive bits

14 14 Carry Lookahead Carry Lookahead adders are faster than ripple-carry adders Recall: –ci+1=(aibi)+(aici)+(bici) ci can be factored out… –ci+1=(aibi)+(ai+bi)ci –So… –c2=(a1b1)+(a1+b1)((a0b0)+(a0+b0)c0)

15 15 Carry Lookahead Note the repeated appearance of (aibi) and (ai+bi) They are called generate (gi) and propagate (pi) –gi=aibi, pi=ai+bi –ci+1=gi+pici –This means if gi=1, a CarryOut is generated –If pi=1, a CarryOut is propagated from CarryIn

16 16 Carry Lookahead c1=g0+(p0c0) c2=g1+(p1g0)+(p1p0c0) c3=g2+(p2g1)+(p2p1g0)+(p2p1p0c0) c4=g3+(p3g2)+(p3p2g1)+(p3p2p1g0)+ (p3p2p1p0c0) …This system will give us an adder with 5 gate delays but it is still too complex

17 17 Carry Lookahead To solve this, we’ll build our adder using 4- bit adders with carry lookahead, and connect them using “super”-propagate and generate logic The superpropagate is only true if all the bits propagate a carry –P0=p0p1p2p3 –P1=p4p5p6p7 –P2=p8p9p10p11 –P3=p12p13p14p15

18 18 Carry Lookahead The supergenerate follows a similar equation: G0=g3+(p3g2)+(p2p2g1)+(p3p2p1g0) G1=g7+(p7g6)+(p7p6g5)+(p7p6p5g4) G2=g11+(p11g10)+(p11p10g9)+(p11p10p9g8) G3=g15+(p15g14)+(p15p14g13)+(p15p14p13g12) The supergenerate and superpropagate logic for the 4-4 bit Carry Lookahead adders is contained in a Carry Lookahead Unit This yields a worst-case delay of 7 gate delays –Reason?

19 19 Carry Lookahead We’ve covered all ALU functions except for the shifter We’ll talk after the shifter later


Download ppt "1 Lecture 4: Arithmetic for Computers (Part 3) CS 447 Jason Bakos."

Similar presentations


Ads by Google