Presentation is loading. Please wait.

Presentation is loading. Please wait.

Csci 136 Computer Architecture II – Constructing An Arithmetic Logic Unit Xiuzhen Cheng

Similar presentations


Presentation on theme: "Csci 136 Computer Architecture II – Constructing An Arithmetic Logic Unit Xiuzhen Cheng"— Presentation transcript:

1 Csci 136 Computer Architecture II – Constructing An Arithmetic Logic Unit Xiuzhen Cheng cheng@gwu.edu

2 Announcement Homework assignment #4: Due on Feb 22, before class Readings: Sections 3.2, 3.3, B.5, B.6 Problems: 3.7, 3.9, 3.10, 3.12, 3.18-3.22 Project #1 is due on 11:59PM, Feb 13. Project #2 is due on 11:59PM, March 10.

3 What Are We Going To Do? Implementing the MIPS ISA architecture! Must support the arithmetic/logic operations Tradeoffs of cost and speed based on frequency of occurrence, hardware budget. 32 operation result a b ALU

4 Negating a two's complement number: invert all bits and add 1 remember: “negate” and “invert” are quite different! Converting n bit numbers into numbers with more than n bits: MIPS 16 bit immediate gets converted to 32 bits for arithmetic copy the most significant bit (the sign bit) into the other bits 0010 -> 0000 0010 1010 -> 1111 1010 "sign extension" (lbu vs. lb) Review on Two's Complement Operations

5 Just like in grade school (carry/borrow 1s) 0111 0111 0110 + 0110- 0110- 0101 Two's complement operations easy subtraction using addition of negative numbers 0111 + 1010 Overflow (result too large or too small for finite computer word): Example: -8 <= 4-bit binary number <=7 Review on Addition & Subtraction

6 No overflow when adding a positive and a negative number No overflow when signs are the same for subtraction Overflow occurs when the value affects the sign: overflow when adding two positives yields a negative or, adding two negatives gives a positive or, subtract a negative from a positive and get a negative or, subtract a positive from a negative and get a positive Detecting Overflow

7 An exception (interrupt) occurs Control jumps to predefined address for exception Interrupted address is saved for possible resumption Details based on software system / language Don't always want to detect overflow — new MIPS instructions: addu, addiu, subu note: addiu still sign-extends! note: sltu, sltiu for unsigned comparisons Effects of Overflow

8 Not easy to decide the “best” way to build something Don't want too many inputs to a single gate Don’t want to have to go through too many gates For our purposes, ease of comprehension is important Let's look at a 1-bit ALU for addition: Building blocks: AND, OR, Inverter, MUX, etc. How could we build a 1-bit ALU for add, and, and or? How could we build a 32-bit ALU? Different Implementations for ALU c out = a b + a c in + b c in sum = a xor b xor c in

9 A 1-bit ALU AND and OR A logic unit performing logic AND and OR. Full Adder A 1-bit Full Adder ((3,2) adder). Implementation of a 1-bit adder A 1-bit ALU that performs AND, OR, and addition Figure B.5.6

10 A 32-bit ALU, Ripple Carry Adder A 32-bit ALU for AND, OR and ADD operation: connecting 32 1-bit ALUs

11 What About Subtraction Invert each bit (by inverter) of b and add 1 How do we implement? A very clever solution: a + (-b) = a + (b’ +1)

12 What About NOR Operation? Explore existing hardware in the ALU NOR (a,b) = not (a or b) = not(a) and not(b) Only need to add an inverter for input a

13 32-Bit ALU for AND, OR, ADD, SUB, NOR Binvert Ainvert

14 In-Class Question Prove that you can detect overflow by CarryIn31 xor CarryOut31 that is, an overflow occurs if the CarryIN to the most significant bit is not the same as the CarryOut of the most significant bit

15 Set on Less Than Operation Idea: For slt $t0, $s1, $s2 Check $s1-$s2. If negative, then set the LSB of $t0 to 1, set all other bits of $t0 to 0; otherwise, set all bits of $t0 to 0. How to set these bits? – Less input line Implementation: Connect Result31 to Less Overflow detection

16 Set on Less Than Operation Question: In figure B.5.11, Set connects directly to Less0. Can you find out any problem in this implementation? How to fix it?

17 Conditional Branch MIPS Instruction: beq$S1, $s2, label Idea: Test $s1-$s2. Use an OR gate to test whether the result is 0 or not. It $s1=$s2, set a zero detector.

18 A Final 32-bit ALU Operations supported: and, or, nor, add, sub, slt, beq/bnq ALU Control lines: 2 operation control signal for and, or, add, and slt, 2 control line for sub, nor, and slt ALU Control LinesFunction 0000And 0001Or 0010Add 0110Sub 0111 1100 Slt NOR

19 SPEED of the 32-bit Ripple Carry Adder CarryOut, Result The critical path Path traversed by CARRY: contains 32 and gates, 32 or gates. We must sequentially evaluate all 1-bit adders.  Ripple-carry is too low to be used in time-critical hardware Speedup: anticipate the carry!  needs more hardware for parallel operation An extreme case – sum of products “Infinite” hardware through two-level logic An example implementation – too expensive!  the number of gates grows exponentially!  How many gates are needed to compute c1:c2:c3:c4:

20 Carry-Lookahead Adder The concept of propagate and generate c(i+1) = (ai. bi) + (ai. ci) + (bi. ci) = (ai. bi) + ((ai+bi).ci) pi = ai + bi;gi = ai. bi Write down c1, c2, c3, c4. Why fast? First Level of abstraction: pi, gi Still too expensive! Why?  still large number of gates  How many gates are needed to compute c1:c2:c3:c4:

21 Carry-Lookahead Adder CarryOut is 1 if some earlier adder generates a carry and all intermediary adders propagate the carry.

22 Build Bigger Adders Can’t build a 16-bit adder with carry lookahead! Could use ripple carry of 4-bit adder Use carry lookahead at higher levels “Super” Propagate Pi vs. “Super” Generate Gi Group concept, 4-bit adder as a building block “Super” Propagate/Generate definition c4 = g3 + (p3.g2) + (p3.p2.g1) + (p3.p2.p1.g0) + (p3.p2.p1.p0.c0) = C1

23 Super Propagate and Generate A “super” propagate is true only if all propagates in the same group is true P0= P1=P2=P3= A “super” generate is true only if at least one generate in its group is true and all the propagates downstream from that generate are true. G0= G1= G2= G3=

24 A 16-Bit Adder Give the equations for C1, C2, C3, C4? Better: use the CLA principle again! Second-level of abstraction!

25 An Example Determine gi, pi, Gi, Pi, and C1, C2, C3, C4 for the following two 16-bit numbers: a:0010 1001 0011 0010 b:1101 0101 1110 1011 Do it yourself

26 Speed of Ripple Carry vs. Carry Lookahead Example: Assume each AND and OR gate take the same time. Gate Delay is defined to be the number of gates along the critical path. Compare the gate delays of three 16-bit adder, one using ripple carry, one using first-level carry lookahead, and one using two- level carry lookahead.

27 Summary Traditional ALU can be built from a multiplexor plus a few gates that are replicated 32 times To tailor to MIPS ISA, we expand the traditional ALU with hardware for slt, beq, and overflow detection Carry lookahead is much faster than ripple carry! CLA principle can be applied multiple times!

28 Questions?


Download ppt "Csci 136 Computer Architecture II – Constructing An Arithmetic Logic Unit Xiuzhen Cheng"

Similar presentations


Ads by Google