# Phase 1 Presentation 2/14/2011. Wisdom of the Day 学而不思则罔，思而不学则殆 Learning without thought is useless; thought without learning is perilous. Today we will.

## Presentation on theme: "Phase 1 Presentation 2/14/2011. Wisdom of the Day 学而不思则罔，思而不学则殆 Learning without thought is useless; thought without learning is perilous. Today we will."— Presentation transcript:

Phase 1 Presentation 2/14/2011

Wisdom of the Day 学而不思则罔，思而不学则殆 Learning without thought is useless; thought without learning is perilous. Today we will encourage thoughts while you learn about the execution stage. Confucius (551Bc-479BC) was a Chinese thinker and social philosopher whose teachings and philosophy have deeply influenced Chinese, Korean, Japanese, and Vietnamese thought and life.

Execution Engine Overview Responsible for producing the result of almost every instruction Three design points: The ALU PC Manipulation Memory Accesses

Block Diagram

Execute Stage Block Diagram

THE ALU P OWERHOUSE OF THE CPU

ALU Overview The ALU is the main math unit of the CPU It takes two inputs and then returns the results of various operations on them

ALU Datapath

ALU Operations Add (ADDU) Subtract (SUBU) Or (OR) And (AND) Nor (NOR) Set Less Than (SLT) Set Less Than Unsigned (SLTU) Shift Logical Left (SLL) Shift Right Logical (SRL) Load Upper Immediate (LUI)

Addition Uses a ripple carry adder to produce the sum Other options include a carry look ahead adder We believe the ripple carry adder is sufficiently fast for our implementation. int ADDU(int a, int b){ return a+b; } C Code

Subtraction Modified existing adder to handle both Add and Subtract Subtraction is the addition of one number and the 2s compliment of a second number 2s compliment: invert bits then add 1 A-B = A + (B! + 1) int SUBU(int a, int b){ return a-b; } C Code

Subtraction For our adder to handle both Add and Subtract, we place a mux in front of the adder Choice 1: signal unmodified Choice 2: inverted part of B Then to add one we set the carry in bit of the adder to high SUBU(int a, int b){ return a-b; } C Code

Set Less Than On a set less than instruction, the ALU determines if the first input is less than the second. To implement, we subtract the two operands and then determine if the output is negative. The sign becomes the result. 1 means it is less than. 0 indcates that it is not. boolean SLT(int a, int b){ return a { "@context": "http://schema.org", "@type": "ImageObject", "contentUrl": "http://images.slideplayer.com/13/3932928/slides/slide_13.jpg", "name": "Set Less Than On a set less than instruction, the ALU determines if the first input is less than the second.", "description": "To implement, we subtract the two operands and then determine if the output is negative. The sign becomes the result. 1 means it is less than. 0 indcates that it is not. boolean SLT(int a, int b){ return a

Set Less Than Unsigned First, let's try using the exact same logic as SLT: 12<3? 12-3<0? 1100-0011 = 1100+1101 = 1001 = -7 We have determined this won't work. Ideas? boolean SLTU(uint a, uint b){ return a { "@context": "http://schema.org", "@type": "ImageObject", "contentUrl": "http://images.slideplayer.com/13/3932928/slides/slide_14.jpg", "name": "Set Less Than Unsigned First, let s try using the exact same logic as SLT: 12<3.", "description": "12-3<0. 1100-0011 = 1100+1101 = 1001 = -7 We have determined this won t work. Ideas. boolean SLTU(uint a, uint b){ return a

Set Less Than Unsigned Let’s add an extra zero to make them positive 12<3? 12-3<0? 01100-00011 = 01100+11101 = 01001 = 9 This does work. Sign bit is only the XOR of 0, 1 and the last carry out bit. If you're savvy, you'll notice this is simply the NOT of the carry-out bit. boolean SLTU(uint a, uint b){ return a { "@context": "http://schema.org", "@type": "ImageObject", "contentUrl": "http://images.slideplayer.com/13/3932928/slides/slide_15.jpg", "name": "Set Less Than Unsigned Let’s add an extra zero to make them positive 12<3.", "description": "12-3<0. 01100-00011 = 01100+11101 = 01001 = 9 This does work. Sign bit is only the XOR of 0, 1 and the last carry out bit. If you re savvy, you ll notice this is simply the NOT of the carry-out bit. boolean SLTU(uint a, uint b){ return a

Shifting The output bits choose their bit based on the input shamt Implemented using MUXes. We made it faster by converting shamt to one-hot encoding. No cascading of MUXes Uses a bunch of and gates in parallel Any ”leftover” bits are 0's. uint SLL(uint a, int shamt){ return a< { "@context": "http://schema.org", "@type": "ImageObject", "contentUrl": "http://images.slideplayer.com/13/3932928/slides/slide_16.jpg", "name": "Shifting The output bits choose their bit based on the input shamt Implemented using MUXes.", "description": "We made it faster by converting shamt to one-hot encoding. No cascading of MUXes Uses a bunch of and gates in parallel Any leftover bits are 0 s. uint SLL(uint a, int shamt){ return a<

Logic Operations AND OR NOR In order to complete these operations we use bitwise, AND, OR, and NOR. Uses 32 gates for each operation. int AND(int a, int b){ return a&b; } C Code

Load Upper Immediate Returns the value of the immediate shifted left 16 times. Reusing SLL complicates the logic of the ALU circuit. Uses a dedicated 16-shift left module. int LUI(int a){ return a<<16; } C Code

ALU Control How is the ALU controlled? How can we accomplish this? Alternative options?

ALU Control Nearly every operation is computer in parallel When ADDU is performed, SUBU, SLT and SLTU are not, and vice- versa There is a MUX inside the ALU The MUX chooses the output based on the requested operation

ALU Control How was this accomplished? One hot encoding SIG 0 is for indicating the adder must do subtraction. OperationCodeSIG0 ADD0000000010 SUB0000000011 OR0000000100 AND0000001000 NOR0000010000 SLT0000100001 SLTU0001000001 SLL0010000000 SRL0100000000 LUI1000000000

ALU Control Alternative Options Binary Encoding One Cold Encoding We use one hot to make the MUX less complex Just as we did for the SLL and SRL

Immediates Adding special instructions for immediates would be painstaking and wasteful Determine second input from outside the ALU

Immediate Datapath

PC Manipulation M OVING A BOUT THE C ODE

PC Manipulation Overview Dynamic alteration of the PC allows for programs to be non- linear Greatly increases the capability of computers

Branching Branch instructions modify the program counter to skip over sections of code or to go back to repeat previous code. Our branches allow for conditional movement to an offset.

Conditional Branch BEQ: Branches when the values in the two registers are equal BNE: Branches when the values in the two registers aren’t equal Two things must be calculated: New address Comparison if(a == b) goto c; /* BEQ */ if(a != b) goto c; /* BNE */ C Code

Conditional Branch Solution: dedicated adder to calculate new address The ALU can then do the comparison Extra output to determine if subtraction results in a 0 if(a == b) goto c; /* BEQ */ if(a != b) goto c; /* BNE */ C Code

Branching Datapath

Jumping Jumps unconditionally change the PC Their addresses are absolute rather than offsets of the current PC goto c; C Code

Jumps Two unlinked jumps Jump(J): jumps directly to the instruction in the immediate field Jump Register (JR): jumps to the instruction whose location is the value of the given register goto 0xDEADBEEF; /* J */ goto c; /* JR */ C Code

Jumping Datapath

Linked Jumps Two linked jumps Jump and Link(JAL): jumps to the instruction in the immediate field and saves the return address in \$ra Jump and Link Register (JALR): jumps to the instruction in the immediate field and saves the return address in the specified register \$ra = PC+8; goto 0xDEADBEEF; C Code

Linked Jumps Linked jumps record the address of PC+8 This is the instruction after the delay slot instruction More MUXes on the ALU inputs to choose when return the link address \$ra = PC+8; goto 0xDEADBEEF; C Code

PC Change Determination PC will change on a successful branch or a jump command Use and a combination of AND gates and an OR gate Use JUMP? bit to choose the new address if(BEQ && !Not Zero || BNE && Not Zero || Jump){ goto NewAddress; } C Code

MEMORY INTERFACING L OADING AND S TORING

Memory Interfacing Overview There are a limited number of registers in the CPU To maintain and obtain more data, we need to be able to access a larger pool of data

Load Word Loads some word into a register from memory The address is determined by adding an offset to the first operand Easily implemented in the ALU b = mem[a+0x000016] C Code

Store Word Stores a register to memory We use the same method as load word to calculate the address mem[a+0x000016] = b C Code

Memory Datapath

Conclusion Three design points: ALU PC Manipulation Memory Accesses Take advantage of repeated logic If all else fails, more hardware

Questions