Presentation is loading. Please wait.

Presentation is loading. Please wait.

Gary MarsdenSlide 1University of Cape Town Chapter 5 - The Processor  Machine Performance factors –Instruction Count, Clock cycle time, Clock cycles per.

Similar presentations


Presentation on theme: "Gary MarsdenSlide 1University of Cape Town Chapter 5 - The Processor  Machine Performance factors –Instruction Count, Clock cycle time, Clock cycles per."— Presentation transcript:

1 Gary MarsdenSlide 1University of Cape Town Chapter 5 - The Processor  Machine Performance factors –Instruction Count, Clock cycle time, Clock cycles per instruction (CPI)  Both clock cycle time and CPI are determined by processor implementation  We will construct datapath and a control unit for 2 different processor implementations for ‘core’ instructions –Memory ref:lw/sw –Arithmetic: add/sub/and/or/slt –Control:beq/j

2 Gary MarsdenSlide 2University of Cape Town Implementation Overview  Consider a core subset of MIPS instructions: –Integer arith-log instructions –Memory-reference instructions –Branch instructions  Good news is that much is similar across different instructions  For every instruction –Set the PC to a memory location to fetch an instruction –Read one or two registers using instructions fields to choose registers

3 Gary MarsdenSlide 3University of Cape Town Differing Instructions  After previous 2 steps, instructions diverge  All instructions do use the ALU next –Arith-log: for opcode execution –Mem-ref: for effective address calculation –Branches: for comparison  After using the ALU –Arith-log: write data from ALU to register –Mem-ref: access memory containing data to complete a store or retrieve a word being loaded –Branch: may need to exchange next instruction address based on comparison

4 Gary MarsdenSlide 4University of Cape Town High-level view  Two types of functional units: –elements that operate on data values (combinational) –elements that contain state (sequential)

5 Gary MarsdenSlide 5University of Cape Town Clocking methodology  Defines when signals can be read and when they can be written  Assume an edge-triggered clock –Clock cycles between high and low –Clock period: time for one full cycle

6 Gary MarsdenSlide 6University of Cape Town MIPS subset implementation  Develop 2 implementations –Single long clock cycle for each instruction (simple) –Multiple clock cycles per instructions (complex)  Input / Output –Nearly all elements have 32 bit wide inputs/outputs –Buses: signals > 1 bit (thick lines) –Control signals vs data signals Notation: control in colour

7 Gary MarsdenSlide 7University of Cape Town Building Blocks 1. Instruction Memory: a place to store program instructions 2. Program Counter (PC): the address of an instruction 3. Adder: to increment the PC to the instruction location

8 Gary MarsdenSlide 8University of Cape Town The common bit  Instruction execution 1. Fetch instruction from memory 2. Increment PC to next instruction (PC += 4)

9 Gary MarsdenSlide 9University of Cape Town R-format  add, sub, slt, and, or –E.g add $1,$2,$3 ($1 = $2+$3)  Need fourth element: Register file –Contains register state of the machine –Register can be read or written by specifying number 2 read ‘ports’ and 1 write ‘port’ 32 registers => 5 bit register number  Fifth element: ALU –3 bit operation signal

10 Gary MarsdenSlide 10University of Cape Town R-type elements

11 Gary MarsdenSlide 11University of Cape Town R-format execution  Only two elements required –Read 2 registers –Perform ALU operation on the contents of the registers –Write the result Instruction Registers Write register Read data 1 Read data 2 Read register 1 Read register 2 Write data ALU result ALU Zero RegWrite ALU operation 3

12 Gary MarsdenSlide 12University of Cape Town Load and Store Operations  lw $1, offset_value($2)  sw $1, offset_value($2) –Address found by adding offset to contents of $2  Besides previous elements, need  Sixth element: Data Memory Unit –State element with inputs (read address, write address, write data) and a ‘read data’ output  Seventh element: Sign Extension Unit –Memory addresses are all 32 but, so ‘offset’ is extended from 16 to 32 bits

13 Gary MarsdenSlide 13University of Cape Town Sign extension  Consider 16 bit version of 2 –0000 0000 0000 0010  Sign extend by copying most significant bit into the new 32bit word –0000 0000 0000 0000 0000 0000 0000 0010  Consider 16 bit of -1 (1->0, 0->1 and add 1) –1111 1111 1111 1110 –-> 1111 1111 1111 1111 1111 1111 1111 1110  One of the ‘magic’ reasons for using 2’s compliment

14 Gary MarsdenSlide 14University of Cape Town Sixth and Seventh logic elements

15 Gary MarsdenSlide 15University of Cape Town Executing load and store  Address in memory is sign extend (offset + contents of $2)  Store: value from $1 is put in this location  Load: value from location is put in to $1

16 Gary MarsdenSlide 16University of Cape Town Branch instruction  beq $1, $2, offset  Need to compare the contents of $1 and $2  If they are equal, we need to calculate a new value for the PC using the offset  The offset is relative to the branch instruction –So we need to add it to the current PC  The offset is a word offset, not a byte offset!

17 Gary MarsdenSlide 17University of Cape Town Word offset  If the offset was a byte offset, the last two bits would always be ‘00’ as instructions take 4 bytes of memory: –0, 4, 8, 12, 16, 20 etc. –00000, 00100, 01000,01100, 10000, 10100 etc.  This is wasteful  By using a word offset, the range is extended by a factor of four

18 Gary MarsdenSlide 18University of Cape Town Executing branch  If ($1 == $2) PC = PC + (offset << 2)

19 Gary MarsdenSlide 19University of Cape Town Putting it all together - a simple implementation  We know what elements we need, but we need control (mysterious orange lines)  If creating a single datapath –Execute everything in one cycle –No datapath resource used more than once per instruction (duplication)  Elements common to different instructions can be shared - implies multiplexor –Selector for multiple inputs to the same element port MUXMUX A B S C

20 Gary MarsdenSlide 20University of Cape Town Combined path  Key differences between arith-log and mem-ref: Second ALU input & Result register input

21 Gary MarsdenSlide 21University of Cape Town Adding branch path  Use adder to compute target address  Another Mux for PC

22 Gary MarsdenSlide 22University of Cape Town Control - the ALU  5 of 8 options used  Need to generate 3 bit input code to ALU for each instruction type  3 types of code implies 2 bit control (ALUOp) ALU inputFunction000AND 001OR 010Add 110Subtract 111SLT

23 Gary MarsdenSlide 23University of Cape Town ALU control for instruction types

24 Gary MarsdenSlide 24University of Cape Town Main control  ALU control relatively easy (not temporal) –PLA / Simple custom controller  To define the rest of the control circuit –Identify control lines and instruction components  Before we do that, we need to look at the instruction types to understand data bus requirements

25 Gary MarsdenSlide 25University of Cape Town Instruction analysis Target register* oprsrtrdshamtfunct R oprsrdaddress LS oprsrt B address 31-2625-2120-1615-1110-65-0 offset Base register * This implies a Mux

26 Gary MarsdenSlide 26University of Cape Town What does that look like?

27 Gary MarsdenSlide 27University of Cape Town What do the orange bits do?  RegDest –Source of the destination register for the operation  RegWrite –Enables writing a register in the register file  ALUsrc –Source of second ALU operand, can be a register or part of the instruction  PCsrc –Source of the PC (increment [PC + 4] or branch)  MemRead / MemWrite –Reading / Writing from memory  MemtoReg –Source of write register contents

28 Gary MarsdenSlide 28University of Cape Town Building the control unit  All but one of the 7 lines can be set using op- code bits –PCSrc is determined by output from the ALU as well as op-code (need an AND gate)  Besides this 7, there are 2 for the ALUOp  To set these, all we need are the 6 bits determining the op-code

29 Gary MarsdenSlide 29University of Cape Town Bunch up - inserting the control unit

30 Gary MarsdenSlide 30University of Cape Town Truth table

31 Gary MarsdenSlide 31University of Cape Town Sample R-type execution  Instruction fetched and PC incremented  $2 and $3 are read from register file  ALU operates on the data  The result from the ALU is written to register file


Download ppt "Gary MarsdenSlide 1University of Cape Town Chapter 5 - The Processor  Machine Performance factors –Instruction Count, Clock cycle time, Clock cycles per."

Similar presentations


Ads by Google