Presentation is loading. Please wait.

Presentation is loading. Please wait.

Gary MarsdenSlide 1University of Cape Town Stages.

Similar presentations


Presentation on theme: "Gary MarsdenSlide 1University of Cape Town Stages."— Presentation transcript:

1 Gary MarsdenSlide 1University of Cape Town Stages

2 Gary MarsdenSlide 2University of Cape Town Load instruction - lw$1, offset($2)

3 Gary MarsdenSlide 3University of Cape Town Beq $1, $2, offset

4 Gary MarsdenSlide 4University of Cape Town Finalising control  Actual Op code

5 Gary MarsdenSlide 5University of Cape Town Final truth table

6 Gary MarsdenSlide 6University of Cape Town PLA implementation

7 Gary MarsdenSlide 7University of Cape Town Limitations of single cycle  Clock cycle identical for every instruction –CPI = 1  Bound by longest instruction (load word) –Inst., register, ALU, data memory, register  Not all instructions will take this long –Memory access: 8 ns –Register access: 2 ns –ALU: 4 ns

8 Gary MarsdenSlide 8University of Cape Town Instruction timing Inst. Class Inst. Mem Reg Read ALU op Data Mem Reg Write Total R-type20050100050400ps Load Word 2005010020050600ps Store Word 200501002000550ps Branch2005010000350ps Jump2000000200ps

9 Gary MarsdenSlide 9University of Cape Town Variable timing  If we looked at a typical instruction profile, we could estimate how inefficient this scheme is:  CPU clock cycle = 600 x 25% + 550 x 10% + 400 x 45% + 350 x 15% + 200 x 5%  CPU clock cycle = 447.5 ps

10 Gary MarsdenSlide 10University of Cape Town Multicycle implementation  Previously, instruction broken in to a series of steps corresponding to the functional unit operations need  Can use these steps to create a multi-cycle implementation where each step is the execution takes one clock cycle –Unit can be used more than once (on different cycles) –Can help reduce the total amount of hardware required Trade-off with complex control

11 Gary MarsdenSlide 11University of Cape Town Differences  Single instruction / data memory  Single ALU  Some extra registers for buffers (more later)

12 Gary MarsdenSlide 12University of Cape Town Implications  Need to add more Muxs and registers (cheap)  New control signals –Write signal for each state element (PC, memory, register file, instruction register) –Read signal for memory –ALU control unit (as before)  But we can ditch two adders and memory unit

13 Gary MarsdenSlide 13University of Cape Town New Instruction Path

14 Gary MarsdenSlide 14University of Cape Town With Control Unit

15 Gary MarsdenSlide 15University of Cape Town Breaking into Clock Cycles  Examine what happens in each clock cycle of each instruction to make sure we have enough elements (e.g. registers, control lines)  Registers introduced when –Value computed in one cycle and used in another –Inputs to a block change before output can be written to a state element Mem -> ALU -> Mem

16 Gary MarsdenSlide 16University of Cape Town Goal of execution cycles  Balance the amount of work done each cycle to minimize the cycle time  In our case, we use 5 steps  Each step limited to –At most one ALU op –One register access –One memory access  Clock cycle will be same as the longest of these

17 Gary MarsdenSlide 17University of Cape Town Instruction steps 1.Instruction fetch 2.Instruction decode and register fetch 3.Execution, mem address completion or branch completion 4.Memory access or R-type write back 5.Write back  Using this information we can determine what control must do in each clock cycle

18 Gary MarsdenSlide 18University of Cape Town Control line effects

19 Gary MarsdenSlide 19University of Cape Town

20 Gary MarsdenSlide 20University of Cape Town Instruction fetch  Load instruction from memory  IR = Memory [PC] –Set Read address mux (IorD) = 0 select instruction –Set MemRead = 1  Increment PC  PC = PC + 4 –Set ALUSrcA = 0 get operand from IR –Set ALUSrcB = 01 get operand '4' –Set ALUOp = 00 add –Allow storing new PC in PC register

21 Gary MarsdenSlide 21University of Cape Town Instruction decode and fetch  Switch registers to the output of the register block –A = register [IR [25-21]] rs –B = register [IR [20-16]] rt –No signal setting required  Calculate the branch target address target PC = (sign-ext. (IR [15-0]) << 2) –Stored in the ALUOut register –Set ALUSrcB = 11 –Set ALUOp = 00 add

22 Gary MarsdenSlide 22University of Cape Town

23 Gary MarsdenSlide 23University of Cape Town Memory access Execution  Step depends on the instruction  Selection performed by interpretation of the op + function field of the instruction  Calculate memory reference address  ALUOut = A + sign-ext. (IR[15-0]) –Set ALUSrcA = 1 get operand from A –Set ALUSrcB = 10 get operand from sign extension unit –Set ALUOp = 00 add

24 Gary MarsdenSlide 24University of Cape Town

25 Gary MarsdenSlide 25University of Cape Town Execution II  Arithmetic-logical instruction (R-type) –ALUOut = A op B –Set ALUSrcA = 1 get operand from A –Set ALUSrcB = 00 get operand from B –Set ALUOp = 10 code from IR  Branch: if (A == B) PC = ALUOut –Set ALUSrcA = 1 get operand from A –Set ALUSrcB = 00 get operand from B –Set ALUOp = 01 subtraction –Write ALUOut to PC register

26 Gary MarsdenSlide 26University of Cape Town Mem access complete  Memory access –ALU controls must remain stable –Set IorD = 1 address from ALU  memory-data = memory [ALUOut]  load from memory –Set MemRead = 1  memory [ALUOut] = B  store to memory –Set MemWrite = 1

27 Gary MarsdenSlide 27University of Cape Town

28 Gary MarsdenSlide 28University of Cape Town R-type complete  Arithmetic-logical instruction complete  Register [IR [15-11]] = ALUOut –Set RegDst = 1 Select write register –Set RegWrite = 1 Allow write operation –Set MemToReg = 0 Select ALU data –ALUOp, ALUSrcA, ALUSrcB = constant

29 Gary MarsdenSlide 29University of Cape Town Write-back  Write data from memory to the register –Reg [IR[20-16]] = memory-data –Set RegDst = 0 Select write rt as target register –Set RegWrite = 1 Allow write operation –Set MemToReg = 1 Select Memory data –ALUOp, ALUSrcA, ALUSrcB = constant

30 Gary MarsdenSlide 30University of Cape Town Summary

31 Gary MarsdenSlide 31University of Cape Town Defining Control  Single cycle path –Construct a truth table and mapped them to logic gates  Multi-cycle –Tricky because of temporal aspect –Control must specify Signal settings Next step in execution –Two techniques Finite State machines (usually graphically represented) Microprogramming (code representation)

32 Gary MarsdenSlide 32University of Cape Town Finite State Machines  Consists of –Set of states –Rules for moving between states  Details –Each state has a set of asserted outputs Those not explicitly asserted are de-asserted –States correspond to the 5 stages of execution –Each step takes one clock cycle –Initial two states are common

33 Gary MarsdenSlide 33University of Cape Town Overview

34 Gary MarsdenSlide 34University of Cape Town FSM for fetch

35 Gary MarsdenSlide 35University of Cape Town Complete diagram

36 Gary MarsdenSlide 36University of Cape Town FSM Implementation  A register to hold current state  A block of combinational logic to determine: –Datapath signals to be asserted –The next state

37 Gary MarsdenSlide 37University of Cape Town Microprogramming  Design the control as a program that implements the machine instructions in terms of simpler microinstructions –For our subset, FSM are fine –For full instruction set (>100) which vary from 1 to 20 cycles more complexity is required (diagrams insufficient) –Use ideas from programming to create a simpler way to define control –Control instructions are referred to as microinstructions (as opposed to MIPS inst.)

38 Gary MarsdenSlide 38University of Cape Town More Microprogramming  Each instruction defines ‘the set of datapath control signals that must be asserted in a given state’  ‘executing’ a microinstruction has the effect of asserting the specified control lines  Format –Symbolic representation of the control that is translated in to control logic –Can choose number of mInstruction fields and what control signals are affected by each field

39 Gary MarsdenSlide 39University of Cape Town Fields

40 Gary MarsdenSlide 40University of Cape Town Choices  Format is chosen to simplify representation –Improving programmer comprehension –A lot better than pure binary to specify how a Mux is set  Besides the format of the instruction, we need to figure out the order of execution

41 Gary MarsdenSlide 41University of Cape Town Choosing next MicroInstruction  Increment address of current mInstruction to get next mInstruction (Seq) - default  Branch to the mInstruction that begins execution of the next MIPS instruction (Fetch)  Choose next instruction based on control unit (Dispatch) –Implemented via a lookup (dispatch) table containing addresses of target mInstructions –Often multiple tables –Kind of like a switch statement

42 Gary MarsdenSlide 42University of Cape Town Sample mInstruction

43 Gary MarsdenSlide 43University of Cape Town Full program

44 Gary MarsdenSlide 44University of Cape Town Finally - exceptions  Hardest part of control: implementing exceptions and interrupts (events other than branches that change flow of execution)  Interrupt –Unexpected change in flow of control generated by event outside processor (usually I/O device)  Exception –Any unexpected change of flow control regardless of source  Often, interrupt and exception are not distinguished

45 Gary MarsdenSlide 45University of Cape Town Exception Handling  Samples include –Invocation of operating system from user –Arithmetic overflow –Undefined instruction –Hardware malfunction  In our subset –Undefined instruction –Arithmetic overflow

46 Gary MarsdenSlide 46University of Cape Town Responding to an exception  Save address of offending instruction in EPC (exception program counter)  Transfer control to operating system with error handling code  Return to original code (using EPC) and continue. Could be: –Providing service to the user program –Coping with overflow –Stopping execution to report and error

47 Gary MarsdenSlide 47University of Cape Town Extra info  Operating system must know why the exception happened, not just where. Therefore could have either: –Cause register: a status register which holds field indicating reason for exception –Vectored interrupts: pair of cause and address to which control is transferred

48 Gary MarsdenSlide 48University of Cape Town Implication  Can perform exception handling by adding some control lines and some registers to the processor –EPC - 32 bit obviously (with EPC write control line) –Cause - 32 bit (with CauseWrite and IntCause control lines) IntCause is 0 for undefined and 1 for overflow –Also need to write to EPC (PC - 4)

49 Gary MarsdenSlide 49University of Cape Town Gratuitous scary picture

50 Gary MarsdenSlide 50University of Cape Town Into Practice - Pentium Datapath  Pentium based on complex (CISC) IA-32 instruction set –Some instructions take over 100 clock cycles! –Some only take 3 or 4 clock cycles  Trick is to support the long instructions without impacting the common core of instructions  Control works by –Using MicroCode for the control of long instructions –Hard-wired control for short instructions

51 Gary MarsdenSlide 51University of Cape Town Summary  Single cycle path has low control overhead but needs a lot of resources and is slow  Multi-cycle much more efficient (speed and resources) but has more complex control  Can use FSM or microcode to specify control –FSM not good for large instruction sets  Also need mechanism to handle interrupts and exceptions


Download ppt "Gary MarsdenSlide 1University of Cape Town Stages."

Similar presentations


Ads by Google