Presentation is loading. Please wait.

Presentation is loading. Please wait.

EEM 486 EEM 486: Computer Architecture Lecture 4 Designing a Multicycle Processor.

Similar presentations


Presentation on theme: "EEM 486 EEM 486: Computer Architecture Lecture 4 Designing a Multicycle Processor."— Presentation transcript:

1 EEM 486 EEM 486: Computer Architecture Lecture 4 Designing a Multicycle Processor

2 Lec 4.2 The Big Picture  Designing a Multiple Clock Cycle Datapath Control Datapath Memory Processor Input Output

3 Lec 4.3 Single-Cycle Processor In our single-cycle processor, each instruction is realized by exactly one control command or microinstruction Control Logic / Store (PLA, ROM) OPcode Datapath Instruction Decode Conditions Control Points microinstruction

4 Lec 4.4 Abstract View of Single Cycle-Processor PC Next PC Register Fetch ALU Reg. Wrt Mem Access Data Mem Instruction Fetch ALUctr RegDst ALUSrc ExtOp MemWr Equal nPC_sel RegWr MemWr MemRd Main Control ALU control op fun Ext

5 Lec 4.5 What’s Wrong with CPI=1 Processor?  Long Cycle Time  All instructions take as much time as the slowest  Real memory is not as nice as our idealized memory Cannot always get the job done in one (short) cycle PCInst Memory mux ALUData Mem mux PCReg FileInst Memory mux ALU mux PC Inst Memory mux ALUData Mem PCInst Memorycmp mux Reg File Arithmetic & Logical Load Store Branch Critical Path setup

6 Lec 4.6 Memory Access Time  Physics => fast memories are small (large memories are slow)  => Use a hierarchy of memories Storage Array selected word line address storage cell bit line sense amps address decoder Cache Processor 1 time-period proc. bus L2 Cache mem. bus 2-3 time-periods time-periods memory

7 Lec 4.7  Break up the instructions into steps: Let each step take one “smaller” clock cycle - Balance the amount of work to be done - Restrict each cycle to use only one major functional unit Major functional units: Memory, Register File, and ALU Let different instructions take different numbers of cycles  Use a functional unit more than once within execution of one instruction (Less hardware) A single memory unit for both instructions and data A single ALU, rather than an ALU and two adders  At the end of a cycle store values for use in later cycles introduce additional “internal” registers Multicycle Approach

8 Lec 4.8 Partitioning the CPI=1 Datapath  Add registers between smallest steps PC Next PC Operand Fetch Exec Reg. File Mem Access Data Mem Instruction Fetch ALUct r RegDst ALUSrc ExtOp MemWr nPC_sel RegWr MemWr MemRd Equal Instruction fetch Decode and Operand fetch Execution Memory access Write back

9 Lec 4.9 Recall: Step-by-step Processor Design Step 1: ISA => Logical Register Transfers Step 2: Components of the Datapath Step 3: RTL + Components => Datapath Step 4: Datapath + Logical RTs => Physical RTs Step 5: Physical RTs => Control

10 Lec 4.10 Step 4: R-type (add, sub,...) inst Logical Register Transfers ADDUR[rd]<–R[rs] + R[rt]; PC <– PC + 4 Step 1.Instruction Fetch IR ← MEM[PC], PC ← PC + 4 Step 2.Instruction Decode and Register Fetch A ← R[rs], B ← R[rt] Step 3.Execution ALUOut ← A op B Step 4.Write-back R[rd] ← ALUOut

11 Lec 4.11 R-type - Fetch 4 ALU Instruction register Address MemData Memory MemRead=1 IRWrite=1 ALUctr=Add nPCWrite=1 PC Write Data

12 Lec 4.12 R-type – Decode/Register Fetch PC A B 4 ALU Rs Rw Rt Registers Write data Read data 1 Read data 2 Instruction [25-21] Instruction [20-16] Instruction [15-11] Instruction register Address MemData Write data Memory MemRead=0 IRWrite=0 RegWrite=0 ALUctr=x nPCWrite=0

13 Lec 4.13 R-type - Execution PC A B ALU Out Rs Rw Rt Registers Write data Read data 1 Read data 2 Instruction [25-21] Instruction [20-16] Instruction [15-11] Instruction register Address MemData Write data Memory MemRead=0 IRWrite=0 RegWrite=0 ALUSrcA=1 ALUSrcB=0 ALUctr= Func nPCWrite=0

14 Lec 4.14 R-type – Write Back PC A B ALU Out Rs Rw Rt Registers Write data Read data 1 Read data 2 Instruction [25-21] Instruction [20-16] Instruction [15-11] Instruction register Address MemData Write data Memory MemRead=0 IRWrite=0 RegWrite=1 ALUSrcA=x ALUSrcB=x ALUctr=x nPCWrite=0

15 Lec 4.15 Step 4: Logical immed inst Logical Register Transfers ORI R[rt] <– R[rs] OR ZExt(Im16); PC <– PC + 4 Step 1.Instruction Fetch IR ← MEM[PC], PC ← PC + 4 Step 2.Instruction Decode and Register Fetch A ← R[rs] Step 3.Execution ALUOut ← A OR ZExt(Im16) Step 4.Write-back R[rt] ← ALUOut

16 Lec 4.16 Logical immediate - Execution PC Inst [15-11] A B ALU Out Rs Rw Rt Registers Write data Read data 1 Read data 2 Instruction [25-21] Instruction [20-16] Instruction [15-0] Instruction register Address MemData Write data Memory MemRead=0 IRWrite=0 RegWrite=0 ALUSrcA=1 ALUSrcB=2 ALUctr=Or nPCWrite=0 2 Zero extend 16 32

17 Lec 4.17 Logical immediate – Write Back

18 Lec 4.18 Step 4 : Load inst Logical Register Transfers LWR[rt] <– MEM[R[rs] + SExt(Im16)]; PC <– PC + 4 Step 1.Instruction Fetch IR ← MEM[PC], PC ← PC + 4 Step 2.Instruction Decode and Register Fetch A ← R[rs] Step 3.Memory address computation ALUOut ← A + SExt(Im16) Step 4.Memory access MDR ← Memory[ALUOut] Step 5.Load completion R[rt] ← MDR

19 Lec 4.19 Load: Address Calculation PC Inst [15-11] A B ALU Out Rs Rw Rt Registers Write data Read data 1 Read data 2 Instruction [25-21] Instruction [20-16] Instruction [15-0] Instruction register Address MemData Write data Memory MemRead=0 IRWrite=0 RegWrite=0 ALUSrcA=1 ALUSrcB=2 ALUctr=Add nPCWrite=0 2 Zero/ Sign extend 0 1 RegDst=x ExtOp=Sign

20 Lec 4.20 Load: Memory Read PC Inst [15-11] A B ALU Out Rs Rw Rt Registers Write data Read data 1 Read data 2 Instruction [31-26] Instruction [25-21] Instruction [20-16] Instruction [15-0] Instruction register Address MemData Write data Memory MemRead=1 IRWrite=0 RegWrite=0 ALUSrcA=x ALUSrcB=x ALUctr=x nPCWrite=0 2 Extender 0 1 RegDst=x MDR IorD=1 ExtOp=x

21 Lec 4.21 Load: Write Back

22 Lec 4.22 Step 4 : Store inst Logical Register Transfers SWMEM[R[rs] + SExt(Im16)] <– R[rt]; PC <– PC + 4 Step 1.Instruction Fetch IR ← MEM[PC], PC ← PC + 4 Step 2.Instruction Decode and Register Fetch A ← R[rs], B ← R[rt] Step 3.Memory address computation ALUOut ← A + SExt(Im16) Step 4.Memory access Memory[ALUOut] ← B

23 Lec 4.23 Store: Address Calculation

24 Lec 4.24 Store: Memory Write

25 Lec 4.25 Step 4 : Branch inst Logical Register Transfers BEQif R[rs] == R[rt] then PC <= PC SExt(Im16) || 00 else PC <= PC + 4 Step 1.Instruction Fetch IR ← MEM[PC], PC ← PC + 4 Step 2.Instruction Decode and Register Fetch A ← R[rs], B ← R[rt] ALUOut ← PC + SExt(Im16) || 00 Step 3.Branch completion If A = B, PC ← ALUOut

26 Lec 4.26 Branch – Address Calculation

27 Lec 4.27 Branch:Execution

28 Lec 4.28 Multicycle Processor

29 Lec 4.29 Summary of Instruction Steps

30 Lec 4.30  How many cycles will it take to execute this code? lw $t2, 0($t3) lw $t3, 4($t3) beq $t2, $t3, Label #assume not add $t5, $t2, $t3 sw $t5, 8($t3) Label:...  What is going on during the 8th cycle of execution?  In what cycle does the actual addition of $t2 and $t3 takes place? Simple Questions

31 Lec 4.31 Finite State Machine (FSM) Controller  State specifies control points for Register Transfer  Transfer occurs upon exiting state (same falling edge) Control State Next State Logic Output Logic inputs (conditions) outputs (control points)

32 Lec 4.32 FSM for Control PCWrite PCWriteCond IorD MemtoReg PCSource ALUOp ALUSrcB ALUSrcA RegWrite RegDst NS3 NS2 NS1 NS0 O p 5 O p 4 O p 3 O p 2 O p 1 O p 0 S 3 S 2 S 1 S 0 State register IRWrite MemRead MemWrite Instruction register opcode field Outputs Control logic Inputs

33 Lec 4.33 Step 4  Control Specification IR <= MEM[PC] PC <= PC + 4 R-type A <= R[rs], B <= R[rt] S <= PC + SX || 00 S <= A fun B R[rd] <= S S <= A or ZX R[rt] <= S ORi S <= A + SX R[rt] <= M M<=MEM[S] LW MEM[S] <= B BEQ PC <= Next(PC,Equal ) SW instruction fetch decode / operand fetch execute memory write-back

34 Lec 4.34 Step 5  (datapath + state diagram  control)  Translate RTs into control points  Assign states  Then go build the controller

35 Lec 4.35 Mapping RTs to Control Points PCSource= 0 PCWrite IorD= 0 MemRead IRWrite ALUSrcA= 0 ALUSrcB= 01 ALUOp= 000 ALUSrcA= 0 ALUSrcB= 11 ALUOp= 000 ExtOp= 1 Instruction fetch Instruction decode / register fetch ALUSrcA= 1 ALUSrcB= 00 ALUOp= 100 ALUSrcA= 1 ALUSrcB= 10 ALUOp= 010 ExtOp= 0 ALUSrcA= 1 ALUSrcB= 10 ALUOp= 000 ExtOp= 1 ALUSrcA= 1 ALUSrcB= 00 ALUOp= 001 PCSource= 1 PCWriteCond RegDst= 1 RegWrite MemtoReg= 0 RegDst= 0 RegWrite MemtoReg= 0 IorD= 1 MemRead IorD= 1 MemWrite RegDst= 0 RegWrite MemtoReg= 1 R-type ORi LW / SW LW SW Branch

36 Lec 4.36 Assigning States IR <= MEM[PC] PC <= PC + 4 R-type A <= R[rs], B <= R[rt] S <= PC + SX || 00 S <= A fun B R[rd] <= S S <= A or ZX R[rt] <= S ORi S <= A + SX R[rt] <= M M<=MEM[S] LW MEM[S] <= B BEQ PC <= Next(PC,Equal ) SW

37 Lec 4.37 OutputCurrent State PCSourcestate3 PCWritestate0 PCWriteCondstate3 IorDstate9 + state11 MemReadstate0 + state9 MemWritestate11 IRWritestate0 RegDststate4 MemtoRegstate10 RegWrite state5 + state7 + state10 ALUSrcA state3+ state4 + state6 + state8 ALUSrcB1 state1 + state6 + state8 ALUSrcB0state0 + state1 ExtOpstate1 + state8 ALUOp2state4 ALUOp1state6 ALUOp0state3 Control Logic – Datapath Control Outputs PCSource= 0 PCWrite IorD= 0 MemRead IRWrite ALUSrcA= 0 ALUSrcB= 01 ALUOp= 000 ALUSrcA= 0 ALUSrcB= 11 ALUOp= 000 ExtOp= 1 Instruction fetch Instruction decode / register fetch ALUSrcA= 1 ALUSrcB= 00 ALUOp= 100 ALUSrcA= 1 ALUSrcB= 10 ALUOp= 010 ExtOp= 0 ALUSrcA= 1 ALUSrcB= 10 ALUOp= 000 ExtOp= 1 ALUSrcA= 1 ALUSrcB= 00 ALUOp= 001 PCSource= 1 PCWriteCond RegDst= 1 RegWrite MemtoReg= 0 RegDst= 0 RegWrite MemtoReg= 0 IorD= 1 MemRead IorD= 1 MemWrite RegDst= 0 RegWrite MemtoReg= 1 R-type ORi LW / SW LW SW Branch

38 Lec 4.38 Control Logic – Next State Function Output Current State Opcode NextState0 state3 + state5+ state7+ state10+ state11 NextState1state0 NextState3state1BEQ NextState4state1R-type NextState5state4 NextState6state1ORi NextState7state6 NextState8state1 LW + SW NextState9state8LW NextState10state9 NextState11state5SW PCSource= 0 PCWrite IorD= 0 MemRead IRWrite ALUSrcA= 0 ALUSrcB= 01 ALUOp= 000 ALUSrcA= 0 ALUSrcB= 11 ALUOp= 000 ExtOp= 1 Instruction fetch Instruction decode / register fetch ALUSrcA= 1 ALUSrcB= 00 ALUOp= 100 ALUSrcA= 1 ALUSrcB= 10 ALUOp= 010 ExtOp= 0 ALUSrcA= 1 ALUSrcB= 10 ALUOp= 000 ExtOp= 1 ALUSrcA= 1 ALUSrcB= 00 ALUOp= 001 PCSource= 1 PCWriteCond RegDst= 1 RegWrite MemtoReg= 0 RegDst= 0 RegWrite MemtoReg= 0 IorD= 1 MemRead IorD= 1 MemWrite RegDst= 0 RegWrite MemtoReg= 1 R-type ORi LW / SW LW SW Branch

39 Lec 4.39 PLA Implementation

40 Lec 4.40 Performance Evaluation  What is the average CPI? State diagram gives CPI for each instruction type Workload gives frequency of each type TypeCPI i for typeFrequency CPI i x freqI i Arith/Logic 4 40% 1.6 Load 5 30% 1.5 Store 4 10% 0.4 branch 3 20% 0.6 Average CPI: 4.1

41 Lec 4.41 Another Implementation Style

42 Lec 4.42 Address Select Logic

43 Lec 4.43 Address Select Logic Current State Address Control ActionAddrCtl 0 Use incremented state 3 1 Use dispatch ROM Use dispatch ROM Use incremented state 3 4 Replace state number by Use incremented state 3 7 Replace state number by PCWrite PCSource = 10 ALUSrcA = 1 ALUSrcB = 00 ALUOp = 01 PCWriteCond PCSource = 01 ALUSrcA =1 ALUSrcB = 00 ALUOp= 10 RegDst = 1 RegWrite MemtoReg = 0 MemWrite IorD = 1 MemRead IorD = 1 ALUSrcA = 1 ALUSrcB = 10 ALUOp = 00 RegDst=0 RegWrite MemtoReg=1 ALUSrcA = 0 ALUSrcB = 11 ALUOp = 00 MemRead ALUSrcA = 0 IorD = 0 IRWrite ALUSrcB = 01 ALUOp = 00 PCWrite PCSource = 00 Instruction fetch Instruction decode/ register fetch Jump completion Branch completionExecution Memory address computation Memory access Memory access R-type completion Write-back step ( O p = ' L W ' ) o r ( O p = ' S W ' ) ( O p = R - t y p e ) ( O p = ' B E Q ' ) ( O p = ' J ' ) ( O p = ' S W ' ) ( O p = ' L W ' ) Start

44 Lec 4.44 Dispatch ROMs Dispatch ROM 1 Dispatch ROM 2 OpOpcode nameValue OpOpcode nameValue R-format lw jmp sw beq lw sw 0010 PCWrite PCSource = 10 ALUSrcA = 1 ALUSrcB = 00 ALUOp = 01 PCWriteCond PCSource = 01 ALUSrcA =1 ALUSrcB = 00 ALUOp= 10 RegDst = 1 RegWrite MemtoReg = 0 MemWrite IorD = 1 MemRead IorD = 1 ALUSrcA = 1 ALUSrcB = 10 ALUOp = 00 RegDst=0 RegWrite MemtoReg=1 ALUSrcA = 0 ALUSrcB = 11 ALUOp = 00 MemRead ALUSrcA = 0 IorD = 0 IRWrite ALUSrcB = 01 ALUOp = 00 PCWrite PCSource = 00 Instruction fetch Instruction decode/ register fetch Jump completion Branch completionExecution Memory address computation Memory access Memory access R-type completion Write-back step ( O p = ' L W ' ) o r ( O p = ' S W ' ) ( O p = R - t y p e ) ( O p = ' B E Q ' ) ( O p = ' J ' ) ( O p = ' S W ' ) ( O p = ' L W ' ) Start

45 Lec 4.45 Microprogramming: Designing the control as a program that implements the machine instructions in terms of microinstructions Microprogramming

46 Lec 4.46 Main Memory execution unit control memory CPU ADD SUB AND DATA User program plus Data this can change! AND microsequence e.g., Fetch Calc Operand Addr Fetch Operand(s) Calculate Save Answer(s) one of these is mapped into one of these Microinstruction ???

47 Lec 4.47 Microprogramming a Multicycle Processor 1) Choose datapath and sequencer architecture 2) Assign states and sequence of each (multicycle) instruction (i.e., define the controller FSM) 3) Choose microinstruction format (minimum bits to describe all allowable functions of sequencer and datapath) 4) Map instructions into microinstruction sequences

48 Lec 4.48 Designing a Microinstruction Set 1) Start with list of control signals 2) Group signals together that make sense: called “fields” 3) Place fields in some logical order (e.g., ALU operation & ALU operands first and microinstruction sequencing last) 4) Create a symbolic legend for the microinstruction format, showing name of field values and how they set the control signals 5) To minimize the width, encode operations that will never be used at the same time

49 Lec 4.49 Multicycle Processor

50 Lec 4.50 Microinstruction fields

51 Lec 4.51 Sequencer Dispatch ROM 1 OpOpcode nameValue R-format Rformat jmp JUMP beq BEQ lw Mem sw Mem1 Dispatch ROM 2 OpOpcode nameValue lw LW sw SW2

52 Lec 4.52 Microinstructions Label ALU control SRC1SRC2 Register control Memory PCWrite control Sequencing FetchAddPC4 Read PC ALUSeq AddPC Extshft ReadDispatch 1 Fetch and Decode: R-type instructions Label ALU control SRC1SRC2 Register control Memory PCWrite control Sequencing Rformat1 Func code ABSeq Write ALU Fetch

53 Lec 4.53 Microinstructions Label ALU control SRC1SRC2 Register control Memory PCWrite control Sequencing Mem1AddA Extend Dispatch 2 LW2 Read ALU Seq Write MDR Fetch SW2 Write ALU Fetch Memory-reference: Branch Label ALU control SRC1SRC2 Register control Memory PCWrite control Sequencing BEQ1SubtA B ALUOut - cond Fetch

54 Lec 4.54 Microprogram

55 Lec 4.55 Summary  Disadvantages of the Single Cycle Processor Long cycle time Cycle time is too long for all instructions except the Load  Multiple Cycle Processor: Divide the instructions into smaller steps Execute each step (instead of the entire instruction) in one cycle  Partition datapath into equal size chunks to minimize cycle time  Follow same 5-step method for designing “real” processor


Download ppt "EEM 486 EEM 486: Computer Architecture Lecture 4 Designing a Multicycle Processor."

Similar presentations


Ads by Google