CSE 308 – Computer Architecture Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and Minerals Multicycle Implementation.

Slides:



Advertisements
Similar presentations
1 Datapath and Control (Multicycle datapath) CDA 3101 Discussion Section 11.
Advertisements

Microprocessor Design Multi-cycle Datapath Nia S. Bradley Vijay.
1 Chapter Five The Processor: Datapath and Control.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 14 - Multi-Cycle.
The Processor Data Path & Control Chapter 5 Part 2 - Multi-Clock Cycle Design N. Guydosh 2/29/04.
Chapter 5 The Processor: Datapath and Control Basic MIPS Architecture Homework 2 due October 28 th. Project Designs due October 28 th. Project Reports.
EECE476 Lecture 9: Multi-cycle CPU Datapath Chapter 5: Section 5.5 The University of British ColumbiaEECE 476© 2005 Guy Lemieux.
CSE378 Multicycle impl,.1 Drawbacks of single cycle implementation All instructions take the same time although –some instructions are longer than others;
1 5.5 A Multicycle Implementation A single memory unit is used for both instructions and data. There is a single ALU, rather than an ALU and two adders.
©UCB CS 161Computer Architecture Chapter 5 Lecture 11 Instructor: L.N. Bhuyan Adapted from notes by Dave Patterson (http.cs.berkeley.edu/~patterson)
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
331 W10.1Spring :332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Multi-Cycle Processor.
1. 2 Multicycle Datapath  As an added bonus, we can eliminate some of the extra hardware from the single-cycle datapath. —We will restrict ourselves.
Dr. Iyad F. Jafar Basic MIPS Architecture: Multi-Cycle Datapath and Control.
Computer Architecture Chapter 5 Fall 2005 Department of Computer Science Kent State University.
Datapath and Control: MultiCycle Implementation. Performance of Single Cycle Machines °Assume following operation times: Memory units : 200 ps ALU and.
CPE232 Basic MIPS Architecture1 Computer Organization Multi-cycle Approach Dr. Iyad Jafar Adapted from Dr. Gheith Abandah slides
1 CS/COE0447 Computer Organization & Assembly Language Multi-Cycle Execution.
C HAPTER 5 T HE PROCESSOR : D ATAPATH AND C ONTROL M ULTICYCLE D ESIGN.
1 A single-cycle MIPS processor  An instruction set architecture is an interface that defines the hardware operations which are available to software.
1 Processor: Datapath and Control Single cycle processor –Datapath and Control Multicycle processor –Datapath and Control Microprogramming –Vertical and.
Multicycle Implementation
IT 251 Computer Organization and Architecture Multi Cycle CPU Datapath Chia-Chi Teng.
LECTURE 6 Multi-Cycle Datapath and Control. SINGLE-CYCLE IMPLEMENTATION As we’ve seen, single-cycle implementation, although easy to implement, could.
ECE-C355 Computer Structures Winter 2008 The MIPS Datapath Slides have been adapted from Prof. Mary Jane Irwin ( )
Datapath and Control AddressInstruction Memory Write Data Reg Addr Register File ALU Data Memory Address Write Data Read Data PC Read Data Read Data.
Datapath and Control AddressInstruction Memory Write Data Reg Addr Register File ALU Data Memory Address Write Data Read Data PC Read Data Read Data.
Design a MIPS Processor
Datapath and Control AddressInstruction Memory Write Data Reg Addr Register File ALU Data Memory Address Write Data Read Data PC Read Data Read Data.
RegDst 1: RegFile destination No. for the WR Reg. comes from rd field. 0: RegFile destination No. for the WR Reg. comes from rt field.
Chapter 4 From: Dr. Iyad F. Jafar Basic MIPS Architecture: Multi-Cycle Datapath and Control.
1 CS/COE0447 Computer Organization & Assembly Language Chapter 5 Part 3.
1 The final datapath. 2 Control  The control unit is responsible for setting all the control signals so that each instruction is executed properly. —The.
1 CS/COE0447 Computer Organization & Assembly Language Chapter 5 Part 3 In-Class Exercises.
Design a MIPS Processor (II)
Multi-Cycle Datapath and Control
Chapter 5: A Multi-Cycle CPU.
IT 251 Computer Organization and Architecture
Systems Architecture I
Multi-Cycle CPU.
Single Cycle Processor
D.4 Finite State Diagram for the Multi-cycle processor
Multi-Cycle CPU.
Basic MIPS Architecture
CS/COE0447 Computer Organization & Assembly Language
Multiple Cycle Implementation of MIPS-Lite CPU
Multicycle Approach Break up the instructions into steps
The Multicycle Implementation
CS/COE0447 Computer Organization & Assembly Language
Chapter Five The Processor: Datapath and Control
Drawbacks of single cycle implementation
The Multicycle Implementation
Systems Architecture I
Vishwani D. Agrawal James J. Danaher Professor
COSC 2021: Computer Organization Instructor: Dr. Amir Asif
Processor: Multi-Cycle Datapath & Control
Computer Architecture Processor: Datapath
Multi-Cycle Datapath Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
Chapter Four The Processor: Datapath and Control
Control Unit for Multiple Cycle Implementation
5.5 A Multicycle Implementation
Systems Architecture I
Control Unit for Multiple Cycle Implementation
FloorPlan for Multicycle MIPS
Control Unit (single cycle implementation)
Alternative datapath (book): Multiple Cycle Datapath
The Processor: Datapath & Control.
CS161 – Design and Architecture of Computer Systems
Single Cycle Processor Design
Presentation transcript:

CSE 308 – Computer Architecture Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and Minerals Multicycle Implementation

Multicycle Implementation© Muhamed Mudawar, COE 308 – KFUPMSlide 2 Drawbacks of Single Cycle Processor  Long cycle time  All instructions take as much time as the slowest  Functional units are duplicated raising cost  Each functional unit can be used once per clock cycle Instruction Fetch Store ALUMemory Write Instruction Fetch Arithmetic & Logical Register ReadALUReg Write Instruction Fetch Branch Load Memory Read Instruction Fetch longest delay ALURegister Read ALU Reg Write

Multicycle Implementation© Muhamed Mudawar, COE 308 – KFUPMSlide 3 Solution = Multicycle Implementation  Break instruction execution into five steps  Instruction fetch  Instruction decode and register read  Execution, memory address calculation, or branch completion  Memory access or ALU instruction completion  Load instruction completion  One step = One clock cycle (clock cycle is reduced)  First 2 steps are the same for all instructions Instruction # cyclesInstruction# cycles ALU4Branch3 Load5Store4

Multicycle Implementation© Muhamed Mudawar, COE 308 – KFUPMSlide 4 MIPS Multicycle Datapath ALU is used to increment upper 30 bits of PC, to compute branch target and load/store address, and to execute ALU instructions Registers are used to store values at the end of each clock cycle for use during next cycle Same memory is used for instructions and data 4 Imm16 32 ALUALU RA RB BusA BusB RW BusW Address Instruction or data Memory 30 PC Rs Extender 5 Rt muxmux 0 1 muxmux 0 1 mux 0 1 Rd 5 muxmux 0 1 zero Data_in MDR IR Registers A B muxmux ALUout 32 PC[31:28], Imm26 30 muxmux

Multicycle Implementation© Muhamed Mudawar, COE 308 – KFUPMSlide 5 Multicycle Datapath Changes  Eliminating some of the components  Single memory unit for both instructions and data  Single ALU eliminating branch address adder and PC adder Note: modern CPUs maintain separate instruction and data memories as well as separate address adders, but we reduce them here because the same component can be used for different purposes in different cycles  Adding temporary registers  Instruction Register: IR  Memory Data Register: MDR  Register file output data registers: A and B  ALU output register: ALUout  Required to store major unit output values for use in next cycle

Multicycle Implementation© Muhamed Mudawar, COE 308 – KFUPMSlide 6 Multicycle Datapath Changes – cont’d  This multicycle design can accommodate  One memory access per cycle  IR register saves fetched instruction  MDR register saves the read memory data  One register file access per cycle  Two registers can be read concurrently into A and B registers  One ALU operation per cycle  ALUout register saves the ALU output  Additional multiplexers are also needed  Mux before the memory address to select PC or ALUout address  Mux before 1 st ALU input to select PC to increment or A register  Extended mux before PC to increment PC, branch, or jump

Multicycle Implementation© Muhamed Mudawar, COE 308 – KFUPMSlide 7 Multicycle Datapath + Control Signals 4 Imm16 32 ALUALU RA RB BusA BusB RW BusW Address MemData Memory 30 PC Rs Extender Rt muxmux 0 1 muxmux 0 1 mux 0 1 Rd muxmux 0 1 zero Data_in MDR IR Registers A B muxmux ALUout 32 PC[31:28], Imm MemReadMemWrite ExtOp RegDst MemtoReg IorDIRWriteRegWriteALUSrcA ALUSrcB ALUCtrl PCWrite PCSource muxmux PCWrite and IRWrite to enable the writing of PC and IR registers IorD to select memory address as either PC for instruction or ALUout for data address PCSource to select PC input ALUSrcA and ALUSrcB to select ALU inputs More control signals than single-cycle CPU NextPC

Multicycle Implementation© Muhamed Mudawar, COE 308 – KFUPMSlide 8 SignalEffect when ‘0’Effect when ‘1’ RegDstDestination register = RtDestination register = Rd RegWriteNoneRegister(RW) ← BusW ExtOp16-bit immediate is zero-extended16-bit immediate is sign-extended ALUSrcA1 st ALU operand is PC (upper 30-bit)1 st ALU operand is the A register ALUSrcB2 nd ALU operand is the B register2 nd ALU input is extended-imm16 MemReadNoneMemData ← Memory[address] MemWriteNoneMemory[address] ← Data_in MemtoRegBusW = ALUoutBusW = MDR IorDMemory Address = PCMemory Address = ALUout IRWriteNoneIR ← MemData PCWriteNonePC ← NextPC Control Signals SignalValueEffect PCSource 00NextPC = PC[31:2] + 1 (increment upper 30 bits of PC) 01NextPC = ALUout = PC[31:2] sign-extend(imm16) (for branch) 10NextPC = PC[31:28], imm26 (for jump)

Multicycle Implementation© Muhamed Mudawar, COE 308 – KFUPMSlide 9 ALUALU PC 00 muxmux 0 1 muxmux Instruction Fetch Cycle 4 Imm16 32 RA RB BusA BusB RW BusW Rs Extender Rt muxmux 0 1 muxmux 0 1 Rd zero MDR Registers A B 32 ALUout 32 PC[31:28], Imm IorD = 0 MemRead = 1 MemWrite = 0 IRWrite = 1 RegWrite = 0 ALUSrcA = 0 ALUSrcB = x ALUCtrl = INC RegDst = x PCWrite = 1 PCSource = 0 mux 0 1 MemtoReg = x IR←Memory[PC] PC← PC + 4 muxmux 0 1 Address MemData Memory Data_in IR ExtOp = x IorD = 0 MemRead = 1 IRWrite = 1 ALUSrcA = 0,ALU = INC PCSource = 0,PCWrite = 1 MemWrite = 0, RegWrite = 0 Don’t care about rest 30 NextPC 1 st ALU input = 00, PC[31:2]

Multicycle Implementation© Muhamed Mudawar, COE 308 – KFUPMSlide Decode and Register Fetch Address MemData Memory muxmux 0 1 Rd muxmux 0 1 zero Data_in MDR IR Imm16 32 ALUALU 30 Extender muxmux ALUout 32 muxmux 0 1 RA RB BusA BusB RW BusW Registers A B Rs Rt ALUSrcB = 1 ALUCtrl = ADD mux 0 1 MemtoReg = x A ← Reg[Rs], B ← Reg[Rt] A, B are written on every cycle ExtOp = 1 ALUout ← PC[31:2] + sign-ext(Im16) (branch address) ALUSrcA = 0, ALUSrcB = 1, ExtOp = 1, ALU = ADD MemRead = MemWrite = IRWrite = RegWrite = 0 PCSource = 10, PCWrite = J (Jump Completion)Compute branch address in advance During this cycle, the instruction is decoded to determine the control signals for the next cycles 1 st ALU input = 00, PC[31:2] PC[31:28], Imm muxmux NextPC 30 PC 00 IorD = x MemRead = 0 MemWrite = 0 IRWrite = 0 RegWrite = 0 ALUSrcA = 0 RegDst = x PCWrite = J PCSource = 10

Multicycle Implementation© Muhamed Mudawar, COE 308 – KFUPMSlide 11 Rs Rt A B 3a. Execute Cycle for R-type 4 Imm16 RA RB BusA BusB RW BusW Address MemData Memory 30 Extender muxmux 0 1 Rd muxmux 0 1 zero Data_in MDR IR Registers 32 PC[31:28], Imm IorD = x MemRead = 0 MemWrite = 0 IRWrite = 0 RegWrite = 0 ALUSrcA = 1 ALUSrcB = 0 ALUCtrl = funct RegDst = x PCWrite = 0 PCSource = x muxmux mux 0 1 MemtoReg = x NextPC For R-type ALU instructions: ALUout ← A funct B ALUCtrl depends on the function field ExtOp = x ALUSrcA = 1, ALUSrcB = 0, ALU = funct PCWrite = MemRead = MemWrite = IRWrite = RegWrite = 0 Don’t care about rest PC 00 During this cycle, the ALU can perform different operations depending on the instruction class ALUALU muxmux 0 1 muxmux 0 1 ALUout 32

Multicycle Implementation© Muhamed Mudawar, COE 308 – KFUPMSlide 12 Rs Rt 3b. Compute Address for Load/Store 4 Address MemData Memory 30 muxmux 0 1 Rd muxmux 0 1 zero Data_in MDR 32 PC[31:28], Imm IorD = x MemRead = 0 MemWrite = 0 IRWrite = 0 RegWrite = 0 ALUSrcA = 1 ALUSrcB = 1 ALUCtrl = ADD RegDst = x PCWrite = 0 PCSource = x muxmux mux 0 1 MemtoReg = x NextPC ALUout ← A + sign-extend(Immediate16) ExtOp = sign ALUSrcA = 1,ALUSrcB = 1 ExtOp = sign,ALU = ADD PCWrite = MemRead = MemWrite = IRWrite = RegWrite = 0 PC 00 For load/store instructions, ALU computes memory address during 3 rd cycle IR A B RA RB BusA BusB RW BusW Registers Imm16 Extender ALUALU 32 muxmux 0 1 muxmux 0 1 ALUout 32 Same control signals can be used with I-type ALU

Multicycle Implementation© Muhamed Mudawar, COE 308 – KFUPMSlide 13 ALUout Rs Rt A B 3c. Branch Completion 4 Imm16 RA RB BusA BusB RW BusW Address MemData Memory 30 Extender muxmux 0 1 Rd muxmux 0 1 Data_in MDR IR Registers 32 PC[31:28], Imm ALUSrcB = 0 ALUCtrl = SUB mux 0 1 MemtoReg = x if (branch) PC ← ALUout ALUout is branch target address computed during the second cycle ExtOp = x ALUSrcA = 1, ALUSrcB = 0,PCSource = 01 ALUCtrl = SUB,PCWrite = Branch MemRead = MemWrite = IRWrite = RegWrite = 0 For a branch, ALU compares A with B and if the branch is taken then PC becomes ALUout zero ALUALU 32 muxmux 0 1 muxmux 0 1 Branch depends on the zero condition Branch = beq. zero + bne. zero muxmux NextPC PC IorD = x MemRead = 0 MemWrite = 0 IRWrite = 0 RegWrite = 0 ALUSrcA = 1 RegDst = x PCWrite = Branch PCSource = 01

Multicycle Implementation© Muhamed Mudawar, COE 308 – KFUPMSlide Rs A B 4a. ALU Instruction Completion 4 Imm16 Address MemData Memory 30 Extender muxmux 0 1 Data_in MDR IR 32 PC[31:28], Imm26 30 IorD = x MemRead = 0 MemWrite = 0 IRWrite = 0 RegWrite = 1 ALUSrcA = x ALUSrcB = x ALUCtrl = x RegDst = 1 or 0 PCWrite = 0 PCSource = x MemtoReg = 0 Reg[Rd]←ALUout (for R-type) Reg[Rt]←ALUout (for I-type ALU instruction) ExtOp = x RegDst = 1 (for R-type and 0 for I-type) MemtoReg = 0,RegWrite = 1 PCWrite = MemRead = MemWrite = IRWrite = 0, and don’t care about rest During 4 th cycle, ALU instruction completes writing its result into the destination register zero ALUALU 32 muxmux 0 1 muxmux muxmux NextPC PC ALUout mux RA RB BusA BusB RW BusW Registers Rt muxmux 0 1 Rd

Multicycle Implementation© Muhamed Mudawar, COE 308 – KFUPMSlide Rs 4b. Memory Access for Load & Store 4 Imm16 30 Extender PC[31:28], Imm26 30 IorD = 1 MemRead = 1 MemWrite = 0 IRWrite = 0 RegWrite = 0 ALUSrcA = x ALUSrcB = x ALUCtrl = x RegDst = x PCWrite = 0 PCSource = x MemtoReg = x MDR ← Memory[ALUout](for load) Memory[ALUout] ← B(for store) ExtOp = x IorD = 1,MemRead = 1 (load) MemWrite = 1 (store) PCWrite = IRWrite = RegWrite = 0 Load & store access memory during 4 th cycle zero ALUALU 32 muxmux 0 1 muxmux muxmux NextPC PC ALUout mux 0 1 Rt muxmux 0 1 Rd IR Address MemData Memory Data_in MDR muxmux A B RA RB BusA BusB RW BusW Registers

Multicycle Implementation© Muhamed Mudawar, COE 308 – KFUPMSlide Rs A B 5. Load Instruction Completion 4 Imm16 Address MemData Memory 30 Extender muxmux 0 1 Data_in MDR IR 32 PC[31:28], Imm26 30 IorD = x MemRead = 0 MemWrite = 0 IRWrite = 0 RegWrite = 1 ALUSrcA = x ALUSrcB = x ALUCtrl = x RegDst = 0 PCWrite = 0 PCSource = x MemtoReg = 1 Reg[Rt]←MDR ExtOp = x RegDst = 0 (Rt) MemtoReg = 1 RegWrite = 1 PCWrite = IRWrite = 0, MemRead = MemWrite = 0 Don’t care about rest During the 5 th cycle, the load instruction completes writing its result into register Rt zero ALUALU 32 muxmux 0 1 muxmux muxmux NextPC PC ALUout 32 Rt Rd mux 0 1 RA RB BusA BusB RW BusW Registers muxmux 0 1

Multicycle Implementation© Muhamed Mudawar, COE 308 – KFUPMSlide 17 Instruction Execution Summary CycleActionRegister Transfers 1Fetch instructionIR ← Memory[PC], PC ← PC Decode instruction Fetch registers Compute branch address in advance Jump completion (case of a jump) Generate control signals A ← Reg[Rs], B ← Reg[Rt] ALUout ← PC[31:2] + sign-extend(Imm16) PC ← PC[31:28], Im26 3 Case 1: Execute R-type ALU Case 2: Execute I-type ALU Case 3: Compute load/store address Case 4: Branch completion ALUout ← A funct B ALUout ← A op extend(Imm16) ALUout ← A + sign-extend(Imm16) if (Branch) PC ← ALUout 4 Case 1: Write ALU result for R-type Case 2: Write ALU result for I-type Case 3: Access memory for load Case 4: Access memory for store Reg[Rd] ← ALUout Reg[Rt] ← ALUout MDR ← Memory[ALUout] Memory[ALUout] ← B 5Load instruction completionReg[Rt] ← MDR

Multicycle Implementation© Muhamed Mudawar, COE 308 – KFUPMSlide 18 Defining the Control  Control for multicycle datapath is more complex  Because instruction is executed as a sequence of steps  Values of control signals depend upon:  What instruction is being executed  Which cycle is being performed  Multicycle control is a Finite State Machine (FSM)  While single-cycle control is a combinational logic  Two implementation techniques for multicycle control  Set of states and transitions implemented directly in logic  Microprogramming: a programming representation for control

Multicycle Implementation© Muhamed Mudawar, COE 308 – KFUPMSlide 19 Multicycle Datapath + Control 4 Imm16 32 ALUALU RA RB BusA BusB RW BusW Address MemData Memory 30 PC Rs Extender Rt muxmux 0 1 muxmux 0 1 mux 0 1 Rd muxmux 0 1 Data_in MDR IR Registers A B muxmux ALUout 32 PC[31:28], Im ExtOp MemtoReg ALUSrcB ALUCtrl muxmux ALU Control NextPC Op 6 PCSourceALUSrcARegWriteRegDst PCWrite IorDMemReadMemWrite zero funct 6 ALUop Main Control IRWrite

Multicycle Implementation© Muhamed Mudawar, COE 308 – KFUPMSlide 20 High Level View of FSM Control Instruction fetch R-typeLoad/Store Branch Start Instruction decode, register fetch, and branch address computation (op = R-type) I-type ALU (op = LW) or (op = SW) (op = J) (op = BEQ) or (op = BNE) (op = ANDI) or (op = ORI) or …

Multicycle Implementation© Muhamed Mudawar, COE 308 – KFUPMSlide 21 Branch Completion State Diagram for Multicycle Control Start BEQ or BNE Control signal values default to zero when they are not specified IorD = 0 MemRead = 1 IRWrite = 1 ALUSrcA = 0 ALUop = INC PCSource = 00 PCWrite = 1 0 Instruction fetch ALUSrcA = 0 ALUSrcB = 1 ALUop = ADD Extop = 1 PCSource = 10 PCWrite = J 1 ALUSrcA = 1 ALUSrcB = 0 ALUop = SUB PCSource = 01 PCWrite = Branch 8 67 ALUSrcA = 1 ALUSrcB = 0 ALUop = R-type R-type ALU RegDst = 1 (Rd) MemtoReg = 0 RegWrite = 1 R-type Completion IorD = 1 MemWrite = 1 5 Store Access RegDst = 0 (Rt) MemtoReg = 1 RegWrite = 1 4 Load Completion ALUSrcA = 1 ALUSrcB = 1 Extop = sign ALUop = ADD 2 Address Computation IorD = 1 MemRead = 1 3 Load Access LW SW R-type LW SW Decode, Register fetch, Jump completion J 910 ALUSrcA = 1 ALUSrcB = 1 Extop = zero ALUop = op I-type ALU RegDst = 0 (Rt) MemtoReg = 0 RegWrite = 1 I-type Completion ORI, ANDI, …

Multicycle Implementation© Muhamed Mudawar, COE 308 – KFUPMSlide 22 Finite State Machine Controller  Implemented as …  Comb control logic  State register RegDst RegWrite ExtOp ALUSrcA ALUSrcB MemRead MemWrite MemtoReg IorD IRWrite PCWrite PCSource State register next state Op current state zero clk ALUop Combinational Control logic 0xx1xx00x10x01100INC 1lw, swx2x100100xx0010ADD 1Rtypex6x100100xx0010ADD 1 beq bne x8x100100xx0010ADD 1jx0x100100xx0110ADD 1ori, …x9x100100xx0010ADD 2lwx3x101100xx00xADD 2swx5x101100xx00xADD 3xx4xx0xx10x100xx 4xx00x1xx001x00xx 5xx0xx0xx01x100xx 6xx7xx01000xx00xRtype 7xx01x1xx000x00xx 8 bne beq xx01000xx0Br01SUB 9xx10x001100xx00xOp 10xx00x1xx000x00xx current statenext state RegDst zero RegWriteALUSrcAALUSrcB MemRead MemWriteMemtoRegIorDIRWrite PCWrite PCSource ALUop Op ExtOp State Transition and Output Table

Multicycle Implementation© Muhamed Mudawar, COE 308 – KFUPMSlide 23  Reduces hardware  One unified memory for instruction and data, and one ALU  Reduces clock cycle and time  When compared to single-cycle implementation  Breaks instruction execution into steps (step = 1 cycle)  Internal registers in datapath  Save intermediate data for later cycles  Finite State Machine (FSM) specification of control  Implementation of control  Hardwired control  a sequential machine  Microprogramming (covered in textbook) Multicycle Implementation Summary