1 Processor: Datapath and Control Single cycle processor –Datapath and Control Multicycle processor –Datapath and Control Microprogramming –Vertical and.

Slides:



Advertisements
Similar presentations
331 W08.1Spring :332:331 Computer Architecture and Assembly Language Spring 2006 Week 8: Datapath Design [Adapted from Dave Patterson’s UCB CS152.
Advertisements

The Processor: Datapath & Control
Chapter 5 The Processor: Datapath and Control Basic MIPS Architecture Homework 2 due October 28 th. Project Designs due October 28 th. Project Reports.
331 W9.1Spring :332:331 Computer Architecture and Assembly Language Spring 2006 Week 9 Building a Single-Cycle Datapath [Adapted from Dave Patterson’s.
Levels in Processor Design
331 Lec 14.1Fall 2002 Review: Abstract Implementation View  Split memory (Harvard) model - single cycle operation  Simplified to contain only the instructions:
Preparation for Midterm Binary Data Storage (integer, char, float pt) and Operations, Logic, Flip Flops, Switch Debouncing, Timing, Synchronous / Asynchronous.
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
Computer Structure - Datapath and Control Goal: Design a Datapath  We will design the datapath of a processor that includes a subset of the MIPS instruction.
Inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 25 CPU design (of a single-cycle CPU) Intel is prototyping circuits that.
331 W10.1Spring :332:331 Computer Architecture and Assembly Language Spring 2005 Week 10 Building a Multi-Cycle Datapath [Adapted from Dave Patterson’s.
Spring W :332:331 Computer Architecture and Assembly Language Spring 2005 Week 11 Introduction to Pipelined Datapath [Adapted from Dave Patterson’s.
CSE431 L05 Basic MIPS Architecture.1Irwin, PSU, 2005 CSE 431 Computer Architecture Fall 2005 Lecture 05: Basic MIPS Architecture Review Mary Jane Irwin.
The Processor: Datapath & Control. Implementing Instructions Simplified instruction set memory-reference instructions: lw, sw arithmetic-logical instructions:
Dr. Iyad F. Jafar Basic MIPS Architecture: Multi-Cycle Datapath and Control.
Chapter 4 Sections 4.1 – 4.4 Appendix D.1 and D.2 Dr. Iyad F. Jafar Basic MIPS Architecture: Single-Cycle Datapath and Control.
CS3350B Computer Architecture Winter 2015 Lecture 5.6: Single-Cycle CPU: Datapath Control (Part 1) Marc Moreno Maza [Adapted.
CSE331 W10&11.1Irwin Fall 2007 PSU CSE 331 Computer Organization and Design Fall 2007 Week 10 & 11 Section 1: Mary Jane Irwin (
COSC 3430 L08 Basic MIPS Architecture.1 COSC 3430 Computer Architecture Lecture 08 Processors Single cycle Datapath PH 3: Sections
Computer Architecture Chapter 5 Fall 2005 Department of Computer Science Kent State University.
CSE431 Chapter 4A.1Irwin, PSU, 2008 CSE 431 Computer Architecture Fall 2008 Chapter 4A: The Processor, Part A Mary Jane Irwin ( )
Gary MarsdenSlide 1University of Cape Town Chapter 5 - The Processor  Machine Performance factors –Instruction Count, Clock cycle time, Clock cycles per.
Computer Organization CS224 Chapter 4 Part b The Processor Spring 2010 With thanks to M.J. Irwin, T. Fountain, D. Patterson, and J. Hennessy for some lecture.
CPE232 Basic MIPS Architecture1 Computer Organization Multi-cycle Approach Dr. Iyad Jafar Adapted from Dr. Gheith Abandah slides
EEM 486: Computer Architecture Designing a Single Cycle Datapath.
Computer Architecture and Design – ECEN 350 Part 6 [Some slides adapted from A. Sprintson, M. Irwin, D. Paterson and others]
Datapath and Control Unit Design
1 Processor: Datapath and Control Single cycle processor –Datapath and Control Multicycle processor –Datapath and Control Microprogramming –Vertical and.
CSE331 W10.1Irwin&Li Fall 2006 PSU CSE 331 Computer Organization and Design Fall 2006 Week 10 Section 1: Mary Jane Irwin (
ECE-C355 Computer Structures Winter 2008 The MIPS Datapath Slides have been adapted from Prof. Mary Jane Irwin ( )
Datapath and Control AddressInstruction Memory Write Data Reg Addr Register File ALU Data Memory Address Write Data Read Data PC Read Data Read Data.
Chapter 4 From: Dr. Iyad F. Jafar Basic MIPS Architecture: Single-Cycle Datapath and Control.
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
CS.305 Computer Architecture The Processor: Datapath and Control Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from.
Design a MIPS Processor
Datapath and Control AddressInstruction Memory Write Data Reg Addr Register File ALU Data Memory Address Write Data Read Data PC Read Data Read Data.
COM181 Computer Hardware Lecture 6: The MIPs CPU.
Chapter 4 From: Dr. Iyad F. Jafar Basic MIPS Architecture: Multi-Cycle Datapath and Control.
Computer Architecture Lecture 6.  Our implementation of the MIPS is simplified memory-reference instructions: lw, sw arithmetic-logical instructions:
Design a MIPS Processor (II)
CS Computer Architecture Week 10: Single Cycle Implementation
Multi-Cycle Datapath and Control
CS 230: Computer Organization and Assembly Language
Computer Architecture
Multi-Cycle CPU.
Single Cycle Processor
Multi-Cycle CPU.
Basic MIPS Architecture
Processor (I).
CS/COE0447 Computer Organization & Assembly Language
Multiple Cycle Implementation of MIPS-Lite CPU
MIPS processor continued
CSCI206 - Computer Organization & Programming
Chapter Five The Processor: Datapath and Control
Chapter Five The Processor: Datapath and Control
Levels in Processor Design
The Processor Lecture 3.2: Building a Datapath with Control
Vishwani D. Agrawal James J. Danaher Professor
COMS 361 Computer Organization
COSC 2021: Computer Organization Instructor: Dr. Amir Asif
COSC 2021: Computer Organization Instructor: Dr. Amir Asif
Lecture 14: Single Cycle MIPS Processor
Processor: Multi-Cycle Datapath & Control
Multi-Cycle Datapath Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
Chapter Four The Processor: Datapath and Control
5.5 A Multicycle Implementation
Control Unit (single cycle implementation)
The Processor: Datapath & Control.
COMS 361 Computer Organization
Processor: Datapath and Control
Presentation transcript:

1 Processor: Datapath and Control Single cycle processor –Datapath and Control Multicycle processor –Datapath and Control Microprogramming –Vertical and Horizontal Microcodes

2 Processor Design Processor design –datapath and control unit design –processor design determines »clock cycle time »clock cycles per instruction Performance of a machine is determined by –Instruction count –clock cycle time –clock cycles per instruction

3 Review: THE Performance Equation Our basic performance equation is then CPU time = Instruction_count x CPI x clock_cycle Instruction_count x CPI clock_rate CPU time = or These equations separate the three key factors that affect performance –Can measure the CPU execution time by running the program –The clock rate is usually given in the documentation –Can measure instruction count by using profilers/simulators without knowing all of the implementation details –CPI varies by instruction type and ISA implementation for which we must know the implementation details

4 How to Design a Processor: step-by-step 1. Analyze instruction set => datapath requirements the meaning of each instruction is given by the register transfers datapath must include storage element for ISA registers possibly more datapath must support each register transfer 2. Select set of datapath components and establish clocking methodology 3. Assemble datapath meeting the requirements 4. Analyze implementation of each instruction to determine setting of control points that effects the register transfer. 5. Assemble the control logic

5 Single Cycle Processor Single cycle processor –Pros: one clock cycle per instruction –Cons: too long cycle time, too low clocking frequency Design a processor –analyze instruction set (the meaning of each instruction is given by the register transfers) –timing of each instruction –datapath support each register transfer –select datapath components and establish clocking methodology –analyze implementation of each instruction to determine setting of control points that affect register transfer –assemble control logic and datapath components

6 Clocking Methodology Edge-triggered clock setup time hold time all storage elements clocked by the same clock combinational logic block: –inputs are updated at each clock tick –all outputs must be stable before the next clock tick

7 The MIPS Instruction Formats All MIPS instructions are 32 bits long. The three instruction formats: –R-type –I-type –J-type The different fields are: –op: operation of the instruction –rs, rt, rd: the source and destination register specifiers –shamt: shift amount –funct: selects the variant of the operation in the “op” field –address / immediate: address offset or immediate value –target address: target address of the jump instruction optarget address bits26 bits oprsrtrdshamtfunct bits 5 bits oprsrt immediate bits16 bits5 bits

8 Register Transfers add $1, $2, $3; rs = $2, rt = $3, rd = $1 R[rd] <- R[rs] + R[rt}PC <- PC + 4 sub $1, $2, $3; rs = $2, rt = $3, rd = $1 R[rd] <- R[rs] - R[rt]PC <- PC + 4 ori $1, $2, 20; rs = $2, rt = $1 R[rt] <- R[rs] + zero_ext(imm16)PC <- PC + 4 lw $1, 200($2); rs = $2, rt = $1 R[rt] <- MEM{R[rs] + sign_ext(imm16)} PC <- PC + 4 sw $1, 200($2); rs = $2, rt = $1 MEM{R[rs] + sign_ext(imm16)} <- R[rt]PC <- PC + 4

9 Components Memory: hold instruction and data Registers: bit registers –read rs –read rt –write rd –write rt Program counter Extender Add and Sub registers or extended immediates Add 4 to PC or Add extended immediate to PC (jump inst)

10 Combinational Logic Elements Adder MUX (multi-plexor) ALU 32 A B Sum Carry 32 A B Result OP 32 A B Y Select Adder MUX ALU CarryIn (to add values) (to chose between values) (to do add, subtract, or)

11 Storage Element: Register (Basic Building Block) Register –Similar to the D Flip Flop except »N-bit input and output »Write Enable input –Write Enable: »negated (0): Data Out will not change »asserted (1): Data Out will become Data In Clk Data In Write Enable NN Data Out

12 Sequential Logic Elements Registers: n-bit input and output, D F/F, write enable rs, rt, rd : register specifiers registers read register1 read register2 write register write data read data1 read data2

13 Fetching Instructions Fetching instructions involves –reading the instruction from the Instruction Memory –updating the PC to hold the address of the next instruction Read Address Instruction Memory Add PC 4 –PC is updated every cycle, so it does not need an explicit write control signal –Instruction Memory is read every cycle, so it doesn’t need an explicit read control signal

14 Decoding Instructions Decoding instructions involves –sending the fetched instruction’s opcode and function field bits to the control unit Instruction Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Read Data 2 Control Unit –reading two values from the Register File »Register File addresses are contained in the instruction Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Read Data 2

15 Executing R Format Operations R format operations ( add, sub, slt, and, or ) –perform the (op and funct) operation on values in rs and rt –store the result back into the Register File (into location rd) Instruction Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Read Data 2 ALU overflow zero ALU controlRegWrite R-type: oprsrtrdfunctshamt 10 –The Register File is not written every cycle (e.g. sw ), so we need an explicit write control signal for the Register File

16 Executing Load and Store Operations Load and store operations involves –compute memory address by adding the base register (read from the Register File during decode) to the 16-bit signed-extended offset field in the instruction –store value (read from the Register File during decode) written to the Data Memory –load value, read from the Data Memory, written to the Register File Instruction Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Read Data 2 ALU overflow zero ALU controlRegWrite Data Memory Address Write Data Read Data Sign Extend MemWrite MemRead 1632

17 Executing Branch Operations Branch operations involves –compare the operands read from the Register File during decode for equality ( zero ALU output) –compute the branch target address by adding the updated PC to the 16-bit signed-extended offset field in the instr Instruction Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Read Data 2 ALU zero ALU control Sign Extend 1632 Shift left 2 Add 4 PC Branch target address (to branch control logic)

18 Executing Jump Operations Jump operation involves –replace the lower 28 bits of the PC with the lower 26 bits of the fetched instruction shifted left by 2 bits Read Address Instruction Memory Add PC 4 Shift left 2 Jump address

19 Creating a Single Datapath from the Parts Assemble the datapath segments and add control lines and multiplexors as needed Single cycle design – fetch, decode and execute each instructions in one clock cycle –no datapath resource can be used more than once per instruction, so some must be duplicated (e.g., separate Instruction Memory and Data Memory, several adders) –multiplexors needed at the input of shared elements with control lines to do the selection –write signals to control writing to the Register File and Data Memory Cycle time is determined by length of the longest path

20 Fetch, R, and Memory Access Portions MemtoReg Read Address Instruction Memory Add PC 4 Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Read Data 2 ALU ovf zero ALU controlRegWrite Data Memory Address Write Data Read Data MemWrite MemRead Sign Extend 1632 ALUSrc

21 Adding the Control Selecting the operations to perform (ALU, Register File and Memory read/write) Controlling the flow of data (multiplexor inputs) I-Type: oprsrt address offset R-type: oprsrtrdfunctshamt 10 Observations –op field always in bits –addr of registers to be read are always specified by the rs field (bits 25-21) and rt field (bits 20-16); for lw and sw rs is the base register –addr. of register to be written is in one of two places – in rt (bits 20-16) for lw; in rd (bits 15-11) for R-type instructions –offset for beq, lw, and sw always in bits 15-0 J-type: optarget address

22 Single Cycle Datapath with Control Unit Read Address Instr[31-0] Instruction Memory Add PC 4 Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Read Data 2 ALU ovf zero RegWrite Data Memory Address Write Data Read Data MemWrite MemRead Sign Extend 1632 MemtoReg ALUSrc Shift left 2 Add PCSrc RegDst ALU control ALUOp Instr[5-0] Instr[15-0] Instr[25-21] Instr[20-16] Instr[15 -11] Control Unit Instr[31-26] Branch

23 R-type Instruction Data/Control Flow Read Address Instr[31-0] Instruction Memory Add PC 4 Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Read Data 2 ALU ovf zero RegWrite Data Memory Address Write Data Read Data MemWrite MemRead Sign Extend 1632 MemtoReg ALUSrc Shift left 2 Add PCSrc RegDst ALU control ALUOp Instr[5-0] Instr[15-0] Instr[25-21] Instr[20-16] Instr[15 -11] Control Unit Instr[31-26] Branch

24 Load Word Instruction Data/Control Flow Read Address Instr[31-0] Instruction Memory Add PC 4 Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Read Data 2 ALU ovf zero RegWrite Data Memory Address Write Data Read Data MemWrite MemRead Sign Extend 1632 MemtoReg ALUSrc Shift left 2 Add PCSrc RegDst ALU control ALUOp Instr[5-0] Instr[15-0] Instr[25-21] Instr[20-16] Instr[15 -11] Control Unit Instr[31-26] Branch

25 Load Word Instruction Data/Control Flow Read Address Instr[31-0] Instruction Memory Add PC 4 Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Read Data 2 ALU ovf zero RegWrite Data Memory Address Write Data Read Data MemWrite MemRead Sign Extend 1632 MemtoReg ALUSrc Shift left 2 Add PCSrc RegDst ALU control ALUOp Instr[5-0] Instr[15-0] Instr[25-21] Instr[20-16] Instr[15 -11] Control Unit Instr[31-26] Branch

26 Branch Instruction Data/Control Flow Read Address Instr[31-0] Instruction Memory Add PC 4 Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Read Data 2 ALU ovf zero RegWrite Data Memory Address Write Data Read Data MemWrite MemRead Sign Extend 1632 MemtoReg ALUSrc Shift left 2 Add PCSrc RegDst ALU control ALUOp Instr[5-0] Instr[15-0] Instr[25-21] Instr[20-16] Instr[15 -11] Control Unit Instr[31-26] Branch

27 Branch Instruction Data/Control Flow Read Address Instr[31-0] Instruction Memory Add PC 4 Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Read Data 2 ALU ovf zero RegWrite Data Memory Address Write Data Read Data MemWrite MemRead Sign Extend 1632 MemtoReg ALUSrc Shift left 2 Add PCSrc RegDst ALU control ALUOp Instr[5-0] Instr[15-0] Instr[25-21] Instr[20-16] Instr[15 -11] Control Unit Instr[31-26] Branch

28 Adding the Jump Operation Read Address Instr[31-0] Instruction Memory Add PC 4 Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Read Data 2 ALU ovf zero RegWrite Data Memory Address Write Data Read Data MemWrite MemRead Sign Extend 1632 MemtoReg ALUSrc Shift left 2 Add PCSrc RegDst ALU control ALUOp Instr[5-0] Instr[15-0] Instr[25-21] Instr[20-16] Instr[15 -11] Control Unit Instr[31-26] Branch Shift left Jump 32 Instr[25-0] 26 PC+4[31-28] 28

29 Single Cycle Control Unit: ALU control control unit Instr[31-26] ALUOp ALU control to ALU operation Instr[5-0] In 5 th edition, The above Figure is in Appendix D

30

31 ALU Control Implementation

32 instruction opcode ALUOPInstruction operation Funct field Desired ALU action ALU control input, i.e., Operation LW00load wordxxxxxxadd0010 SW00store wordxxxxxxadd0010 Branch equal 01branch equal xxxxxxsubtract0110 R type10add100000add0010 R type10subtract100010subtract0110 R type10AND100100and0000 R type10OR100101or0001 R type10set on less than set on less than 0111

33 Setting of the control signals Instru- ction RegDstALUSrcMemto Reg Write Mem Read Mem Write BranchALU Op1 ALU Op0 R type lw swx1x beqx0x000101

34

35 Control Unit PLA Implementation

36 Single Cycle Disadvantages & Advantages Uses the clock cycle inefficiently – the clock cycle must be timed to accommodate the slowest instruction –especially problematic for more complex instructions like floating point multiply May be wasteful of area since some functional units (e.g., adders) must be duplicated since they can not be shared during a clock cycle but Is simple and easy to understand Clk lwswWaste Cycle 1Cycle 2

37 Multicycle Datapath Approach Let an instruction take more than 1 clock cycle to complete –Break up instructions into steps where each step takes a cycle while trying to »balance the amount of work to be done in each step »restrict each cycle to use only one major functional unit –Not every instruction takes the same number of clock cycles In addition to faster clock rates, multicycle allows functional units that can be used more than once per instruction as long as they are used on different clock cycles, as a result –only need one memory – but only one memory access per cycle –need only one ALU/adder – but only one ALU operation per cycle

38 Multicycle Datapath Approach Let an instruction take more than 1 clock cycle to complete –Break up instructions into steps where each step takes a cycle while trying to »balance the amount of work to be done in each step »restrict each cycle to use only one major functional unit –Not every instruction takes the same number of clock cycles In addition to faster clock rates, multicycle allows functional units that can be used more than once per instruction as long as they are used on different clock cycles, as a result –only need one memory – but only one memory access per cycle –need only one ALU/adder – but only one ALU operation per cycle

39 At the end of a cycle –Store values needed in a later cycle by the current instruction in an internal register (not visible to the programmer). All (except IR) hold data only between a pair of adjacent clock cycles (no write control signal needed) IR – Instruction RegisterMDR – Memory Data Register A, B – regfile read data registers ALUout – ALU output register Multicycle Datapath Approach, con’t Address Read Data (Instr. or Data) Memory PC Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Read Data 2 ALU Write Data IR MDR A B ALUout –Data used by subsequent instructions are stored in programmer visible registers (i.e., register file, PC, or memory)

40 The Multicycle Datapath with Control Signals Address Read Data (Instr. or Data) Memory PC Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Read Data 2 ALU Write Data IR MDR A B ALUout Sign Extend Shift left 2 ALU control Shift left 2 ALUOp Control IRWrite MemtoReg MemWrite MemRead IorD PCWrite PCWriteCond RegDst RegWrite ALUSrcA ALUSrcB zero PCSource Instr[5-0] Instr[25-0] PC[31-28] Instr[15-0] Instr[31-26] 32 28

41 Multicycle datapath control signals are not determined solely by the bits in the instruction –e.g., op code bits tell what operation the ALU should be doing, but not what instruction cycle is to be done next Must use a finite state machine (FSM) for control –a set of states (current state stored in State Register) –next state function (determined by current state and the input) –output function (determined by current state and the input) Multicycle Control Unit Combinational control logic State Reg Inst Opcode Datapath control points Next State...

42 The Five Steps of the Load Instruction IFetch: Instruction Fetch and Update PC Dec: Instruction Decode, Register Read, Sign Extend Offset Exec: Execute R-type; Calculate Memory Address; Branch Comparison; Branch and Jump Completion Mem: Memory Read; Memory Write Completion; R-type Completion (RegFile write) WB: Memory Read Completion (RegFile write) Cycle 1Cycle 2Cycle 3Cycle 4Cycle 5 IFetchDecExecMemWB lw INSTRUCTIONS TAKE FROM CYCLES!

43 Single Cycle vs. Multiple Cycle Timing Clk Cycle 1 Multiple Cycle Implementation: IFetchDecExecMemWB Cycle 2Cycle 3Cycle 4Cycle 5Cycle 6Cycle 7Cycle 8Cycle 9Cycle 10 IFetchDecExecMem lwsw IFetch R-type Clk Single Cycle Implementation: lwsw Waste Cycle 1Cycle 2 multicycle clock slower than 1/5 th of single cycle clock due to state register overhead

44

45

46

47

48

49

50

51

52

53

54

55

56