EECC550 - Shaaban #1 Lec # 4 Winter 2003 12-16-2003 CPU Organization Datapath Design: –Capabilities & performance characteristics of principal Functional.

Slides:



Advertisements
Similar presentations
CS152 Lec9.1 CS152 Computer Architecture and Engineering Lecture 9 Designing Single Cycle Control.
Advertisements

EECC550 - Shaaban #1 Lec # 4 Summer Major CPU Design Steps 1Using independent RTN, write the micro- operations required for all target ISA.
361 datapath Computer Architecture Lecture 8: Designing a Single Cycle Datapath.
CS61C L19 CPU Design : Designing a Single-Cycle CPU (1) Beamer, Summer 2007 © UCB Scott Beamer Instructor inst.eecs.berkeley.edu/~cs61c CS61C : Machine.
CS61C L26 Single Cycle CPU Datapath II (1) Garcia © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c CS61C : Machine.
CS152 / Kubiatowicz Lec8.1 9/26/01©UCB Fall 2001 CS152 Computer Architecture and Engineering Lecture 8 Designing Single Cycle Control September 26, 2001.
CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2007 © UCB 3.6 TB DVDs? Maybe!  Researchers at Harvard have found a way to use.
EECC550 - Shaaban #1 Lec # 5 Winter CPU Design Steps 1. Analyze instruction set operations using independent ISA => RTN => datapath requirements.
Savio Chau Single Cycle Controller Design Last Time: Discussed the Designing of a Single Cycle Datapath Control Datapath Memory Processor (CPU) Input Output.
Processor II CPSC 321 Andreas Klappenecker. Midterm 1 Tuesday, October 5 Thursday, October 7 Advantage: less material Disadvantage: less preparation time.
CPU Organization (Design)
EECC550 - Shaaban #1 Lec # 5 Winter CPU Design Steps 1. Analyze instruction set operations using independent RTN => datapath requirements.
Ceg3420 control.1 ©UCB, DAP’ 97 CEG3420 Computer Design Lecture 9.2: Designing Single Cycle Control.
CS 61C L34 Single Cycle CPU Control I (1) Garcia, Spring 2004 © UCB Lecturer PSOE Dan Garcia inst.eecs.berkeley.edu/~cs61c.
Microprocessor Design
EECC250 - Shaaban #1 lec #22 Winter The Von-Neumann Computer Model Partitioning of the computing engine into components: –Central Processing.
ECE 232 L13. Control.1 ©UCB, DAP’ 97 ECE 232 Hardware Organization and Design Lecture 13 Control Design
CS152 / Kubiatowicz Lec8.1 2/22/99©UCB Spring 1999 CS152 Computer Architecture and Engineering Lecture 8 Designing Single Cycle Control Feb 22, 1999 John.
CS61C L25 CPU Design : Designing a Single-Cycle CPU (1) Garcia, Fall 2006 © UCB T-Mobile’s Wi-Fi / Cell phone  T-mobile just announced a new phone that.
CS 61C L17 Control (1) A Carle, Summer 2006 © UCB inst.eecs.berkeley.edu/~cs61c/su06 CS61C : Machine Structures Lecture #17: CPU Design II – Control
EECC550 - Shaaban #1 Lec # 4 Winter CPU Organization (Design) Datapath Design: –Capabilities & performance characteristics of principal.
CS151B Computer Systems Architecture Winter 2002 TuTh 2-4pm BH Instructor: Prof. Jason Cong Lecture 8 Designing a Single Cycle Control.
CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Fall 2006 © UCB Lecturer SOE Dan Garcia inst.eecs.berkeley.edu/~cs61c.
Inst.eecs.berkeley.edu/~cs61c UCB CS61C : Machine Structures Lecture 25 CPU design (of a single-cycle CPU) Intel is prototyping circuits that.
EECC550 - Shaaban #1 Lec # 4 Winter CPU Organization (Design) Datapath Design: –Capabilities & performance characteristics of principal.
CS61C L25 CPU Design : Designing a Single-Cycle CPU (1) Garcia, Spring 2007 © UCB Google Summer of Code  Student applications are now open (through );
EECC550 - Shaaban #1 Lec # 4 Winter Major CPU Design Steps 1Using independent RTN, write the micro- operations required for all target.
EEM 486: Computer Architecture Lecture 3 Designing a Single Cycle Datapath.
CS 61C L16 Datapath (1) A Carle, Summer 2004 © UCB inst.eecs.berkeley.edu/~cs61c/su05 CS61C : Machine Structures Lecture #16 – Datapath Andy.
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 28: Single-Cycle CPU Datapath Control Part 1 Guest Lecturer: Sagar Karandikar.
CS61C L20 Single Cycle Datapath, Control (1) Chae, Summer 2008 © UCB Albert Chae, Instructor inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture.
361 control Computer Architecture Lecture 9: Designing Single Cycle Control.
CS61C L26 CPU Design : Designing a Single-Cycle CPU II (1) Garcia, Spring 2010 © UCB inst.eecs.berkeley.edu/~cs61c UC Berkeley CS61C : Machine Structures.
EECC550 - Shaaban #1 Lec # 5 Spring CPU Design Steps 1. Analyze instruction set operations using independent RTN => datapath requirements.
ECE 232 L12.Datapath.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 12 Datapath.
ELEN 350 Single Cycle Datapath Adapted from the lecture notes of John Kubiatowicz(UCB) and Hank Walker (TAMU)
EECC550 - Shaaban #1 Lec # 5 Spring CPU Design Steps 1. Analyze instruction set operations using independent RTN => datapath requirements.
CS61C L27 Single Cycle CPU Control (1) Garcia, Fall 2006 © UCB Wireless High Definition?  Several companies will be working on a “WirelessHD” standard,
CS3350B Computer Architecture Winter 2015 Lecture 5.6: Single-Cycle CPU: Datapath Control (Part 1) Marc Moreno Maza [Adapted.
COSC 3430 L08 Basic MIPS Architecture.1 COSC 3430 Computer Architecture Lecture 08 Processors Single cycle Datapath PH 3: Sections
Computer Organization CS224 Fall 2012 Lesson 26. Summary of Control Signals addsuborilwswbeqj RegDst ALUSrc MemtoReg RegWrite MemWrite Branch Jump ExtOp.
EEM 486: Computer Architecture Designing Single Cycle Control.
Computer Organization CS224 Fall 2012 Lesson 22. The Big Picture  The Five Classic Components of a Computer  Chapter 4 Topic: Processor Design Control.
Designing a Single Cycle Datapath In this lecture, slides from lectures 3, 8 and 9 from the course Computer Architecture ECE 201 by Professor Mike Schulte.
CS 61C: Great Ideas in Computer Architecture Datapath
EEM 486: Computer Architecture Designing a Single Cycle Datapath.
CPE 442 single-cycle datapath.1 Intro. To Computer Architecture CpE242 Computer Architecture and Engineering Designing a Single Cycle Datapath.
Datapath and Control Unit Design
CS3350B Computer Architecture Winter 2015 Lecture 5.7: Single-Cycle CPU: Datapath Control (Part 2) Marc Moreno Maza [Adapted.
EECC550 - Shaaban #1 Lec # 4 Winter CPU Organization (Design) Datapath Design: –Capabilities & performance characteristics of principal.
Computer Organization CS224 Chapter 4 Part a The Processor Spring 2011 With thanks to M.J. Irwin, T. Fountain, D. Patterson, and J. Hennessy for some lecture.
Designing a Single- Cycle Processor 國立清華大學資訊工程學系 黃婷婷教授.
CS4100: 計算機結構 Designing a Single-Cycle Processor 國立清華大學資訊工程學系 一零零學年度第二學期.
Csci 136 Computer Architecture II –Single-Cycle Datapath Xiuzhen Cheng
EEM 486: Computer Architecture Lecture 3 Designing Single Cycle Control.
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Single-Cycle CPU Datapath & Control Part 2 Instructors: Krste Asanovic & Vladimir Stojanovic.
CS141-L4-1Tarun Soni, Summer’03 Single Cycle CPU  Previously: built and ALU.  Today: Actually build a CPU Questions on CS140 ? Computer Arithmetic ?
Single Cycle Controller Design
CS 110 Computer Architecture Lecture 11: Single-Cycle CPU Datapath & Control Instructor: Sören Schwertfeger School of Information.
Computer Architecture Lecture 6.  Our implementation of the MIPS is simplified memory-reference instructions: lw, sw arithmetic-logical instructions:
IT 251 Computer Organization and Architecture
(Chapter 5: Hennessy and Patterson) Winter Quarter 1998 Chris Myers
CpE242 Computer Architecture and Engineering Designing a Single Cycle Datapath Start: X:40.
CPU Organization (Design)
CS152 Computer Architecture and Engineering Lecture 8 Designing a Single Cycle Datapath Start: X:40.
COMS 361 Computer Organization
Instructors: Randy H. Katz David A. Patterson
The Processor: Datapath & Control.
COMS 361 Computer Organization
Processor: Datapath and Control
Presentation transcript:

EECC550 - Shaaban #1 Lec # 4 Winter CPU Organization Datapath Design: –Capabilities & performance characteristics of principal Functional Units (FUs): –(e.g., Registers, ALU, Shifters, Logic Units,...) –Ways in which these components are interconnected (buses connections, multiplexors, etc.). –How information flows between components. Control Unit Design: –Logic and means by which such information flow is controlled. –Control and coordination of FUs operation to realize the targeted Instruction Set Architecture to be implemented (can either be implemented using a finite state machine or a microprogram). Hardware description with a suitable language, possibly using Register Transfer Notation (RTN).

EECC550 - Shaaban #2 Lec # 4 Winter Major CPU Design Steps 1Using independent RTN, write the micro- operations required for target ISA instructions. 2Construct the datapath required by the micro- operations identified in step 1. 3Identify and define the function of all control signals needed by the datapath. 3Control unit design, based on micro-operation timing and control signals identified: -Combinational logic: For single cycle CPU. -Hard-Wired: Finite-state machine implementation. -Microprogrammed. (Chapter )

EECC550 - Shaaban #3 Lec # 4 Winter CPU Design & Implantation Process Top-down Design: –Specify component behavior from high-level requirements (ISA). Bottom-up Design: –Assemble components in target technology to establish critical timing (hardware delays). Iterative refinement: –Establish a partial solution, expand and improve. Datapath Control Processor Instruction Set Architecture (ISA): Provides Requirements Reg. FileMuxALURegMemDecoderSequencer CellsGates

EECC550 - Shaaban #4 Lec # 4 Winter Datapath Design Steps Write the micro-operation sequences required for a number of representative target ISA instructions using independent RTN. Independent RTN statements specify: the required datapath components and how they are connected. From the above, create an initial datapath by determining possible destinations for each data source (i.e registers, ALU). –This establishes connectivity requirements (data paths, or connections) for datapath components. –Whenever multiple sources are connected to a single input, a multiplexor of appropriate size is added. Find the worst-time propagation delay in the datapath to determine the datapath clock cycle (CPU clock cycle). Complete the micro-operation sequences for all remaining instructions adding datapath components + connections/multiplexors as needed.

EECC550 - Shaaban #5 Lec # 4 Winter MIPS Instruction Formats op: Opcode, operation of the instruction. rs, rt, rd: The source and destination register specifiers. shamt: Shift amount. funct: Selects the variant of the operation in the “op” field. address / immediate: Address offset or immediate value. target address: Target address of the jump instruction. optarget address bits26 bits oprsrtrdshamtfunct bits 5 bits oprsrt immediate bits16 bits5 bits R-Type I-Type: ALU Load/Store, Branch J-Type: Jumps

EECC550 - Shaaban #6 Lec # 4 Winter MIPS R-Type (ALU) Instruction Fields op: Opcode, basic operation of the instruction. – For R-Type op = 0 rs: The first register source operand. rt: The second register source operand. rd: The register destination operand. shamt: Shift amount used in constant shift operations. funct: Function, selects the specific variant of operation in the op field. OPrs rt rdshamtfunct 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits R-Type: All ALU instructions that use three registers add $1,$2,$3 sub $1,$2,$3 and $1,$2,$3 or $1,$2,$3 Examples: Destination register in rd Operand register in rt Operand register in rs

EECC550 - Shaaban #7 Lec # 4 Winter MIPS ALU I-Type Instruction Fields I-Type ALU instructions that use two registers and an immediate value Loads/stores, conditional branches. op: Opcode, operation of the instruction. rs: The register source operand. rt: The result destination register. immediate: Constant second operand for ALU instruction. OPrs rt immediate 6 bits 5 bits 5 bits 16 bits add immediate:addi $1,$2,100 and immediateandi $1,$2,10 Examples: Result register in rt Source operand register in rs Constant operand in immediate

EECC550 - Shaaban #8 Lec # 4 Winter MIPS Load/Store I-Type Instruction Fields op: Opcode, operation of the instruction. –For load op = 35, for store op = 43. rs: The register containing memory base address. rt: For loads, the destination register. For stores, the source register of value to be stored. address: 16-bit memory address offset in bytes added to base register. OPrs rt address 6 bits 5 bits 5 bits 16 bits Store word: sw 500($4), $3 Load word: lw $1, 30($2) Examples: Offset base register in rs source register in rt Destination register in rt Offset base register in rs Signed address offset in bytes

EECC550 - Shaaban #9 Lec # 4 Winter MIPS Branch I-Type Instruction Fields op: Opcode, operation of the instruction. rs: The first register being compared rt: The second register being compared. address: 16-bit memory address branch target offset in words added to PC to form branch address. OPrs rt address 6 bits 5 bits 5 bits 16 bits Branch on equal beq $1,$2,100 Branch on not equal bne $1,$2,100 Examples: Register in rs Register in rt offset in bytes equal to instruction field address x 4 Signed address offset in words

EECC550 - Shaaban #10 Lec # 4 Winter MIPS J-Type Instruction Fields op: Opcode, operation of the instruction. –Jump j op = 2 –Jump and link jal op = 3 jump target: jump memory address in words. J-Type: Include jump j, jump and link jal OP jump target 6 bits 26 bits Branch on equal j Branch on not equal jal Examples: Jump memory address in bytes equal to instruction field jump target x 4 jump target = bits 26 bits 2 bits 0 PC(31-28) Effective 32-bit jump address: PC(31-28),jump_target,00 From PC+4

EECC550 - Shaaban #11 Lec # 4 Winter A Subset of MIPS Instructions ADD and SUB: addU rd, rs, rt subU rd, rs, rt OR Immediate: ori rt, rs, imm16 LOAD and STORE Word lw rt, rs, imm16 sw rt, rs, imm16 BRANCH: beq rs, rt, imm16 oprsrtrdshamtfunct bits 5 bits oprsrtimmediate bits16 bits5 bits oprsrtimmediate bits16 bits5 bits oprsrtimmediate bits16 bits5 bits

EECC550 - Shaaban #12 Lec # 4 Winter Basic Instruction Processing Steps Obtain instruction from program storage Determine instruction type Obtain operands from registers Compute result value or status Store result in register/memory if needed (usually called Write Back). Update program counter to address of next instruction } Common steps for all instructions Instruction Fetch Instruction Decode Execute Result Store Next Instruction Instruction  Mem[PC] PC  PC + 4 Done by Control Unit

EECC550 - Shaaban #13 Lec # 4 Winter Overview of MIPS Instruction Micro-operations All instructions go through these common steps: –Send program counter to instruction memory and fetch the instruction. (fetch) Instruction  Mem[PC] –Update the program counter to point to next instruction PC  PC + 4 –Read one or two registers, using instruction fields. (decode) Load reads one register only. Additional instruction execution actions (execution) depend on the instruction in question, but similarities exist: –All instruction classes use the ALU after reading the registers: Memory reference instructions use it for address calculation. Arithmetic and logic instructions (R-Type), use it for the specified operation. Branches use it for comparison. Additional execution steps where instruction classes differ: –Memory reference instructions: Access memory for a load or store. –Arithmetic and logic instructions: Write ALU result back in register. –Branch instructions: Change next instruction address based on comparison.

EECC550 - Shaaban #14 Lec # 4 Winter A Single Cycle MIPS CPU Implementation Design target: A single-cycle per instruction MIPS CPU implementation All micro-operations of an instruction are to be carried out in a single system clock cycle. Cycles Per Instruction = CPI = 1 CPI = 1

EECC550 - Shaaban #15 Lec # 4 Winter Instruction Word  Mem[PC]Fetch the instruction PC  PC + 4Increment PC R[rd]  R[rs] + R[rt]Add register rs to register rt result in register rd R-Type Example: Micro-Operation Sequence For ADDU OPrs rt rdshamtfunct 6 bits 5 bits 5 bits 5 bits 5 bits 6 bits addU rd, rs, rt Independent RTN ?

EECC550 - Shaaban #16 Lec # 4 Winter Initial Datapath Components Initial Datapath Components Two state elements (memory) needed to store and access instructions: 1 Instruction memory: Only read access (by user code). No read control signal. 2 Program counter (PC): 32-bit register. Written at end of every clock cycle: No write control signal bit Adder: To compute the the next instruction address. Instruction Word 32 Three components needed by: Instruction Fetch: Instruction  Mem[PC] Program Counter Update: PC  PC + 4

EECC550 - Shaaban #17 Lec # 4 Winter More Datapath Components Register File: Contains all ISA registers. Two read ports and one write port. Register writes by asserting write control signal Writes are edge-triggered. Can read and write to the same register in the same clock cycle. ISA Register File Main 32-bit ALU bit Arithmetic and Logic Unit (ALU)

EECC550 - Shaaban #18 Lec # 4 Winter Register File consists of 32 registers: –Two 32-bit output busses: busA and busB –One 32-bit input bus: busW Register is selected by: –RA (number) selects the register to put on busA (data): busA = R[RA] –RB (number) selects the register to put on busB (data): busB = R[RB] –RW (number) selects the register to be written via busW (data) when Write Enable is 1 Write Enable: R[RW]  busW Clock input (CLK) –The CLK input is a factor ONLY during write operations. –During read operation, it behaves as a combinational logic block: RA or RB valid => busA or busB valid after “access time.” Register File Details Clk busW Write Enable 32 busA 32 busB 555 RWRARB bit Registers

EECC550 - Shaaban #19 Lec # 4 Winter A Possible Register File Implementation 5-to-32 Decoder Register 0 Write Data In Data Out Register 1 Write Data In Data Out Register 30 Write Data In Data Out Register 31 Write Data In Data Out to-1 MUX to-1 MUX Register Read Data 1 (Bus A) Register Read Data 2 (Bus B) Read Register 1 (RA) Read Register 2 (RB) Register Write Data (Bus W) 32 Register Write Enable (RegWrite) Write Register RW Clk busW Write Enable 32 busA 32 busB 555 RWRARB bit Registers (Appendix B - The Basics of Logic Design)

EECC550 - Shaaban #20 Lec # 4 Winter Idealized Memory Memory (idealized) –One input bus: Data In. –One output bus: Data Out. Memory word is selected by: –Address selects the word to put on Data Out bus. –Write Enable = 1: address selects the memory word to be written via the Data In bus. Clock input (CLK): –The CLK input is a factor ONLY during write operation, –During read operation, this memory behaves as a combinational logic block: Address valid => Data Out valid after “access time.” Ideal Memory = Short access time. Clk Data In Write Enable 32 DataOut Address

EECC550 - Shaaban #21 Lec # 4 Winter Building The Datapath Portion of the datapath used for fetching instructions and incrementing the program counter. Instruction Fetch & PC Update: PC  PC + 4 Instruction  Mem[PC] 32

EECC550 - Shaaban #22 Lec # 4 Winter Simplified Datapath For MIPS R-Type Instructions Components and connections as specified by RTN statement R[rd]  R[rs] + R[rt] 32 rs rt rd R[rs] R[rt] From Instruction Memory

EECC550 - Shaaban #23 Lec # 4 Winter More Detailed Datapath For R-Type Instructions With Control Points Identified 32 Result ALUctr Clk busW RegWr 32 busA 32 busB 555 RwRaRb bit Registers RsRtRd ALU R[rd]  R[rs] + R[rt] R[rs] R[rt]

EECC550 - Shaaban #24 Lec # 4 Winter R-Type Register-Register Timing 32 Result ALUct r Clk busW RegWr 32 busA 32 busB 555 RwRaRb bit Registers RsRtRd ALU Clk PC Rs, Rt, Rd, Op, Func Clk-to-Q ALUctr Instruction Memory Access Time Old Value New Value RegWrOld Value New Value Delay through Control Logic busA, B Register File Access Time Old Value New Value busW ALU Delay Old Value New Value Old Value New Value Old Value Register Write Occurs Here

EECC550 - Shaaban #25 Lec # 4 Winter Instruction Word  Mem[PC]Fetch the instruction PC  PC + 4Increment PC R[rt]  R[rs] OR ZeroExt[imm16]OR register rs with immediate field zero extended to 32 bits, result in register rt Example : Micro-Operation Sequence For ORI Logical Operations with Immediate Example : Micro-Operation Sequence For ORI oprsrtimmediate bits16 bits5 bits ori rt, rs, imm16

EECC550 - Shaaban #26 Lec # 4 Winter Datapath For Logical Instructions With Immediate R[rt]  R[rs] OR ZeroExt[imm16] 32 Result ALUctr Clk busW RegWr 32 busA 32 busB 555 RwRaRb bit Registers Rs Rt RtRd RegDst ZeroExt Mux imm16 ALUSrc ALU 01 R[rs] R[rt] 2x1 MUX (width 5 bits) 2x1 MUX (width 32 bits)

EECC550 - Shaaban #27 Lec # 4 Winter Instruction Word  Mem[PC]Fetch the instruction PC  PC + 4Increment PC R[rt]  Mem[R[rs] + SignExt[imm16]]Immediate field sign extended to 32 bits and added to register rs to form memory load address, word at load address to register rt Example : Micro-Operation Sequence For LW Load Operations Example : Micro-Operation Sequence For LW oprsrtimmediate bits16 bits5 bits lw rt, rs, imm16 Data Memory Instruction Memory Effective Address Address offset in bytes

EECC550 - Shaaban #28 Lec # 4 Winter Additional Datapath Components For Loads & Stores Inputs: for address and write (store) data Output for read (load) result 16-bit input sign-extended into a 32-bit value at the output For SignExt[imm16] 32

EECC550 - Shaaban #29 Lec # 4 Winter Datapath For Loads R[rt]  Mem[R[rs] + SignExt[imm16]]

EECC550 - Shaaban #30 Lec # 4 Winter Instruction Word  Mem[PC]Fetch the instruction PC  PC + 4Increment PC Mem[R[rs] + SignExt[imm16]]  R[rt] Immediate field sign extended to 32 bits and added to register rs to form memory store address, register rt written to memory at store address. Example : Micro-Operation Sequence For SW Store Operations Example : Micro-Operation Sequence For SW oprsrtimmediate bits16 bits5 bits sw rt, rs, imm16 Effective Address Address offset in bytes

EECC550 - Shaaban #31 Lec # 4 Winter Datapath For Stores Mem[R[rs] + SignExt[imm16]]  R[rt]

EECC550 - Shaaban #32 Lec # 4 Winter Instruction Word  Mem[PC] Fetch the instruction PC  PC + 4 Increment PC Zero  R[rs] - R[rt] Calculate the branch condition R[rs] == R[rt] (i.e R[rs] - R[rt] = 0 ) Zero : PC  PC ( SignExt(imm16) x 4 ) Calculate the next instruction’s PC address Example : Micro-Operation Sequence For BEQ Conditional Branch Example : Micro-Operation Sequence For BEQ oprsrtimmediate bits16 bits5 bits beqrs, rt, imm16 Branch Target PC Offset in words

EECC550 - Shaaban #33 Lec # 4 Winter Datapath For Branch Instructions ALU to evaluate branch condition New adder to compute branch target: Sum of incremented PC and the sign-extended lower 16-bits on the instruction. Main ALU Evaluates Branch Condition (subtract) R[rs] R[rt] Zero flag =1 if R[rs] = R[rt] = 0 New 32-bit Adder (Third ALU) for Branch Target

EECC550 - Shaaban #34 Lec # 4 Winter More Detailed Datapath For Branch Operations Clk busW RegWr 32 busA 32 busB 555 RwRaRb bit Registers Rs Rt Equal? Zero 32 imm16 PC Clk 00 Adder Mux Adder 4 PCSrc PC Ext Instruction Address 32 Branch Zero PC+4 Branch Target 0 1 Main ALU (subtract ) Branch Target ALU New 2X1 32-bit MUX to select next PC value Sign extend shift left 2 PC

EECC550 - Shaaban #35 Lec # 4 Winter Combining The Datapaths For Memory Instructions and R-Type Instructions Highlighted muliplexors and connections added to combine the datapaths of memory and R-Type instructions into one datapath (This is book version ORI not supported) rs rt R[rs] R[rt] rt/rd MUX not shown

EECC550 - Shaaban #36 Lec # 4 Winter Instruction Fetch Datapath Added to ALU R-Type and Memory Instructions Datapath R[rs] R[rt] rs rt PC+ 4 PC (This is book version ORI not supported, no zero extend of immediate needed) rt/rd MUX not shown

EECC550 - Shaaban #37 Lec # 4 Winter A Simple Datapath For The MIPS Architecture Datapath of branches and a program counter multiplexor are added. Resulting datapath can execute in a single cycle the basic MIPS instruction: - load/store word - ALU operations - Branches PC +4 Branch Target rt/rd MUX not shown (This is book version ORI not supported, no zero extend of immediate needed)

EECC550 - Shaaban #38 Lec # 4 Winter Single Cycle MIPS Datapath Necessary multiplexors and control lines are identified here: (This is book version ORI not supported, no zero extend of immediate needed)

EECC550 - Shaaban #39 Lec # 4 Winter Putting It All Together: A Single Cycle Datapath imm16 32 ALUctr Clk busW RegWr 32 busA 32 busB 555 RwRaRb bit Registers Rs Rt Rd RegDst Extender Mux imm16 ALUSrc ExtOp Mux MemtoReg Clk Data In WrEn 32 Adr Data Memory MemWr ALU Zero Instruction Imm16RdRtRs = Adder PC Clk 00 Mux 4 PCSrc PC Ext Adr Inst Memory Branch Zero 0 1 PC+4 Branch Target R[rs] R[rt] (Includes ORI) Main ALU

EECC550 - Shaaban #40 Lec # 4 Winter Instruction Word  Mem[PC]Fetch the instruction PC  PC + 4Increment PC PC  PC(31-28),jump_target,00 Update PC with jump address : Micro-Operation Sequence For Jump: J Adding Support For Jump : Micro-Operation Sequence For Jump: J OP Jump_target 6 bits 26 bits j jump_target jump target = bits 26 bits 2 bits 0 PC(31-28) Jump Address Jump address in words

EECC550 - Shaaban #41 Lec # 4 Winter Datapath For Jump 32 PC Clk 00 Mux PCSrc imm16 Adder 4 PC Ext Next Instruction Address Mux JUMP Shift left 2 jump_target Instruction(15-0) Instruction(25-0) PC+4(31-28) PC+4 Branch Zero Branch Target Jump Address PC

EECC550 - Shaaban #42 Lec # 4 Winter RegDst ALUctr ALUSrc MemtoReg Instruction Imm16RdRsRt Adr Instruction Memory DATA PATH ExtOp MemWr Control Unit Op Fun Branch RegWr Jump_target Jump Control Lines

EECC550 - Shaaban #43 Lec # 4 Winter Single Cycle MIPS Datapath Extended To Handle Jump with Control Unit Added (This is book version ORI not supported, no zero extend of immediate needed)

EECC550 - Shaaban #44 Lec # 4 Winter Control Signal Generation addsuborilwswbeqjump RegDst ALUSrc MemtoReg RegWrite MemWrite Branch Jump ExtOp ALUctr x Add x Subtract Or Add x 1 x x 0 x x Subtract x x x x xxx func op Appendix A See Don’t Care Control lines for Main ALU

EECC550 - Shaaban #45 Lec # 4 Winter The Concept of Local Decoding Main Control op 6 ALU Control (Local) func N 6 ALUop ALUctr 3 ALU

EECC550 - Shaaban #46 Lec # 4 Winter Local Decoding of “func” Field R-typeorilwswbeqjump ALUop (Symbolic)“R-type”OrAdd Subtract xxx ALUop xxx Main Control op 6 ALU Control (Local) func N 6 ALUop ALUctr 3 funct Instruction Operation add subtract and or set-on-less-than ALUctr ALU Operation Add Subtract And Or Set-on-less-than ALUctr ALU (From Chapter 4)

EECC550 - Shaaban #47 Lec # 4 Winter The Truth Table for ALUctr R-typeorilwswbeq ALUop (Symbolic) “R-type”OrAdd Subtract ALUop funct Instruction Op add subtract and or set-on-less-than

EECC550 - Shaaban #48 Lec # 4 Winter The Truth Table For The Main Control

EECC550 - Shaaban #49 Lec # 4 Winter Example: RegWrite Logic Equation R-typeorilwswbeqjump RegWrite op RegWrite = R-type + ori + lw = !op & !op & !op & !op & !op & !op (R-type) + !op & !op & op & op & !op & op (ori) + op & !op & !op & !op & op & op (lw) op.... op.. op.. op.. op.. R-typeorilwswbeqjump RegWrite

EECC550 - Shaaban #50 Lec # 4 Winter PLA Implementation of the Main Control op.... op.. op.. op.. op.. R-typeorilwswbeqjump RegWrite ALUSrc MemtoReg MemWrite Branch Jump RegDst ExtOp ALUop

EECC550 - Shaaban #51 Lec # 4 Winter Clk PC Rs, Rt, Rd, Op, Func Clk-to-Q ALUctr Instruction Memoey Access Time Old ValueNew Value RegWrOld ValueNew Value Delay through Control Logic busA Register File Access Time Old ValueNew Value busB ALU Delay Old ValueNew Value Old ValueNew Value Old Value ExtOpOld ValueNew Value ALUSrcOld ValueNew Value MemtoRegOld ValueNew Value AddressOld ValueNew Value busWOld ValueNew Delay through Extender & Mux Register Write Occurs Data Memory Access Time Worst Case Timing (Load)

EECC550 - Shaaban #52 Lec # 4 Winter Instruction Timing Comparison PCInst Memory mux ALUData Mem mux PCReg FileInst Memory mux ALU mux PCInst Memory mux ALUData Mem PCInst Memorycmp mux Reg File Arithmetic & Logical Load Store Branch Critical Path setup

EECC550 - Shaaban #53 Lec # 4 Winter Simplified Single Cycle Datapath Timing Assuming the following datapath/control hardware components delays: –Memory Units: 2 ns –ALU and adders: 2 ns –Register File: 1 ns –Control Unit < 1 ns Ignoring Mux and clk-to-Q delays, critical path analysis: Instruction Memory Register Read Main ALU Data Memory Register Write PC + 4 ALU Branch Target ALU Control Unit Time 0 2ns 3ns 4ns 5ns 7ns 8ns Critical Path (Load) Obtained from low-level target technology implementation of components

EECC550 - Shaaban #54 Lec # 4 Winter Performance of Single-Cycle (CPI=1) CPU Assuming the following datapath hardware components delays: –Memory Units: 2 ns –ALU and adders: 2 ns –Register File: 1 ns The delays needed for each instruction type can be found : The clock cycle is determined by the instruction with longest delay: The load in this case which is 8 ns. Clock rate = 1 / 8 ns = 125 MHz A program with I = 1,000,000 instructions executed takes: Execution Time = T = I x CPI x C = 10 6 x 1 x 8x10 -9 = s = 8 msec Instruction Instruction Register ALU Data Register Total Class Memory Read Operation Memory Write Delay ALU 2 ns 1 ns 2 ns 1 ns 6 ns Load 2 ns 1 ns 2 ns 2 ns 1 ns 8 ns Store 2 ns 1 ns 2 ns 2 ns 7 ns Branch 2 ns 1 ns 2 ns 5 ns Jump 2 ns 2 ns Load has longest delay of 8 ns thus determining the clock cycle of the CPU to be 8ns Nano second, ns = second

EECC550 - Shaaban #55 Lec # 4 Winter Drawback of Single Cycle Processor Long cycle time: –Cycle time must be long enough for the load instruction: PC’s Clock -to-Q + Instruction Memory Access Time + Register File Access Time + ALU Delay (address calculation) + Data Memory Access Time + Register File Setup Time + Clock Skew All instructions must take as much time as the slowest –Cycle time for load is longer than needed for all other instructions. Real memory is not as well-behaved as idealized memory –Cannot always complete data access in one (short) cycle. Does not allow overlap of instruction processing (instruction pipelining, chapter 6).