Single-Cycle CPU DataPath.

Slides:



Advertisements
Similar presentations
CS/COE1541: Introduction to Computer Architecture Datapath and Control Review Sangyeun Cho Computer Science Department University of Pittsburgh.
Advertisements

The Processor: Datapath & Control
331 W9.1Spring :332:331 Computer Architecture and Assembly Language Spring 2006 Week 9 Building a Single-Cycle Datapath [Adapted from Dave Patterson’s.
Levels in Processor Design
331 Lec 14.1Fall 2002 Review: Abstract Implementation View  Split memory (Harvard) model - single cycle operation  Simplified to contain only the instructions:
Computer Structure - Datapath and Control Goal: Design a Datapath  We will design the datapath of a processor that includes a subset of the MIPS instruction.
Copyright 1998 Morgan Kaufmann Publishers, Inc. All rights reserved. Digital Architectures1 Machine instructions execution steps (1) FETCH = Read the instruction.
The Processor: Datapath & Control. Implementing Instructions Simplified instruction set memory-reference instructions: lw, sw arithmetic-logical instructions:
Chapter 4 Sections 4.1 – 4.4 Appendix D.1 and D.2 Dr. Iyad F. Jafar Basic MIPS Architecture: Single-Cycle Datapath and Control.
COSC 3430 L08 Basic MIPS Architecture.1 COSC 3430 Computer Architecture Lecture 08 Processors Single cycle Datapath PH 3: Sections
Processor: Datapath and Control
Gary MarsdenSlide 1University of Cape Town Chapter 5 - The Processor  Machine Performance factors –Instruction Count, Clock cycle time, Clock cycles per.
Computer Architecture and Design – ECEN 350 Part 6 [Some slides adapted from A. Sprintson, M. Irwin, D. Paterson and others]
1. Building A CPU  We’ve built a small ALU l Add, Subtract, SLT, And, Or l Could figure out Multiply and Divide  What about the rest l How do.
Chapter 4 From: Dr. Iyad F. Jafar Basic MIPS Architecture: Single-Cycle Datapath and Control.
Datapath and Control AddressInstruction Memory Write Data Reg Addr Register File ALU Data Memory Address Write Data Read Data PC Read Data Read Data.
COM181 Computer Hardware Lecture 6: The MIPs CPU.
1 Chapter 5: Datapath and Control (Part 2) CS 447 Jason Bakos.
MIPS Processor.
Computer Architecture Lecture 6.  Our implementation of the MIPS is simplified memory-reference instructions: lw, sw arithmetic-logical instructions:
Design a MIPS Processor (II)
Single-cycle CPU Control
Access the Instruction from Memory
EE204 Computer Architecture
CS Computer Architecture Week 10: Single Cycle Implementation
Single Cycle CPU - Control
Multi-Cycle Datapath and Control
CS161 – Design and Architecture of Computer Systems
Electrical and Computer Engineering University of Cyprus
Single-Cycle Datapath and Control
Computer Architecture
Morgan Kaufmann Publishers
/ Computer Architecture and Design
Multi-Cycle CPU.
Processor (I).
CS/COE0447 Computer Organization & Assembly Language
MIPS processor continued
Designing MIPS Processor (Single-Cycle) Presentation G
Single Cycle CPU Design
CSCI206 - Computer Organization & Programming
CS/COE0447 Computer Organization & Assembly Language
CS/COE0447 Computer Organization & Assembly Language
Design of the Control Unit for One-cycle Instruction Execution
CSCI206 - Computer Organization & Programming
CS/COE0447 Computer Organization & Assembly Language
MIPS Processor.
Datapath & Control MIPS
Levels in Processor Design
Topic 5: Processor Architecture Implementation Methodology
Rocky K. C. Chang 6 November 2017
Composing the Elements
Composing the Elements
The Processor Lecture 3.2: Building a Datapath with Control
Topic 5: Processor Architecture
Lecture 9. MIPS Processor Design – Decoding and Execution
Systems Architecture I
COSC 2021: Computer Organization Instructor: Dr. Amir Asif
Lecture 14: Single Cycle MIPS Processor
Single Cycle Datapath Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
Computer Architecture Processor: Datapath
MIPS processor continued
CS/COE0447 Computer Organization & Assembly Language
Control Unit (single cycle implementation)
A single-cycle MIPS processor
The Processor: Datapath & Control.
COMS 361 Computer Organization
MIPS Processor.
Processor: Datapath and Control
CS/COE0447 Computer Organization & Assembly Language
Presentation transcript:

Single-Cycle CPU DataPath

Building A CPU 5.1 We’ve built a small ALU Add, Subtract, SLT, And, Or Could figure out Multiply and Divide... What about the rest How do we deal with memory and registers? What about control operations (branches)? How do we interpret instructions? The whole thing... A CPU’s datapath deals with moving data around A CPU’s control manages the data 5.1

Datapath Overview 5.1 ALU Computes on: R-type: 2 registers I-type: Register and data Datapath Overview Current Instruction: PC Instruction Memory Registers Data Memory Read reg. num A Read reg. num B Write reg num Write reg data Read reg data A Read reg dataB Read address Instruction [31-0] Write address Write data Read data Result PC Instructions: R-type: 3 registers I-type: 2 registers, Data Memory: Address from ALU Data to/from regs Data to write into dest. register from: ALU or Memory 5.1

Instruction Datapath Instructions will be held in the instruction memory The instruction to fetch is at the location specified by the PC Instr. = M[PC] Add 4 Instruction Memory Read address Instruction PC After we fetch one instruction, the PC must be incremented to the next instruction All instructions are 4 bytes PC = PC + 4 Note: Regular instruction width (32 for MIPS) makes this easy 5.2

R-type Instruction Datapath Read reg. num A Read reg num A Registers Read reg data A Instruction Read reg num B Zero Result Write reg num Read reg data B ALU Write reg data R-type Instructions have three registers Two read (Rs, Rt) to provide data to the ALU One write (Rd) to receive data from the ALU We’ll need to specify the operation to the ALU (later...) We might be interested if the result of the ALU is zero (later...) 5.2

Memory Operations Read reg. num A Registers Read reg num B Write reg num Write reg data Read reg data A Read reg data B Read reg num A Result Zero Data Memory Read address Instruction Read data Write address Write data sign extend 16 32 Memory operations first need to compute the effective address LW $t1, 450($s3) # E.A. = 450 + $s3 Add together one register and 16 bits of immediate data Immediate data needs to be converted from 16-bit to 32-bit Memory then performs load or store using destination register 5.2

Branches 5.2 Branches conditionally change the next instruction BEQ $2, $1, 42 The offset is specified as the number of words to be added to the next instruction (PC+4) PC + 4 Add Result Sh. Left 2 Instruction To control logic Read reg. num A Registers Read reg num B Write reg num Write reg data Read reg data A Read reg data B Read reg num A Result Zero Take offset, multiply by 4 Shift left two Add this to PC+4 (from PC logic) offset sign extend 16 32 Control logic has to decide if the branch is taken Uses ‘zero’ output of ALU 5.2

Integrating the R-types and Memory Data Memory Read address Write address Write data Read data Result Zero sign extend 16 32 Read reg. num A Registers Read reg num B Write reg num Write reg data Read reg data A Read reg data B Read reg num A Instruction 1 1 Memory Datapath R-types and Load/Stores are similar in many respects Differences: 2nd ALU source: R-types use register, I-types use Immediate Write Data: R-types use ALU result, I-types use memory Mux the conflicting datapaths together 5.3

Adding the instruction memory 4 Read address Instruction [31-0] Result PC Simply add the instruction memory and PC to the beginning of the datapath. Data Memory Read address Write address Write data Read data Result Zero 1 sign extend 16 32 Read reg. num A Registers Read reg num B Write reg num Write reg data Read reg data A Read reg data B Read reg num A Separate Instruction and Data memories are needed in order to allow the entire datapath to complete its job in a single clock cycle. 5.3

Adding the Branch Datapath Result Instruction Memory Add 4 Read address Instruction [31-0] Result PC Data Memory Write address Write data Read data Zero 1 sign extend 16 32 Read reg. num A Registers Read reg num B Write reg num Write reg data Read reg data A Read reg data B Read reg num A 1 Sh. Left 2 Now we have the datapath for R-type, I-type, and branch instructions. On to the control logic! 5.3

When does everything happen? 4 clk Result 1 Result Sh. Left 2 Add Add Single-Cycle Design Read reg. num A Read reg num A Read address Read reg data A Read reg num B Data Memory PC Read address Zero Registers Read data 1 Instruction [31-0] Result Write address Write reg num Instruction Memory Read reg data B Write data Write reg data 1 clk 16 32 clk sign extend Combinational Logic: Just does it! Outputs are always just a function of its inputs (with some delay) Registers: Written at the end of the clock cycle. (Rising edge triggered). 5.3

Example Suppose it takes: memory 100 nsec to read a word, the ALU and adders take 4 nsec, the register file can be read or written in 1 nsec, the PC can be read or written in 0.2 nsec, all multiplexors take 0.1 nsec. Assume everything else takes 0 time (control, shift, sign extend, wires, etc.). How long will it take to execute an add instruction? How long will it take to execute a lw instruction? How long will it take to execute a beq instruction? How long will it take to execute a j instruction?

Single-cycle CPU Control

What do we need to control? Mux - are we branching or not? Registers- Should we write data? 4 Result 1 Mux - Result from ALU or Memory? Result Sh. Left 2 Add Add Read reg. num A Registers Read reg num B Write reg num Write reg data Read reg data A Read reg data B Read reg num A Read address Data Memory PC Read address Zero Read data 1 Instruction [31-0] Result Write address Instruction Memory Write data 1 16 sign extend Mux - Where does 2nd ALU operand come from? 32 Memory- Read/Write/neither? ALU - What is the Operation? Almost all of the information we need is in the instruction! 5.3

The ALU 5.3 + The ALU is stuck right in the middle of everything... It must: Add, Subtract, And, or Or for arithmetic instructions Subtract for a branch on equal Subtract and set for a SLT Add for a memory access 1 A Operation Result + 2 B CarryIn CarryOut BInvert 3 Less Function BInvert Op Carryin Result And 0 00 0 R = A • B Or 0 01 0 R = A Ú B Add 0 10 0 R = A + B Subtract 1 10 1 R = A - B SLT 1 11 1 R = 1 if A < B 0 if A ³ B Always the same: Combine into one signal called “sub” 5.3

Setting the ALU controls The instruction Opcode and Function give us the info we need For R-type instructions, Opcode is zero, function code determines ALU controls For I-type instructions, Opcode determines ALU controls New control signal: ALUOp is 00 for memory, 01 for Branch, and 10 for R-type Instruction Opcode ALUOp Funct. Code ALU action ALU control sub op add R-type 10 100000 add 0 10 sub R-type 10 100010 subtract 1 10 and R-type 10 100100 and 0 00 or R-type 10 100101 or 0 01 SLT R-type 10 101010 SLT 1 11 load word LW 00 xxxxxx add 0 10 store word SW 00 xxxxxx add 0 10 branch equal BEQ 01 xxxxxx subtract 1 10 5.3

Controlling the ALU 5.3 For ALUOp = 00 or 01, function code is unused AluOp is determined by Opcode - separate logic will generate ALUOp For ALUOp = 00 or 01, function code is unused ALUOp F5 F4 F3 F2 F1 F0 Function ALU Ctrl 00 x x x x x x Add 0 10 x1 x x x x x x Sub 1 10 1x x x 0 0 0 0 Add 0 10 1x x x 0 0 1 0 Sub 1 10 1x x x 0 1 0 0 And 0 00 1x x x 0 1 0 1 Or 0 01 1x x x 1 0 1 0 SLT 1 11 ALUOp1 ALUOp0 F0 F3 F1 F2 A0 A1 A2 Since ALUOp can only be 00, 01, or 10, we don’t care what ALUOp2 is when ALUOP1 is 1 A 6-input truth table - use standard minimization techniques 5.3

Decoding the Instruction - Data The instruction holds the key to all of the data signals R-type Opcode RS RT RD ShAmt Function 31-26 25-21 20-16 15-11 10-6 5-0 To ctrl logic Read reg. A Read reg. B Write reg. Not Used To ALU Control Memory, Branch Opcode RS RT Immediate Data 31-26 25-21 20-16 15-0 To ctrl logic Read reg. A Write reg./ Read reg. B Memory address or Branch Offset One problem - Write register number must come from two different places. 5.3

We can decode the data simply by dividing up the instruction bus Instruction Decoding Opcode: [31-26] 4 Result 1 Result Sh. Left 2 Add Add Op:[31-26] Ctrl Rs:[25-21] Read reg. num A Registers Read reg num B Write reg num Write reg data Read reg data A Read reg data B Read reg num A Read address Rt:[20-16] Data Memory PC Read address Zero Read data 1 Instruction [31-0] 1 Result Write address Instruction Memory Rd: [15-11] Write data 1 Read Reg A: Rs Imm: [15-0] 16 32 Read Reg B: Rt sign extend Write Reg: Either Rd or Rt Immediate Data: [15-0] 5.3

Control Signals 5.3 ALU Control - A function of: ALUOp 4 Result 1 Load,R-type Result Sh. Left 2 Add BEQ and zero Add PCSrc Op:[31-26] Ctrl MemWrite RegWrite Load Store MemToReg Rs:[25-21] Read reg. num A Registers Read reg num B Write reg num Write reg data Read reg data A Read reg data B Read reg num A ALUSrc Read address Rt:[20-16] Data Memory PC Read address Memory 1 Zero Read data 1 Instruction [31-0] Result Write address Instruction Memory Rd: [15-11] Write data 1 RegDest Imm: [15-0] R-type ALU Ctrl 00: Memory 01: Branch 10: R-type 16 sign extend 32 MemRead Load FC:[5-0] 6 ALUOp ALU Control - A function of: ALUOp and the function code 5.3

Inside the control oval 00:Mem 01:Branch 10:R-type 0:Reg 1:Imm 1:Mem 0:ALU 0:Rt 1:Rd 1:Branch Reg ALU Mem Reg Mem Mem Instruction Opcode Write Src To Reg Dest Read Write PCSrc ALUOp R-format 000000 1 0 0 1 0 0 0 10 LW 100011 1 1 1 0 1 0 0 00 SW 101011 0 1 x x 0 1 0 00 BEQ 000100 0 0 x x 0 0 1 01 This control logic can be decoded in several ways: Random logic, PLA, PAL Just build hardware that looks for the 4 opcodes For each opcode, assert the appropriate signals Note: BEQ must also check the zero output of the ALU... 5.3

Control Signals 5.3 We must AND BEQ and Zero Ctrl 4 Add Add 4 Result 1 Result Sh. Left 2 Add Add PCSrc BEQ Ctrl MemToReg MemRead Op:[31-26] MemWrite ALUOp ALUSrc RegWrite RegDest Rs:[25-21] Read reg. num A Registers Read reg num B Write reg num Write reg data Read reg data A Read reg data B Read reg num A Write Read Read address Rt:[20-16] Data Memory PC Read address 1 Zero Read data 1 Instruction [31-0] Result Write address Instruction Memory Rd: [15-11] Write data 1 Imm: [15-0] 16 ALU Ctrl sign extend 32 FC:[5-0] 6 5.3

Jumping 5.3 Ctrl 4 Add Add Data Memory Registers Instruction Memory 32 1 Sh. Left 2 Concat. 26 28 4 4 Result 1 [31-28] Result Sh. Left 2 Add Add Jump PCSrc J:[25-0] BEQ Ctrl MemToReg MemRead Op:[31-26] MemWrite ALUOp ALUSrc RegWrite RegDest Rs:[25-21] Read reg. num A Registers Read reg num B Write reg num Write reg data Read reg data A Read reg data B Read reg num A Write Read Read address Rt:[20-16] Data Memory PC Read address 1 Zero Read data 1 Instruction [31-0] Result Write address Instruction Memory Rd: [15-11] Write data 1 Imm: [15-0] 16 ALU Ctrl sign extend 32 FC:[5-0] 6 5.3

Performance What major functional units are used by different instructions? R-type: Instr. Fetch Register Read ALU Register Write 6ns LW: Instr. Fetch Register Read ALU Memory Read Register Write 8ns SW: Instr. Fetch Register Read ALU Memory Write 7ns Branch: Instr. Fetch Register Read ALU 5ns Jump: Instr. Fetch 2ns Assume the following times: Since the longest time is 8ns (LW), the cycle time must be at least 8ns. Memory Access: 2ns ALU: 2ns Registers: 1ns

Example Calculate the execution times for the following program in a Single-cycle datapath with a cycle time of 50 ns main: add $9, $0, $0 # clear $9 lw $8, Tonto($9) # put Tonto[0] in $8 addi $9, $9, 4 # increment $9 lw $10, Tonto($9) # put Tonto[1] in $10 add $11, $10, $8

Example 2 Calculate the execution times for the following program in a Single-cycle datapath with a cycle time of 50 ns .data ARRAY: .word 3, 5, 7, 9, 2 #random values SUM: .word 0 #initialize sum to zero .text main: addi $6, $0, 5 #initialize loop counter to 5 addi $7, $0, 0 #initialize array index to zero addi $8, $0, 0 #set $8 (sum temp) to zero REPEAT: lw $5, ARRAY($7) #R5 = ARRAY[i] add $8, $8, $5 #SUM+= ARRAY[I] addi $7, $7, 4 #increment index (i++) addi $6, $6, -1 #decrement loop counter bne $6, $0, REPEAT #check if 5 repetitions sw $8, SUM($0) #copy sum to memory addi $v0, $0, 10 #exit program syscall