Pipeline Timing Issues

Slides:



Advertisements
Similar presentations
Pipeline Example: cycle 1 lw R10,9(R1) sub R11,R2, R3 and R12,R4, R5 or R13,R6, R7.
Advertisements

1 A few words about the quiz Closed book, but you may bring in a page of handwritten notes. –You need to know what the “core” MIPS instructions do. –I.
MIPS Pipelined Datapath
Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University RISC Pipeline See: P&H Chapter 4.6.
Part 2 - Data Hazards and Forwarding 3/24/04++
Review: MIPS Pipeline Data and Control Paths
CSCE 212 Quiz 9 – 3/30/11 1.What is the clock cycle time based on for single-cycle and for pipelining? 2.What two actions can be done to resolve data hazards?
Lec 8: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University.
Computer Architecture - A Pipelined Datapath A Pipelined Datapath  Resisters are used to save data between stages. 1/14.
ENGS 116 Lecture 51 Pipelining and Hazards Vincent H. Berk September 30, 2005 Reading for today: Chapter A.1 – A.3, article: Patterson&Ditzel Reading for.
Supplementary notes for pipelining LW ____,____ SUB ____,____,____ BEQ ____,____,____ ; assume that, condition for branch is not satisfied OR ____,____,____.
Pipeline Hazard CT101 – Computing Systems. Content Introduction to pipeline hazard Structural Hazard Data Hazard Control Hazard.
Pipelined Datapath and Control
CPE432 Chapter 4B.1Dr. W. Abu-Sufah, UJ Chapter 4B: The Processor, Part B-2 Read Section 4.7 Adapted from Slides by Prof. Mary Jane Irwin, Penn State University.
CSIE30300 Computer Architecture Unit 05: Overcoming Data Hazards Hsin-Chou Chi [Adapted from material by and
PROCESSOR PIPELINING YASSER MOHAMMAD. SINGLE DATAPATH DESIGN.
Datapath and Control AddressInstruction Memory Write Data Reg Addr Register File ALU Data Memory Address Write Data Read Data PC Read Data Read Data.
Pipelining: Implementation CPSC 252 Computer Organization Ellen Walker, Hiram College.
Interstage Buffers 1 Computer Organization II © McQuain Pipeline Timing Issues Consider executing: add $t2, $t1, $t0 sub $t3, $t1, $t0 or.
Computer Architecture Lecture 6.  Our implementation of the MIPS is simplified memory-reference instructions: lw, sw arithmetic-logical instructions:
Computer Organization
Stalling delays the entire pipeline
Note how everything goes left to right, except …
Pipelining Chapter 6.
CSCI206 - Computer Organization & Programming
Test 2 review Lectures 5-10.
Single Clock Datapath With Control
Appendix C Pipeline implementation
Chapter 4 The Processor Part 4
ECS 154B Computer Architecture II Spring 2009
\course\cpeg323-08F\Topic6b-323
Design of the Control Unit for Single-Cycle Instruction Execution
Forwarding Now, we’ll introduce some problems that data hazards can cause for our pipelined processor, and show how to handle them with forwarding.
Review: MIPS Pipeline Data and Control Paths
Morgan Kaufmann Publishers The Processor
Morgan Kaufmann Publishers The Processor
Chapter 4 The Processor Part 2
SOLUTIONS CHAPTER 4.
Pipelining review.
Single-cycle datapath, slightly rearranged
Pipelining Chapter 6.
Current Design.
Design of the Control Unit for One-cycle Instruction Execution
A pipeline diagram Clock cycle lw $t0, 4($sp) IF ID
Pipelining in more detail
Systems Architecture II
CSCI206 - Computer Organization & Programming
\course\cpeg323-05F\Topic6b-323
Pipelined Control (Simplified)
Pipeline control unit (highly abstracted)
The Processor Lecture 3.6: Control Hazards
The Processor Lecture 3.2: Building a Datapath with Control
Control unit extension for data hazards
The Processor Lecture 3.5: Data Hazards
Instruction Execution Cycle
Pipeline control unit (highly abstracted)
Pipeline Control unit (highly abstracted)
Pipelining Basic concept of assembly line
Pipelining (II).
Control unit extension for data hazards
Pipelining Basic concept of assembly line
Morgan Kaufmann Publishers The Processor
Introduction to Computer Organization and Architecture
Control unit extension for data hazards
MIPS Pipelined Datapath
©2003 Craig Zilles (derived from slides by Howard Huang)
Pipelined datapath and control
Need to stall for one cycle.
ELEC / Computer Architecture and Design Spring 2015 Pipeline Control and Performance (Chapter 6) Vishwani D. Agrawal James J. Danaher.
Pipelining Hazards.
Presentation transcript:

Pipeline Timing Issues Consider executing: add $t2, $t1, $t0 sub $t3, $t1, $t0 or $t4, $t1, $t0 sw $t2, 0($t0) time 1 ... sw 2 or 3 sub 4 add

Pipeline Timing Issues What happens during cycle 1? Among other things… - sw reaches the ID stage, and Control sets MemWrite to 1 - so, a memory write will occur while sub is in the MEM stage - and that’s bad news… time 1 ... sw 2 or 3 sub 4 add

Pipeline Timing Issues What needs to happen instead? - the value of MemWrite that goes with sw… - … needs to travel forward, stage to stage as sw does 1 2 3 4 time add sub or sw ... 1 2 3 4 time sub or sw ...

Pipeline Timing Issues What needs to happen instead? - the value of MemWrite that goes with sw… - … needs to travel forward, stage to stage as sw does time 1 ... ... 2 ... 3 sw 4 or So how do we make this happen?

Adding Buffers Put storage buffers between adjacent stages: Control writes/reads with the clock signal. Write values exiting a stage to the “outbound” buffer. Read values entering a stage from the “inbound” buffer. So no signal (or data value) arrives before its time…

Pipeline Operation Cycle-by-cycle flow of instructions through the pipelined datapath “Single-clock-cycle” pipeline diagram Shows pipeline usage in a single cycle Highlight resources used c.f. “multi-clock-cycle” diagram Graph of operation over time We’ll look at “single-clock-cycle” diagrams for load & store

IF for Load, Store, … PC+4 is computed, stored back into the PC, stored in the IF/ID buffer although it will not be needed in a later stage for LW or SW Instruction word is fetched from memory, and stored in the IF/ID buffer because it will be needed in the next stage. Write into the buffer

ID for Load Bits of load instruction are taken from IF/ID buffer, while new instruction is being fetched back in stage 1. PC+4 is passed forward to ID/EX buffer... Read register #1 and #2 contents are fetched and stored in ID/EX buffer until needed in next stage… #2 won't be needed. 16-bit field is fetched from IF/ID buffer, then sign-extended, then stored in the ID/EX buffer for use in a later stage. Read from the buffer

EX for Load PC+4 is taken from ID/EX buffer and added to branch offset… Computed branch target address is stored in EX/MEM buffer to await decision in next stage... but won't be needed. Read register #1 contents are taken from ID/EX buffer and provided to ALU. 16-bit literal is provided to ALU as second operand ALU result and Zero line are stored in EX/MEM buffer for use as memory address in next stage. Read register #2 is passed forward to EX/MEM buffer, for possible use in later stage… but won't be needed.

MEM for Load Zero line taken from EX/MEM buffer for branch control logic in this stage… Value on Read data port of data memory is stored in MEM/WB buffer, awaiting decision in last stage.. ALU result is taken from EX/MEM buffer and passed to Address port of data memory. ALU result also stored in MEM/WB buffer for possible use in last stage… Read register #2 contents taken from EX/MEM buffer and passed to Write data port of data memory.

WB for Load But the Write register port is now seeing the register number from a different, later instruction. Since load instruction, value from data memory is selected and passed back to register file.

Corrected Datapath for Load So we fix the register number problem by passing the Write register # from the load instruction through the various inter-stage buffers… …and then back, on the correct clock cycle.

EX for Store Almost the same as for LW… Read register #2 is passed forward to EX/MEM buffer, for use in later stage… for SW this will be needed.

MEM for Store Zero line taken from EX/MEM buffer for branch control logic in this stage… Value on Read data port of data memory is stored in MEM/WB buffer, awaiting decision in last stage.. ALU result is taken from EX/MEM buffer and passed to Address port of data memory. ALU result also stored in MEM/WB buffer for possible use in last stage… Read register #2 contents taken from EX/MEM buffer and passed to Write data port of data memory.

WB for Store Since SW instruction, neither value will be written to the register file… doesn't really matter which value we send back…

Questions to Ponder Can you repeat this analysis for other sorts of instructions, identifying in each stage what's relevant and what's not? How much storage space does each interstage buffer need? Why? Do the interstage buffers have any effect on the overall time required for an instruction to migrate through the pipeline? Why?

Summary Here’s our preliminary configuration for the buffers: