1 Appendix A Pipeline implementation Pipeline hazards, detection and forwarding Multiple-cycle operations MIPS R4000 CDA5155 Spring, 2007, Peir / University.

Slides:



Advertisements
Similar presentations
1 Pipelining Part 2 CS Data Hazards Data hazards occur when the pipeline changes the order of read/write accesses to operands that differs from.
Advertisements

COMP 4211 Seminar Presentation Based On: Computer Architecture A Quantitative Approach by Hennessey and Patterson Presenter : Feri Danes.
Instruction Set Issues MIPS easy –Instructions are only committed at MEM  WB transition Other architectures are more difficult –Instructions may update.
Pipeline Computer Organization II 1 Hazards Situations that prevent starting the next instruction in the next cycle Structural hazards – A required resource.
Review: Pipelining. Pipelining Laundry Example Ann, Brian, Cathy, Dave each have one load of clothes to wash, dry, and fold Washer takes 30 minutes Dryer.
Lecture 6: Pipelining MIPS R4000 and More Kai Bu
Instruction-Level Parallelism (ILP)
1 Lecture: Pipelining Extensions Topics: control hazards, multi-cycle instructions, pipelining equations.
1 IF IDEX MEM L.D F4,0(R2) MUL.D F0, F4, F6 ADD.D F2, F0, F8 L.D F2, 0(R2) WB IF IDM1 MEM WBM2M3M4M5M6M7 stall.
Instructor: Senior Lecturer SOE Dan Garcia CS 61C: Great Ideas in Computer Architecture Pipelining Hazards 1.
1 Lecture 4: Advanced Pipelines Data hazards, control hazards, multi-cycle in-order pipelines (Appendix A.4-A.10)
EECS 470 Pipeline Hazards Lecture 4 Coverage: Appendix A.
1  2004 Morgan Kaufmann Publishers Chapter Six. 2  2004 Morgan Kaufmann Publishers Pipelining The laundry analogy.
COMP381 by M. Hamdi 1 Pipelining Control Hazards and Deeper pipelines.
King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department.
DLX Instruction Format
1 Lecture 4: Advanced Pipelines Data hazards, control hazards, multi-cycle in-order pipelines (Appendix A.4-A.10)
1 Lecture 4: Advanced Pipelines Control hazards, multi-cycle in-order pipelines, static ILP (Appendix A.4-A.10, Sections )
Appendix A Pipelining: Basic and Intermediate Concepts
ENGS 116 Lecture 51 Pipelining and Hazards Vincent H. Berk September 30, 2005 Reading for today: Chapter A.1 – A.3, article: Patterson&Ditzel Reading for.
-1.1- PIPELINING 2 nd week. -2- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM PIPELINING 2 nd week References Pipelining concepts The DLX.
Pipeline Hazard CT101 – Computing Systems. Content Introduction to pipeline hazard Structural Hazard Data Hazard Control Hazard.
Lecture 7: Pipelining Review Kai Bu
CPE 731 Advanced Computer Architecture Pipelining Review Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University of California,
Chapter 2 Summary Classification of architectures Features that are relatively independent of instruction sets “Different” Processors –DSP and media processors.
CSC 4250 Computer Architectures September 26, 2006 Appendix A. Pipelining.
EEL5708 Lotzi Bölöni EEL 5708 High Performance Computer Architecture Pipelining.
Computer Architecture Lecture Notes Spring 2005 Dr. Michael P. Frank (New) Competency Area 6: Introduction to Pipelining.
Comp Sci pipelining 1 Ch. 13 Pipelining. Comp Sci pipelining 2 Pipelining.
CMPE 421 Parallel Computer Architecture
Instructor: Senior Lecturer SOE Dan Garcia CS 61C: Great Ideas in Computer Architecture Pipelining Hazards 1.
EE524/CptS561 Jose G. Delgado-Frias 1 Processor Basic steps to process an instruction IFID/OFEXMEMWB Instruction Fetch Instruction Decode / Operand Fetch.
11 Pipelining Kosarev Nikolay MIPT Oct, Pipelining Implementation technique whereby multiple instructions are overlapped in execution Each pipeline.
LECTURE 10 Pipelining: Advanced ILP. EXCEPTIONS An exception, or interrupt, is an event other than regular transfers of control (branches, jumps, calls,
CSC 4250 Computer Architectures September 22, 2006 Appendix A. Pipelining.
1 Lecture: Pipelining Extensions Topics: control hazards, multi-cycle instructions, pipelining equations.
CS203 – Advanced Computer Architecture Pipelining Review.
Instruction-Level Parallelism
CDA3101 Recitation Section 8
Pipelining Chapter 6.
Lecture 07: Pipelining Multicycle, MIPS R4000, and More
Pipeline Implementation (4.6)
Appendix C Pipeline implementation
ECE232: Hardware Organization and Design
CDA 3101 Spring 2016 Introduction to Computer Organization
\course\cpeg323-08F\Topic6b-323
Exceptions & Multi-cycle Operations
Pipelining: Advanced ILP
Morgan Kaufmann Publishers The Processor
CS 5513 Computer Architecture Pipelining Examples
Lecture 6: Advanced Pipelines
Pipelining Multicycle, MIPS R4000, and More
Pipelining review.
Pipelining Chapter 6.
Pipelining in more detail
CSC 4250 Computer Architectures
\course\cpeg323-05F\Topic6b-323
Pipeline control unit (highly abstracted)
Lecture: Pipelining Extensions
Lecture: Pipelining Extensions
Instruction Execution Cycle
Project Instruction Scheduler Assembler for DLX
Pipelining Multicycle, MIPS R4000, and More
Pipeline control unit (highly abstracted)
Lecture 4: Advanced Pipelines
Pipeline Control unit (highly abstracted)
Pipelining.
Guest Lecturer: Justin Hsia
CS 3853 Computer Architecture Pipelining Examples
Pipelining Hazards.
Presentation transcript:

1 Appendix A Pipeline implementation Pipeline hazards, detection and forwarding Multiple-cycle operations MIPS R4000 CDA5155 Spring, 2007, Peir / University of Florida

2 Limits of Pipelining Increasing the number of pipeline stages in a given logic block by a factor of n generally allows increasing clock speed & throughput by a factor of almost n. overheads –Usually less than n because of overheads such as latches and balance of delay in each stage. But, pipelining has a natural limit: –At least 1 layer of logic gates per pipeline stage! –Practical minimum is usally several gates (2-10). –Commercial designs are approaching this point!!

3 Simple RISC Datapath

4 Basic RISC Pipelining Basic idea: –Each instruction spends 1 clock cycle in each of the 5 execution stages. –During 1 clock cycle, the pipeline can be processing (different stages of) 5 different instructions.

5 Adding Pipeline Registers

6 Operations of Pipe Stages

7 Pipeline Hazards Hazards are circumstances which may lead to stalls (delays, “bubbles”) in the pipeline if not addressed. Three major types: –Structural hazards: Lack of HW resources to keep all instructions moving. –Data hazards Data results of earlier instrs. not yet avail. when needed. –Control hazards Control decisions resulting from earlier instrs. (branches) not yet made; don’t know which new instrs. to execute.

8 Structural Hazard Example Suppose you had a combined instruction+data memory with only 1 read port

9 Hazards Produce “Bubbles”

10 Another View

11 Example Data Hazard

12 Forwarding for Data Hazards

13 Another Forwarding Example

14 Three Types of Data Hazards Let i be an earlier instruction, j a later one. RAW (read after write) –j tries to read a value before i writes it WAW (write after write) –i and j write to same place, but in the wrong order. –Only occurs if >1 pipeline stage can write. WAR (write after read) –j writes a new value to a location before i has read the old one. –Only occurs if writes can happen before reads in pipeline.

15 An Unavoidable Stall - Load

16 Stalling for Load Dependent

17 Data Hazard Prevention A clever compiler can often reschedule instructions (code motion) to avoid a stall. –A simple example: Original code: lw r2, 0(r4) add r1, r2, r3  Note: Stall happens here! lw r5, 4(r4) Transformed code: lw r2, 0(r4) lw r5, 4(r4) add r1, r2, r3  No stall needed!

18 Data Hazard Detection

19 Hazard Detection Logic for Load Example: Detecting whether an instruction that has just been fetched needs to be stalled because of dependence from a preceding load. NOTE, The right part of the equ. should be IF/ID.IR

20 Forwarding Situations in MIPS Same as Figure A.22

21 Forwarding to The ALU

22 Branch Hazard Suppose the new PC value is not computed until the MEM stage. Then we must stall 3 clocks after every branch!

23 Early Branch Resolution Branch resolution at ID stage See Fig A.24, to resolve branch at ID stage without latching, save another cycle!!

24 Predict-Not-Taken Same as Fig. A.12 (Branch resolves in ID)

25 Delayed Branches Machine code sequence: Branch instruction Delay slot instruction(s) Post-branch instructions Branch is taken (if taken) at this point Same as Fig. A.13

26 Filling the Branch-Delay Slot For (b), (c) must no side-effect

27 Multi-Cycle Execution Same as Fig. A.29

28 Latency & Initiation Interval Latency: –Extra delay cycles before result is available. Initiation interval: –Minimum number of cycles before a new input can be given to that functional unit.

29 Pipelined Multiple-FP Operations Same as Fig. A.31

30 Pipelining FP Instructions Notice instructions may complete out-of-order: M7 –MULTD IF ID M1 M2 M3 M4 M5 M6 M7 ME WB A4 –ADDD IF ID A1 A2 A3 A4 ME WB ME –LD IF ID EX ME WB –SD IF ID EX ME WB Raises the possibility of WAW hazards, and structural hazards in MEM & WB stages. Structural hazards may occur especially often with non-pipelined DIV unit. Out-of-order completion impacts exception handling.

31 Issues in Multi-Cycle Operations Stall for RAW is longer and more frequent (Fig. A.33) WAW is possible; WAR is not (why?) Structural Hazard possible for non-pipelined unit Multiple WBs are likely (Fig. A.34) Handling hazards –At Issue (ID) stage: Check structural hazards: functional unit, WB port Check RAW hazards: Issue with forwarding Check WAW hazards: Not issue to make sure write in order –Detect and stall instruction before MEM and WB stages

32 Maintaining Precise Exception Settle for imprecise exception Buffer and complete in order –Require large buffers and comparators –History file, future file approaches Software trap handling when exception occurs Hybrid scheme: Issue when certain no exception for early instruction –All instructions before can be completed –No instructions after can be completed

33 Real MIPS R4000 Pipeline IF,IS - Instruction cache fetch, First & Second halves. RF - Inst. decode, Register Fetch, hazard check… EX - Execution (EA calc, ALU op, target calc…) DF,DS - Data cache access, First & Second halves. TC - Tag Check, did cache access hit? WB - Write-Back for loads & register-register ops. Read through A.38 – A.49

34 2-Cycle Load Delay

35 Branch Delay