1 Lecture 4: Advanced Pipelines Data hazards, control hazards, multi-cycle in-order pipelines (Appendix A.4-A.10)

Slides:



Advertisements
Similar presentations
1 Lecture: Pipelining Hazards Topics: Basic pipelining implementation, hazards, bypassing HW2 posted, due Wednesday.
Advertisements

ILP: IntroductionCSCE430/830 Instruction-level parallelism: Introduction CSCE430/830 Computer Architecture Lecturer: Prof. Hong Jiang Courtesy of Yifeng.
1 Lecture: Out-of-order Processors Topics: out-of-order implementations with issue queue, register renaming, and reorder buffer, timing, LSQ.
Pipeline Hazards Pipeline hazards These are situations that inhibit that the next instruction can be processed in the next stage of the pipeline. This.
Lecture 6: Pipelining MIPS R4000 and More Kai Bu
Instruction-Level Parallelism (ILP)
1 Lecture: Pipelining Extensions Topics: control hazards, multi-cycle instructions, pipelining equations.
1 IF IDEX MEM L.D F4,0(R2) MUL.D F0, F4, F6 ADD.D F2, F0, F8 L.D F2, 0(R2) WB IF IDM1 MEM WBM2M3M4M5M6M7 stall.
1 Lecture: Pipeline Wrap-Up and Static ILP Topics: multi-cycle instructions, precise exceptions, deep pipelines, compiler scheduling, loop unrolling, software.
S. Barua – CPSC 440 CHAPTER 6 ENHANCING PERFORMANCE WITH PIPELINING This chapter presents pipelining.
1 Lecture 17: Basic Pipelining Today’s topics:  5-stage pipeline  Hazards and instruction scheduling Mid-term exam stats:  Highest: 90, Mean: 58.
1 Lecture 3: Pipelining Basics Biggest contributors to performance: clock speed, parallelism Today: basic pipelining implementation (Sections A.1-A.3)
1 Lecture 7: Out-of-Order Processors Today: out-of-order pipeline, memory disambiguation, basic branch prediction (Sections 3.4, 3.5, 3.7)
1 Lecture 8: Branch Prediction, Dynamic ILP Topics: branch prediction, out-of-order processors (Sections )
1  2004 Morgan Kaufmann Publishers Chapter Six. 2  2004 Morgan Kaufmann Publishers Pipelining The laundry analogy.
1 Lecture 5: Pipeline Wrap-up, Static ILP Basics Topics: loop unrolling, VLIW (Sections 2.1 – 2.2) Assignment 1 due at the start of class on Thursday.
1 Lecture 8: Instruction Fetch, ILP Limits Today: advanced branch prediction, limits of ILP (Sections , )
1 Lecture 18: Pipelining Today’s topics:  Hazards and instruction scheduling  Branch prediction  Out-of-order execution Reminder:  Assignment 7 will.
1 Lecture 10: ILP Innovations Today: handling memory dependences with the LSQ and innovations for each pipeline stage (Section 3.5)
DLX Instruction Format
1 Lecture 4: Advanced Pipelines Data hazards, control hazards, multi-cycle in-order pipelines (Appendix A.4-A.10)
1 Lecture 4: Advanced Pipelines Control hazards, multi-cycle in-order pipelines, static ILP (Appendix A.4-A.10, Sections )
1 Lecture 9: Dynamic ILP Topics: out-of-order processors (Sections )
Appendix A Pipelining: Basic and Intermediate Concepts
ENGS 116 Lecture 51 Pipelining and Hazards Vincent H. Berk September 30, 2005 Reading for today: Chapter A.1 – A.3, article: Patterson&Ditzel Reading for.
1 Lecture 7: Static ILP and branch prediction Topics: static speculation and branch prediction (Appendix G, Section 2.3)
Pipeline Hazard CT101 – Computing Systems. Content Introduction to pipeline hazard Structural Hazard Data Hazard Control Hazard.
Chapter 2 Summary Classification of architectures Features that are relatively independent of instruction sets “Different” Processors –DSP and media processors.
1 Appendix A Pipeline implementation Pipeline hazards, detection and forwarding Multiple-cycle operations MIPS R4000 CDA5155 Spring, 2007, Peir / University.
Spring 2003CSE P5481 Precise Interrupts Precise interrupts preserve the model that instructions execute in program-generated order, one at a time If an.
Computer Architecture Pipeline. Motivation  Non-pipelined design  Single-cycle implementation  The cycle time depends on the slowest instruction 
Lecture 16: Basic Pipelining
11 Pipelining Kosarev Nikolay MIPT Oct, Pipelining Implementation technique whereby multiple instructions are overlapped in execution Each pipeline.
LECTURE 10 Pipelining: Advanced ILP. EXCEPTIONS An exception, or interrupt, is an event other than regular transfers of control (branches, jumps, calls,
1 Lecture: Out-of-order Processors Topics: a basic out-of-order processor with issue queue, register renaming, and reorder buffer.
1 Lecture 3: Pipelining Basics Today: chapter 1 wrap-up, basic pipelining implementation (Sections C.1 - C.4) Reminders:  Sign up for the class mailing.
CSC 4250 Computer Architectures September 22, 2006 Appendix A. Pipelining.
1 Lecture: Pipelining Extensions Topics: control hazards, multi-cycle instructions, pipelining equations.
CS203 – Advanced Computer Architecture Pipelining Review.
1 Lecture 20: OOO, Memory Hierarchy Today’s topics:  Out-of-order execution  Cache basics.
1 Lecture: Pipeline Wrap-Up and Static ILP Topics: multi-cycle instructions, precise exceptions, deep pipelines, compiler scheduling, loop unrolling, software.
CDA3101 Recitation Section 8
Lecture 17: Pipelining Today’s topics: 5-stage pipeline Hazards
Lecture: Pipelining Extensions
Appendix C Pipeline implementation
Exceptions & Multi-cycle Operations
Pipelining: Advanced ILP
Lecture 6: Advanced Pipelines
Lecture 19: Branches, OOO Today’s topics: Instruction scheduling
Lecture: Pipelining Hazards
Lecture 18: Pipelining Today’s topics:
Lecture: Pipelining Hazards
Lecture 5: Pipelining Basics
Lecture 18: Pipelining Today’s topics:
Pipelining in more detail
CSC 4250 Computer Architectures
Lecture: Pipelining Hazards
Lecture: Out-of-order Processors
Lecture 8: Dynamic ILP Topics: out-of-order processors
Lecture 19: Branches, OOO Today’s topics: Instruction scheduling
How to improve (decrease) CPI
Lecture: Pipelining Extensions
Lecture 20: OOO, Memory Hierarchy
Lecture: Pipelining Extensions
Lecture 18: Pipelining Today’s topics:
CS203 – Advanced Computer Architecture
Lecture 4: Advanced Pipelines
Lecture: Pipelining Hazards
Lecture 5: Pipeline Wrap-up, Static ILP
Lecture 9: Dynamic ILP Topics: out-of-order processors
Presentation transcript:

1 Lecture 4: Advanced Pipelines Data hazards, control hazards, multi-cycle in-order pipelines (Appendix A.4-A.10)

2 Hazards Structural hazards: different instructions in different stages (or the same stage) conflicting for the same resource Data hazards: an instruction cannot continue because it needs a value that has not yet been generated by an earlier instruction Control hazard: fetch cannot continue because it does not know the outcome of an earlier branch – special case of a data hazard – separate category because they are treated in different ways

3 Data Hazards

4 Bypassing Some data hazard stalls can be eliminated: bypassing

5 Example add R1, R2, R3 lw R4, 8(R1)

6 Example lw R1, 8(R2) lw R4, 8(R1)

7 Data Dependence Example lw R1, 8(R2) sw R1, 8(R3)

8 Summary For the 5-stage pipeline, bypassing can eliminate delays between the following example pairs of instructions: add/sub R1, R2, R3 add/sub/lw/sw R4, R1, R5 lw R1, 8(R2) sw R1, 4(R3) The following pairs of instructions will have intermediate stalls: lw R1, 8(R2) add/sub/lw R3, R1, R4 or sw R3, 8(R1) fmul F1, F2, F3 fadd F5, F1, F4

9 Control Hazards Simple techniques to handle control hazard stalls:  for every branch, introduce a stall cycle (note: every 6 th instruction is a branch!)  assume the branch is not taken and start fetching the next instruction – if the branch is taken, need hardware to cancel the effect of the wrong-path instruction  fetch the next instruction (branch delay slot) and execute it anyway – if the instruction turns out to be on the correct path, useful work was done – if the instruction turns out to be on the wrong path, hopefully program state is not lost

10 Branch Delay Slots

11 Slowdowns from Stalls Perfect pipelining with no hazards  an instruction completes every cycle (total cycles ~ num instructions)  speedup = increase in clock speed = num pipeline stages With hazards and stalls, some cycles (= stall time) go by during which no instruction completes, and then the stalled instruction completes Total cycles = number of instructions + stall cycles Slowdown because of stalls = 1/ (1 + stall cycles per instr)

12 Pipeline Implementation Signals for the muxes have to be generated – some of this can happen during ID Need look-up tables to identify situations that merit bypassing/stalling – the number of inputs to the muxes goes up

13 SituationExample codeAction No dependenceLD R1, 45(R2) DADD R5, R6, R7 DSUB R8, R6, R7 OR R9, R6, R7 No hazards Dependence requiring stall LD R1, 45(R2) DADD R5, R1, R7 DSUB R8, R6, R7 OR R9, R6, R7 Detect use of R1 during ID of DADD and stall Dependence overcome by forwarding LD R1, 45(R2) DADD R5, R6, R7 DSUB R8, R1, R7 OR R9, R6, R7 Detect use of R1 during ID of DSUB and set mux control signal that accepts result from bypass path Dependence with accesses in order LD R1, 45(R2) DADD R5, R6, R7 DSUB R8, R6, R7 OR R9, R1, R7 No action required Detecting Control Signals

14 Multicycle Instructions Functional unitLatencyInitiation interval Integer ALU11 Data memory21 FP add41 FP multiply71 FP divide25

15 Effects of Multicycle Instructions Structural hazards if the unit is not fully pipelined (divider) Frequent RAW hazard stalls Potentially multiple writes to the register file in a cycle WAW hazards because of out-of-order instr completion Imprecise exceptions because of o-o-o instr completion Note: Can also increase the “width” of the processor: handle multiple instructions at the same time: for example, fetch two instructions, read registers for both, execute both, etc.

16 Precise Exceptions On an exception:  must save PC of instruction where program must resume  all instructions after that PC that might be in the pipeline must be converted to NOPs (other instructions continue to execute and may raise exceptions of their own)  temporary program state not in memory (in other words, registers) has to be stored in memory  potential problems if a later instruction has already modified memory or registers A processor that fulfils all the above conditions is said to provide precise exceptions (useful for debugging and of course, correctness)

17 Dealing with these Effects Multiple writes to the register file: increase the number of ports, stall one of the writers during ID, stall one of the writers during WB (the stall will propagate) WAW hazards: detect the hazard during ID and stall the later instruction Imprecise exceptions: buffer the results if they complete early or save more pipeline state so that you can return to exactly the same state that you left at

18 ILP Instruction-level parallelism: overlap among instructions: pipelining or multiple instruction execution What determines the degree of ILP?  dependences: property of the program  hazards: property of the pipeline

19 Types of Dependences Data dependences: an instr produces a result for another (true dependence, results in RAW hazards in a pipeline) Name dependences: two instrs that use the same names (anti and output dependences, result in WAR and WAW hazards in a pipeline) Control dependences: an instruction’s execution depends on the result of a branch – re-ordering should preserve exception behavior and dataflow

20 Title Bullet