Control unit extension for data hazards

Slides:

Advertisements

Similar presentations

Morgan Kaufmann Publishers The Processor

Advertisements

Instruction-Level Parallelism (ILP)

MIPS Pipelined Datapath

ECE 445 – Computer Organization

Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Pipelined Processor.

Review: MIPS Pipeline Data and Control Paths

ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )

ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )

1  2004 Morgan Kaufmann Publishers Chapter Six. 2  2004 Morgan Kaufmann Publishers Pipelining The laundry analogy.

1 Stalling  The easiest solution is to stall the pipeline  We could delay the AND instruction by introducing a one-cycle delay into the pipeline, sometimes.

Computer Organization Lecture Set – 06 Chapter 6 Huei-Yung Lin.

Pipelined Datapath and Control (Lecture #15) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer.

ENGS 116 Lecture 51 Pipelining and Hazards Vincent H. Berk September 30, 2005 Reading for today: Chapter A.1 – A.3, article: Patterson&Ditzel Reading for.

CSE378 Pipelining1 Pipelining Basic concept of assembly line –Split a job A into n sequential subjobs (A 1,A 2,…,A n ) with each A i taking approximately.

Lecture 15: Pipelining and Hazards CS 2011 Fall 2014, Dr. Rozier.

Pipeline Hazard CT101 – Computing Systems. Content Introduction to pipeline hazard Structural Hazard Data Hazard Control Hazard.

Pipeline Data Hazards: Detection and Circumvention Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly.

Pipelined Datapath and Control

CPE432 Chapter 4B.1Dr. W. Abu-Sufah, UJ Chapter 4B: The Processor, Part B-2 Read Section 4.7 Adapted from Slides by Prof. Mary Jane Irwin, Penn State University.

Chapter 4 CSF 2009 The processor: Pipelining. Performance Issues Longest delay determines clock period – Critical path: load instruction – Instruction.

Comp Sci pipelining 1 Ch. 13 Pipelining. Comp Sci pipelining 2 Pipelining.

11/13/2015 8:57 AM 1 of 86 Pipelining Chapter 6. 11/13/2015 8:57 AM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which.

Winter 2002CSE Topic Branch Hazards in the Pipelined Processor.

1 (Based on text: David A. Patterson & John L. Hennessy, Computer Organization and Design: The Hardware/Software Interface, 3 rd Ed., Morgan Kaufmann,

CSE431 L07 Overcoming Data Hazards.1Irwin, PSU, 2005 CSE 431 Computer Architecture Fall 2005 Lecture 07: Overcoming Data Hazards Mary Jane Irwin (

Branch Hazards and Static Branch Prediction Techniques

1/24/ :00 PM 1 of 86 Pipelining Chapter 6. 1/24/ :00 PM 2 of 86 Overview of Pipelining Pipelining is an implementation technique in which.

CSIE30300 Computer Architecture Unit 05: Overcoming Data Hazards Hsin-Chou Chi [Adapted from material by and

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB ECE232: Hardware Organization and Design Part 13: Branch prediction (Chapter 4/6)

PROCESSOR PIPELINING YASSER MOHAMMAD. SINGLE DATAPATH DESIGN.

Pipelining: Implementation CPSC 252 Computer Organization Ellen Walker, Hiram College.

CSE 340 Computer Architecture Spring 2016 Overcoming Data Hazards.

Interstage Buffers 1 Computer Organization II © McQuain Pipeline Timing Issues Consider executing: add $t2, $t1, $t0 sub $t3, $t1, $t0 or.

Pipeline Timing Issues

Computer Organization CS224

Stalling delays the entire pipeline

Single Clock Datapath With Control

Pipeline Implementation (4.6)

Appendix C Pipeline implementation

Morgan Kaufmann Publishers The Processor

Chapter 4 The Processor Part 4

ECS 154B Computer Architecture II Spring 2009

ECE232: Hardware Organization and Design

Pipelining: Advanced ILP

Chapter 4 The Processor Part 3

Review: MIPS Pipeline Data and Control Paths

Morgan Kaufmann Publishers The Processor

Morgan Kaufmann Publishers The Processor

Pipelining review.

Single-cycle datapath, slightly rearranged

The processor: Pipelining and Branching

Current Design.

Pipelining in more detail

Data Hazards Data Hazard

Pipelining Basic concept of assembly line

Pipeline control unit (highly abstracted)

The Processor Lecture 3.6: Control Hazards

The Processor Lecture 3.5: Data Hazards

Instruction Execution Cycle

Pipeline control unit (highly abstracted)

Pipeline Control unit (highly abstracted)

Pipelining Basic concept of assembly line

Pipelining (II).

Control unit extension for data hazards

Pipelining Basic concept of assembly line

Control unit extension for data hazards

MIPS Pipelined Datapath

ELEC / Computer Architecture and Design Spring 2015 Pipeline Control and Performance (Chapter 6) Vishwani D. Agrawal James J. Danaher.

Pipelining Hazards.

Presentation transcript:

Control unit extension for data hazards Hazard detection unit Control Unit ID/EX EX/Mem Mem/WB IF/ID IF ID EX Mem WB Forwarding unit 1/2/2019 CSE378 Pipelining hazards

CSE378 Pipelining hazards Forwarding unit Forwarding is done prior to ALU computation in EX stage If we have an R-R instruction, the forwarding unit will need to check whether EX/Mem result register = IF/ID rs EX/Mem result register = IF/ID rt and if so set up muxes to ALU source appropriately and also whether Mem/WB result register = IF/ID rs Mem/WB result register = IF/ID rt 1/2/2019 CSE378 Pipelining hazards

Forwarding unit (continued) For a Load/Store or Immediate instruction Need to check forwarding for rs only For a branch instruction Need to check forwarding for the registers involved in the comparison 1/2/2019 CSE378 Pipelining hazards

Forwarding in consecutive instructions What happens if we have add $10,$10,$12 Forwarding priority is given to the most recent result, that is the one generated by the ALU in the EX/Mem, not the one passed to Mem/Wb So same conditions as before for forwarding from EX/MEM but when forwarding from MEM/WB check if the forwarding is also done for the same register from EX/MEM 1/2/2019 CSE378 Pipelining hazards

CSE378 Pipelining hazards Hazard detection unit If a Load (instruction i-1) is followed by instruction i that needs the result of the load, we need to stall the pipeline for one cycle, that is instruction i-1 should progress normally instruction i should not progress no new instruction should be fetched The hazard detection unit should operate during the ID stage When processing instruction i, how do we know instruction i-1 is a Load ? MemRead signal is asserted in ID/EX 1/2/2019 CSE378 Pipelining hazards

Hazard detection unit (continued) How do we know we should stall instruction i-1 is a Load and either ID/EX rt = IF/ID rs, or ID/EX rt = IF/ID rt How do we prevent instruction i to progress Put 0’s in all control fields of ID/EX (becomes a no-op) Don’t change the IF/ID field (have a control line be asserted at every cycle to write it unless we have to stall) How do we prevent fetching a new instruction Have a control line asserted only when we want to write a new value in the PC 1/2/2019 CSE378 Pipelining hazards

Overall picture (almost) for data hazards What is missing... Forwarding when Load followed by a Store (mem to mem copy) forwarding from MEM/WB stage to memory input 1/2/2019 CSE378 Pipelining hazards

CSE378 Pipelining hazards Control hazards Pipelining and branching don’t get along Transfer of control (jumps, procedure call/returns, successful branches) cause control hazards When a branch is known to succeed, at the Mem stage (but could be done one stage earlier), there are instructions in the pipeline in stages before Mem that need to be converted into “no-op” and we need to start fetching the correct instructions by using the right PC 1/2/2019 CSE378 Pipelining hazards

Example of control hazard 1/2/2019 CSE378 Pipelining hazards

Resolving control hazards Detecting a potential control hazard is easy Look at the opcode We must insure that the state of the program is not changed until the outcome of the branch is known. Possibilities are: Stall as soon as opcode is detected (cost 3 bubbles; same type of logic as for the load stall but for 3 cycles instead of 1) Assume that branch won’t be taken (cost only if branch is taken; see next slides) Use some predictive techniques 1/2/2019 CSE378 Pipelining hazards

Assume branch not taken strategy We have a problem if branch is taken! “No-op” the “wrong” instructions Once the new PC is known (in Mem stage) Zero out the instruction field in the IF/ID pipeline register For the instruction in the ID stage, use the signals that were set-up for data dependencies in the Load case For the instruction in the EX stage, zero out the result of the ALU (e.g, make the result register be register $0) 1/2/2019 CSE378 Pipelining hazards

CSE378 Pipelining hazards Optimizations Move up the result of branch execution Do target address computation in ID stage (like in multiple cycle implementation) Comparing registers is “fast”; can be done in first phase of the clock and setting PC in the second phase. Thus we can reduce stalling time by 1 bubble If the forwarding is set up right, 2 bubbles can be saved 1/2/2019 CSE378 Pipelining hazards

CSE378 Pipelining hazards Branch prediction Instead of assuming “branch not taken” you can have a table keeping the history of past branches We’ll see how to build such tables when we study caches History can be restricted to 2-bit “saturating counters” such that it takes two wrong prediction outcomes before changing your prediction If predicted taken, will need only 1 bubble since PC can be computed during ID stage. There even exists schemes where you can predict and not lose any cycle on predicted taken, of course if the prediction is correct Note that if prediction is incorrect, you need to flush the pipe as before 1/2/2019 CSE378 Pipelining hazards

Saturating Counter Predictor Taken Predict Taken Predict Taken Not Taken Taken Taken Not Taken Predict Not Taken Predict Not Taken Not Taken Taken Not Taken 1/2/2019 CSE378 Pipelining hazards

Flushing Instructions 1/2/2019 CSE378 Pipelining hazards

Current trends in microprocessor design Superscalar processors Several pipelines, e.g., integer pipeline(s), floating-point, load/store unit etc Several instructions are fetched and decoded at once. They can be executed concurrently if there are no hazards Out-of-order execution (also called dynamically scheduled processors) While some instructions are stalled because of dependencies or other causes (cache misses, see later), other instructions down he stream can still proceed. However results must be stored in program order! 1/2/2019 CSE378 Pipelining hazards

Current trends (continued) Speculative execution Predict the outcome of branches and continue processing with (of course) a recovery mechanism. Because branches occur so often, the branch prediction mechanisms have become very sophisticated Assume that Load/Stores don’t conflict (of course need to be able to recover) VLIW (or EPIC) (Very Long Instruction Word) In “pure VLIW”, each pipeline (functional unit) is assigned a task at every cycle. The compiler does it. A little less ambitious: have compiler generate long instructions (e.g., using 3 pipes; cf. Intel IA-64 or Itanium) 1/2/2019 CSE378 Pipelining hazards