Computer Organization Lecture Set – 06 Chapter 6 Huei-Yung Lin.

Slides:



Advertisements
Similar presentations
Pipeline Hazards CS365 Lecture 10. D. Barbara Pipeline Hazards CS465 2 Review  Pipelined CPU  Overlapped execution of multiple instructions  Each on.
Advertisements

ECE 445 – Computer Organization
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Pipelined Processor.
Part 2 - Data Hazards and Forwarding 3/24/04++
Review: MIPS Pipeline Data and Control Paths
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
Computer Organization
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 19 - Pipelined.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 18 - Pipelined.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania Computer Organization Pipelined Processor Design 1.
©UCB CS 162 Computer Architecture Lecture 3: Pipelining Contd. Instructor: L.N. Bhuyan
Chapter Six Enhancing Performance with Pipelining
1 Stalling  The easiest solution is to stall the pipeline  We could delay the AND instruction by introducing a one-cycle delay into the pipeline, sometimes.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania Computer Organization Pipelined Processor Design 3.
 The actual result $1 - $3 is computed in clock cycle 3, before it’s needed in cycles 4 and 5  We forward that value to later instructions, to prevent.
1 CSE SUNY New Paltz Chapter Six Enhancing Performance with Pipelining.
Pipelined Datapath and Control (Lecture #15) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer.
Lecture 28: Chapter 4 Today’s topic –Data Hazards –Forwarding 1.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 17 - Pipelined.
Lecture 15: Pipelining and Hazards CS 2011 Fall 2014, Dr. Rozier.
Enhancing Performance with Pipelining Slides developed by Rami Abielmona and modified by Miodrag Bolic High-Level Computer Systems Design.
1 Pipelining Reconsider the data path we just did Each instruction takes from 3 to 5 clock cycles However, there are parts of hardware that are idle many.
Pipeline Data Hazards: Detection and Circumvention Adapted from Computer Organization and Design, Patterson & Hennessy, © 2005, and from slides kindly.
Pipelined Datapath and Control
CPE432 Chapter 4B.1Dr. W. Abu-Sufah, UJ Chapter 4B: The Processor, Part B-2 Read Section 4.7 Adapted from Slides by Prof. Mary Jane Irwin, Penn State University.
Chapter 4 CSF 2009 The processor: Pipelining. Performance Issues Longest delay determines clock period – Critical path: load instruction – Instruction.
Chapter 4 The Processor CprE 381 Computer Organization and Assembly Level Programming, Fall 2012 Revised from original slides provided by MKP.
Basic Pipelining & MIPS Pipelining Chapter 6 [Computer Organization and Design, © 2007 Patterson (UCB) & Hennessy (Stanford), & Slides Adapted from: Mary.
CMPE 421 Parallel Computer Architecture Part 2: Hardware Solution: Forwarding.
1 (Based on text: David A. Patterson & John L. Hennessy, Computer Organization and Design: The Hardware/Software Interface, 3 rd Ed., Morgan Kaufmann,
CSE431 L07 Overcoming Data Hazards.1Irwin, PSU, 2005 CSE 431 Computer Architecture Fall 2005 Lecture 07: Overcoming Data Hazards Mary Jane Irwin (
Computing Systems Pipelining: enhancing performance.
University of Texas at Austin CS352H - Computer Systems Architecture Fall 2009 Don Fussell CS352H: Computer Systems Architecture Topic 9: MIPS Pipeline.
CSIE30300 Computer Architecture Unit 05: Overcoming Data Hazards Hsin-Chou Chi [Adapted from material by and
PROCESSOR PIPELINING YASSER MOHAMMAD. SINGLE DATAPATH DESIGN.
CPE432 Chapter 4B.1Dr. W. Abu-Sufah, UJ Chapter 4B: The Processor, Part B-1 Read Sections 4.7 Adapted from Slides by Prof. Mary Jane Irwin, Penn State.
Csci 136 Computer Architecture II – Superscalar and Dynamic Pipelining Xiuzhen Cheng
LECTURE 9 Pipeline Hazards. PIPELINED DATAPATH AND CONTROL In the previous lecture, we finalized the pipelined datapath for instruction sequences which.
Pipelining: Implementation CPSC 252 Computer Organization Ellen Walker, Hiram College.
CSE 340 Computer Architecture Spring 2016 Overcoming Data Hazards.
Note how everything goes left to right, except …
Morgan Kaufmann Publishers The Processor
Single Clock Datapath With Control
Morgan Kaufmann Publishers The Processor
ECS 154B Computer Architecture II Spring 2009
ECS 154B Computer Architecture II Spring 2009
ECE232: Hardware Organization and Design
Morgan Kaufmann Publishers The Processor
Data Hazards and Stalls
Forwarding Now, we’ll introduce some problems that data hazards can cause for our pipelined processor, and show how to handle them with forwarding.
Chapter 4 The Processor Part 3
Review: MIPS Pipeline Data and Control Paths
Morgan Kaufmann Publishers The Processor
Csci 136 Computer Architecture II – Data Hazard, Forwarding, Stall
Morgan Kaufmann Publishers The Processor
Morgan Kaufmann Publishers Enhancing Performance with Pipelining
Computer Organization CS224
Pipelining in more detail
Pipelined Control (Simplified)
Control unit extension for data hazards
The Processor Lecture 3.5: Data Hazards
CSC3050 – Computer Architecture
Pipelining (II).
Control unit extension for data hazards
Morgan Kaufmann Publishers The Processor
Control unit extension for data hazards
Pipelined Datapath Lecture notes from MKP, H. H. Lee and S. Yalamanchili.
Systems Architecture II
©2003 Craig Zilles (derived from slides by Howard Huang)
ELEC / Computer Architecture and Design Spring 2015 Pipeline Control and Performance (Chapter 6) Vishwani D. Agrawal James J. Danaher.
Presentation transcript:

Computer Organization Lecture Set – 06 Chapter 6 Huei-Yung Lin

H.Y. Lin, CCUEE Computer Organization 2 Data Hazards Revisited … Data hazards occur when data is used before it is stored (Fig. 6.28)

H.Y. Lin, CCUEE Computer Organization 3 Data Hazard Solution: Forwarding Key idea: connect data internally before it's stored (Fig. 6.29)

H.Y. Lin, CCUEE Computer Organization 4 Data Hazard Solution: Forwarding Add hardware to feed back ALU and MEM results to both ALU inputs (Fig. 6.32)

H.Y. Lin, CCUEE Computer Organization 5 Controlling Forwarding Need to test when register numbers match in rs, rt, and rd fields stored in pipeline registers "EX" hazard:  EX/MEM – test whether instruction writes register file and examine rd register  ID/EX – test whether instruction reads rs or rt register and matches rd register in EX/MEM "MEM" hazard:  MEM/WB – test whether instruction writes register file and examine rd (rt) register  ID/EX – test whether instruction reads rs or rt register and matches rd (rt) register in EX/MEM

H.Y. Lin, CCUEE Computer Organization 6 Forwarding Unit Detail – EX Hazard if (EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0) and (EX/MEM.RegisterRd = ID/EX.RegisterRs)) ForwardA = 10 if (EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0) and (EX/MEM.RegisterRd = ID/EX.RegisterRt)) ForwardB = 10

H.Y. Lin, CCUEE Computer Organization 7 Forwarding Unit Detail – MEM Hazard if (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0) and (MEM/WB.RegisterRd = ID/EX.RegisterRs)) ForwardA = 01 if (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0) and (MEM/WB.RegisterRd = ID/EX.RegisterRt)) ForwardB = 01

H.Y. Lin, CCUEE Computer Organization 8 EX Hazard Complication What if we a register is changed more than once?  add $1, $1, $2;  add $1, $1, $3;  add $1, $1, $4; Answer: forward most recent result (in MEM stage)

H.Y. Lin, CCUEE Computer Organization 9 Data Hazards and Stalls We still have to stall when register is loaded from memory and used in following instruction (Fig. 6.34)

H.Y. Lin, CCUEE Computer Organization 10 Data Hazards and Stalls Add a hazard detection unit to detect this and stall (Fig. 6.35) Typo: Should read and

H.Y. Lin, CCUEE Computer Organization 11 (Fig. 6.36) Pipelined Processor with Hazard Detection

H.Y. Lin, CCUEE Computer Organization 12 if (ID/EX.MemRead and ((ID/EX.RegisterRt = IF/ID.RegisterRs) or ((ID/EX.RegisterRt = IF/ID.RegisterRt))) stall Hazard Detection Unit – Control Detail

H.Y. Lin, CCUEE Computer Organization 13 Hazard Detection Unit – What Happens MUX zeros out control signals for instruction in ID  “squashes” the instruction  “no-op” (nop) propagates through following stages IF/ID holds stalled instruction until next clock cycle PC holds current value until next clock cycle (re-loads first instruction)

H.Y. Lin, CCUEE Computer Organization 14 Branch Hazards Just stalling for each branch is not practical Common assumption: branch not taken When assumption fails: flush three instructions (Fig. 6.37)

H.Y. Lin, CCUEE Computer Organization 15 Reducing Branch Delay Key idea: move branch logic to ID stage of pipeline  New adder calculates branch target (PC extend(IMM) << 2)  New hardware tests rs == rt after register read  Add flush signal to squash instruction in IF/ID register Reduced penalty (1 cycle) when branch taken Example: Figure 6.38, p. 420

H.Y. Lin, CCUEE Computer Organization 16 Pipelining Outline Introduction Pipelined Processor Design  Datapath  Control  Dealing with Hazards & Forwarding  Branch Prediction   Exceptions  Performance Advanced Pipelining  Superscalar  Dynamic Pipelining  Examples

H.Y. Lin, CCUEE Computer Organization 17 Branch Prediction Key idea: instead of always assuming branch not taken, use a prediction based on previous history  Branch history table: small memory Index using lower bits instruction address Save “what happened” on last execution  branch taken OR  branch not taken  Use history to make prediction

H.Y. Lin, CCUEE Computer Organization 18 More about Branch Prediction Consider nested loops: for (i=1; i<M; i++) oloop:... for (j=1; j<N; j++) { iloop: } bne $1,$2, iloop } bne $3,$4, oloop Prediction fails on fast and last branch (Why?) More history can improve performance

H.Y. Lin, CCUEE Computer Organization 19 Branch Prediction with 2-Bit History Key idea: must be wrong twice before changing prediction

H.Y. Lin, CCUEE Computer Organization 20 Pipelining Outline Introduction Pipelined Processor Design  Datapath  Control  Dealing with Hazards & Forwarding  Branch Prediction  Exceptions   Performance

H.Y. Lin, CCUEE Computer Organization 21 Pipelining and Exceptions Exceptions require suspension of execution Complicating factors  Several instructions are in pipeline  Exception may occur before instruction is complete  Must flush pipeline to suspend execution, but may lose information about the exception Exceptions make life difficult – take a computer architecture course to learn more.

H.Y. Lin, CCUEE Computer Organization 22 Pipelining Outline Introduction Pipelined Processor Design  Datapath  Control  Dealing with Hazards & Forwarding  Branch Prediction  Exceptions  Performance 

H.Y. Lin, CCUEE Computer Organization 23 Use “gcc” instruction mix to calculate CPI lw25%1 cycle (2 cycles when load-use hazard) sw10%1 cycle R-type52%1 cycle branch11%1 cycle (2 when prediction wrong) jump2%2 cycles Assmptions:  50% of load instructions are followed by immed. use  25% of branch predictions are wrong Calculating CPI  CPI = (1.5 cycles * 0.25) + (1 cycle * 0.10) + (1 cycle * 0.52) + (1.25 cycles * 0.11) + (2 cycles * 0.02)  CPI = 1.17 cycles per instruction Performance of the Pipelined Implementation

H.Y. Lin, CCUEE Computer Organization 24 Calculate the average execution time: Pipelined1.17 CPI * 200ps/clock= 234ps Single-Cycle 1 CPI * 600ps/clock=600ps Multicycle4.12 CPI * 200ps / clock=824ps Speedup of pipelined implementation  2.56  faster than single cycle  3.4  faster than multicycle “Your mileage may differ” as instruction mix changes Performance of the Pipelined Implementation

H.Y. Lin, CCUEE Computer Organization 25 References Portions of these slides are derived from:  Textbook figures © 1998 Morgan Kaufmann Publishers all rights reserved  Tod Amon's COD2e Slides © 1998 Morgan Kaufmann Publishers all rights reserved  Dave Patterson’s CS 152 Slides – Fall 1997 © UCB  Rob Rutenbar’s Slides – Fall 1999 CMU  John Nestor’s ECE 313 Slides – Fall 2004 LC  T.S. Chang’s DEE 1050 Slides – Fall 2004 NCTU  Other sources as noted