1 Dynamic Branch Prediction. 2 Why do we want to predict branches? MIPS based pipeline – 1 instruction issued per cycle, branch hazard of 1 cycle. –Delayed.

Slides:



Advertisements
Similar presentations
Lecture 8 Dynamic Branch Prediction, Superscalar and VLIW Advanced Computer Architecture COE 501.
Advertisements

Dynamic Branch Prediction (Sec 4.3) Control dependences become a limiting factor in exploiting ILP So far, we’ve discussed only static branch prediction.
Pipelining and Control Hazards Oct
Dynamic Branch PredictionCS510 Computer ArchitecturesLecture Lecture 10 Dynamic Branch Prediction, Superscalar, VLIW, and Software Pipelining.
Computer Organization and Architecture (AT70.01) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Dr. Sumanta Guha Slide Sources: Based.
CPE 631: Branch Prediction Electrical and Computer Engineering University of Alabama in Huntsville Aleksandar Milenkovic,
Dynamic Branch Prediction
Pipeline Hazards Pipeline hazards These are situations that inhibit that the next instruction can be processed in the next stage of the pipeline. This.
CPE 731 Advanced Computer Architecture ILP: Part II – Branch Prediction Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University.
CPE 631: Branch Prediction Electrical and Computer Engineering University of Alabama in Huntsville Aleksandar Milenkovic,
W04S1 COMP s1 Seminar 4: Branch Prediction Slides due to David A. Patterson, 2001.
EECC551 - Shaaban #1 lec # 5 Spring Reduction of Control Hazards (Branch) Stalls with Dynamic Branch Prediction So far we have dealt with.
EECC551 - Shaaban #1 lec # 5 Fall Static Conditional Branch Prediction Branch prediction schemes can be classified into static (at compilation.
EECS 470 Branch Prediction Lecture 6 Coverage: Chapter 3.
1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Oct. 8, 2003 Topic: Instruction-Level Parallelism (Dynamic Branch Prediction)
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Oct. 7, 2002 Topic: Instruction-Level Parallelism (Dynamic Branch Prediction)
1 Lecture 7: Out-of-Order Processors Today: out-of-order pipeline, memory disambiguation, basic branch prediction (Sections 3.4, 3.5, 3.7)
EECE476: Computer Architecture Lecture 20: Branch Prediction Chapter extra The University of British ColumbiaEECE 476© 2005 Guy Lemieux.
EECC551 - Shaaban #1 lec # 5 Fall Reduction of Control Hazards (Branch) Stalls with Dynamic Branch Prediction So far we have dealt with.
EENG449b/Savvides Lec /17/04 February 17, 2004 Prof. Andreas Savvides Spring EENG 449bG/CPSC 439bG.
1 Lecture 8: Branch Prediction, Dynamic ILP Topics: branch prediction, out-of-order processors (Sections )
1 Lecture 5 Branch Prediction (2.3) and Scoreboarding (A.7)
Goal: Reduce the Penalty of Control Hazards
EECC551 - Shaaban #1 lec # 5 Winter Reduction of Control Hazards (Branch) Stalls with Dynamic Branch Prediction So far we have dealt with.
Computer Architecture Instruction Level Parallelism Dr. Esam Al-Qaralleh.
Branch Prediction Dimitris Karteris Rafael Pasvantidιs.
COMP381 by M. Hamdi 1 (Recap) Control Hazards. COMP381 by M. Hamdi 2 Control (Branch) Hazard A: beqz r2, label B: label: P: Problem: The outcome.
1 COMP 740: Computer Architecture and Implementation Montek Singh Thu, Feb 19, 2009 Topic: Instruction-Level Parallelism III (Dynamic Branch Prediction)
So far we have dealt with control hazards in instruction pipelines by:
Dynamic Branch Prediction
EENG449b/Savvides Lec /25/05 March 24, 2005 Prof. Andreas Savvides Spring g449b EENG 449bG/CPSC 439bG.
CIS 429/529 Winter 2007 Branch Prediction.1 Branch Prediction, Multiple Issue.
1 Lecture 7: Branch prediction Topics: bimodal, global, local branch prediction (Sections )
ENGS 116 Lecture 91 Dynamic Branch Prediction and Speculation Vincent H. Berk October 10, 2005 Reading for today: Chapter 3.2 – 3.6 Reading for Wednesday:
EECC551 - Shaaban #1 lec # 5 Fall Static Conditional Branch Prediction Branch prediction schemes can be classified into static and dynamic.
CSCI 6461: Computer Architecture Branch Prediction Instructor: M. Lancaster Corresponding to Hennessey and Patterson Fifth Edition Section 3.3 and Part.
Branch.1 10/14 Branch Prediction Static, Dynamic Branch prediction techniques.
CPE 631 Session 17 Branch Prediction Electrical and Computer Engineering University of Alabama in Huntsville.
Adapted from Computer Organization and Design, Patterson & Hennessy, UCB ECE232: Hardware Organization and Design Part 13: Branch prediction (Chapter 4/6)
Dynamic Branch Prediction
Instruction-Level Parallelism Dynamic Branch Prediction
Instruction-Level Parallelism and Its Dynamic Exploitation
CS203 – Advanced Computer Architecture
Dynamic Branch Prediction
COMP 740: Computer Architecture and Implementation
UNIVERSITY OF MASSACHUSETTS Dept
CS 704 Advanced Computer Architecture
Lecture 6 Score Board And Tomasulo’s Algorithm
CMSC 611: Advanced Computer Architecture
Module 3: Branch Prediction
So far we have dealt with control hazards in instruction pipelines by:
Dynamic Hardware Branch Prediction
CPE 631: Branch Prediction
Branch statistics Branches occur every 4-6 instructions (16-25%) in integer programs; somewhat less frequently in scientific ones Unconditional branches.
Dynamic Branch Prediction
Advanced Computer Architecture
/ Computer Architecture and Design
Pipelining and control flow
So far we have dealt with control hazards in instruction pipelines by:
Lecture 10: Branch Prediction and Instruction Delivery
So far we have dealt with control hazards in instruction pipelines by:
So far we have dealt with control hazards in instruction pipelines by:
Adapted from the slides of Prof
Dynamic Hardware Prediction
So far we have dealt with control hazards in instruction pipelines by:
So far we have dealt with control hazards in instruction pipelines by:
So far we have dealt with control hazards in instruction pipelines by:
So far we have dealt with control hazards in instruction pipelines by:
So far we have dealt with control hazards in instruction pipelines by:
CPE 631 Lecture 12: Branch Prediction
Presentation transcript:

1 Dynamic Branch Prediction

2 Why do we want to predict branches? MIPS based pipeline – 1 instruction issued per cycle, branch hazard of 1 cycle. –Delayed branch Modern processor and next generation – multiple instructions issued per cycle, more branch hazard cycles will incur. –Cost of branch misfetch goes up –Pentium Pro – 3 instructions issued per cycle, 12+ cycle misfetch penalty  HUGE penalty for a misfetched path following a branch

3 Branch Prediction Static Prediction –Always taken, always not taken –Opcode based –Displacement based (forward not taken, backward taken) –Compiler directed (branch likely, branch not likely) Dynamic Prediction – 1 bit predictor – remember last taken/not taken per branch  Use a branch-prediction buffer or branch-history table with 1 bit entry  Use part of the PC (low-order bits) to index buffer/table – Why? –Multiple branches may share the same bit  Invert the bit if the prediction is wrong  Backward branches for loops will be mispredicted twice -- once upon loop exit and then again on loop entry

4 2-bit Branch Prediction Has 4 states instead of 2 A prediction must miss twice before it is changed Good for backward branches of loops

5 Branch History Table 01 BHT branch PC Has limited size 2 bits by N (e.g. 4K entries) Uses low-order bits of branch PC to choose entry Plot misprediction instead of prediction

6 Observations Prediction Accuracy ranges from 99% to 82% or a misprediction rate of 1% to 18% Misprediction for integer programs (gcc, espresso, eqntott, li) is substantially higher than FP programs (nasa7, matrix300, tomcatv, doduc, spice, fppp) Branch penalty involves both misprediction rate and branch frequency, and is higher for integer benchmarks Prediction accuracy improves with buffer size, but does not improve beyond 4K entries

7 Correlating or Two-level Predictors Correlating branch predictors also look at other branches for clues. Consider the following example. if (aa==2) -- branch b1 aa = 0; if (bb==2) --- branch b2 bb = 0; if(aa!=bb) { … --- branch b3 – Clearly depends on the results of b1 and b2 Prediction if the last branch is NT Prediction if the last branch is T (1,1) predictor – uses history of 1 branch and uses a 1-bit predictor

8 Another Example If (d==0) d=1; If (d=1) --- Code Sequence assuming d is assigned to R1: BNEZ R1, L1 ; branch b1 (d!=0) DADDU R1,R0,#1 ; d==0, so d=1 L1: DADDIU R3,R1,#-1 BNEZ R3,L2 ; branch b2 (d!=1) Possible Execution Sequence for the code fragment; Initial d d==0? B1 d before b2 d==1? b2 0 yes not taken 1 yes not taken 1 no taken 1 yes not taken 2 no taken Clearly, if b1 is not taken b2 will not be taken => correlation

9 Correlating Branch Predictor If we use 2 branches as histories, then there are 4 possibilities (T-T, NT-T, NT-NT, NT-T). For each possibility, we need to use a predictor (1-bit or 2-bit). And this repeats for every branch. (2,2) branch prediction

10 Performance of Correlating Branch Prediction With same number of state bits, (2,2) performs better than non- correlating 2-bit predictor. Outperforms a 2-bit predictor with infinite number of entries

11 General (m,n) Branch Predictors The global history register is an m-bit shift register that records the last m branches encountered by the processor Usually use both the PC address and the GHR (2-level) 01 m-bit ghr 00 n-bit predictors PC Combining funciton

12 Is Branch Predictor Enough? When is using branch prediction beneficial? –When the outcome is known later than the target If we predict the branch as taken, and suppose that is correct, what is the target address? –Need a mechanism to provide target address as well Can we eliminate the one cycle delay for the 5-stage pipeline? –Need to fetch from branch target immediately after branch

13 Branch Target Buffer (BTB) BTB is a cache that contains the predicted PC value for taken branches. Is the current instruction a branch ? BTB provides the answer before the current instruction is decoded and therefore enables fetching to begin after IF-stage. What is the branch target ? BTB provides the branch target if the prediction is a taken direct branch (for not taken branches the target is simply PC+4 ).

BTB

15 BTB operations

16 BTB Performance Two things can go wrong –BTB miss (misfetch) –Mispredicted a branch (mispredict) Suppose for branches, BTB hit rate of 85% and predict accuracy of 90%, misfetch penalty of 2 cycles and mispredict penalty of 5 cycles. What is the average branch penalty? 2*(15%) + 5*(85%*10%) BTB and BPT can be used together to perform better prediction – for example 2 bit predictor can be used for branch prediction.

17 Integrated Instruction Fetch Unit Separate out IF from the pipeline and integrate with the following components. So, the pipeline consists of Issue, Read, EX, and WB (scoreboarding) ; Or Issue, EX and WB stages (Tomasulo). 1.Integrate Branch Prediction – Branch predictor is part of the IFU. 2.Instruction Prefetch – Fetch instruction from instruction memory ahead of PC computation with the help of branch predictor & store in a prefetch buffer. 3.Instruction Memory Access and Buffering - Keep on filling the Instruction Queue independent of the execution => Decoupled Execution.

18 Branch Prediction Summary The better we predict, the lesser penalty we might incur 2-bit predictors capture branch behavior well Correlating predictors improve accuracy, particularly when combined with 2-bit predictors Accurate branch prediction does no good if we don’t know there was a branch to predict BTB identifies branches in IF stage BTB combined with branch prediction table