Dynamic Branch Prediction (Sec 4.3) Control dependences become a limiting factor in exploiting ILP So far, we’ve discussed only static branch prediction.

Slides:

Advertisements

Similar presentations

Instruction-Level Parallelism compiler techniques and branch prediction prepared and Instructed by Shmuel Wimer Eng. Faculty, Bar-Ilan University March.

Advertisements

Lecture 8 Dynamic Branch Prediction, Superscalar and VLIW Advanced Computer Architecture COE 501.

Pipelining and Control Hazards Oct

COMP4611 Tutorial 6 Instruction Level Parallelism

Dynamic Branch PredictionCS510 Computer ArchitecturesLecture Lecture 10 Dynamic Branch Prediction, Superscalar, VLIW, and Software Pipelining.

Pipeline Computer Organization II 1 Hazards Situations that prevent starting the next instruction in the next cycle Structural hazards – A required resource.

Computer Organization and Architecture (AT70.01) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Dr. Sumanta Guha Slide Sources: Based.

CPE 631: Branch Prediction Electrical and Computer Engineering University of Alabama in Huntsville Aleksandar Milenkovic,

Dynamic Branch Prediction

Pipeline Hazards Pipeline hazards These are situations that inhibit that the next instruction can be processed in the next stage of the pipeline. This.

Chapter 8. Pipelining. Instruction Hazards Overview Whenever the stream of instructions supplied by the instruction fetch unit is interrupted, the pipeline.

CPE 731 Advanced Computer Architecture ILP: Part II – Branch Prediction Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University.

CPE 631: Branch Prediction Electrical and Computer Engineering University of Alabama in Huntsville Aleksandar Milenkovic,

W04S1 COMP s1 Seminar 4: Branch Prediction Slides due to David A. Patterson, 2001.

1 Lecture 7: Static ILP, Branch prediction Topics: static ILP wrap-up, bimodal, global, local branch prediction (Sections )

CPSC614 Lec 5.1 Instruction Level Parallelism and Dynamic Execution #4: Based on lectures by Prof. David A. Patterson E. J. Kim.

1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Oct. 8, 2003 Topic: Instruction-Level Parallelism (Dynamic Branch Prediction)

1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Oct. 7, 2002 Topic: Instruction-Level Parallelism (Dynamic Branch Prediction)

EECE476: Computer Architecture Lecture 20: Branch Prediction Chapter extra The University of British ColumbiaEECE 476© 2005 Guy Lemieux.

EECC551 - Shaaban #1 lec # 5 Fall Reduction of Control Hazards (Branch) Stalls with Dynamic Branch Prediction So far we have dealt with.

EENG449b/Savvides Lec /17/04 February 17, 2004 Prof. Andreas Savvides Spring EENG 449bG/CPSC 439bG.

1  2004 Morgan Kaufmann Publishers Chapter Six. 2  2004 Morgan Kaufmann Publishers Pipelining The laundry analogy.

EECC551 - Shaaban #1 lec # 7 Fall Hardware Dynamic Branch Prediction Simplest method: –A branch prediction buffer or Branch History Table.

Computer Architecture Instruction Level Parallelism Dr. Esam Al-Qaralleh.

1 COMP 740: Computer Architecture and Implementation Montek Singh Thu, Feb 19, 2009 Topic: Instruction-Level Parallelism III (Dynamic Branch Prediction)

Dynamic Branch Prediction

EENG449b/Savvides Lec /25/05 March 24, 2005 Prof. Andreas Savvides Spring g449b EENG 449bG/CPSC 439bG.

CIS 429/529 Winter 2007 Branch Prediction.1 Branch Prediction, Multiple Issue.

Spring 2003CSE P5481 Control Hazard Review The nub of the problem: In what pipeline stage does the processor fetch the next instruction? If that instruction.

Pipelined Datapath and Control (Lecture #15) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer.

1 Lecture 7: Branch prediction Topics: bimodal, global, local branch prediction (Sections )

ENGS 116 Lecture 91 Dynamic Branch Prediction and Speculation Vincent H. Berk October 10, 2005 Reading for today: Chapter 3.2 – 3.6 Reading for Wednesday:

CPSC614 Lec 5.1 Instruction Level Parallelism and Dynamic Execution #4: Based on lectures by Prof. David A. Patterson E. J. Kim.

1 Lecture 7: Static ILP and branch prediction Topics: static speculation and branch prediction (Appendix G, Section 2.3)

CS 211: Computer Architecture Lecture 6 Module 2 Exploiting Instruction Level Parallelism with Software Approaches Instructor: Morris Lancaster.

1 Dynamic Branch Prediction. 2 Why do we want to predict branches? MIPS based pipeline – 1 instruction issued per cycle, branch hazard of 1 cycle. –Delayed.

CSCI 6461: Computer Architecture Branch Prediction Instructor: M. Lancaster Corresponding to Hennessey and Patterson Fifth Edition Section 3.3 and Part.

Chapter 6 Pipelined CPU Design. Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides Pipelined operation – laundry analogy Text Fig. 6.1.

CPE 631 Session 17 Branch Prediction Electrical and Computer Engineering University of Alabama in Huntsville.

CS 6290 Branch Prediction. Control Dependencies Branches are very frequent –Approx. 20% of all instructions Can not wait until we know where it goes –Long.

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB ECE232: Hardware Organization and Design Part 13: Branch prediction (Chapter 4/6)

Dynamic Branch Prediction

Instruction-Level Parallelism Dynamic Branch Prediction

CS203 – Advanced Computer Architecture

Concepts and Challenges

Dynamic Branch Prediction

CS 704 Advanced Computer Architecture

ECS 154B Computer Architecture II Spring 2009

CMSC 611: Advanced Computer Architecture

Module 3: Branch Prediction

So far we have dealt with control hazards in instruction pipelines by:

Dynamic Hardware Branch Prediction

CPE 631: Branch Prediction

Chapter 3: ILP and Its Exploitation

Dynamic Branch Prediction

Control unit extension for data hazards

So far we have dealt with control hazards in instruction pipelines by:

Lecture 10: Branch Prediction and Instruction Delivery

So far we have dealt with control hazards in instruction pipelines by:

So far we have dealt with control hazards in instruction pipelines by:

Adapted from the slides of Prof

Dynamic Hardware Prediction

So far we have dealt with control hazards in instruction pipelines by:

So far we have dealt with control hazards in instruction pipelines by:

So far we have dealt with control hazards in instruction pipelines by:

So far we have dealt with control hazards in instruction pipelines by:

So far we have dealt with control hazards in instruction pipelines by:

Procedure Return Predictors

Lecture 7: Branch Prediction, Dynamic ILP

CPE 631 Lecture 12: Branch Prediction

Presentation transcript:

Dynamic Branch Prediction (Sec 4.3) Control dependences become a limiting factor in exploiting ILP So far, we’ve discussed only static branch prediction schemes Here, we talk about using hardware to dynamically predict branch outcome. The effectiveness of a branch prediction scheme depends on –Its accuracy of prediction –Its cost when the prediction is correct and when it is incorrect.

Branch Prediction Buffer In its simplest form, a memory contains a bit, called prediction bit, saying whether the branch was recently taken or not The memory is indexed by the lower portion of the address of the branch instruction The fetching begins in the predicted direction If the prediction is wrong, the prediction bit is inverted The simple one-bit scheme has performance shortcomings (Example on page 263)

Branch Prediction Buffer (Cont’d) Two-bit prediction schemes track the previous two consecutive branches to change the prediction (Fig. 4.13) An n-bit predictor can have an n-bit counter, and a branch prediction can depend on its value The branch prediction buffer is accessed during the IF stage If the instruction is decoded as branch, the next fetch is based on the prediction See Figure 4.14 to see the prediction accuracy Prediction accuracy becomes more important in programs with high branch frequency We may improve prediction accuracy if we also look at the recent behavior of other branches

Consider the following code fragment: If (aa = = 2) aa = 0; If (bb = = 2) bb =0; If (aa ! = bb) { DLX code for the above is SUBI R3, R1, #2 BNEZR3, L1 ;branch b1 (aa !=2) ADDR1, R0, R0 ;aa = = 0 L1:SUBIR3, R2, #2 BNEZR3, L2 ;branch b2 (bb!=2) ADDR2, R0, R0 ;bb= = 0 L2:SUBR3, R1, R2; R3= aa - bb BEQZR3, L3 ;branch b3 (aa = = bb) b3 behavior is correlated with the behavior of b1 & b2 Branch Prediction Buffer (Cont’d)

Correlating Branch Predictors Consider the code: If (d = = 0) d = 1; If(d = = 1) The instruction sequence generated as follows: BNEZR1, L1;b1 (d != 0) ADDIR1, R0, #1;d = = 0 so d = 1 L1:SUBIR3, R1, #1 BNEZR3, L2;branch b2 (d != 1) L2: See Figures 4.26, 4.17, 4.18 and 4.19

Correlating Branch Predictors (cont’d.) (m, n) predictor (Figure 4.20) –Uses the behavior of last ‘m’ branches (global history) –N-bit predictor for a branch –2 m branch predictors to choose from –Global history can be recorded as an n-bit shift register –Concatenate low order bits prove the branch address with m- bit global history (see figure 4.20)

Branch Target Buffers A branch target buffer stores the predicted address for the next instruction The intent is to know the branch target address at the end of the IF stage (see Fig. 4.22) We access the buffer during the IF stage If we get a bit, we fetch the next instruction for the predicted PC value If there is no match, proceed normally A branch predictor field can also be added for extra prediction See Fig. 4.23, Fig 4.24, Do example on page 274

Multiple–Issue Processors So for, we tried to achieve the ideal CPI of 1 How can we improve performance further, to achieve CPI < 1? Multiple-issue processors are used to improve performance further –Superscalar processor: Issue varying numbers of instructions per clock Could be statically scheduled (Sun Ultra SPARC II/III) Or dynamically scheduled (Pentium III/4, MIPSR 10k) –VLIW (Very Large Instruction World) processors Fixed number of instructions per clock Statically scheduled by the compiler (Trimedia, 1860, Itanium)

Superscalar Processors A superscalar processor has dynamic issue capability The hardware may issue from one to eight instruction in a clock cycle Usually the instructions are independent and/or follow certain constraints, such as memory access, etc. If there is a dependency or structural hazard in an instruction, only the preceding instructions are issued