Arvind and Joel Emer Computer Science and Artificial Intelligence Laboratory M.I.T. Branch Prediction.

Slides:



Advertisements
Similar presentations
Pipelining V Topics Branch prediction State machine design Systems I.
Advertisements

Pipelining and Control Hazards Oct
Pipeline Computer Organization II 1 Hazards Situations that prevent starting the next instruction in the next cycle Structural hazards – A required resource.
Dynamic Branch Prediction
Pipeline Hazards Pipeline hazards These are situations that inhibit that the next instruction can be processed in the next stage of the pipeline. This.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture ILP II Steve Ko Computer Sciences and Engineering University at Buffalo.
CPE 731 Advanced Computer Architecture ILP: Part II – Branch Prediction Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University.
W04S1 COMP s1 Seminar 4: Branch Prediction Slides due to David A. Patterson, 2001.
EECE476: Computer Architecture Lecture 21: Faster Branches Branch Prediction with Branch-Target Buffers (not in textbook) The University of British ColumbiaEECE.
CS 152 Computer Architecture and Engineering Lecture 13 - Out-of-Order Issue, Register Renaming, & Branch Prediction Krste Asanovic Electrical Engineering.
CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture ILP III Steve Ko Computer Sciences and Engineering University at Buffalo.
March 11, 2010CS152, Spring 2010 CS 152 Computer Architecture and Engineering Lecture 14 - Advanced Superscalars Krste Asanovic Electrical Engineering.
CS252 Graduate Computer Architecture Lecture 9 Prediction (Con’t) (Dependencies, Load Values, Data Values) February 22 nd, 2010 John Kubiatowicz Electrical.
CS 152 Computer Architecture and Engineering Lecture 14 - Advanced Superscalars Krste Asanovic Electrical Engineering and Computer Sciences University.
ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )
1 Lecture 7: Out-of-Order Processors Today: out-of-order pipeline, memory disambiguation, basic branch prediction (Sections 3.4, 3.5, 3.7)
March 4, 2010CS152, Spring 2010 CS 152 Computer Architecture and Engineering Lecture 13 - Out-of-Order Issue, Register Renaming, & Branch Prediction Krste.
CS 252 Graduate Computer Architecture Lecture 5: Instruction-Level Parallelism (Part 2) Krste Asanovic Electrical Engineering and Computer Sciences University.
Goal: Reduce the Penalty of Control Hazards
1 Lecture 18: Pipelining Today’s topics:  Hazards and instruction scheduling  Branch prediction  Out-of-order execution Reminder:  Assignment 7 will.
The PowerPC Architecture  IBM, Motorola, and Apple Alliance  Based on the IBM POWER Architecture ­Facilitate parallel execution ­Scale well with advancing.
Dynamic Branch Prediction
Pipelined Datapath and Control (Lecture #15) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer.
March 2, 2011CS152, Spring 2011 CS 152 Computer Architecture and Engineering Lecture 11 - Out-of-Order Issue, Register Renaming, & Branch Prediction Krste.
Computer Architecture: A Constructive Approach Branch Prediction - 1 Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of.
Evaluation of Dynamic Branch Prediction Schemes in a MIPS Pipeline Debajit Bhattacharya Ali JavadiAbhari ELE 475 Final Project 9 th May, 2012.
© Krste Asanovic, 2014CS252, Spring 2014, Lecture 7 CS252 Graduate Computer Architecture Spring 2014 Lecture 7: Branch Prediction and Load-Store Queues.
Computer Architecture: A Constructive Approach Branch Direction Prediction – Six Stage Pipeline Joel Emer Computer Science & Artificial Intelligence Lab.
February 2, 2012CS152, Spring 2012 CS 152 Computer Architecture and Engineering Lecture 5 - Pipelining II (Branches, Exceptions) Krste Asanovic Electrical.
Krste Asanovic Electrical Engineering and Computer Sciences
1 Dynamic Branch Prediction. 2 Why do we want to predict branches? MIPS based pipeline – 1 instruction issued per cycle, branch hazard of 1 cycle. –Delayed.
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst Pipelining Basics.
Branch.1 10/14 Branch Prediction Static, Dynamic Branch prediction techniques.
© Krste Asanovic, 2015CS252, Fall 2015, Lecture 7 CS252 Graduate Computer Architecture Spring 2014 Lecture 7: Advanced Out-of-Order Superscalar Designs.
Out-of-Order Execution & Register Renaming Krste Asanovic Laboratory for Computer Science Massachusetts Institute of Technology Asanovic/Devadas Spring.
Superscalar - summary Superscalar machines have multiple functional units (FUs) eg 2 x integer ALU, 1 x FPU, 1 x branch, 1 x load/store Requires complex.
Computer Architecture: A Constructive Approach Next Address Prediction – Six Stage Pipeline Joel Emer Computer Science & Artificial Intelligence Lab. Massachusetts.
Yiorgos Makris Professor Department of Electrical Engineering University of Texas at Dallas EE (CE) 6304 Computer Architecture Lecture #12 (10/27/15) Course.
Branch Prediction Prof. Mikko H. Lipasti University of Wisconsin-Madison Lecture notes based on notes by John P. Shen Updated by Mikko Lipasti.
Computer Architecture: A Constructive Approach Branch Direction Prediction – Pipeline Integration Joel Emer Computer Science & Artificial Intelligence.
Adapted from Computer Organization and Design, Patterson & Hennessy, UCB ECE232: Hardware Organization and Design Part 13: Branch prediction (Chapter 4/6)
COMPSYS 304 Computer Architecture Speculation & Branching Morning visitors - Paradise Bay, Bay of Islands.
PROCESSOR PIPELINING YASSER MOHAMMAD. SINGLE DATAPATH DESIGN.
Yiorgos Makris Professor Department of Electrical Engineering University of Texas at Dallas EE (CE) 6304 Computer Architecture Lecture #13 (10/28/15) Course.
Computer Architecture: A Constructive Approach Data Hazards and Multistage Pipelines Teacher: Yoav Etsion Taken (with permission) from Arvind et al.*,
Dynamic Branch Prediction
CS203 – Advanced Computer Architecture
PowerPC 604 Superscalar Microprocessor
CS252 Graduate Computer Architecture Spring 2014 Lecture 8: Advanced Out-of-Order Superscalar Designs Part-II Krste Asanovic
Branch Prediction Constructive Computer Architecture: Arvind
Chapter 4 The Processor Part 4
CMSC 611: Advanced Computer Architecture
The processor: Pipelining and Branching
Lecture 19: Branches, OOO Today’s topics: Instruction scheduling
CS252 Graduate Computer Architecture Lecture 8 Prediction/Speculation (Branches, Return Addrs) February 14th, 2011 John Kubiatowicz Electrical Engineering.
Branch statistics Branches occur every 4-6 instructions (16-25%) in integer programs; somewhat less frequently in scientific ones Unconditional branches.
CS 152 Computer Architecture and Engineering Lecture 13 - Out-of-Order Issue, Register Renaming, & Branch Prediction Krste Asanovic Electrical Engineering.
Branch Prediction Constructive Computer Architecture: Arvind
Krste Asanovic Electrical Engineering and Computer Sciences
Dynamic Branch Prediction
Branch Prediction: Direction Predictors
Branch Prediction: Direction Predictors
/ Computer Architecture and Design
Control unit extension for data hazards
Lecture 10: Branch Prediction and Instruction Delivery
Branch Prediction: Direction Predictors
Adapted from the slides of Prof
Pipelining (II).
Dynamic Hardware Prediction
CS 152 Computer Architecture and Engineering CS252 Graduate Computer Architecture Lecture 12 – Branch Prediction and Advanced Out-of-Order Superscalars.
Presentation transcript:

Arvind and Joel Emer Computer Science and Artificial Intelligence Laboratory M.I.T. Branch Prediction

Arvind & Emer L12-2 I-cache Fetch Buffer Issue Buffer Func. Units Arch. State Execute Decode Result Buffer Commit PC Fetch Branch executed Next fetch started Modern processors may have > 10 pipeline stages between next PC calculation and branch resolution ! Control Flow Penalty How much work is lost if pipeline doesn’t follow correct instruction flow ? ~ Loop length x pipeline width

Arvind & Emer L12-3 Average Run-Length between Branches Average dynamic instruction mix from SPEC92: SPECint92 SPECfp92 ALU39 %13 % FPU Add 20 % FPU Mult13 % load26 %23 % store 9 % 9 % branch16 % 8 % other10 %12 % SPECint92: compress, eqntott, espresso, gcc, li SPECfp92: doduc, ear, hydro2d, mdijdp2, su2cor What is the averagerun length between branches

Arvind & Emer L12-4 InstructionTaken known?Target known? J JR BEQZ/BNEZ MIPS Branches and Jumps Each instruction fetch depends on one or two pieces of information from the preceding instruction: 1) Is the preceding instruction a taken branch? 2) If so, what is the target address? After Reg. Fetch * After Inst. Decode After Reg. Fetch * Assuming zero detect on register read

Arvind & Emer L12-5 Branch Prediction Bits Assume 2 BP bits per instruction Use saturating counter On ¬taken   On taken 11Strongly taken 10Weakly taken 01Weakly ¬taken 00Strongly ¬taken

Arvind & Emer L12-6 Branch History Table 4K-entry BHT, 2 bits/entry, ~80-90% correct predictions 00 Fetch PC Branch? Target PC + I-Cache Opcodeoffset Instruction k BHT Index 2 k -entry BHT, 2 bits/entry Taken/¬Taken?

Arvind & Emer L12-7 Overview of branch prediction PCPC Need next PC immediately Decode Reg Read Execute Instr type, PC relative targets available Simple conditions, register targets available Complex conditions available BTB BP, JMP, Ret Loose loop Tight loop Must speculation check always be correct? No… Best predictors reflect program behavior

Arvind & Emer L12-8 Branch Target Buffer BP bits are stored with the predicted target address. IF stage: If (BP=taken) then nPC=target else nPC=PC+4 later: check prediction, if wrong then kill the instruction and update BTB & BPb else update BPb IMEM PC Branch Target Buffer (2 k entries) k BPb predicted targetBP target

Arvind & Emer L12-9 Address Collisions What will be fetched after the instruction at 1028? BTB prediction= Correct target=  Assume a 128-entry BTB BPb target take Add Jump 100 Instruction Memory kill PC=236 and fetch PC=1032 Is this a common occurrence? Can we avoid these bubbles?

Arvind & Emer L12-10 BTB is only for Control Instructions BTB contains useful information for branch and jump instructions only  Do not update it for other instructions For all other instructions the next PC is (PC)+4 ! How to achieve this effect without decoding the instruction?

Arvind & Emer L12-11 Branch Target Buffer (BTB) Keep both the branch PC and target PC in the BTB PC+4 is fetched if match fails Only taken branches and jumps held in BTB Next PC determined before branch fetched and decoded 2 k -entry direct-mapped BTB (can also be associative) I-Cache PC k Valid valid Entry PC = match predicted target target PC

Arvind & Emer L12-12 Consulting BTB Before Decoding 1028 Add Jump 100 BPb target take 236 entry PC 132 The match for PC=1028 fails and is fetched  eliminates false predictions after ALU instructions BTB contains entries only for control transfer instructions  more room to store branch targets