TAGE-SC-L Branch Predictors

Slides:



Advertisements
Similar presentations
André Seznec Caps Team IRISA/INRIA 1 Looking for limits in branch prediction with the GTL predictor André Seznec IRISA/INRIA/HIPEAC.
Advertisements

H-Pattern: A Hybrid Pattern Based Dynamic Branch Predictor with Performance Based Adaptation Samir Otiv Second Year Undergraduate Kaushik Garikipati Second.
Branch prediction Titov Alexander MDSP November, 2009.
Pipelining V Topics Branch prediction State machine design Systems I.
Dynamic History-Length Fitting: A third level of adaptivity for branch prediction Toni Juan Sanji Sanjeevan Juan J. Navarro Department of Computer Architecture.
Lecture 8 Dynamic Branch Prediction, Superscalar and VLIW Advanced Computer Architecture COE 501.
Exploring Correlation for Indirect Branch Prediction 1 Nikunj Bhansali, Chintan Panirwala, Huiyang Zhou Department of Electrical and Computer Engineering.
André Seznec Caps Team IRISA/INRIA 1 The O-GEHL branch predictor Optimized GEometric History Length André Seznec IRISA/INRIA/HIPEAC.
Computer Organization and Architecture (AT70.01) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Dr. Sumanta Guha Slide Sources: Based.
Dynamic Branch Prediction
André Seznec Caps Team IRISA/INRIA Design tradeoffs for the Alpha EV8 Conditional Branch Predictor André Seznec, IRISA/INRIA Stephen Felix, Intel Venkata.
Yue Hu David M. Koppelman Lu Peng A Penalty-Sensitive Branch Predictor Department of Electrical and Computer Engineering Louisiana State University.
A PPM-like, tag-based predictor Pierre Michaud. 2 Main characteristics global history based 5 tables –one 4k-entry bimodal (indexed with PC) –four 1k-entry.
CPE 731 Advanced Computer Architecture ILP: Part II – Branch Prediction Dr. Gheith Abandah Adapted from the slides of Prof. David Patterson, University.
1 Lecture: Branch Prediction Topics: branch prediction, bimodal/global/local/tournament predictors, branch target buffer (Section 3.3, notes on class webpage)
Computer Architecture 2011 – Branch Prediction 1 Computer Architecture Advanced Branch Prediction Lihu Rappoport and Adi Yoaz.
CS 7810 Lecture 7 Trace Cache: A Low Latency Approach to High Bandwidth Instruction Fetching E. Rotenberg, S. Bennett, J.E. Smith Proceedings of MICRO-29.
1 Improving Branch Prediction by Dynamic Dataflow-based Identification of Correlation Branches from a Larger Global History CSE 340 Project Presentation.
VLSI Project Neural Networks based Branch Prediction Alexander ZlotnikMarcel Apfelbaum Supervised by: Michael Behar, Spring 2005.
Lecture 4: Branch Predictors. Direction: 0 or 1 Target: 32- or 64-bit value Turns out targets are generally easier to predict –Don’t need to predict NT.
Branch Target Buffers BPB: Tag + Prediction
Computer Architecture Instruction Level Parallelism Dr. Esam Al-Qaralleh.
1 Lecture 8: Branch Prediction, Dynamic ILP Topics: static speculation and branch prediction (Sections )
Dynamic Branch Prediction
Prophet/Critic Hybrid Branch Prediction Falcon, Stark, Ramirez, Lai, Valero Presenter: Christian Wanamaker.
CIS 429/529 Winter 2007 Branch Prediction.1 Branch Prediction, Multiple Issue.
Evaluation of Dynamic Branch Prediction Schemes in a MIPS Pipeline Debajit Bhattacharya Ali JavadiAbhari ELE 475 Final Project 9 th May, 2012.
Optimized Hybrid Scaled Neural Analog Predictor Daniel A. Jiménez Department of Computer Science The University of Texas at San Antonio.
1 Storage Free Confidence Estimator for the TAGE predictor André Seznec IRISA/INRIA.
1 A 64 Kbytes ITTAGE indirect branch predictor André Seznec INRIA/IRISA.
Analysis of Branch Predictors
Low Power Cache Design M.Bilal Paracha Hisham Chowdhury Ali Raza.
1 Two research studies related to branch prediction and instruction sequencing André Seznec INRIA/IRISA.
André Seznec Caps Team IRISA/INRIA 1 Analysis of the O-GEHL branch predictor Optimized GEometric History Length André Seznec IRISA/INRIA/HIPEAC.
1 A New Case for the TAGE Predictor André Seznec INRIA/IRISA.
1 Revisiting the perceptron predictor André Seznec IRISA/ INRIA.
CSCI 6461: Computer Architecture Branch Prediction Instructor: M. Lancaster Corresponding to Hennessey and Patterson Fifth Edition Section 3.3 and Part.
Not- Taken? Taken? The Frankenpredictor Gabriel H. Loh Georgia Tech College of Computing MICRO Dec 5, 2004.
T-BAG: Bootstrap Aggregating the TAGE Predictor Ibrahim Burak Karsli, Resit Sendag University of Rhode Island.
Computer Structure Advanced Branch Prediction
André Seznec Caps Team IRISA/INRIA 1 A 256 Kbits L-TAGE branch predictor André Seznec IRISA/INRIA/HIPEAC.
CS 6290 Branch Prediction. Control Dependencies Branches are very frequent –Approx. 20% of all instructions Can not wait until we know where it goes –Long.
1 The Inner Most Loop Iteration counter a new dimension in branch history André Seznec, Joshua San Miguel, Jorge Albericio.
Temporal Stream Branch Predictor (TS Predictor) Yongming Shen, Michael Ferdman.
Branch Prediction Perspectives Using Machine Learning Veerle Desmet Ghent University.
André Seznec Caps Team IRISA/INRIA 1 Analysis of the O-GEHL branch predictor Optimized GEometric History Length André Seznec IRISA/INRIA/HIPEAC.
FAT predictor Sabareesh Ganapathy, Prasanna Venkatesh Srinivasan, Maribel Monica.
Value Prediction Kyaw Kyaw, Min Pan Final Project.
Samira Khan University of Virginia April 12, 2016
Multiperspective Perceptron Predictor Daniel A. Jiménez Department of Computer Science & Engineering Texas A&M University.
CS203 – Advanced Computer Architecture
Computer Structure Advanced Branch Prediction
Computer Architecture Advanced Branch Prediction
Multiperspective Perceptron Predictor with TAGE
COSC3330 Computer Architecture Lecture 15. Branch Prediction
Dynamically Sizing the TAGE Branch Predictor
CS 704 Advanced Computer Architecture
FA-TAGE Frequency Aware TAgged GEometric History Length Branch Predictor Boyu Zhang, Christopher Bodden, Dillon Skeehan ECE/CS 752 Advanced Computer Architecture.
CMSC 611: Advanced Computer Architecture
Exploring Value Prediction with the EVES predictor
Looking for limits in branch prediction with the GTL predictor
Scaled Neural Indirect Predictor
Dynamic Branch Prediction
Lecture 10: Branch Prediction and Instruction Delivery
TAGE-SC-L Again MTAGE-SC
5th JILP Workshop on Computer Architecture Competitions
Pipelining: dynamic branch prediction Prof. Eric Rotenberg
Adapted from the slides of Prof
Dynamic Hardware Prediction
The O-GEHL branch predictor
Presentation transcript:

TAGE-SC-L Branch Predictors André Seznec INRIA/IRISA

The TAGE-SC-L branch predictor Sorry, nothing really new .. TAGE, JILP 2006 Considered as state-of-the-art global history predictor Can be augmented with small adjunct predictors Loop predictor: CBP-2 (2006) Statistical Corrector + Loop Predictor, Global history CBP-3 (2011) Local history Micro 2011

Optimized all parameters Number, size, width of the tables Types of the histories for the statistical components All that for decreasing the misprediction number by 3% !!

Global, local, skeleton histories (Main) TAGE Predictor Stat. Cor. Prediction + Confidence Loop Predictor PPC +Global history Global, local, skeleton histories

TAGE: multiple tables, global history predictor The set of history lengths forms a geometric series Capture correlation on very long histories {0, 2, 4, 8, 16, 32, 64, 128} most of the storage for short history !!

TAGE: Tagged and prediction by the longest history matching entry pc h[0:L1] ctr u tag =? prediction h[0:L2] h[0:L3] 1 Tagless base predictor

=? 1 Hit Altpred Pred Miss

Prediction computation General case: Longest matching component provides the prediction Special case: Many mispredictions on newly allocated entries: weak Ctr On many applications, Altpred more accurate than Pred Property dynamically monitored through 4-bit counters

A tagged table entry Ctr: 3-bit prediction counter U: 2-bit counters Was the entry recently useful ? Tag: partial tag Tag Ctr U

Allocate entries on mispredictions Allocate entries in longer history length tables On tables with U unset Set Ctr to Weak and U to 0 Limited storage budget: Allocate 2 entries for 256Kbits Allocate 1 or 2 for 32Kbits UNLIMITED STORAGE BUDGET: multiple entries allocated in different tables

Managing the (U)seful counter Increment when avoids a misprediction (Pred = taken) & (Alt ≠ taken) 256K: Global decrement if « difficult » to allocate 32K: Probabilistic decrement when conflict Unlimited: don’t care

Adjunct predictors TAGE tracks strong correlation with the global branch history Small adjunct predictors to capture some missed correlation: Loop predictor Statistical Corrector

The loop predictor Predict loop with constant number of iterations: 16/32 entries less than 5 bytes per entry Capture loops with long bodies and/or irregular internal branches S: 1.2 %  M: 1 %  U:0.4%  Good tradeoff for the Championship Implementation: Not that great

The Statistical Corrector predictor Branches with poor correlation with global history: Sometimes better predicted by a single wide PC indexed counter than by TAGE More generally, track cases such that: « In this case (PC, history, prediction), TAGE is likely (>50 %) to mispredict »

Small predictor: very limited budget for the SC predictor Just track the statistically PC biased branches « TAGE predicts this direction on this branch, but in most cases this was wrong » The corrector filter: A small partially tagged associative table 1.5 % misp. reduction: Much simpler than a loop predictor

Medium predictor « Statistically » correlated branches: Not strongly correlated with the global history, but exhibit a bias better predicted by averaging than tags neural  tags Branches correlated with local history, but irregular global history pattern (on other branches) TAGE does not learn the pattern

MultiGehl Statistical Correlator Predictor + H + LH PC Pred Gehl-like Prediction + ctr value TAGE Stat. Corr. H PC Local hist.

Why does it work The bias table indexed with PC+TAGE output: Correct (most of the time) High counter value Dominates, not many updates Wrong Other counters can be trained Correlation (if it exists) can be captured

MultiGehl Statistical Correlator Predictor for the Championship + RAS associated history + 2 different local histories + simple choser 6.8 % misp reduction TAGE H PC Stat. Corr. Prediction + ctr value Local hist.

« Realistic » 256 Kbits TAGE-SC-L « Only » 12 equal size TAGE tables + (local hist., global hist.) 4-tables SC + loop predictor No history tuning Only 2.8 % extra mispredictions

SC for Unlimited predictor GEHL based SC predictor: Use any form of history information Very long global Mutiple local « Skeleton » global history ignore some branches Recycle old ideas from the MAC-RHSP predictor (2004)

SC for unlimited predictor 460 predictor tables + 10 choser tables Globally about 20 % less misp. than TAGE alone If one removes only : The bias: 1.6 % for a single table All global history components: 3.7 % All local history components: 3.9 % The choser: 3.2 %

Conclusion TAGE-SC-L fits (nearly) all storage sizes 32Kbits ≈ 64Kbits CBP1 champion on CBP1 traces 256Kbits ≈ 512Kbits CBP3 champion on CBP4 traces Unlimited predictor: poTAGE-SC does better