Optimized Hybrid Scaled Neural Analog Predictor Daniel A. Jiménez Department of Computer Science The University of Texas at San Antonio.

Slides:

Advertisements

Similar presentations

Perceptron Branch Prediction with Separated T/NT Weight Tables Guangyu Shi and Mikko Lipasti University of Wisconsin-Madison June 4, 2011.

Advertisements

Dead Block Replacement and Bypass with a Sampling Predictor Daniel A. Jiménez Department of Computer Science The University of Texas at San Antonio.

Idealized Piecewise Linear Branch Prediction Daniel A. Jiménez Department of Computer Science Rutgers University and Departament d'Arquitectura de Computadors.

H-Pattern: A Hybrid Pattern Based Dynamic Branch Predictor with Performance Based Adaptation Samir Otiv Second Year Undergraduate Kaushik Garikipati Second.

Hardware-based Devirtualization (VPC Prediction) Hyesoon Kim, Jose A. Joao, Onur Mutlu ++, Chang Joo Lee, Yale N. Patt, Robert Cohn* ++ *

Exploring Correlation for Indirect Branch Prediction 1 Nikunj Bhansali, Chintan Panirwala, Huiyang Zhou Department of Electrical and Computer Engineering.

Computer Science Department University of Central Florida Adaptive Information Processing: An Effective Way to Improve Perceptron Predictors Hongliang.

André Seznec Caps Team IRISA/INRIA 1 The O-GEHL branch predictor Optimized GEometric History Length André Seznec IRISA/INRIA/HIPEAC.

Computer Organization and Architecture (AT70.01) Comp. Sc. and Inf. Mgmt. Asian Institute of Technology Instructor: Dr. Sumanta Guha Slide Sources: Based.

Yue Hu David M. Koppelman Lu Peng A Penalty-Sensitive Branch Predictor Department of Electrical and Computer Engineering Louisiana State University.

EECC722 - Shaaban #1 lec # 6 Fall Dynamic Branch Prediction Dynamic branch prediction schemes run-time behavior of branches to make predictions.

A PPM-like, tag-based predictor Pierre Michaud. 2 Main characteristics global history based 5 tables –one 4k-entry bimodal (indexed with PC) –four 1k-entry.

TAGE-SC-L Branch Predictors

Branch Prediction in SimpleScalar

1 Lecture: Branch Prediction Topics: branch prediction, bimodal/global/local/tournament predictors, branch target buffer (Section 3.3, notes on class webpage)

A Simple Divide-and-Conquer Approach for Neural-Class Branch Prediction Gabriel H. Loh College of Computing Georgia Tech.

Neural Methods for Dynamic Branch Prediction Daniel A. Jiménez Calvin Lin Dept. of Computer Science Rutgers University Univ. of Texas Austin Presented.

Perceptron-based Global Confidence Estimation for Value Prediction Master’s Thesis Michael Black June 26, 2003.

1 Applying Perceptrons to Speculation in Computer Architecture Michael Black Dissertation Defense April 2, 2007.

EE8365/CS8203 ADVANCED COMPUTER ARCHITECTURE A Survey on BRANCH PREDICTION METHODOLOGY By, Baris Mustafa Kazar Resit Sendag.

VLSI Project Neural Networks based Branch Prediction Alexander ZlotnikMarcel Apfelbaum Supervised by: Michael Behar, Spring 2005.

Branch Prediction. Literature Tse-Yu Yeh and Yale N. Patt, “A Comparison of Dynamic Branch Predictors that use Two Levels of Branch History,”Tse-Yu Yeh.

1 Lecture 8: Branch Prediction, Dynamic ILP Topics: branch prediction, out-of-order processors (Sections )

Combining Branch Predictors

Branch Target Buffers BPB: Tag + Prediction

EECC722 - Shaaban #1 lec # 10 Fall Dynamic Branch Prediction Dynamic branch prediction schemes are different from static mechanisms because.

Perceptrons Branch Prediction and its’ recent developments

Neural Methods for Dynamic Branch Prediction Daniel A. Jiménez Department of Computer Science Rutgers University.

CS 7810 Lecture 6 The Impact of Delay on the Design of Branch Predictors D.A. Jimenez, S.W. Keckler, C. Lin Proceedings of MICRO

Data Cache Prefetching using a Global History Buffer Presented by: Chuck (Chengyan) Zhao Mar 30, 2004 Written by: - Kyle Nesbit - James Smith Department.

Improving the Performance of Object-Oriented Languages with Dynamic Predication of Indirect Jumps José A. Joao *‡ Onur Mutlu ‡* Hyesoon Kim § Rishi Agarwal.

1 Storage Free Confidence Estimator for the TAGE predictor André Seznec IRISA/INRIA.

Analysis of Branch Predictors

A STUDY OF BRANCH PREDICTION STRATEGIES JAMES E.SMITH Presented By: Prasanth Kanakadandi.

André Seznec Caps Team IRISA/INRIA 1 Analysis of the O-GEHL branch predictor Optimized GEometric History Length André Seznec IRISA/INRIA/HIPEAC.

1 Revisiting the perceptron predictor André Seznec IRISA/ INRIA.

Sampling Dead Block Prediction for Last-Level Caches

MadCache: A PC-aware Cache Insertion Policy Andrew Nere, Mitch Hayenga, and Mikko Lipasti PHARM Research Group University of Wisconsin – Madison June 20,

Not- Taken? Taken? The Frankenpredictor Gabriel H. Loh Georgia Tech College of Computing MICRO Dec 5, 2004.

Predicting Conditional Branches With Fusion-Based Hybrid Predictors Gabriel H. Loh Yale University Dept. of Computer Science Dana S. Henry Yale University.

Computer Structure Advanced Branch Prediction

André Seznec Caps Team IRISA/INRIA 1 A 256 Kbits L-TAGE branch predictor André Seznec IRISA/INRIA/HIPEAC.

Idealized Piecewise Linear Branch Prediction Daniel A. Jiménez Department of Computer Science Rutgers University.

1 The Inner Most Loop Iteration counter a new dimension in branch history André Seznec, Joshua San Miguel, Jorge Albericio.

Branch Prediction Perspectives Using Machine Learning Veerle Desmet Ghent University.

André Seznec Caps Team IRISA/INRIA 1 Analysis of the O-GEHL branch predictor Optimized GEometric History Length André Seznec IRISA/INRIA/HIPEAC.

Fast Path-Based Neural Branch Prediction Daniel A. Jimenez Presented by: Ioana Burcea.

Samira Khan University of Virginia April 12, 2016

Multiperspective Perceptron Predictor Daniel A. Jiménez Department of Computer Science & Engineering Texas A&M University.

CSL718 : Pipelined Processors

Multilayer Perceptron based Branch Predictor

CS203 – Advanced Computer Architecture

Computer Structure Advanced Branch Prediction

Multiperspective Perceptron Predictor with TAGE

COSC3330 Computer Architecture Lecture 15. Branch Prediction

Dynamically Sizing the TAGE Branch Predictor

Samira Khan University of Virginia Dec 4, 2017

15-740/ Computer Architecture Lecture 25: Control Flow II

Perceptrons for Dummies

Module 3: Branch Prediction

Dynamic Hardware Branch Prediction

15-740/ Computer Architecture Lecture 24: Control Flow

Scaled Neural Indirect Predictor

Hyesoon Kim Onur Mutlu Jared Stark* Yale N. Patt

TAGE-SC-L Again MTAGE-SC

Pipelining: dynamic branch prediction Prof. Eric Rotenberg

So far we have dealt with control hazards in instruction pipelines by:

The O-GEHL branch predictor

Samira Khan University of Virginia Mar 6, 2019

Lecture 7: Branch Prediction, Dynamic ILP

Presentation transcript:

Optimized Hybrid Scaled Neural Analog Predictor Daniel A. Jiménez Department of Computer Science The University of Texas at San Antonio

Branch Prediction with Perceptrons 2

Branch Prediction with Perceptrons cont. 3

4 SNP/SNAP [St. Amant et al. 2008] u A version of piecewise linear neural prediction [Jiménez 2005] u Based on perceptron prediction u SNAP is a mixed digital/analog version of SNP u Uses analog circuit for costly dot-product operation u Enables interesting tricks e.g. scaling

5 Weight Scaling u Scaling weights by coefficients Different history positions have different importance!

6 The Algorithm: Parameters and Variables u C – array of scaling coefficients u h – the global history length u H – a global history shift register u A – a global array of previous branch addresses u W – an n × (GHL + 1) array of small integers u θ – a threshold to decide when to train

7 The Algorithm: Making a Prediction Weights are selected based on the current branch and the i th most recent branch

The Algorithm: Training u If the prediction is wrong or |output| ≤ θ then u For the i th correlating weight used to predict this branch: u Increment it if the branch outcome = outcome of i th in history u Decrement it otherwise u Increment the bias weight if branch is taken u Decrement otherwise 8

SNP/SNAP Datapath 9

10 Tricks u Use alloyed [Skadron 2000] global and per-branch history u Separate table of local perceptrons u Output from this stage multiplied by empircally determined coefficient u Training coefficients vector(s) u Multiple vectors initialized to f(i) = 1 / (A + B × i) u Minimum coefficient value determined empircally u Indexed by branch PC u Each vector trained with perceptron-like learning on-line

Tricks(2) u Branch cache u Highly associative cache with entries for branch information u Each entry contains: u A partial tag for this branch PC u The bias weight for this branch u An “ever taken” bit u A “never taken” bit u The “ever/never” bits avoid needless use of weight resources u The bias weight is protected from destructive interference u LRU replacement u >99% hit rate 11

Tricks(3) u Hybrid predictor u When perceptron output is below some threshold: u If a 2-bit counter gshare predictor has high confidence, use it u Else use a 1-bit counter PAs predictor u Multiple θs indexed by branch PC u Each trained adaptively [Seznec 2005] u Ragged array u Not all rows of the matrix are the same size 12

Benefit of Tricks 13 u Graph shows effect of one trick in isolation u Training coefficients yields most benefit

14 References u Jiménez & Lin, HPCA 2001 (perceptron predictor) u Jiménez & Lin, TOCS 2002 (global/local perceptron) u Jiménez ISCA 2005 (piecewise linear branch predictor) u Skadron, Martonosi & Clark, PACT 2000 (alloyed history) u Seznec 2005 (adaptively trained threshold) u St. Amant, Jiménez & Burger, MICRO 2008 (SNP/SNAP) u McFarling 1993, gshare u Yeh & Patt 1991, PAs

15 The End