And Branch Prediction Machine Learning You can follow along at:

And Branch Prediction Machine Learning You can follow along at:
Machine Learning And Branch Prediction IN ORDER OF SPEAKING: Guru Nishok Radhakrishnan, Crystin Rodrick, and Anuj Wali

INTRODUCTION Introduction Previous research Simulation
OVERVIEW Introduction Previous research Simulation Machine learning Design and Implementation Future Work

INTRODUCTION Different number of branch predictors.
Decreasing HW costs , Feasible branch predictors. Machine Learning – Recognizing patterns within the data.

PERFORMANCE OVERVIEW Bimodal Branch Prediction:
Used in non-MMX Intel Pentium processor Saturates at 93.5% correct REF [4]

LOCAL BRANCH PREDICTION: Separate history buffer.
Pattern history table may or may not be shared. Used in Pentium II, Pentium III. Saturates at 97.1% correct. REF [4]

GLOBAL BRANCH PREDICTION:
Shared history records Advantage: Correlation between the different conditional jumps is a part of making the predictions. Disadvantage: History gives irrelevant information if the conditional jumps are uncorrelated. Used in Intel Pentium M, Core ,Core 2.

Saturates at 96% correct. REF [4]

Neural Branch Predictor:
Advantage: Exploit long histories which requires linear resource growth. Disadvantage: Perceptron predictor has a high Latency. Used in AMD Ryzen Processor

PREVIOUS RESEARCH: “Dynamic Branch Prediction using Machine Learning Algorithms”, Kedar Bellare, Pallika Kanani and Shiraj Sen Department of Computer Science, University of Massachusetts Amherst May 17, 2006. p=rep1&type=pdf “Branch Prediction with Bayesian Networks”, Jeremy Singer, Gavin Brown, and Ian Watson ,University of Manchester, UK 7b1fa3d3.pdf

“A 2-Clock-Cycle Naïve Bayes classifier for Dynamic branch prediction in pipelined RISC Microprocessors” , Itaru Hida, Masayuki Ikebe, Tetsuya Asai, and Masato Motomura, Graduate School of Information Science and Technology.

MOTIVATION Two level predictor - High overhead.
Neural Predictors is already commercially available. Overcome the issues through machine learning concepts.

PARSING THE TRACE FILES
SIMULATION PARSING THE TRACE FILES BRANCH HISTORY BRANCH PREDICTION

WHAT IS MACHINE LEARNING?
ARTIFICIAL INTELLIGENCE No rule set, they create their own rules. DEEP LEARNING Multiple layers MACHINE LEARNING Training sets

Types of Machines Learning
ALGORITHMS Reinforcement Learning Unsupervised Learning Semi-Supervised Learning Supervised Learning

Algorithm Naive Bayes

LoGISTIC REGRESSION Created in the 50’s. Directly learns P(Y|X)

AlGORITHM hw(x) =g(wTx) = g(w0x0 +w1x1 +···+wnxn) g(z) = 1/ (1 + e-z)
Logistic Regression hw(x) =g(wTx) = g(w0x0 +w1x1 +···+wnxn) g(z) = 1/ (1 + e-z) Thus, hw(x)=1/1+e(-wT x) w T x should be large negative values for negative instances w T x should be large positive values for positive instances. If we assume a threshold of 0.5, then we predict 1, else 0.

Logistic FUNction J(w) = − y(i) log(hw(x)) + (1 − y(i)) log(1 − hw(x))
wj =wj − α (∂/∂wj)*J(w)

DUCK TYPING VS

EXAMPLE DATASET QUACK WADDLE TYPE 1 DUCK TOY

PROBABILITIES P(QUACK=1|DUCK) = 5/7 = 0.71 P(QUACK=1|TOY) = 2/7 = 0.29
P(WADDLE=1|DUCK) = 2/5 = 0.4 P(WADDLE=1|TOY) = 3/5 = 0.6 P(DUCK) = 5/10 = 0.5 P(TOY) = 5/10 = 0.5

NAIVE BAYES PREDICT BY FINDING ARGMAX OF P(Y|WADDLE=1,QUACK=1) DUCK
SOLUTION PREDICT BY FINDING ARGMAX OF P(Y|WADDLE=1,QUACK=1) DUCK P(0.5)*P(0.71)*P(0.4) = 0.14 TOY P(0.5)*P(0.29)*P(0.6) = 0.09 THEREFORE, WE PREDICT DUCK

LOGISTIC REGRESSION INITIALIZE W, 0.33 SPREAD OVER ALL WEIGHTS
RUN THROUGH INITIALIZE W, 0.33 SPREAD OVER ALL WEIGHTS AFTER FIRST ITERATION 1 / (1 + e(-( (1) + 0.3*(1)) wj =wj − α (∂/∂wj)*J(w) Once you get the optimal weights, plug in the X values to get your y

DESIGN Naive Bayes Why Naive Bayes? Faster than other regression/classification models Relatively less complex circuitry Amount of physical space needed is lower

DESIGN What do we need to build one? Set of known attributes
Naive Bayes What do we need to build one? Set of known attributes Conditional Probability Table Posterior Probability

DESIGN Set of known attributes: Last 30 branch outcomes
Naive Bayes Set of known attributes: Last 30 branch outcomes Maintained as a queue

DESIGN CPT TABLE Bimodal Counters instead of Probabilities
Naive Bayes CPT TABLE Bimodal Counters instead of Probabilities Updated after each branch

DESIGN Posterior Probabilities P(y = 0 | X) P(y = 1 | X)
Naive Bayes Posterior Probabilities P(y = 0 | X) P(y = 1 | X) Made faster using Look-Up-Table

DESIGN Naive Bayes PROCESS FLOW

DESIGN PROCESSOR ARCHITECTURE LatticeMico32 6 stage pipeline
Naive Bayes PROCESSOR ARCHITECTURE LatticeMico32 6 stage pipeline WHERE DOES IT FIT IN THE PIPELINE?

DESIGN Naive Bayes EXPECTED MODEL PARAMETERS Counter Size
History Length Figures taken from [1]

DESIGN Naive Bayes EXPECTED RESULTS
Performance Measurement: Misprediction Rate (%) Performs better than others Figure taken from [1]

POSSIBLE WORK ISSUES Not much research available Feasibility
Logistic Regression ISSUES Not much research available Feasibility POSSIBLE IMPLEMENTATION MODEL Same Attribute Set Performance similar to Perceptron

REFERENCES 1) “A 2-Clock-Cycle Naïve Bayes classifier for Dynamic branch prediction in pipelined RISC Microprocessors” , Itaru Hida, Masayuki Ikebe, Tetsuya Asai, and Masato Motomura, Graduate School of Information Science and Technology. 2) “Dynamic Branch Prediction using Machine Learning Algorithms”, Kedar Bellare, Pallika Kanani and Shiraj Sen Department of Computer Science, University of Massachusetts Amherst May 17, 2006. 3) “Branch Prediction with Bayesian Networks”, Jeremy Singer, Gavin Brown, and Ian Watson ,University of Manchester, UK. 4)”Combining Branch Predictors”, Scott McFarling , June 1993.

THANK YOU

And Branch Prediction Machine Learning You can follow along at:

Similar presentations

Presentation on theme: "And Branch Prediction Machine Learning You can follow along at:"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

And Branch Prediction Machine Learning You can follow along at:

Similar presentations

Presentation on theme: "And Branch Prediction Machine Learning You can follow along at:"— Presentation transcript:

Similar presentations

About project

Feedback