Perceptrons for Dummies

Slides:



Advertisements
Similar presentations
Perceptron Branch Prediction with Separated T/NT Weight Tables Guangyu Shi and Mikko Lipasti University of Wisconsin-Madison June 4, 2011.
Advertisements

Perceptron Lecture 4.
Idealized Piecewise Linear Branch Prediction Daniel A. Jiménez Department of Computer Science Rutgers University and Departament d'Arquitectura de Computadors.
Slides from: Doug Gray, David Poole
Introduction to Neural Networks Computing
G53MLE | Machine Learning | Dr Guoping Qiu
Artificial Neural Networks (1)
NEURAL NETWORKS Perceptron
Neural Methods for Dynamic Branch Prediction Daniel A. Jiménez Calvin Lin Dept. of Computer Science Rutgers University Univ. of Texas Austin Presented.
Greg GrudicIntro AI1 Introduction to Artificial Intelligence CSCI 3202 Fall 2007 Introduction to Classification Greg Grudic.
The Perceptron CS/CMPE 333 – Neural Networks. CS/CMPE Neural Networks (Sp 2002/2003) - Asim LUMS2 The Perceptron – Basics Simplest and one.
VLSI Project Neural Networks based Branch Prediction Alexander ZlotnikMarcel Apfelbaum Supervised by: Michael Behar, Spring 2005.
An Illustrative Example
Perceptrons Branch Prediction and its’ recent developments
Neural Methods for Dynamic Branch Prediction Daniel A. Jiménez Department of Computer Science Rutgers University.
Optimized Hybrid Scaled Neural Analog Predictor Daniel A. Jiménez Department of Computer Science The University of Texas at San Antonio.
Where We’re At Three learning rules  Hebbian learning regression  LMS (delta rule) regression  Perceptron classification.
Artificial Intelligence Lecture No. 29 Dr. Asad Ali Safi ​ Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.
Linear Discrimination Reading: Chapter 2 of textbook.
Linear Classification with Perceptrons
1 Perceptron as one Type of Linear Discriminants IntroductionIntroduction Design of Primitive UnitsDesign of Primitive Units PerceptronsPerceptrons.
Perceptrons Michael J. Watts
Idealized Piecewise Linear Branch Prediction Daniel A. Jiménez Department of Computer Science Rutgers University.
Start with student evals. What function does perceptron #4 represent?
SUPPORT VECTOR MACHINES Presented by: Naman Fatehpuria Sumana Venkatesh.
Fast Path-Based Neural Branch Prediction Daniel A. Jimenez Presented by: Ioana Burcea.
CS621 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 20: Neural Net Basics: Perceptron.
Multiperspective Perceptron Predictor Daniel A. Jiménez Department of Computer Science & Engineering Texas A&M University.
CS 9633 Machine Learning Support Vector Machines
Multilayer Perceptron based Branch Predictor
Artificial Neural Networks
Gradient descent David Kauchak CS 158 – Fall 2016.
Dan Roth Department of Computer and Information Science
Fall 2004 Perceptron CS478 - Machine Learning.
Artificial Intelligence
Multiperspective Perceptron Predictor with TAGE
Chapter 2 Single Layer Feedforward Networks
CS344: Introduction to Artificial Intelligence (associated lab: CS386)
Classification with Perceptrons Reading:
Basic machine learning background with Python scikit-learn
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
Artificial Neural Networks
CS 4/527: Artificial Intelligence
LINEAR AND NON-LINEAR CLASSIFICATION USING SVM and KERNELS
CS621: Artificial Intelligence
CSC 578 Neural Networks and Deep Learning
Simple Learning: Hebbian Learning and the Delta Rule
CS 188: Artificial Intelligence
Neural Networks Advantages Criticism
Classification Neural Networks 1
CS621: Artificial Intelligence Lecture 17: Feedforward network (lecture 16 was on Adaptive Hypermedia: Debraj, Kekin and Raunak) Pushpak Bhattacharyya.
Chapter 3. Artificial Neural Networks - Introduction -
Scaled Neural Indirect Predictor
CSE (c) S. Tanimoto, 2004 Neural Networks
Perceptron as one Type of Linear Discriminants
Neural Networks Chapter 5
Artificial Intelligence Lecture No. 28
CSE (c) S. Tanimoto, 2001 Neural Networks
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Support Vector Machines and Kernels
CSE (c) S. Tanimoto, 2002 Neural Networks
Neuro-Computing Lecture 2 Single-Layer Perceptrons
Lecture 02: Perceptron By: Nur Uddin, Ph.D.
CS623: Introduction to Computing with Neural Nets (lecture-3)
CSE (c) S. Tanimoto, 2007 Neural Nets
David Kauchak CS158 – Spring 2019
Word representations David Kauchak CS158 – Fall 2016.
A task of induction to find patterns
Outline Announcement Neural networks Perceptrons - continued
CS621: Artificial Intelligence Lecture 17: Feedforward network (lecture 16 was on Adaptive Hypermedia: Debraj, Kekin and Raunak) Pushpak Bhattacharyya.
Presentation transcript:

Perceptrons for Dummies * 07/16/96 Perceptrons for Dummies Daniel A. Jiménez Department of Computer Science Rutgers University *

Conditional Branch Prediction is a Machine Learning Problem The machine learns to predict conditional branches So why not apply a machine learning algorithm? Artificial neural networks Simple model of neural networks in brain cells Learn to recognize and classify patterns We used fast and accurate perceptrons [Rosenblatt `62, Block `62] for dynamic branch prediction [Jiménez & Lin, HPCA 2001]

Input and Output of the Perceptron The inputs to the perceptron are branch outcome histories Just like in 2-level adaptive branch prediction Can be global or local (per-branch) or both (alloyed) Conceptually, branch outcomes are represented as +1, for taken -1, for not taken The output of the perceptron is Non-negative, if the branch is predicted taken Negative, if the branch is predicted not taken Ideally, each static branch is allocated its own perceptron

Branch-Predicting Perceptron Inputs (x’s) are from branch history and are -1 or +1 n + 1 small integer weights (w’s) learned by on-line training Output (y) is dot product of x’s and w’s; predict taken if y ≥ 0 Training finds correlations between history and outcome

Training Algorithm

What Do The Weights Mean? The bias weight, w0: Proportional to the probability that the branch is taken Doesn’t take into account other branches; just like a Smith predictor The correlating weights, w1 through wn: wi is proportional to the probability that the predicted branch agrees with the ith branch in the history The dot product of the w’s and x’s wi × xi is proportional to the probability that the predicted branch is taken based on the correlation between this branch and the ith branch Sum takes into account all estimated probabilities What’s θ? Keeps from overtraining; adapt quickly to changing behavior

Organization of the Perceptron Predictor Keeps a table of m perceptron weights vectors Table is indexed by branch address modulo m [Jiménez & Lin, HPCA 2001]

Mathematical Intuition A perceptron defines a hyperplane in n+1-dimensional space: For instance, in 2D space we have: This is the equation of a line, the same as

Mathematical Intuition continued In 3D space, we have Or you can think of it as i.e. the equation of a plane in 3D space This hyperplane forms a decision surface separating predicted taken from predicted not taken histories. This surface intersects the feature space. Is it a linear surface, e.g. a line in 2D, a plane in 3D, a cube in 4D, etc.

Example: AND Here is a representation of the AND function White means false, black means true for the output -1 means false, +1 means true for the input -1 AND -1 = false -1 AND +1 = false +1 AND -1 = false +1 AND +1 = true

Example: AND continued A linear decision surface (i.e. a plane in 3D space) intersecting the feature space (i.e. the 2D plane where z=0) separates false from true instances

Example: AND continued Watch a perceptron learn the AND function:

Example: XOR Here’s the XOR function: -1 XOR -1 = false -1 XOR +1 = true +1 XOR -1 = true +1 XOR +1 = false Perceptrons cannot learn such linearly inseparable functions

Example: XOR continued Watch a perceptron try to learn XOR

Concluding Remarks Perceptron is an alternative to traditional branch predictors The literature speaks for itself in terms of better accuracy Perceptrons were nice but they had some problems: Latency Linear inseparability

The End