Branch Prediction with Neural- Networks: Hidden Layers and Recurrent Connections Andrew Smith CSE Dept. June 10, 2004.

Slides:



Advertisements
Similar presentations
Artificial Neural Networks
Advertisements

Multi-Layer Perceptron (MLP)
Backpropagation Learning Algorithm
A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C. Wunsch.
Artificial Intelligence 13. Multi-Layer ANNs Course V231 Department of Computing Imperial College © Simon Colton.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 7: Learning in recurrent networks Geoffrey Hinton.
Back-propagation Chih-yun Lin 5/16/2015. Agenda Perceptron vs. back-propagation network Network structure Learning rule Why a hidden layer? An example:
Kostas Kontogiannis E&CE
Artificial Neural Networks
Machine Learning Neural Networks
Lecture 14 – Neural Networks
Supervised learning 1.Early learning algorithms 2.First order gradient methods 3.Second order gradient methods.
Neural Networks Marco Loog.
Artificial Neural Networks Artificial Neural Networks are (among other things) another technique for supervised learning k-Nearest Neighbor Decision Tree.
Multi Layer Perceptrons (MLP) Course website: The back-propagation algorithm Following Hertz chapter 6.
Artificial Neural Networks
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Multiple-Layer Networks and Backpropagation Algorithms
Machine Learning Chapter 4. Artificial Neural Networks
Appendix B: An Example of Back-propagation algorithm
 Diagram of a Neuron  The Simple Perceptron  Multilayer Neural Network  What is Hidden Layer?  Why do we Need a Hidden Layer?  How do Multilayer.
CS 478 – Tools for Machine Learning and Data Mining Backpropagation.
1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 31: Feedforward N/W; sigmoid.
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 30: Perceptron training convergence;
Multi-Layer Perceptron
Non-Bayes classifiers. Linear discriminants, neural networks.
Neural Networks and Backpropagation Sebastian Thrun , Fall 2000.
CS621 : Artificial Intelligence
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 32: sigmoid neuron; Feedforward.
Over-Trained Network Node Removal and Neurotransmitter-Inspired Artificial Neural Networks By: Kyle Wray.
Introduction to Neural Networks Introduction to Neural Networks Applied to OCR and Speech Recognition An actual neuron A crude model of a neuron Computational.
Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
EEE502 Pattern Recognition
Neural Networks Vladimir Pleskonjić 3188/ /20 Vladimir Pleskonjić General Feedforward neural networks Inputs are numeric features Outputs are in.
Introduction to Neural Networks Freek Stulp. 2 Overview Biological Background Artificial Neuron Classes of Neural Networks 1. Perceptrons 2. Multi-Layered.
Previous Lecture Perceptron W  t+1  W  t  t  d(t) - sign (w(t)  x)] x Adaline W  t+1  W  t  t  d(t) - f(w(t)  x)] f’ x Gradient.
Chapter 6 Neural Network.
Neural Networks Lecture 11: Learning in recurrent networks Geoffrey Hinton.
Intro. ANN & Fuzzy Systems Lecture 11. MLP (III): Back-Propagation.
Pattern Recognition Lecture 20: Neural Networks 3 Dr. Richard Spillman Pacific Lutheran University.
CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
Multiple-Layer Networks and Backpropagation Algorithms
The Gradient Descent Algorithm
Chapter 2 Single Layer Feedforward Networks
第 3 章 神经网络.
Matt Gormley Lecture 16 October 24, 2016
CSE 473 Introduction to Artificial Intelligence Neural Networks
Dr. Kenneth Stanley September 6, 2006
CSE P573 Applications of Artificial Intelligence Neural Networks
CSE 473 Introduction to Artificial Intelligence Neural Networks
CS621: Artificial Intelligence
Prof. Carolina Ruiz Department of Computer Science
Machine Learning Today: Reading: Maria Florina Balcan
A critical review of RNN for sequence learning Zachary C
Neural Networks Advantages Criticism
Training a Neural Network
XOR problem Input 2 Input 1
Artificial Intelligence Chapter 3 Neural Networks
CSE 573 Introduction to Artificial Intelligence Neural Networks
Neural Networks Chapter 5
Artificial Neural Networks
Neural Network - 2 Mayank Vatsa
Artificial Intelligence Chapter 3 Neural Networks
Artificial Intelligence Chapter 3 Neural Networks
Artificial Intelligence Chapter 3 Neural Networks
CSC321: Neural Networks Lecture 11: Learning in recurrent networks
CS621: Artificial Intelligence Lecture 22-23: Sigmoid neuron, Backpropagation (Lecture 20 and 21 taken by Anup on Graphical Models) Pushpak Bhattacharyya.
CS621: Artificial Intelligence Lecture 18: Feedforward network contd
Prof. Carolina Ruiz Department of Computer Science
Presentation transcript:

Branch Prediction with Neural- Networks: Hidden Layers and Recurrent Connections Andrew Smith CSE Dept. June 10, 2004

Outline What is a Perceptron? –Learning? What is a Feed-Forward Network? –Learning? What is a Recurrent Network –Learning? How to do it on hardware??? Results – Adding hidden units Results – Modeling latency of slow networks. Results – Varying the hardware budget

The Perceptron Linear (affine) combination of inputs  DECISION

Perceptron Learning inputs x j, outputs y i and targets t i are {-1, +1} Cycle through training set if X i = (x1, x2, …, xd) is misclassified, do w j  w j + a * t i * x j end if

Feed-Forward Network A network of perceptrons…

Feed-forward Network Learning Use A gradient-descent algorithm. Network output is: Error is: Derivatives of error are:

Feed-Forward Networks, BACKPROP But no error defined for hidden units??? Solution, assign responsibility for output units error to each hidden unit, then descend gradient This is called “back-propagation”

Recurrent Networks Now it has state…

Learning weights for a RNN Unroll it and use back-propagation? No! Too Slow, and wrong…

Use Real-Time Recurrent Learning Keep list, at each time T: –For each Unit u –For each Weight w –Keep partial derivative du/dw Update with recurrence relation:

But on hardware??? Idea, represent real numbers in [-4, +4] with integers in [ ] Adding, is ok… –1024 i j = (i+j)1024 Multiplying requires a divide (shift): –(1024 i) * (1024 j) = (i*j)1024^2 Compute activation function by looking up in a discretized table.

Results, different numbers of hidden units

Results, Different latencies

Results, different HW budget (crafty)

Results, Different HW budges (BZIP-PROGRAM)

Conclusions DON’T use a RNN! Maybe use a NNet with a few hidden units, but don’t over do it Future work: explore trade-off between –Number, size (hidden units), inputs