Neural Networks II CMPUT 466/551 Nilanjan Ray. Outline Radial basis function network Bayesian neural network.

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Beyond Linear Separability
Pattern Recognition and Machine Learning
Mehran University of Engineering and Technology, Jamshoro Department of Electronic Engineering Neural Networks Feedforward Networks By Dr. Mukhtiar Ali.
Supervised Learning Recap
Instance Based Learning
Ch. 4: Radial Basis Functions Stephen Marsland, Machine Learning: An Algorithmic Perspective. CRC 2009 based on slides from many Internet sources Longin.
Bayesian Reasoning: Markov Chain Monte Carlo
Instance Based Learning
Neural Networks I CMPUT 466/551 Nilanjan Ray. Outline Projection Pursuit Regression Neural Network –Background –Vanilla Neural Networks –Back-propagation.
Lecture 17: Supervised Learning Recap Machine Learning April 6, 2010.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
6/10/ Visual Recognition1 Radial Basis Function Networks Computer Science, KAIST.
RBF Neural Networks x x1 Examples inside circles 1 and 2 are of class +, examples outside both circles are of class – What NN does.
Radial Basis Functions
Speaker Adaptation for Vowel Classification
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Radial Basis Function Networks 표현아 Computer Science, KAIST.
Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees Radford M. Neal and Jianguo Zhang the winners.
Clustering.
Three kinds of learning
End of Chapter 8 Neil Weisenfeld March 28, 2005.
Supervised Learning Networks. Linear perceptron networks Multi-layer perceptrons Mixture of experts Decision-based neural networks Hierarchical neural.
MACHINE LEARNING 12. Multilayer Perceptrons. Neural Networks Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Hazırlayan NEURAL NETWORKS Radial Basis Function Networks I PROF. DR. YUSUF OYSAL.
MLP Exercise (2006) Become familiar with the Neural Network Toolbox in Matlab Construct a single hidden layer, feed forward network with sigmoidal units.
CS Instance Based Learning1 Instance Based Learning.
Aula 4 Radial Basis Function Networks
Radial Basis Function (RBF) Networks
Radial Basis Function G.Anuradha.
Last lecture summary.
Radial-Basis Function Networks
Hazırlayan NEURAL NETWORKS Radial Basis Function Networks II PROF. DR. YUSUF OYSAL.
Radial Basis Function Networks
8/10/ RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.
Last lecture summary.
Radial Basis Function Networks
PATTERN RECOGNITION AND MACHINE LEARNING
Radial Basis Function Networks
Biointelligence Laboratory, Seoul National University
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Artificial Neural Networks Shreekanth Mandayam Robi Polikar …… …... … net k   
11 CSE 4705 Artificial Intelligence Jinbo Bi Department of Computer Science & Engineering
RBF TWO-STAGE LEARNING NETWORKS: EXPLOITATION OF SUPERVISED DATA IN THE SELECTION OF HIDDEN UNIT PARAMETERS An application to SAR data classification.
Radial Basis Function Networks:
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 3: LINEAR MODELS FOR REGRESSION.
CLASSIFICATION: Ensemble Methods
Map-Reduce for Machine Learning on Multicore C. Chu, S.K. Kim, Y. Lin, Y.Y. Yu, G. Bradski, A.Y. Ng, K. Olukotun (NIPS 2006) Shimin Chen Big Data Reading.
Fast Learning in Networks of Locally-Tuned Processing Units John Moody and Christian J. Darken Yale Computer Science Neural Computation 1, (1989)
Ensemble Methods in Machine Learning
Supervised Machine Learning: Classification Techniques Chaleece Sandberg Chris Bradley Kyle Walsh.
Neural Networks 2nd Edition Simon Haykin
Object Recognizing. Deep Learning Success in 2012 DeepNet and speech processing.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
A Presentation on Adaptive Neuro-Fuzzy Inference System using Particle Swarm Optimization and it’s Application By Sumanta Kundu (En.R.No.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
Neural network based hybrid computing model for wind speed prediction K. Gnana Sheela, S.N. Deepa Neurocomputing Volume 122, 25 December 2013, Pages 425–429.
CMPS 142/242 Review Section Fall 2011 Adapted from Lecture Slides.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Machine Learning 12. Local Models.
Machine Learning Supervised Learning Classification and Regression
Big data classification using neural network
Bayesian Neural Networks
Radial Basis Function G.Anuradha.
Computational Intelligence
Neuro-Computing Lecture 4 Radial Basis Function Network
Computational Intelligence
Introduction to Radial Basis Function Networks
Computational Intelligence
Computational Intelligence
Presentation transcript:

Neural Networks II CMPUT 466/551 Nilanjan Ray

Outline Radial basis function network Bayesian neural network

Radial Basis Function Network Output: Basis function:Or,

MLP and RBFN Taken from Bishop

Learning RBF Network Parameters of a RBF network – Basis function parameters:  ’s,  ’s, or  ’s – Weights of network Learning proceeds in two distinct steps – Basis function parameters are learned first – Next, the network weights are learned

Learning RBF Network Weights Training set: (x i, t i ), i=1, 2, …N RBFN output: For matrix differentiation see: Squared-error: Differentiating, Pseudo-inverse So, that’s easy!

Learning Basis Function Parameters A number of unsupervised methods are there: – Subsets of data points Set the basis function centers,  ’s to randomly chosen data points Set  ’s equal and to some multiple of average distance between centers – Orthogonal least square A principled way to choose subset of data points (“Orthogonal least squares learning algorithm for radial basis function networks,” by Chen, Cowan, Grant) – Clustering K-means Mean shift, etc. – Gaussian mixture model Expectation maximization technique Supervised technique – Form squared-error and differentiate with respect to  ’s and  ’s; then use gradient descent

MLP vs. RBFN Global hyperplaneLocal modeling Back-propagation trainingSubset choice + LMS Online computation time is shorter Online computation time is typically longer Longer learning timeShorter learning time Recent research trend is more in MLP than in RBFN

Bayesian NN: Basics Neal, R. M. (1992) ``Bayesian training of backpropagation networks by the hybrid Monte Carlo method'', Technical Report CRG-TR- 92-1, Dept. of Computer Science, University of Toronto, Consider a neural network with output f and weights w Let (x i, y i ), i=1, 2, …, N be the training set Then for a new input x new the output can thought of an expectation: Posterior probability of weights w How do we get Pr(w|…)? How do we carry out this integration?

Posterior Probability of Weights An example posterior: Data term Weight decay term Note that Pr(w|…) is highly peaked with peaks provided by the local minima of E(w) One such peak can be obtained by say, error back-propagation (EBP) training of the network. So, the previous expectation in principle can overcome at least two things: (1)Local minimum problem of say EBP, and more importantly, (2)Can reduce the effect of overfitting that typically occur in EBP, even with weight decay

How To Compute The Expectation? Typically the computing the integration analytically is impossible. Approximation can be obtained by Monte Carlo method, which generate Samples w (k) from the posterior distribution Pr(w|…) and take average: Well, of course, the next question is how to efficiently generate samples from Pr(w|…)? This precisely where the challenge and the art is hidden in Bayesian neural network.

Efficiently Generating Samples For a complex network with many 2/3 hidden layers and many hidden nodes, one almost always has to resort to Markov chain Monte Carlo (MCMC) method. Even, designing an MCMC is quite an art. Neal considers a hybrid MCMC, where the gradient direction of E(w) is efficiently used in sampling. Also, another advantage here is that one can use ARD (automatic relevance detection) in MCMC, which can neglect irrelevant inputs. Very effective for high-dimensional problems. Neal, R. M. (1992) ``Bayesian training of backpropagation networks by the hybrid Monte Carlo method'', Technical Report CRG-TR- 92-1, Dept. of Computer Science, University of Toronto,

Hmm…Is There Any Success Story With BNN? Winner of the NIPS 2003 competition! Input sizes for 5 problems were 500, 5000, 10,000, 20,000, 100,000. To know the nitty-gritty see Neal, R. M. and Zhang, J. (2006) ``High dimensional classification with Bayesian neural networks and Dirichlet diffusion trees'', in I. Guyon, S. Gunn, M. Nikravesh, and L. A. Zadeh (editors) Feature Extraction: Foundations and Applications, Studies in Fuzziness and Soft Computing, Volume 207, Springer, pp

Related Neural Network Techniques BNN is essentially a collection of neural networks Similarly, you can think of ‘bagged’ neural networks – An aside, how is bagging different from BNN? Boosted neural networks, etc. – Typically, care should be taken to make the neural network a weak learner with limited architecture

Some Interesting Features of BNN Does not use cross-validation; so, the entire training data set can be used for learning Flexible design: can average neural networks with different architectures! Can work with active learning, i.e., determining relevant data Noisy and irrelevant inputs can be discarded by ARD