Varieties of Helmholtz Machine Peter Dayan and Geoffrey E. Hinton, Neural Networks, Vol. 9, No. 8, pp.1385-1403, 1996.

Slides:



Advertisements
Similar presentations
The Helmholtz Machine P Dayan, GE Hinton, RM Neal, RS Zemel
Advertisements

Thomas Trappenberg Autonomous Robotics: Supervised and unsupervised learning.
Deep Learning Bing-Chen Tsai 1/21.
Ch. Eick: More on Machine Learning & Neural Networks Different Forms of Learning: –Learning agent receives feedback with respect to its actions (e.g. using.
CIAR Second Summer School Tutorial Lecture 2a Learning a Deep Belief Net Geoffrey Hinton.
Stochastic Neural Networks Deep Learning and Neural Nets Spring 2015.
CS590M 2008 Fall: Paper Presentation
Advanced topics.
Stacking RBMs and Auto-encoders for Deep Architectures References:[Bengio, 2009], [Vincent et al., 2008] 2011/03/03 강병곤.
2806 Neural Computation Stochastic Machines Lecture 10
Learning Representations. Maximum likelihood s r s?s? World Activity Probabilistic model of neuronal firing as a function of s Generative Model.
What kind of a Graphical Model is the Brain?
Supervised Learning Recap
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 10: The Bayesian way to fit models Geoffrey Hinton.
Presented by: Mingyuan Zhou Duke University, ECE September 18, 2009
CIAR Summer School Tutorial Lecture 2b Learning a Deep Belief Net
Graphical models: approximate inference and learning CA6b, lecture 5.
Deep Learning.
Wake-Sleep algorithm for Representational Learning
Neural NetworksNN 11 Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
Supervised and Unsupervised learning and application to Neuroscience Cours CA6b-4.
How to do backpropagation in a brain
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Restricted Boltzmann Machines and Deep Belief Networks
CSC321: Introduction to Neural Networks and Machine Learning Lecture 20 Learning features one layer at a time Geoffrey Hinton.
CSC2535: 2013 Advanced Machine Learning Lecture 3a: The Origin of Variational Bayes Geoffrey Hinton.
Can computer simulations of the brain allow us to see into the mind? Geoffrey Hinton Canadian Institute for Advanced Research & University of Toronto.
Hazırlayan NEURAL NETWORKS Radial Basis Function Networks II PROF. DR. YUSUF OYSAL.
How to do backpropagation in a brain
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
CSC2535: Computation in Neural Networks Lecture 11: Conditional Random Fields Geoffrey Hinton.
Boltzmann Machine (BM) (§6.4) Hopfield model + hidden nodes + simulated annealing BM Architecture –a set of visible nodes: nodes can be accessed from outside.
Learning Lateral Connections between Hidden Units Geoffrey Hinton University of Toronto in collaboration with Kejie Bao University of Toronto.
CSC321: Neural Networks Lecture 13: Learning without a teacher: Autoencoders and Principal Components Analysis Geoffrey Hinton.
ARTIFICIAL NEURAL NETWORKS. Overview EdGeneral concepts Areej:Learning and Training Wesley:Limitations and optimization of ANNs Cora:Applications and.
Geoffrey Hinton CSC2535: 2013 Lecture 5 Deep Boltzmann Machines.
Sequence Models With slides by me, Joshua Goodman, Fei Xia.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
CIAR Second Summer School Tutorial Lecture 1a Sigmoid Belief Nets and Boltzmann Machines Geoffrey Hinton.
CSC 2535 Lecture 8 Products of Experts Geoffrey Hinton.
CSC321: Introduction to Neural Networks and Machine Learning Lecture 18 Learning Boltzmann Machines Geoffrey Hinton.
CSC2515: Lecture 7 (post) Independent Components Analysis, and Autoencoders Geoffrey Hinton.
CIAR Summer School Tutorial Lecture 1b Sigmoid Belief Nets Geoffrey Hinton.
CSC321: Lecture 7:Ways to prevent overfitting
How to learn a generative model of images Geoffrey Hinton Canadian Institute for Advanced Research & University of Toronto.
Cognitive models for emotion recognition: Big Data and Deep Learning
RADFORD M. NEAL GEOFFREY E. HINTON 발표: 황규백
CSC2535 Lecture 5 Sigmoid Belief Nets
Deep Belief Network Training Same greedy layer-wise approach First train lowest RBM (h 0 – h 1 ) using RBM update algorithm (note h 0 is x) Freeze weights.
CSC Lecture 23: Sigmoid Belief Nets and the wake-sleep algorithm Geoffrey Hinton.
Deep Learning Overview Sources: workshop-tutorial-final.pdf
CSC2535: Computation in Neural Networks Lecture 7: Independent Components Analysis Geoffrey Hinton.
Machine Learning Artificial Neural Networks MPλ ∀ Stergiou Theodoros 1.
CSC2535: Lecture 4: Autoencoders, Free energy, and Minimum Description Length Geoffrey Hinton.
Machine Learning 12. Local Models.
Some Slides from 2007 NIPS tutorial by Prof. Geoffrey Hinton
Learning Deep Generative Models by Ruslan Salakhutdinov
Energy models and Deep Belief Networks
CSC321: Neural Networks Lecture 22 Learning features one layer at a time Geoffrey Hinton.
Restricted Boltzmann Machines for Classification
Deep Learning Qing LU, Siyuan CAO.
Probabilistic Models for Linear Regression
Probabilistic Models with Latent Variables
Boltzmann Machine (BM) (§6.4)
CSC321 Winter 2007 Lecture 21: Some Demonstrations of Restricted Boltzmann Machines Geoffrey Hinton.
A connectionist model in action
Artificial Intelligence 10. Neural Networks
Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton
Sanguthevar Rajasekaran University of Connecticut
CSC 578 Neural Networks and Deep Learning
Presentation transcript:

Varieties of Helmholtz Machine Peter Dayan and Geoffrey E. Hinton, Neural Networks, Vol. 9, No. 8, pp , 1996.

Helmholtz Machines Hierarchical compression schemes would reveal the true hidden causes of the sensory data and that this facilitate subsequent supervised learning. –Easy to unsupervised learning via unlabelled data.

Density Estimation with Hidden States log-likelihood of observed data vectors d maximum likelihood estimation

The Helmholtz Machine The top-down weights –the parameter  of the generative model –unidirectional Bayesian network –factorial within each layer The bottom-up weights –the parameter  of the recognition model –another unidirectional Bayesian network

Another view of HM Autoencoders –the recognition model : the coding operation of turning inputs d into stochastic odes in the hidden layer –the generative model : reconstructs its best guess of the input on the basis of the code that it sees Maximizing the likelihood of the data can be interpreted as minimizing the total number of bits it takes to send the data from sender to receiver

The deterministic HM - Dayan et al (NC) Approximation inspired by mean-field methods replacing stochastic firing probabilities in the recognition model by their deterministic mean values. Advantage –powerful optimization method disadvantage –incorrect capturing of recognition distribution

The stochastic HM - Hinton et al (Science) Capture the correlation between the activities in different hidden layers. Wake-sleep algorithm

Variants of the HM Unit activation function reinforcement learning alternative recognition models supervised HM modeling temporal structure

Unit Activation Function The wake-sleep algorithm is particularly convenient for changing the activation functions.

The Reinforcement Learning HM This methods only for correctly optimizing recognition weights. can makes learning very slow.

Alternative Recognition Models Recurrent Recognition –Sophisticated mean field methods –Using E-M algorithm –Only generative weights –But poor results

Alternative Recognition Models Dangling Units –For XOR problem ( explanation away problem) –No modification of wake-sleep algorithm

Alternative Recognition Models Other sampling methods –Gibbs sampling –Metropolis algorithm

Alternative Recognition Models The Lateral HM –Recurrent weights within hidden layer. –Only recognition model –Recurrent connections into the generative pathway of HM  Boltzmann machine.

Alternative Recognition Models The Lateral HM –During wake phase Using stochastic Gibbs sampling –During sleep phase Generative weights updated Samples is produced by generative weights and lateral weights

Alternative Recognition Models The Lateral HM –Boltzmann machine learning methods can be used. –Recognition models Calculate Use Boltzmann machine methods For learning

Supervised HMs Supervised learning  p ( d | e ) – e : input, d : output First model –Not good architecture

Supervised HMs The Side-Information HM – e as extra input to both recognition and generative pathway during learning –Standard wake-sleep algorithm can be used.

Supervised HMs The Clipped HM –To generate samples over d –Standard wake-sleep algorithm is used to train the e pathway –The extra generative connections to d are trained during wake-phases once the weights for e have converged

Supervised HMs The Inverse HM –Takes direct advantage of the capacity of the recognition model in the HM to learn inverse distributions –After learning, the units above d can be discarded

The Helmholtz Machine Through Time (HMTT) Wake-sleep algorithm is used.