Document Classification using Deep Belief Nets Lawrence McAfee 6/9/08 CS224n, Sprint ‘08.

Slides:



Advertisements
Similar presentations
Greedy Layer-Wise Training of Deep Networks
Advertisements

Neural networks Introduction Fitting neural networks
Deep Learning Bing-Chen Tsai 1/21.
Relevant characteristics extraction from semantically unstructured data PhD title : Data mining in unstructured data Daniel I. MORARIU, MSc PhD Supervisor:
Stochastic Neural Networks Deep Learning and Neural Nets Spring 2015.
CS590M 2008 Fall: Paper Presentation
Advanced topics.
Rajat Raina Honglak Lee, Roger Grosse Alexis Battle, Chaitanya Ekanadham, Helen Kwong, Benjamin Packer, Narut Sereewattanawoot Andrew Y. Ng Stanford University.
Deep Learning.
Structure learning with deep neuronal networks 6 th Network Modeling Workshop, 6/6/2013 Patrick Michl.
Machine Learning Neural Networks
How to do backpropagation in a brain
Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
November 19, 2009Introduction to Cognitive Science Lecture 20: Artificial Neural Networks I 1 Artificial Neural Network (ANN) Paradigms Overview: The Backpropagation.
Deep Belief Networks for Spam Filtering
Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Perceptron Learning Rule
CHAPTER 11 Back-Propagation Ming-Feng Yeh.
CSC321: Introduction to Neural Networks and Machine Learning Lecture 20 Learning features one layer at a time Geoffrey Hinton.
How to do backpropagation in a brain
Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.
Artificial Neural Networks
Artificial Neural Networks (ANN). Output Y is 1 if at least two of the three inputs are equal to 1.
Multiple-Layer Networks and Backpropagation Algorithms
Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng Computer Science Department, Stanford University, Stanford, CA 94305, USA ImprovingWord.
Artificial Intelligence Techniques Multilayer Perceptrons.
Multi-Layer Perceptron
Neural Networks and Backpropagation Sebastian Thrun , Fall 2000.
Neural Text Categorizer for Exclusive Text Categorization Journal of Information Processing Systems, Vol.4, No.2, June 2008 Taeho Jo* 報告者 : 林昱志.
For Friday No reading Take home exam due Exam 2. For Monday Read chapter 22, sections 1-3 FOIL exercise due.
1 Lecture 6 Neural Network Training. 2 Neural Network Training Network training is basic to establishing the functional relationship between the inputs.
Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
Lecture notes for Stat 231: Pattern Recognition and Machine Learning 1. Stat 231. A.L. Yuille. Fall Perceptron Rule and Convergence Proof Capacity.
Neural Networks Vladimir Pleskonjić 3188/ /20 Vladimir Pleskonjić General Feedforward neural networks Inputs are numeric features Outputs are in.
Introduction to Deep Learning
Cognitive models for emotion recognition: Big Data and Deep Learning
Convolutional Restricted Boltzmann Machines for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov
Deep learning Tsai bing-chen 10/22.
CSC321 Lecture 24 Using Boltzmann machines to initialize backpropagation Geoffrey Hinton.
Deep Belief Network Training Same greedy layer-wise approach First train lowest RBM (h 0 – h 1 ) using RBM update algorithm (note h 0 is x) Freeze weights.
CSC321 Lecture 27 Using Boltzmann machines to initialize backpropagation Geoffrey Hinton.
Neural networks (2) Reminder Avoiding overfitting Deep neural network Brief summary of supervised learning methods.
Supervised Learning – Network is presented with the input and the desired output. – Uses a set of inputs for which the desired outputs results / classes.
Data Mining: Concepts and Techniques1 Prediction Prediction vs. classification Classification predicts categorical class label Prediction predicts continuous-valued.
CSC2535: Lecture 4: Autoencoders, Free energy, and Minimum Description Length Geoffrey Hinton.
Pattern Recognition Lecture 20: Neural Networks 3 Dr. Richard Spillman Pacific Lutheran University.
Big data classification using neural network
Some Slides from 2007 NIPS tutorial by Prof. Geoffrey Hinton
Learning Deep Generative Models by Ruslan Salakhutdinov
Energy models and Deep Belief Networks
CSC321: Neural Networks Lecture 22 Learning features one layer at a time Geoffrey Hinton.
第 3 章 神经网络.
Structure learning with deep autoencoders
Unsupervised Learning and Autoencoders
Department of Electrical and Computer Engineering
Deep Architectures for Artificial Intelligence
Neuro-Computing Lecture 4 Radial Basis Function Network
Deep Belief Nets and Ising Model-Based Network Construction
Artificial Intelligence Chapter 3 Neural Networks
Competitive Networks.
network of simple neuron-like computing elements
Artificial Intelligence Chapter 3 Neural Networks
Competitive Networks.
Artificial Intelligence Chapter 3 Neural Networks
CSC321 Winter 2007 Lecture 21: Some Demonstrations of Restricted Boltzmann Machines Geoffrey Hinton.
Artificial Intelligence Chapter 3 Neural Networks
Perceptron Learning Rule
Perceptron Learning Rule
Perceptron Learning Rule
CSC 578 Neural Networks and Deep Learning
Presentation transcript:

Document Classification using Deep Belief Nets Lawrence McAfee 6/9/08 CS224n, Sprint ‘08

Overview Corpus: Wikipedia XML Corpus Single-labeled data – each document falls under single category Binary Feature Vectors Bag-of-words ‘1’ indicates word occurred one or more times in document Doc#1 Doc#3 Doc#2 Classifier Doc#1 Food Doc#2 Brazil Doc#3 Presidents

Background on Deep Belief Nets Training Data RBM 1 RBM 2 RBM 3 Higher level features Features/basis vectors for training data Very abstract features RBM Unsupervised, clustering training algorithm

Inside an RBM hidden i j visible Configuration (v,h) Golf Cycling Energy Input/Training data Goal in training RBM is to minimize energy of configurations corresponding to input data Train RBM by repeatedly sampling hidden and visible units for a given data input

Depth Binary representation does not capture word frequency information Inaccurate features learned at each level of DBN

Training Iterations Accuracy increases with more training iterations Increasing iterations may (partially) make up for learning poor features Configuration (v,h) LionsTigers Configuration (v,h) Lions Tigers Energy

Comparison to SVM, NB Binary features do not provide good starting point for learning higher level features Binary still useful, as 22% is better than random Time: DBN-2h,13m; SVM-4sec; NB-3sec 30 categories

Lowercasing Supposedly richer vocabulary when lowercasing Overfitting: we don’t need these extra words Other experiments show only top 500 words relevant

Suggestions for Improvement Use appropriate continuous-valued neurons Linear or Gaussian neurons Slower to train Not much documentation on using continuous-valued neurons with RBMs Implement backpropagation to fine-tune weights and biases Propagate error derivatives from top level RBM back to inputs Unsupervised training gives good initial weights, while backpropagation slightly modifies weights/biases Backpropagation cannot be used alone, as it tends to get stuck in local optima