Document Classification using Deep Belief Nets Lawrence McAfee 6/9/08 CS224n, Sprint ‘08.

Slides:

Advertisements

Similar presentations

Greedy Layer-Wise Training of Deep Networks

Advertisements

Neural networks Introduction Fitting neural networks

Deep Learning Bing-Chen Tsai 1/21.

Relevant characteristics extraction from semantically unstructured data PhD title : Data mining in unstructured data Daniel I. MORARIU, MSc PhD Supervisor:

Stochastic Neural Networks Deep Learning and Neural Nets Spring 2015.

CS590M 2008 Fall: Paper Presentation

Advanced topics.

Rajat Raina Honglak Lee, Roger Grosse Alexis Battle, Chaitanya Ekanadham, Helen Kwong, Benjamin Packer, Narut Sereewattanawoot Andrew Y. Ng Stanford University.

Structure learning with deep neuronal networks 6 th Network Modeling Workshop, 6/6/2013 Patrick Michl.

Machine Learning Neural Networks

How to do backpropagation in a brain

Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

November 19, 2009Introduction to Cognitive Science Lecture 20: Artificial Neural Networks I 1 Artificial Neural Network (ANN) Paradigms Overview: The Backpropagation.

Deep Belief Networks for Spam Filtering

Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Perceptron Learning Rule

CHAPTER 11 Back-Propagation Ming-Feng Yeh.

CSC321: Introduction to Neural Networks and Machine Learning Lecture 20 Learning features one layer at a time Geoffrey Hinton.

How to do backpropagation in a brain

Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.

Artificial Neural Networks

Artificial Neural Networks (ANN). Output Y is 1 if at least two of the three inputs are equal to 1.

Multiple-Layer Networks and Backpropagation Algorithms

Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng Computer Science Department, Stanford University, Stanford, CA 94305, USA ImprovingWord.

Artificial Intelligence Techniques Multilayer Perceptrons.

Multi-Layer Perceptron

Neural Networks and Backpropagation Sebastian Thrun , Fall 2000.

Neural Text Categorizer for Exclusive Text Categorization Journal of Information Processing Systems, Vol.4, No.2, June 2008 Taeho Jo* 報告者 : 林昱志.

For Friday No reading Take home exam due Exam 2. For Monday Read chapter 22, sections 1-3 FOIL exercise due.

1 Lecture 6 Neural Network Training. 2 Neural Network Training Network training is basic to establishing the functional relationship between the inputs.

Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22

Lecture notes for Stat 231: Pattern Recognition and Machine Learning 1. Stat 231. A.L. Yuille. Fall Perceptron Rule and Convergence Proof Capacity.

Neural Networks Vladimir Pleskonjić 3188/ /20 Vladimir Pleskonjić General Feedforward neural networks Inputs are numeric features Outputs are in.

Introduction to Deep Learning

Cognitive models for emotion recognition: Big Data and Deep Learning

Convolutional Restricted Boltzmann Machines for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov

Deep learning Tsai bing-chen 10/22.

CSC321 Lecture 24 Using Boltzmann machines to initialize backpropagation Geoffrey Hinton.

Deep Belief Network Training Same greedy layer-wise approach First train lowest RBM (h 0 – h 1 ) using RBM update algorithm (note h 0 is x) Freeze weights.

CSC321 Lecture 27 Using Boltzmann machines to initialize backpropagation Geoffrey Hinton.

Neural networks (2) Reminder Avoiding overfitting Deep neural network Brief summary of supervised learning methods.

Supervised Learning – Network is presented with the input and the desired output. – Uses a set of inputs for which the desired outputs results / classes.

Data Mining: Concepts and Techniques1 Prediction Prediction vs. classification Classification predicts categorical class label Prediction predicts continuous-valued.

CSC2535: Lecture 4: Autoencoders, Free energy, and Minimum Description Length Geoffrey Hinton.

Pattern Recognition Lecture 20: Neural Networks 3 Dr. Richard Spillman Pacific Lutheran University.

Big data classification using neural network

Some Slides from 2007 NIPS tutorial by Prof. Geoffrey Hinton

Learning Deep Generative Models by Ruslan Salakhutdinov

Energy models and Deep Belief Networks

CSC321: Neural Networks Lecture 22 Learning features one layer at a time Geoffrey Hinton.

第 3 章神经网络.

Structure learning with deep autoencoders

Unsupervised Learning and Autoencoders

Department of Electrical and Computer Engineering

Deep Architectures for Artificial Intelligence

Neuro-Computing Lecture 4 Radial Basis Function Network

Deep Belief Nets and Ising Model-Based Network Construction

Artificial Intelligence Chapter 3 Neural Networks

Competitive Networks.

network of simple neuron-like computing elements

Artificial Intelligence Chapter 3 Neural Networks

Competitive Networks.

Artificial Intelligence Chapter 3 Neural Networks

CSC321 Winter 2007 Lecture 21: Some Demonstrations of Restricted Boltzmann Machines Geoffrey Hinton.

Artificial Intelligence Chapter 3 Neural Networks

Perceptron Learning Rule

Perceptron Learning Rule

Perceptron Learning Rule

CSC 578 Neural Networks and Deep Learning

Presentation transcript:

Document Classification using Deep Belief Nets Lawrence McAfee 6/9/08 CS224n, Sprint ‘08

Overview Corpus: Wikipedia XML Corpus Single-labeled data – each document falls under single category Binary Feature Vectors Bag-of-words ‘1’ indicates word occurred one or more times in document Doc#1 Doc#3 Doc#2 Classifier Doc#1 Food Doc#2 Brazil Doc#3 Presidents

Background on Deep Belief Nets Training Data RBM 1 RBM 2 RBM 3 Higher level features Features/basis vectors for training data Very abstract features RBM Unsupervised, clustering training algorithm

Inside an RBM hidden i j visible Configuration (v,h) Golf Cycling Energy Input/Training data Goal in training RBM is to minimize energy of configurations corresponding to input data Train RBM by repeatedly sampling hidden and visible units for a given data input

Depth Binary representation does not capture word frequency information Inaccurate features learned at each level of DBN

Training Iterations Accuracy increases with more training iterations Increasing iterations may (partially) make up for learning poor features Configuration (v,h) LionsTigers Configuration (v,h) Lions Tigers Energy

Comparison to SVM, NB Binary features do not provide good starting point for learning higher level features Binary still useful, as 22% is better than random Time: DBN-2h,13m; SVM-4sec; NB-3sec 30 categories

Lowercasing Supposedly richer vocabulary when lowercasing Overfitting: we don’t need these extra words Other experiments show only top 500 words relevant

Suggestions for Improvement Use appropriate continuous-valued neurons Linear or Gaussian neurons Slower to train Not much documentation on using continuous-valued neurons with RBMs Implement backpropagation to fine-tune weights and biases Propagate error derivatives from top level RBM back to inputs Unsupervised training gives good initial weights, while backpropagation slightly modifies weights/biases Backpropagation cannot be used alone, as it tends to get stuck in local optima