Autoencoders Supervised learning uses explicit labels/correct output in order to train a network. E.g., classification of images. Unsupervised learning.

Slides:



Advertisements
Similar presentations
Neural networks Introduction Fitting neural networks
Advertisements

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Deep Learning and Neural Nets Spring 2015
Stacking RBMs and Auto-encoders for Deep Architectures References:[Bengio, 2009], [Vincent et al., 2008] 2011/03/03 강병곤.
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
An introduction to: Deep Learning aka or related to Deep Neural Networks Deep Structural Learning Deep Belief Networks etc,
Unsupervised Learning With Neural Nets Deep Learning and Neural Nets Spring 2015.
1 Chapter 10 Introduction to Machine Learning. 2 Chapter 10 Contents (1) l Training l Rote Learning l Concept Learning l Hypotheses l General to Specific.
Submitted by:Supervised by: Ankit Bhutani Prof. Amitabha Mukerjee (Y )Prof. K S Venkatesh.
Autoencoders Mostafa Heidarpour
Image Denoising and Inpainting with Deep Neural Networks Junyuan Xie, Linli Xu, Enhong Chen School of Computer Science and Technology University of Science.
Image Compression Using Neural Networks Vishal Agrawal (Y6541) Nandan Dubey (Y6279)
CS-424 Gregory Dudek Today’s Lecture Neural networks –Backprop example Clustering & classification: case study –Sound classification: the tapper Recurrent.
Artificial Neural Networks Shreekanth Mandayam Robi Polikar …… …... … net k   
Cs: compressed sensing
Chapter 11 – Neural Networks COMP 540 4/17/2007 Derek Singer.
EMIS 8381 – Spring Netflix and Your Next Movie Night Nonlinear Programming Ron Andrews EMIS 8381.
Outline 1-D regression Least-squares Regression Non-iterative Least-squares Regression Basis Functions Overfitting Validation 2.
Group Sparse Coding Samy Bengio, Fernando Pereira, Yoram Singer, Dennis Strelow Google Mountain View, CA (NIPS2009) Presented by Miao Liu July
Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,
Neural Networks - Lecture 81 Unsupervised competitive learning Particularities of unsupervised learning Data clustering Neural networks for clustering.
Project 11: Determining the Intrinsic Dimensionality of a Distribution Okke Formsma, Nicolas Roussis and Per Løwenborg.
Project 11: Determining the Intrinsic Dimensionality of a Distribution Okke Formsma, Nicolas Roussis and Per Løwenborg.
1 Chapter 10 Introduction to Machine Learning. 2 Chapter 10 Contents (1) l Training l Rote Learning l Concept Learning l Hypotheses l General to Specific.
CSC2515: Lecture 7 (post) Independent Components Analysis, and Autoencoders Geoffrey Hinton.
Machine Learning CUNY Graduate Center Lecture 6: Linear Regression II.
Dynamic Background Learning through Deep Auto-encoder Networks Pei Xu 1, Mao Ye 1, Xue Li 2, Qihe Liu 1, Yi Yang 2 and Jian Ding 3 1.University of Electronic.
CSC321 Lecture 24 Using Boltzmann machines to initialize backpropagation Geoffrey Hinton.
Neural networks (2) Reminder Avoiding overfitting Deep neural network Brief summary of supervised learning methods.
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Xintao Wu University of Arkansas Introduction to Deep Learning 1.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Big data classification using neural network
Visualizing High-Dimensional Data
Dimensionality Reduction and Principle Components Analysis
Handwritten Digit Recognition Using Stacked Autoencoders
Data Mining, Neural Network and Genetic Programming
One-layer neural networks Approximation problems
Representation Learning: A Review and New Perspectives
Article Review Todd Hricik.
Matt Gormley Lecture 16 October 24, 2016
Neural networks (3) Regularization Autoencoder
Deep learning and applications to Natural language processing
Convolutional Networks
4.3 Feedforward Net. Applications
Unsupervised Learning and Autoencoders
Dipartimento di Ingegneria «Enzo Ferrari»
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
Deep Networks for Manifold Data
CSC 578 Neural Networks and Deep Learning
CNNs and compressive sensing Theoretical analysis
Towards Understanding the Invertibility of Convolutional Neural Networks Anna C. Gilbert1, Yi Zhang1, Kibok Lee1, Yuting Zhang1, Honglak Lee1,2 1University.
Artificial Intelligence Methods
Deep learning Introduction Classes of Deep Learning Networks
Goodfellow: Chapter 14 Autoencoders
Word Embedding Word2Vec.
Representation Learning with Deep Auto-Encoder
Autoencoders hi shea autoencoders Sys-AI.
Machine learning overview
Neural networks (3) Regularization Autoencoder
Word embeddings (continued)
Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton
Attention for translation
LOGAN: Unpaired Shape Transform in Latent Overcomplete Space
An introduction to: Deep Learning aka or related to Deep Neural Networks Deep Structural Learning Deep Belief Networks etc,
CSC 578 Neural Networks and Deep Learning
Goodfellow: Chapter 14 Autoencoders
Presentation transcript:

Autoencoders Supervised learning uses explicit labels/correct output in order to train a network. E.g., classification of images. Unsupervised learning relies on data only. E.g., CBOW and skip-gram word embeddings: the output is determined implicitly from word order in the input data. Key point is to produce a useful embedding of words. The embedding encodes structure such as word similarity and some relationships. Still need to define a loss – this is an implicit supervision.

Autoencoders Autoencoders are designed to reproduce their input, especially for images. Key point is to reproduce the input from a learned encoding. https://www.edureka.co/blog/autoencoders-tutorial/

Autoencoders Compare PCA/SVD Neural network autoencoders PCA takes a collection of vectors (images) and produces a usually smaller set of vectors that can be used to approximate the input vectors via linear combination. Very efficient for certain applications. Fourier and wavelet compression is similar. Neural network autoencoders Can learn nonlinear dependencies Can use convolutional layers Can use transfer learning https://www.edureka.co/blog/autoencoders-tutorial/

Autoencoders: structure Encoder: compress input into a latent-space of usually smaller dimension. h = f(x) Decoder: reconstruct input from the latent space. r = g(f(x)) with r as close to x as possible https://towardsdatascience.com/deep-inside-autoencoders-7e41f319999f

Autoencoders: applications Denoising: input clean image + noise and train to reproduce the clean image. https://www.edureka.co/blog/autoencoders-tutorial/

Autoencoders: Applications Image colorization: input black and white and train to produce color images https://www.edureka.co/blog/autoencoders-tutorial/

Autoencoders: Applications Watermark removal https://www.edureka.co/blog/autoencoders-tutorial/

Properties of Autoencoders Data-specific: Autoencoders are only able to compress data similar to what they have been trained on. Lossy: The decompressed outputs will be degraded compared to the original inputs. Learned automatically from examples: It is easy to train specialized instances of the algorithm that will perform well on a specific type of input. https://www.edureka.co/blog/autoencoders-tutorial/

Capacity As with other NNs, overfitting is a problem when capacity is too large for the data. Autoencoders address this through some combination of: Bottleneck layer – fewer degrees of freedom than in possible outputs. Training to denoise. Sparsity through regularization. Contractive penalty.

Bottleneck layer (undercomplete) Suppose input images are nxn and the latent space is m < nxn. Then the latent space is not sufficient to reproduce all images. Needs to learn an encoding that captures the important features in training data, sufficient for approximate reconstruction.

Simple bottleneck layer in Keras input_img = Input(shape=(784,)) encoding_dim = 32 encoded = Dense(encoding_dim, activation='relu')(input_img) decoded = Dense(784, activation='sigmoid')(encoded) autoencoder = Model(input_img, decoded) Maps 28x28 images into a 32 dimensional vector. Can also use more layers and/or convolutions. https://blog.keras.io/building-autoencoders-in-keras.html

Denoising autoencoders Basic autoencoder trains to minimize the loss between x and the reconstruction g(f(x)). Denoising autoencoders train to minimize the loss between x and g(f(x+w)), where w is random noise. Same possible architectures, different training data. Kaggle has a dataset on damaged documents. https://blog.keras.io/building-autoencoders-in-keras.html

Denoising autoencoders Denoising autoencoders can’t simply memorize the input output relationship. Intuitively, a denoising autoencoder learns a projection from a neighborhood of our training data back onto the training data. https://ift6266h17.files.wordpress.com/2017/03/14_autoencoders.pdf

Sparse autoencoders Construct a loss function to penalize activations within a layer. Usually regularize the weights of a network, not the activations. Individual nodes of a trained model that activate are data-dependent. Different inputs will result in activations of different nodes through the network. Selectively activate regions of the network depending on the input data. https://www.jeremyjordan.me/autoencoders/

Sparse autoencoders Construct a loss function to penalize activations the network. L1 Regularization: Penalize the absolute value of the vector of activations a in layer h for observation I KL divergence: Use cross-entropy between average activation and desired activation https://www.jeremyjordan.me/autoencoders/

Contractive autoencoders Arrange for similar inputs to have similar activations. I.e., the derivative of the hidden layer activations are small with respect to the input. Denoising autoencoders make the reconstruction function (encoder+decoder) resist small perturbations of the input Contractive autoencoders make the feature extraction function (ie. encoder) resist infinitesimal perturbations of the input. https://www.jeremyjordan.me/autoencoders/

Contractive autoencoders Contractive autoencoders make the feature extraction function (ie. encoder) resist infinitesimal perturbations of the input. https://ift6266h17.files.wordpress.com/2017/03/14_autoencoders.pdf

Autoencoders Both the denoising and contractive autoencoder can perform well Advantage of denoising autoencoder : simpler to implement- requires adding one or two lines of code to regular autoencoder-no need to compute Jacobian of hidden layer Advantage of contractive autoencoder : gradient is deterministic -can use second order optimizers (conjugate gradient, LBFGS, etc.)-might be more stable than denoising autoencoder, which uses a sampled gradient To learn more on contractive autoencoders: Contractive Auto-Encoders: Explicit Invariance During Feature Extraction. Salah Rifai, Pascal Vincent, Xavier Muller, Xavier Glorot et Yoshua Bengio, 2011. https://ift6266h17.files.wordpress.com/2017/03/14_autoencoders.pdf