Nonlinear Unsupervised Feature Learning How Local Similarities Lead to Global Coding Amirreza Shaban.

Slides:



Advertisements
Similar presentations
Clustering. How are we doing on the pass sequence? Pretty good! We can now automatically learn the features needed to track both people But, it sucks.
Advertisements

Support Vector Machines
Nonlinear Dimension Reduction Presenter: Xingwei Yang The powerpoint is organized from: 1.Ronald R. Coifman et al. (Yale University) 2. Jieping Ye, (Arizona.
SVM—Support Vector Machines
Machine learning continued Image source:
Multi-layer Orthogonal Codebook for Image Classification Presented by Xia Li.
Presented by: Mingyuan Zhou Duke University, ECE April 3, 2009
A novel supervised feature extraction and classification framework for land cover recognition of the off-land scenario Yan Cui
An Introduction to Sparse Coding, Sparse Sensing, and Optimization Speaker: Wei-Lun Chao Date: Nov. 23, 2011 DISP Lab, Graduate Institute of Communication.
Watching Unlabeled Video Helps Learn New Human Actions from Very Few Labeled Snapshots Chao-Yeh Chen and Kristen Grauman University of Texas at Austin.
Support Vector Machines (SVMs) Chapter 5 (Duda et al.)
1 Transfer Learning Algorithms for Image Classification Ariadna Quattoni MIT, CSAIL Advisors: Michael Collins Trevor Darrell.
An Introduction to Kernel-Based Learning Algorithms K.-R. Muller, S. Mika, G. Ratsch, K. Tsuda and B. Scholkopf Presented by: Joanna Giforos CS8980: Topics.
Semi-Supervised Clustering Jieping Ye Department of Computer Science and Engineering Arizona State University
Semi-Supervised Learning D. Zhou, O Bousquet, T. Navin Lan, J. Weston, B. Schokopf J. Weston, B. Schokopf Presents: Tal Babaioff.
Atul Singh Junior Undergraduate CSE, IIT Kanpur.  Dimension reduction is a technique which is used to represent a high dimensional data in a more compact.
Clustering & Dimensionality Reduction 273A Intro Machine Learning.
Nonlinear Dimensionality Reduction by Locally Linear Embedding Sam T. Roweis and Lawrence K. Saul Reference: "Nonlinear dimensionality reduction by locally.
Radial Basis Function (RBF) Networks
Image Classification using Sparse Coding: Advanced Topics
Diffusion Maps and Spectral Clustering
Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional.
An Introduction to Support Vector Machines Martin Law.
Review: Intro to recognition Recognition tasks Machine learning approach: training, testing, generalization Example classifiers Nearest neighbor Linear.
Manifold learning: Locally Linear Embedding Jieping Ye Department of Computer Science and Engineering Arizona State University
Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.
A Geometric Framework for Unsupervised Anomaly Detection: Detecting Intrusions in Unlabeled Data Authors: Eleazar Eskin, Andrew Arnold, Michael Prerau,
CSE 185 Introduction to Computer Vision Pattern Recognition.
Cs: compressed sensing
Local Non-Negative Matrix Factorization as a Visual Representation Tao Feng, Stan Z. Li, Heung-Yeung Shum, HongJiang Zhang 2002 IEEE Presenter : 張庭豪.
Presented by: Mingyuan Zhou Duke University, ECE June 17, 2011
Classifiers Given a feature representation for images, how do we learn a model for distinguishing features from different classes? Zebra Non-zebra Decision.
An Introduction to Support Vector Machines (M. Law)
Computer Vision Lab. SNU Young Ki Baik Nonlinear Dimensionality Reduction Approach (ISOMAP, LLE)
Transductive Regression Piloted by Inter-Manifold Relations.
Pattern Recognition April 19, 2007 Suggested Reading: Horn Chapter 14.
Nonlinear Learning Using Local Coordinate Coding K. Yu, T. Zhang and Y. Gong, NIPS 2009 Improved Local Coordinate Coding Using Local Tangents K. Yu and.
Locality-constrained Linear Coding for Image Classification
MACHINE LEARNING 8. Clustering. Motivation Based on E ALPAYDIN 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Classification problem:
Kernels Usman Roshan CS 675 Machine Learning. Feature space representation Consider two classes shown below Data cannot be separated by a hyperplane.
GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer.
Manifold learning: MDS and Isomap
SemiBoost : Boosting for Semi-supervised Learning Pavan Kumar Mallapragada, Student Member, IEEE, Rong Jin, Member, IEEE, Anil K. Jain, Fellow, IEEE, and.
CSC2535: Computation in Neural Networks Lecture 12: Non-linear dimensionality reduction Geoffrey Hinton.
Nonlinear Dimensionality Reduction Approach (ISOMAP)
Jan Kamenický.  Many features ⇒ many dimensions  Dimensionality reduction ◦ Feature extraction (useful representation) ◦ Classification ◦ Visualization.
Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara.
Final Exam Review CS479/679 Pattern Recognition Dr. George Bebis 1.
Math 285 Project Diffusion Maps Xiaoyan Chong Department of Mathematics and Statistics San Jose State University.
Presented by: Mingkui Tan, Li Wang, Ivor W. Tsang School of Computer Engineering June 21-24, ICML2010 Haifa, Israel Learning Sparse SVM.
Ultra-high dimensional feature selection Yun Li
CSC321: Lecture 25: Non-linear dimensionality reduction Geoffrey Hinton.
Example Apply hierarchical clustering with d min to below data where c=3. Nearest neighbor clustering d min d max will form elongated clusters!
Nonlinear Dimension Reduction: Semi-Definite Embedding vs. Local Linear Embedding Li Zhang and Lin Liao.
Jianchao Yang, John Wright, Thomas Huang, Yi Ma CVPR 2008 Image Super-Resolution as Sparse Representation of Raw Image Patches.
Introduction to Machine Learning Nir Ailon Lecture 12: EM, Clustering and More.
Machine Learning Supervised Learning Classification and Regression K-Nearest Neighbor Classification Fisher’s Criteria & Linear Discriminant Analysis Perceptron:
Spectral Methods for Dimensionality
Nonlinear Dimensionality Reduction
Semi-Supervised Clustering
Glenn Fung, Murat Dundar, Bharat Rao and Jinbo Bi
Unsupervised Riemannian Clustering of Probability Density Functions
Machine Learning Basics
Using Transductive SVMs for Object Classification in Images
Outline Nonlinear Dimension Reduction Brief introduction Isomap LLE
Goodfellow: Chapter 14 Autoencoders
Using Manifold Structure for Partially Labeled Classification
Machine Learning – a Probabilistic Perspective
Progress Report Alvaro Velasquez.
Goodfellow: Chapter 14 Autoencoders
Presentation transcript:

Nonlinear Unsupervised Feature Learning How Local Similarities Lead to Global Coding Amirreza Shaban

2 DML Nonlinear Unsupervised Feature Learning DML Nonlinear Unsupervised Feature Learning 2 Outline  Feature Learning  Coding methods  Vector Quantization  Sparse Coding  Local Coordinate Coding  Locality-constrained Linear Coding  Local Similarity Global Coding  Experiments

3 DML Nonlinear Unsupervised Feature Learning DML Nonlinear Unsupervised Feature Learning 3 Feature Learning  The goal of feature learning is to convert a complex high dimensional nonlinear learning problem into a much simpler linear one.  Learned features capture the nonlinearity of the data structure in a way that the problem can be solved by a much easier linear learning method.  A topic very close to nonlinear dimensionality reduction.

4 DML Nonlinear Unsupervised Feature Learning DML Nonlinear Unsupervised Feature Learning 4 Feature Learning

5 DML Nonlinear Unsupervised Feature Learning DML Nonlinear Unsupervised Feature Learning 5 Coding Method  Coding methods are a class of algorithms aimed at finding high level representations of low level features.  Given unlabeled input data X= and codebook C = of m atoms, the goal is to learn the coding vector where each element indicates the affinity of data point to the corresponding codebook atom.

6 DML Nonlinear Unsupervised Feature Learning DML Nonlinear Unsupervised Feature Learning 6 Vector Quantization  Assign each data point to its nearest dictionary basis:  The dictionary bases are the cluster centers that are learned by K-means.

7 DML Nonlinear Unsupervised Feature Learning DML Nonlinear Unsupervised Feature Learning 7 Vector Quantization R1 R2 R3 [1, 0, 0] [0, 1, 0] [0, 0, 1]

8 DML Nonlinear Unsupervised Feature Learning DML Nonlinear Unsupervised Feature Learning 8 Sparse Coding  Each data point is represented by a linear combination of a small number of codebook atoms.  The coefficients are found by solving the following minimization problem:

9 DML Nonlinear Unsupervised Feature Learning DML Nonlinear Unsupervised Feature Learning 9 Local Coordinate Coding  It is empirically seen that when coefficients corresponding to local bases are non-zero, sparse coding proves a better performance.  It is conclude that locality is more essential than sparsity.

10 DML Nonlinear Unsupervised Feature Learning DML Nonlinear Unsupervised Feature Learning 10 Local Coordinate Coding  Learning Method:  It is proved that it can learn an arbitrary function on the manifold.  Rate of convergence only depends on the intrinsic dimensionality of the manifold, not d.

11 DML Nonlinear Unsupervised Feature Learning DML Nonlinear Unsupervised Feature Learning 11 Locality-constrained Linear Coding  LCC has high computational cost and it is not suitable for large-scale learning problems.  LLC firstly, guarantees locality by incorporating only the k-nearest bases in the coding process and secondly, minimizes the reconstruction term on the local patches:

12 DML Nonlinear Unsupervised Feature Learning DML Nonlinear Unsupervised Feature Learning 12 Locality-constrained method drawback  Incapable of representing similarity between non- neighbor points:

13 DML Nonlinear Unsupervised Feature Learning DML Nonlinear Unsupervised Feature Learning 13 Locality-constrained method drawback  The SVM labeling function can be written as:  For those points which SVM fails to predict the label of x.

14 DML Nonlinear Unsupervised Feature Learning DML Nonlinear Unsupervised Feature Learning 14 Local Similarity Global Coding  The idea is to propagate the coefficients along the data manifold:  When t = 1, is similar to recent locality- constrained coding methods.

15 DML Nonlinear Unsupervised Feature Learning DML Nonlinear Unsupervised Feature Learning 15 Inductive LSGC  The Kernel function is computed as:  It is referred to as diffusion kernel of order t.  The similarity is high if x and y are connected to each other by many paths in the graph.  it is known that t controls the resolution at which we are looking at data  The computational cost is.  High computational cost:

16 DML Nonlinear Unsupervised Feature Learning DML Nonlinear Unsupervised Feature Learning 16 Inductive LSGC  A two step process:  Projection: Find vector f, in which each element represents one step similarity between data point x and basis, i.e..  Mapping: Propagate the one step similarities in f to the other bases by a (t-1)-step diffusion process.

17 DML Nonlinear Unsupervised Feature Learning DML Nonlinear Unsupervised Feature Learning 17 Inductive LSGC  The coding coefficient of data point in base is defined as:  And overall coding can be shown as:

18 DML Nonlinear Unsupervised Feature Learning DML Nonlinear Unsupervised Feature Learning 18 Inductive to Transductive convergence  p and q are related by:  converges to zero at the rate of.

19 DML Nonlinear Unsupervised Feature Learning DML Nonlinear Unsupervised Feature Learning 19 Experiments

20 DML Nonlinear Unsupervised Feature Learning DML Nonlinear Unsupervised Feature Learning 20 Experiments

خسته نباشید