Nonlinear Dimensionality Reduction

Slides:



Advertisements
Similar presentations
Computer examples Tenenbaum, de Silva, Langford “A Global Geometric Framework for Nonlinear Dimensionality Reduction”
Advertisements

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Manifold Learning Dimensionality Reduction. Outline Introduction Dim. Reduction Manifold Isomap Overall procedure Approximating geodesic dist. Dijkstra’s.
Presented by: Mingyuan Zhou Duke University, ECE April 3, 2009
Non-linear Dimensionality Reduction CMPUT 466/551 Nilanjan Ray Prepared on materials from the book Non-linear dimensionality reduction By Lee and Verleysen,
University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Isomap Algorithm.
“Random Projections on Smooth Manifolds” -A short summary
Principal Component Analysis
LLE and ISOMAP Analysis of Robot Images Rong Xu. Background Intuition of Dimensionality Reduction Linear Approach –PCA(Principal Component Analysis) Nonlinear.
Algorithmic Classification of Resonant Orbits Using Persistent Homology in Poincaré Sections Thomas Coffee.
Image Manifolds : Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson.
1 Numerical geometry of non-rigid shapes Spectral Methods Tutorial. Spectral Methods Tutorial 6 © Maks Ovsjanikov tosca.cs.technion.ac.il/book Numerical.
Manifold Learning: ISOMAP Alan O'Connor April 29, 2008.
Three Algorithms for Nonlinear Dimensionality Reduction Haixuan Yang Group Meeting Jan. 011, 2005.
3D Geometry for Computer Graphics
A Global Geometric Framework for Nonlinear Dimensionality Reduction Joshua B. Tenenbaum, Vin de Silva, John C. Langford Presented by Napat Triroj.
Atul Singh Junior Undergraduate CSE, IIT Kanpur.  Dimension reduction is a technique which is used to represent a high dimensional data in a more compact.
NonLinear Dimensionality Reduction or Unfolding Manifolds Tennenbaum|Silva|Langford [Isomap] Roweis|Saul [Locally Linear Embedding] Presented by Vikas.
Lightseminar: Learned Representation in AI An Introduction to Locally Linear Embedding Lawrence K. Saul Sam T. Roweis presented by Chan-Su Lee.
Nonlinear Dimensionality Reduction by Locally Linear Embedding Sam T. Roweis and Lawrence K. Saul Reference: "Nonlinear dimensionality reduction by locally.
Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional.
SVD(Singular Value Decomposition) and Its Applications
Manifold learning: Locally Linear Embedding Jieping Ye Department of Computer Science and Engineering Arizona State University
Gwangju Institute of Science and Technology Intelligent Design and Graphics Laboratory Multi-scale tensor voting for feature extraction from unstructured.
Graph Embedding: A General Framework for Dimensionality Reduction Dong XU School of Computer Engineering Nanyang Technological University
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Adaptive nonlinear manifolds and their applications to pattern.
IEEE TRANSSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
Data Reduction. 1.Overview 2.The Curse of Dimensionality 3.Data Sampling 4.Binning and Reduction of Cardinality.
THE MANIFOLDS OF SPATIAL HEARING Ramani Duraiswami | Vikas C. Raykar Perceptual Interfaces and Reality Lab University of Maryland, College park.
Computer Vision Lab. SNU Young Ki Baik Nonlinear Dimensionality Reduction Approach (ISOMAP, LLE)
Computer examples Tenenbaum, de Silva, Langford “A Global Geometric Framework for Nonlinear Dimensionality Reduction”
ISOMAP TRACKING WITH PARTICLE FILTER Presented by Nikhil Rane.
GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer.
Dimensionality Reduction
Manifold learning: MDS and Isomap
CSC2535: Computation in Neural Networks Lecture 12: Non-linear dimensionality reduction Geoffrey Hinton.
Dimensionality Reduction Part 2: Nonlinear Methods
1 LING 696B: MDS and non-linear methods of dimension reduction.
Nonlinear Dimensionality Reduction Approach (ISOMAP)
Jan Kamenický.  Many features ⇒ many dimensions  Dimensionality reduction ◦ Feature extraction (useful representation) ◦ Classification ◦ Visualization.
Data Mining Course 0 Manifold learning Xin Yang. Data Mining Course 1 Outline Manifold and Manifold Learning Classical Dimensionality Reduction Semi-Supervised.
Project 11: Determining the Intrinsic Dimensionality of a Distribution Okke Formsma, Nicolas Roussis and Per Løwenborg.
Data Mining Course 2007 Eric Postma Clustering. Overview Three approaches to clustering 1.Minimization of reconstruction error PCA, nlPCA, k-means clustering.
Math 285 Project Diffusion Maps Xiaoyan Chong Department of Mathematics and Statistics San Jose State University.
Nonlinear Dimension Reduction: Semi-Definite Embedding vs. Local Linear Embedding Li Zhang and Lin Liao.
Machine Learning Supervised Learning Classification and Regression K-Nearest Neighbor Classification Fisher’s Criteria & Linear Discriminant Analysis Perceptron:
Spectral Methods for Dimensionality
CSCE822 Data Mining and Warehousing
Intrinsic Data Geometry from a Training Set
Unsupervised Riemannian Clustering of Probability Density Functions
کاربرد نگاشت با حفظ تنکی در شناسایی چهره
CS 2750: Machine Learning Dimensionality Reduction
Dipartimento di Ingegneria «Enzo Ferrari»,
Machine Learning Basics
Dimensionality Reduction
Spectral Methods Tutorial 6 1 © Maks Ovsjanikov
Machine Learning Dimensionality Reduction
ISOMAP TRACKING WITH PARTICLE FILTERING
In summary C1={skin} C2={~skin} Given x=[R,G,B], is it skin or ~skin?
Outline Nonlinear Dimension Reduction Brief introduction Isomap LLE
Dimensionality Reduction
Object Modeling with Layers
Scale-Space Representation of 3D Models and Topological Matching
Goodfellow: Chapter 14 Autoencoders
Principal Component Analysis
Image Classification Painting and handwriting identification
CS4670: Intro to Computer Vision
Nonlinear Dimension Reduction:
NonLinear Dimensionality Reduction or Unfolding Manifolds
Goodfellow: Chapter 14 Autoencoders
Presentation transcript:

Nonlinear Dimensionality Reduction Presented by Dragana Veljkovic

Overview Curse-of-dimensionality Dimension reduction techniques Isomap Locally linear embedding (LLE) Problems and improvements

Problem description Large amount of data being collected leads to creation of very large databases Most problems in data mining involve data with a large number of measurements (or dimensions) E.g. Protein matching, fingerprint recognition, meteorological predictions, satellite image repositories Reducing dimensions increases capability of extracting knowledge

Problem definition Original high dimensional data: X = (x1, …, xn) where xi=(xi1,…,xip)T underlying low dimensional data: Y = (y1, …, yn) where yi=(yi1,…,yiq)T and q<<p Assume X forms a smooth low dimensional manifold in high dimensional space Find the mapping that captures the important features Determine q that can best describe the data

Different approaches Local or Shape preserving Global or Topology preserving Local embeddings Local – simplify representation of each object regardless of the rest of the data Features selected retain most of the information Fourier decomposition, wavelet decomposition, piecewise constant approximation, etc.

Global or Topology preserving Mostly used for visualization and classification PCA or KL decomposition MDS SVD ICA

Local embeddings (LE) Overlapping local neighborhoods, collectively analyzed, can provide information on global geometry LE preserves the local neighborhood of each object preserving the global distances through the non-neighboring objects Isomap and LLE

Another classification Linear and Non Linear methods

Neighborhood Two ways to select neighboring objects: k nearest neighbors (k-NN) – can make non-uniform neighbor distance across the dataset ε-ball – prior knowledge of the data is needed to make reasonable neighborhoods; size of neighborhood can vary

Isomap – general idea Only geodesic distances reflect the true low dimensional geometry of the manifold MDS and PCA see only Euclidian distances and there for fail to detect intrinsic low-dimensional structure Geodesic distances are hard to compute even if you know the manifold In a small neighborhood Euclidian distance is a good approximation of the geodesic distance For faraway points, geodesic distance is approximated by adding up a sequence of “short hops” between neighboring points

Isomap algorithm Find neighborhood of each object by computing distances between all pairs of points and selecting closest Build a graph with a node for each object and an edge between neighboring points. Euclidian distance between two objects is used as edge weight Use a shortest path graph algorithm to fill in distance between all non-neighboring points Apply classical MDS on this distance matrix Dist matrix is double centered

Isomap

Isomap on face images

Isomap on hand images

Isomap on written two-s

Isomap - summary Inherits features of MDS and PCA: guaranteed asymptotic convergence to true structure Polynomial runtime Non-iterative Ability to discover manifolds of arbitrary dimensionality Perform well when data is from a single well sampled cluster Few free parameters Good theoretical base for its metrics preserving properties

Problems with Isomap Embeddings are biased to preserve the separation of faraway points, which can lead to distortion of local geometry Fails to nicely project data spread among multiple clusters Well-conditioned algorithm but computationally expensive for large datasets

Improvements to Isomap Conformal Isomap – capable of learning the structure of certain curved manifolds Landmark Isomap – approximates large global computations by a much smaller set of calculation Reconstruct distances using k/2 closest objects, as well as k/2 farthest objects

Locally Linear Embedding (LLE) Isomap attempts to preserve geometry on all scales, mapping nearby points close and distant points far away from each other LLE attempts to preserve local geometry of the data by mapping nearby points on the manifold to nearby points in the low dimensional space Computational efficiency Representational capacity

LLE – general idea Locally, on a fine enough scale, everything looks linear Represent object as linear combination of its neighbors Representation indifferent to affine transformation Assumption: same linear representation will hold in the low dimensional space

LLE – matrix representation X = W*X where X is p*n matrix of original data W is n*n matrix of weights and Wij =0 if Xj is not neighbor of Xi rows of W sum to one Need to solve system Y = W*Y Y is q*n matrix of underlying low dimensional data Minimize error:

LLE - algorithm Find k nearest neighbors in X space Solve for reconstruction weights W Compute embedding coordinates Y using weights W: create sparse matrix M = (I-W)'*(I-W) Compute bottom q+1 eigenvectors of M Set i-th row of Y to be i+1 smallest eigen vector

Numerical Issues Covariance matrix used to compute W can be ill-conditioned, regularization needs to be used Small eigen values are subject to numerical precision errors and to getting mixed But, sparse matrices used in this algorithm make it much faster then Isomap

LLE

LLE – effect of neighborhood size

LLE – with face picture

LLE – Lips pictures

PCA vs. LLE

Problems with LLE If data is noisy, sparse or weakly connected coupling between faraway points can be attenuated Most common failure of LLE is mapping close points that are faraway in original space – arising often if manifold is undersampled Output strongly depends on selection of k

References Roweis, S. T. and L. K. Saul (2000). "Nonlinear dimensionality reduction by locally linear embedding " Science 290(5500): 2323-2326. Tenenbaum, J. B., V. de Silva, et al. (2000). "A global geometric framework for nonlinear dimensionality reduction " Science 290(5500): 2319-2323. Vlachos, M., C. Domeniconi, et al. (2002). "Non-linear dimensionality reduction techniques for classification and visualization." Proc. of 8th SIGKDD, Edmonton, Canada. de Silva, V. and Tenenbaum, J. (2003). “Local versus global methods for nonlinear dimensionality reduction”, Advances in Neural Information Processing Systems,15.

Questions?