Jan Kamenický.  Many features ⇒ many dimensions  Dimensionality reduction ◦ Feature extraction (useful representation) ◦ Classification ◦ Visualization.

Slides:



Advertisements
Similar presentations
Text mining Gergely Kótyuk Laboratory of Cryptography and System Security (CrySyS) Budapest University of Technology and Economics
Advertisements

Nonlinear Dimension Reduction Presenter: Xingwei Yang The powerpoint is organized from: 1.Ronald R. Coifman et al. (Yale University) 2. Jieping Ye, (Arizona.
Dimensionality Reduction PCA -- SVD
Manifold Learning Dimensionality Reduction. Outline Introduction Dim. Reduction Manifold Isomap Overall procedure Approximating geodesic dist. Dijkstra’s.
AGE ESTIMATION: A CLASSIFICATION PROBLEM HANDE ALEMDAR, BERNA ALTINEL, NEŞE ALYÜZ, SERHAN DANİŞ.
Presented by: Mingyuan Zhou Duke University, ECE April 3, 2009
1 A Survey on Distance Metric Learning (Part 1) Gerry Tesauro IBM T.J.Watson Research Center.
A novel supervised feature extraction and classification framework for land cover recognition of the off-land scenario Yan Cui
Nonlinear Unsupervised Feature Learning How Local Similarities Lead to Global Coding Amirreza Shaban.
Non-linear Dimensionality Reduction CMPUT 466/551 Nilanjan Ray Prepared on materials from the book Non-linear dimensionality reduction By Lee and Verleysen,
University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Isomap Algorithm.
Principal Component Analysis CMPUT 466/551 Nilanjan Ray.
“Random Projections on Smooth Manifolds” -A short summary
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Dimensionality R e d u c t i o n. Another unsupervised task Clustering, etc. -- all forms of data modeling Trying to identify statistically supportable.
Principal Component Analysis
Dimensionality Reduction and Embeddings
LLE and ISOMAP Analysis of Robot Images Rong Xu. Background Intuition of Dimensionality Reduction Linear Approach –PCA(Principal Component Analysis) Nonlinear.
Correspondence & Symmetry
Image Manifolds : Learning-based Methods in Vision Alexei Efros, CMU, Spring 2007 © A.A. Efros With slides by Dave Thompson.
1 Numerical geometry of non-rigid shapes Spectral Methods Tutorial. Spectral Methods Tutorial 6 © Maks Ovsjanikov tosca.cs.technion.ac.il/book Numerical.
1 NONLINEAR MAPPING: APPROACHES BASED ON OPTIMIZING AN INDEX OF CONTINUITY AND APPLYING CLASSICAL METRIC MDS TO REVISED DISTANCES By Ulas Akkucuk & J.
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
Three Algorithms for Nonlinear Dimensionality Reduction Haixuan Yang Group Meeting Jan. 011, 2005.
A Global Geometric Framework for Nonlinear Dimensionality Reduction Joshua B. Tenenbaum, Vin de Silva, John C. Langford Presented by Napat Triroj.
Atul Singh Junior Undergraduate CSE, IIT Kanpur.  Dimension reduction is a technique which is used to represent a high dimensional data in a more compact.
DATA MINING LECTURE 7 Dimensionality Reduction PCA – SVD
NonLinear Dimensionality Reduction or Unfolding Manifolds Tennenbaum|Silva|Langford [Isomap] Roweis|Saul [Locally Linear Embedding] Presented by Vikas.
Lightseminar: Learned Representation in AI An Introduction to Locally Linear Embedding Lawrence K. Saul Sam T. Roweis presented by Chan-Su Lee.
Nonlinear Dimensionality Reduction by Locally Linear Embedding Sam T. Roweis and Lawrence K. Saul Reference: "Nonlinear dimensionality reduction by locally.
Clustering Unsupervised learning Generating “classes”
Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional.
Similarities, Distances and Manifold Learning Prof. Richard C. Wilson Dept. of Computer Science University of York.
Manifold learning: Locally Linear Embedding Jieping Ye Department of Computer Science and Engineering Arizona State University
Nonlinear Dimensionality Reduction for Hyperspectral Image Classification Tim Doster Advisors: John Benedetto & Wojciech Czaja.
Graph Embedding: A General Framework for Dimensionality Reduction Dong XU School of Computer Engineering Nanyang Technological University
IEEE TRANSSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
Data Reduction. 1.Overview 2.The Curse of Dimensionality 3.Data Sampling 4.Binning and Reduction of Cardinality.
THE MANIFOLDS OF SPATIAL HEARING Ramani Duraiswami | Vikas C. Raykar Perceptual Interfaces and Reality Lab University of Maryland, College park.
Computer Vision Lab. SNU Young Ki Baik Nonlinear Dimensionality Reduction Approach (ISOMAP, LLE)
ISOMAP TRACKING WITH PARTICLE FILTER Presented by Nikhil Rane.
GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer.
Dimensionality Reduction
Manifold learning: MDS and Isomap
CSC2535: Computation in Neural Networks Lecture 12: Non-linear dimensionality reduction Geoffrey Hinton.
Dimensionality Reduction Part 2: Nonlinear Methods
1 LING 696B: MDS and non-linear methods of dimension reduction.
Nonlinear Dimensionality Reduction Approach (ISOMAP)
Non-Linear Dimensionality Reduction
Project 11: Determining the Intrinsic Dimensionality of a Distribution Okke Formsma, Nicolas Roussis and Per Løwenborg.
Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara.
Data Projections & Visualization Rajmonda Caceres MIT Lincoln Laboratory.
Data Mining Course 2007 Eric Postma Clustering. Overview Three approaches to clustering 1.Minimization of reconstruction error PCA, nlPCA, k-means clustering.
Math 285 Project Diffusion Maps Xiaoyan Chong Department of Mathematics and Statistics San Jose State University.
CSC321: Lecture 25: Non-linear dimensionality reduction Geoffrey Hinton.
Nonlinear Dimension Reduction: Semi-Definite Embedding vs. Local Linear Embedding Li Zhang and Lin Liao.
CSC321: Extra Lecture (not on the exam) Non-linear dimensionality reduction Geoffrey Hinton.
Manifold Learning JAMES MCQUEEN – UW DEPARTMENT OF STATISTICS.
Out of sample extension of PCA, Kernel PCA, and MDS WILSON A. FLORERO-SALINAS DAN LI MATH 285, FALL
Multi-index Evaluation Algorithm Based on Locally Linear Embedding for the Node importance in Complex Networks Fang Hu
Spectral Methods for Dimensionality
Nonlinear Dimensionality Reduction
Unsupervised Riemannian Clustering of Probability Density Functions
Dipartimento di Ingegneria «Enzo Ferrari»,
Spectral Methods Tutorial 6 1 © Maks Ovsjanikov
Machine Learning Dimensionality Reduction
Outline Nonlinear Dimension Reduction Brief introduction Isomap LLE
Principal Component Analysis
Nonlinear Dimension Reduction:
NonLinear Dimensionality Reduction or Unfolding Manifolds
Presentation transcript:

Jan Kamenický

 Many features ⇒ many dimensions  Dimensionality reduction ◦ Feature extraction (useful representation) ◦ Classification ◦ Visualization

 WhaT maniFold? ◦ Low dimensional embedding of high dimensional data lying on a smooth nonlinear manifold  Linear methods fail ◦ i.e. PCA

 Unsupervised methods ◦ Without any a priori knowledge  ISOMAPs ◦ Isometric mapping  LLE ◦ Locally linear embedding

 Core idea ◦ Use geodesic distances on the manifold instead of Euclidean  Classical MDS ◦ Maps data to the lower dimensional space

 Select neighbours ◦ K-nearest neighbours ◦ ε-distance neighbourhood  Create weighted neighbourhood graph ◦ Weights = Euclidean distances  Estimate the geodesic distances as shortest paths in the weighted graph ◦ Dijkstra’s algorithm

 1) Set distances (0 for initial, ∞ for all other nodes), set all nodes as unvisited  2) Select unvisited node with smallest distance as active  3) Update all unvisited neighbours of the active node (if the computed distance is smaller)  4) Mark active node as visited (it has now minimal distance), repeat from 2) as necessary

 Time complexity ◦ O(|E|dec+|V|min)  Implementation ◦ Sparse edges ◦ Fibonacci heap as a priority queue ◦ O(|E|+|V|log|V|)  Geodesic distances in ISOMAP ◦ O(N 2 logN)

 Input ◦ Dissimilarities (distances)  Output ◦ Data in a low-dimensional embedding, with distances corresponding to the dissimilarities  Many types of MDS ◦ Classical ◦ Metric / non-metric (number of dissimilarity matrices, symmetry, etc.)

 Quantitative similarity  Euclidean distances (output)  One distance matrix (symmetric)  Minimizing the stress function

 We can optimize directly ◦ Compute double-centered distance matrix ◦ Note: ◦ Perform SVD of B ◦ Compute final data

 Covariance matrix  Projection of centered X onto eigenvectors of NS (result of the PCA of X)

 How many dimensions to use? ◦ Residual variance  Short-circuiting ◦ Too large neigbourhood (not enough data) ◦ Non-isometric mapping ◦ Totally destroys the final embedding

 Conformal ISOMAP ◦ Modified weights in geodesic distance estimate: ◦ Magnifies regions with high density ◦ Shrinks regions with low density

 Landmark ISOMAP ◦ Use only geodesic distances from several landmark points (on the manifold) ◦ Use Landmark-MDS for finding the embedding  Involves triangulation of non-landmark data ◦ Significantly faster, but higher chance for “short- circuiting”, number of landmarks has to be chosen carefully

 Kernel ISOMAP ◦ Ensures that the B (double-centered distance matrix) is positive semidefinite by constant-shifting method

 Core idea ◦ Estimate each point as a linear combination of it’s neighbours – find best such weights ◦ Same linear representation will hold in the low dimensional space

 Find weights W ij by constrained minimization  Neighbourhood preserving mapping

 Low dimensional representation Y  We take eigenvectors of M corresponding to its q+1 smallest eigenvalues  Actually, different algebra is used to improve numeric stability and speed

 ISOMAP ◦ Preserves global geometric properties (geodesic distances), especially for faraway points  LLE ◦ Preserves local neighbourhood correspondence only ◦ Overcomes non-isometric mapping ◦ Manifold is not explicitly required ◦ Difficult to estimate q (number of dimensions)

The end