Nonlinear Dimension Reduction:

Slides:



Advertisements
Similar presentations
Text mining Gergely Kótyuk Laboratory of Cryptography and System Security (CrySyS) Budapest University of Technology and Economics
Advertisements

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
A Geometric Perspective on Machine Learning 何晓飞 浙江大学计算机学院 1.
Input Space versus Feature Space in Kernel- Based Methods Scholkopf, Mika, Burges, Knirsch, Muller, Ratsch, Smola presented by: Joe Drish Department of.
Graph Embedding and Extensions: A General Framework for Dimensionality Reduction Keywords: Dimensionality reduction, manifold learning, subspace learning,
Manifold Learning Dimensionality Reduction. Outline Introduction Dim. Reduction Manifold Isomap Overall procedure Approximating geodesic dist. Dijkstra’s.
1er. Escuela Red ProTIC - Tandil, de Abril, 2006 Principal component analysis (PCA) is a technique that is useful for the compression and classification.
Presented by: Mingyuan Zhou Duke University, ECE April 3, 2009
Graph Laplacian Regularization for Large-Scale Semidefinite Programming Kilian Weinberger et al. NIPS 2006 presented by Aggeliki Tsoli.
A novel supervised feature extraction and classification framework for land cover recognition of the off-land scenario Yan Cui
Non-linear Dimensionality Reduction CMPUT 466/551 Nilanjan Ray Prepared on materials from the book Non-linear dimensionality reduction By Lee and Verleysen,
University of Joensuu Dept. of Computer Science P.O. Box 111 FIN Joensuu Tel fax Isomap Algorithm.
Principal Component Analysis CMPUT 466/551 Nilanjan Ray.
“Random Projections on Smooth Manifolds” -A short summary
Principal Component Analysis
Dimensional reduction, PCA
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
Three Algorithms for Nonlinear Dimensionality Reduction Haixuan Yang Group Meeting Jan. 011, 2005.
Lecture outline Support vector machines. Support Vector Machines Find a linear hyperplane (decision boundary) that will separate the data.
Proceedings of the 2007 SIAM International Conference on Data Mining.
A Global Geometric Framework for Nonlinear Dimensionality Reduction Joshua B. Tenenbaum, Vin de Silva, John C. Langford Presented by Napat Triroj.
Continuous Latent Variables --Bishop
Atul Singh Junior Undergraduate CSE, IIT Kanpur.  Dimension reduction is a technique which is used to represent a high dimensional data in a more compact.
Lightseminar: Learned Representation in AI An Introduction to Locally Linear Embedding Lawrence K. Saul Sam T. Roweis presented by Chan-Su Lee.
Nonlinear Dimensionality Reduction by Locally Linear Embedding Sam T. Roweis and Lawrence K. Saul Reference: "Nonlinear dimensionality reduction by locally.
Nonlinear Dimensionality Reduction Approaches. Dimensionality Reduction The goal: The meaningful low-dimensional structures hidden in their high-dimensional.
Manifold learning: Locally Linear Embedding Jieping Ye Department of Computer Science and Engineering Arizona State University
Summarized by Soo-Jin Kim
Linear Least Squares Approximation. 2 Definition (point set case) Given a point set x 1, x 2, …, x n  R d, linear least squares fitting amounts to find.
Structure Preserving Embedding Blake Shaw, Tony Jebara ICML 2009 (Best Student Paper nominee) Presented by Feng Chen.
1 Graph Embedding (GE) & Marginal Fisher Analysis (MFA) 吳沛勳 劉冠成 韓仁智
Graph Embedding: A General Framework for Dimensionality Reduction Dong XU School of Computer Engineering Nanyang Technological University
1 Recognition by Appearance Appearance-based recognition is a competing paradigm to features and alignment. No features are extracted! Images are represented.
IEEE TRANSSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
Learning a Kernel Matrix for Nonlinear Dimensionality Reduction By K. Weinberger, F. Sha, and L. Saul Presented by Michael Barnathan.
Computer Vision Lab. SNU Young Ki Baik Nonlinear Dimensionality Reduction Approach (ISOMAP, LLE)
Local Fisher Discriminant Analysis for Supervised Dimensionality Reduction Presented by Xianwang Wang Masashi Sugiyama.
GRASP Learning a Kernel Matrix for Nonlinear Dimensionality Reduction Kilian Q. Weinberger, Fei Sha and Lawrence K. Saul ICML’04 Department of Computer.
Spoken Language Group Chinese Information Processing Lab. Institute of Information Science Academia Sinica, Taipei, Taiwan
Manifold learning: MDS and Isomap
Nonlinear Dimensionality Reduction Approach (ISOMAP)
Jan Kamenický.  Many features ⇒ many dimensions  Dimensionality reduction ◦ Feature extraction (useful representation) ◦ Classification ◦ Visualization.
Non-Linear Dimensionality Reduction
EE4-62 MLCV Lecture Face Recognition – Subspace/Manifold Learning Tae-Kyun Kim 1 EE4-62 MLCV.
Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara.
Supervisor: Nakhmani Arie Semester: Winter 2007 Target Recognition Harmatz Isca.
Math 285 Project Diffusion Maps Xiaoyan Chong Department of Mathematics and Statistics San Jose State University.
Nonlinear Dimension Reduction: Semi-Definite Embedding vs. Local Linear Embedding Li Zhang and Lin Liao.
CSC321: Extra Lecture (not on the exam) Non-linear dimensionality reduction Geoffrey Hinton.
Out of sample extension of PCA, Kernel PCA, and MDS WILSON A. FLORERO-SALINAS DAN LI MATH 285, FALL
Multi-index Evaluation Algorithm Based on Locally Linear Embedding for the Node importance in Complex Networks Fang Hu
Machine Learning Supervised Learning Classification and Regression K-Nearest Neighbor Classification Fisher’s Criteria & Linear Discriminant Analysis Perceptron:
Spectral Methods for Dimensionality
Principal Component Analysis (PCA)
Nonlinear Dimensionality Reduction
Dimensionality Reduction
9.3 Filtered delay embeddings
Face Recognition and Feature Subspaces
Metric Learning for Clustering
Dimensionality Reduction
ISOMAP TRACKING WITH PARTICLE FILTERING
Outline Nonlinear Dimension Reduction Brief introduction Isomap LLE
Singular Value Decomposition
Dimensionality Reduction
Object Modeling with Layers
CS 2750: Machine Learning Support Vector Machines
Principal Component Analysis
Introduction PCA (Principal Component Analysis) Characteristics:
Video Analysis via Nonlinear Dimensionality Reduction Technique
Presentation transcript:

Nonlinear Dimension Reduction: Semi-Definite Embedding vs. Local Linear Embedding Li Zhang and Lin Liao

Outline Nonlinear Dimension Reduction Semi-Definite Embedding Local Linear Embedding Experiments

Dimension Reduction To understand images in terms of their basic modes of variability. Unsupervised learning problem: Given N high dimensional input Xi RD, find a faithful one-to-one mapping to N low dimensional output Yi Rd and d<D. Methods: Linear methods (PCA, MDS): subspace Nonlinear methods (SDE, LLE): manifold

Semi-Definite Embedding Given input X=(X1,...,XN) and k Finding the k nearest neighbors for each input Xi Formulate and solve a corresponding semi-definite programming problem; find optimal Gram matrix of output K=YTY Extract approximately a low dimensional embedding Y from the eigenvectors and eigenvalues of Gram matrix K

Semi-Definite Programming Maximize C·X Subject to AX=b matrix(X) is positive semi-definite where X is a vector with size n2, and matrix(X) is a n by n matrix reshaped from X

Semi-Definite Programming Constraints: Maintain the distance between neighbors |Xi-Xj|2=|Yi-Yj|2 for each pair of neighbor (i,j)  Kii+Kjj-Kij-Kji= Gii+Gjj-Gij-Gji where K=YTY,G=XTX Constrain the output centered on the origin ΣYi=0  ΣKij=0 K is positive semidefinite

Semi-Definite Programming Objective function Maximize the sum of pairwise squared distance between outputs Σij|Yi-Yj|2  Tr(K)

Semi-Definite Programming Solve the best K using any SDP solver CSDP (fast, stable) SeDuMi (stable, slow) SDPT3 (new, fastest, not well tested)

Locally Linear Embedding

Swiss Roll SDE, k=4 LLE, k=18 N=800

LLE on Swiss Roll, varying K

LLE on Swiss Roll, varying K

LLE on Swiss Roll, varying K

Twos SDE, k=4 LLE, k=18 N=638

Teapots SDE, k=4 LLE, k=12 N=400

LLE on Teapot, varying N N=400 K=200 K=100 K=50

Faces SDE, failed LLE, k=12 N=1900

SDE versus LLE Similar idea First, compute neighborhoods in the input space Second, construct a square matrix to characterize local relationship between input data. Finally, compute low-dimension embedding using the eigenvectors of the matrix

SDE versus LLE Different performance SDE: good quality, more robust to sparse samples, but optimization is slow and hard to scale to large data set LLE: fast, scalable to large data set, but low quality when samples are sparse, due to locally linear assumption