Nonlinear Dimensionality Reduction Approach (ISOMAP) 2006. 2. 28 Young Ki Baik Computer Vision Lab. Seoul National University
References A global geometric framework for nonlinear dimensionality reduction J. B. Tenenbaum, V. De Silva, J. C. Langford (Science 2000) LLE and Isomap Analysis of Spectra and Colour Images Dejan Kulpinski (Thesis 1999) Out-of-Sample Extensions for LLE, Isomap, MDS, Eigenmaps, and Spectral Clustering Yoshua Bengio et.al. (TR 2003)
Contents Introduction PCA and MDS ISOMAP Conclusion
Dimensionality Reduction The goal The meaningful low-dimensional structures hidden in their high-dimensional observations. Classical techniques PCA (Principle Component Analysis) – preserves the variance MDS (MultiDimensional Scaling) - preserves inter-point distance ISOMAP LLE (Locally Linear Embedding)
Linear Dimensionality Reduction PCA Finds a low-dimensional embedding of the data points that best preserves their variance as measured in the high-dimensional input space. MDS Finds an embedding that preserves the inter-point distances, equivalent to PCA when the distances are Euclidean.
Linear Dimensionality Reduction MDS distances Relation
Nonlinear Dimensionality Reduction Many data sets contain essential nonlinear structures that invisible to PCA and MDS Resort to some nonlinear dimensionality reduction approaches.
ISOMAP Example of Non-linear structure (Swiss roll) Only the geodesic distances reflect the true low-dimensional geometry of the manifold. ISOMAP (Isometric feature Mapping) Preserves the intrinsic geometry of the data. Uses the geodesic manifold distances between all pairs. This figure is a example of non-linear structure.
ISOMAP (Algorithm Description) Step 1 Determining neighboring points within a fixed radius based on the input space distance . These neighborhood relations are represented as a weighted graph G over the data points. Step 2 Estimating the geodesic distances between all pairs of points on the manifold by computing their shortest path distances in the graph G. Step 3 Constructing an embedding of the data in d-dimensional Euclidean space Y that best preserves the manifold’s geometry. This figure is a example of non-linear structure.
ISOMAP (Algorithm Description) Step 1 Determining neighboring points within a fixed radius based on the input space distance . # ε-radius # K-nearest neighbors These neighborhood relations are represented as a weighted graph G over the data points. K=4 ε This figure is a example of non-linear structure. i j k
ISOMAP (Algorithm Description) Step 2 Estimating the geodesic distances between all pairs of points on the manifold by computing their shortest path distances in the graph G. Can be done using Floyd’s algorithm or Dijkstra’s algorithm j i k This figure is a example of non-linear structure.
ISOMAP (Algorithm Description) Step 3 Constructing an embedding of the data in d-dimensional Euclidean space Y that best preserves the manifold’s geometry. Minimize the cost function: This figure is a example of non-linear structure. Solution: take top d eigenvectors of the matrix
Isomap : filled circles Experimental results # FACE # Hand writing : face pose and illumination : bottom loop and top arch MDS : open triangles Isomap : filled circles
Discussion Isomap handles non-linear manifold. Isomap keeps the advantages of PCA and MDS. Non-iterative procedure Polynomial procedure Guaranteed convergence Isomap represents the global structure of a data set within a single coordinate system.