Presentation is loading. Please wait.

Presentation is loading. Please wait.

Spectral Methods for Dimensionality

Similar presentations


Presentation on theme: "Spectral Methods for Dimensionality"— Presentation transcript:

1 Spectral Methods for Dimensionality
ECOE580

2 Introduction How can we search low-dimensional structure in high-dimensional structure? Spectral methods Non-linear low-dimensional sub manifolds Computationally tractable Shortest path problems, LSE, SDP etc.

3 Inputs & outputs Given a high-dimensional data X = (x1 x2 …xn) xi є Rd
Compute n corresponding outputs such that yi є Rm Faithful mapping Nearby inputs mapped to nearby outputs m << d Assume inputs are centered on the origin Sum(xi) = 0

4 Spectral Methods Top or bottom eigenvectors of specially constructed matrices Linear Methods Graphical Methods Nearest neighbor relations Weighting Kernel Methods

5 Linear Methods PCA Preserves covariance structure
Input patterns projected to m-dim subspace by minimizing Which is equal to sub-space with minimum variance

6 PCA The output pattern yij = xi . ej
The subspace will contain the significant data’s variance. A prominent gap in the eigenvalue spectrum indicates that a cut-off

7 Metric Multidimensional Scaling
Uses inner product between different inputs The minimum error is obtained from spectral decomposition of the Gram matrix The output pattern yij = λj . eji

8 Metric Multidimensional Scaling
Motivated by preserving pairwise distance Assuming that the inputs centered at origin Gram matrix G can be written in terms of S Where

9 Metric Multidimensional Scaling
Yields the same outputs with PCA Distance metric can be generalized to non linear metrics

10 Graph Based Methods If the data set is highly nonlinear then linear methods fail. Constructs a sparse graph Nodes are input patterns Edges are neighborhood relations Contract matrices from these graphs to capture the underlying structure

11 Graph Based Methods Polynomial-time Uses shortest path LSE SDP

12 IsoMap Preserves the pairwise distances between inputs as measured along the sub-manifold from which they are sampled Variant of MDS it uses geodesic distance

13 IsoMap Geodesic distance : Shortest path through the graph Algorithm
Connect k-NN Compute pairwise distance P, between all nodes Find all-to-all shortest path

14 IsoMap Apply MDS on P Find the top m eigenvalues
Euclidean distance of outputs are geodesic distance of inputs Formal guarantee of convergence when the data set has no holes (convex)

15 Maximum Variance Unfolding
Preserves the distances and angles between nearby inputs Constructs a Gram matrix Unfold the data by pulling the input patterns apart Final transformation is a locally rotation and transformation

16 MVU Compute kNN Indicator matrix nij = 1 when input i and j are neighbors or in the kNN set of some other instance Due to the distance and angle constraint when nij = 1 Unfold the input patter by maximizing the variance of output

17 MVU Above optimization can be solved by SDP.

18 Locally Linear Embedding
Preserves local linear structure of nearby inputs Instead of top m eigenvectors of a dense gram matrix it uses bottom m eigenvectors of a sparse matrix

19 LLE Compute kNN Construct directed graph whose edges indicate NN
Assign Wij to the edges (each input and its kNN viewed as a small liner patch) Weights are computed by re-construct each input x from its kNN

20 LLE Weights are 0 if input i and j do not have kNN relationship
Sum of weights for every input is 1. Sparse matrix W with local properties of data Same relation holds for outputs Minimize above equation Outputs has unit covariance Outputs are centered

21 LLE Minimization of equals to computing the bottom m eigenvalues of

22 Laplacian Eigenmaps Preserves the proximity relations
Map nearby inputs to nearby outputs Similar to LLE Compute kNN Construct undirected graph Assigns positive weights (uniform or decaying weights)

23 LE Let D denote the diagonal matrix with elements
Obtain outputs by minimizing Nearness measured by W

24 LE The minimizing problem can be solved by finding bottom m eigenvectors of Matrices are sparse so algorithm can be scaled to larger datasets.

25 Kernel Functions Let H be a mapping from Rd->dot product feature space Then the PCA can be written as Kernel PCA often uses nonlinear kernels Polynomial Kernels Gaussian Kernels However these kernels are not well suited for manifold learning


Download ppt "Spectral Methods for Dimensionality"

Similar presentations


Ads by Google