Made by: Maor Levy, Temple University 2012 1. Many data points, no labels 2.

Made by: Maor Levy, Temple University 2012 1

Many data points, no labels 2

 Google Street View Google Street View 3

Many data points, no labels 4

 Choose a fixed number of clusters  Choose cluster centers and point-cluster allocations to minimize error  can’t do this by exhaustive search, because there are too many possible allocations.  Algorithm ◦ fix cluster centers; allocate points to closest cluster ◦ fix allocation; compute best cluster centers  x could be any set of features for which we can compute a distance (careful about scaling) * From Marc Pollefeys COMP 256 2003 5

K-means clustering using intensity alone and color alone Image Clusters on intensity Clusters on color * From Marc Pollefeys COMP 256 2003 Results of K-Means Clustering: 8

 Is an approximation to EM ◦ Model (hypothesis space): Mixture of N Gaussians ◦ Latent variables: Correspondence of data and Gaussians  We notice: ◦ Given the mixture model, it’s easy to calculate the correspondence ◦ Given the correspondence it’s easy to estimate the mixture models 9

 Data generated from mixture of Gaussians  Latent variables: Correspondence between Data Items and Gaussians 10

M-Step E-Step 14

 Converges!  Proof [Neal/Hinton, McLachlan/Krishnan] : ◦ E/M step does not decrease data likelihood ◦ Converges at local minimum or saddle point  But subject to local minima 15

http://www.ece.neu.edu/groups/rpl/kmeans/ 16

 Number of Clusters unknown  Suffers (badly) from local minima  Algorithm: ◦ Start new cluster center if many points “unexplained” ◦ Kill cluster center that doesn’t contribute ◦ (Use AIC/BIC criterion for all this, if you want to be formal) 17

Data SimilaritiesBlock-Detection * Slides from Dan Klein, Sep Kamvar, Chris Manning, Natural Language Group Stanford University 20

 Block matrices have block eigenvectors:  Near-block matrices have near-block eigenvectors: [Ng et al., NIPS 02] 1100 1100 0011 0011 eigensolver.71 0 0 0 0 1 = 2 2 = 2 3 = 0 4 = 0 11.20 110-.2.2011 0-.211 eigensolver.71.69.14 0 0 -.14.69.71 1 = 2.02 2 = 2.02 3 = -0.02 4 = -0.02 * Slides from Dan Klein, Sep Kamvar, Chris Manning, Natural Language Group Stanford University 21

 Can put items into blocks by eigenvectors:  Resulting clusters independent of row ordering: 11.20 110-.2.2011 0-.211.71.69.14 0 0 -.14.69.71 e1e1 e2e2 e1e1 e2e2 1.210 101 101-.2 01 1.71.14.69 0 0 -.14.71 e1e1 e2e2 e1e1 e2e2 * Slides from Dan Klein, Sep Kamvar, Chris Manning, Natural Language Group Stanford University 22

 The key advantage of spectral clustering is the spectral space representation: * Slides from Dan Klein, Sep Kamvar, Chris Manning, Natural Language Group Stanford University 23

Intensity Texture Distance * From Marc Pollefeys COMP 256 2003 24

* From Marc Pollefeys COMP 256 2003 25

* From Marc Pollefeys COMP 256 2003 26

M. Brand, MERL 27

 Fit multivariate Gaussian  Compute eigenvectors of Covariance  Project onto eigenvectors with largest eigenvalues 29

Mean face (after alignment) Slide credit: Santiago Serrano 30

Slide credit: Santiago Serrano 31

 Isomap  Local Linear Embedding Isomap, Science, M. Balasubmaranian and E. Schwartz 32

 References: ◦ Sebastian Thrun and Peter Norvig, Artificial Intelligence, Stanford University http://www.stanford.edu/class/cs221/notes/cs221-lecture6- fall11.pdf http://www.stanford.edu/class/cs221/notes/cs221-lecture6- fall11.pdf 34

Made by: Maor Levy, Temple University 2012 1. Many data points, no labels 2.

Similar presentations

Presentation on theme: "Made by: Maor Levy, Temple University 2012 1. Many data points, no labels 2."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Made by: Maor Levy, Temple University 2012 1. Many data points, no labels 2.

Similar presentations

Presentation on theme: "Made by: Maor Levy, Temple University 2012 1. Many data points, no labels 2."— Presentation transcript:

Similar presentations

About project

Feedback