Presentation is loading. Please wait.

Presentation is loading. Please wait.

Advanced Artificial Intelligence Lecture 8: Advance machine learning.

Similar presentations


Presentation on theme: "Advanced Artificial Intelligence Lecture 8: Advance machine learning."— Presentation transcript:

1 Advanced Artificial Intelligence Lecture 8: Advance machine learning

2 Outline  Clustering  K-Means  EM  Spectral Clustering  Dimensionality Reduction 2

3 The unsupervised learning problem 3 Many data points, no labels

4 K-Means 4 Many data points, no labels

5 K-Means  Choose a fixed number of clusters  Choose cluster centers and point-cluster allocations to minimize error  can’t do this by exhaustive search, because there are too many possible allocations.  Algorithm  fix cluster centers; allocate points to closest cluster  fix allocation; compute best cluster centers  x could be any set of features for which we can compute a distance (careful about scaling)

6 K-Means

7 * From Marc Pollefeys COMP 256 2003

8 K-Means  Is an approximation to EM  Model (hypothesis space): Mixture of N Gaussians  Latent variables: Correspondence of data and Gaussians  We notice:  Given the mixture model, it’s easy to calculate the correspondence  Given the correspondence it’s easy to estimate the mixture models

9 Expectation Maximzation: Idea  Data generated from mixture of Gaussians  Latent variables: Correspondence between Data Items and Gaussians

10 Generalized K-Means (EM)

11 Gaussians 11

12 ML Fitting Gaussians 12

13 Learning a Gaussian Mixture (with known covariance) M-Step E-Step

14 Expectation Maximization  Converges!  Proof [Neal/Hinton, McLachlan/Krishnan] :  E/M step does not decrease data likelihood  Converges at local minimum or saddle point  But subject to local minima

15 Practical EM  Number of Clusters unknown  Suffers (badly) from local minima  Algorithm:  Start new cluster center if many points “unexplained”  Kill cluster center that doesn’t contribute  (Use AIC/BIC criterion for all this, if you want to be formal) 15

16 Spectral Clustering 16

17 Spectral Clustering 17

18 Spectral Clustering: Overview Data SimilaritiesBlock-Detection

19 Eigenvectors and Blocks  Block matrices have block eigenvectors:  Near-block matrices have near-block eigenvectors: [Ng et al., NIPS 02] 1100 1100 0011 0011 eigensolver.71 0 0 0 0 1 = 2 2 = 2 3 = 0 4 = 0 11.20 110-.2.2011 0-.211 eigensolver.71.69.14 0 0 -.14.69.71 1 = 2.02 2 = 2.02 3 = -0.02 4 = -0.02

20 Spectral Space  Can put items into blocks by eigenvectors:  Resulting clusters independent of row ordering: 11.20 110-.2.2011 0-.211.71.69.14 0 0 -.14.69.71 e1e1 e2e2 e1e1 e2e2 1.210 101 101-.2 01 1.71.14.69 0 0 -.14.71 e1e1 e2e2 e1e1 e2e2

21 The Spectral Advantage  The key advantage of spectral clustering is the spectral space representation:

22 Measuring Affinity Intensity Texture Distance

23 Scale affects affinity

24

25 Dimensionality Reduction 25

26 Dimensionality Reduction with PCA 26

27 Linear: Principal Components  Fit multivariate Gaussian  Compute eigenvectors of Covariance  Project onto eigenvectors with largest eigenvalues 27


Download ppt "Advanced Artificial Intelligence Lecture 8: Advance machine learning."

Similar presentations


Ads by Google