Presentation is loading. Please wait.

Presentation is loading. Please wait.

EM Algorithm and its Applications

Similar presentations


Presentation on theme: "EM Algorithm and its Applications"— Presentation transcript:

1 EM Algorithm and its Applications
slides adapted from Yi Li, WU

2 Outline Introduction of EM K-Means  EM EM Applications
Image Segmentation using EM in Clustering Algorithms

3 Clustering How can we separate unsupervised the data into different classes?

4 Example

5 Clustering There are K clusters C1,…, CK with means m1,…, mK.
The least-squares error is defined as Out of all possible partitions into K clusters, choose the one that minimizes D. K 2 D =   || xi - mk || . k=1 xi  Ck Why don’t we just do this? If we could, would we get meaningful objects?

6 Color Clustering by K-means Algorithm
Form K-means clusters from a set of n-dimensional vectors 1. Set ic (iteration count) to 1 2. Choose randomly a set of K means m1(1), …, mK(1). 3. For each vector xi, compute distance D(xi,mk(ic)), k=1,…K and assign xi to the cluster Cj with nearest mean. 4. Update the means to get m1(ic),…,mK(ic); increment ic by 1 5. Repeat steps 3 and 4 until Ck(ic) = Ck(ic+1) for all k.

7 Classification Results
K-Means Classifier Classification Results x1C(x1) x2C(x2) xiC(xi) x1={r1, g1, b1} x2={r2, g2, b2} xi={ri, gi, bi} Classifier (K-Means) Cluster Parameters m1 for C1 m2 for C2 mk for Ck

8 K-Means Classifier (Cont.)
Input (Known) Output (Unknown) x1={r1, g1, b1} x2={r2, g2, b2} xi={ri, gi, bi} Classification Results x1C(x1) x2C(x2) xiC(xi) Cluster Parameters m1 for C1 m2 for C2 mk for Ck

9 Classification Results (1) C(x1), C(x2), …, C(xi)
Output (Unknown) Input (Known) Initial Guess of Cluster Parameters m1 , m2 , …, mk x1={r1, g1, b1} x2={r2, g2, b2} xi={ri, gi, bi} Classification Results (1) C(x1), C(x2), …, C(xi) Cluster Parameters(1) m1 , m2 , …, mk Classification Results (2) C(x1), C(x2), …, C(xi) Cluster Parameters(2) m1 , m2 , …, mk   Classification Results (ic) C(x1), C(x2), …, C(xi) Cluster Parameters(ic) m1 , m2 , …, mk  

10 K-Means (Cont.) Boot Step: Iteration Step:
Initialize K clusters: C1, …, CK Each Cluster is represented by its mean mj Iteration Step: Estimate the cluster of each data Re-estimate the cluster parameters xi C(xi)

11 K-Means Example

12 K-Means Example

13 Image Segmentation by K-Means
Select a value of K Select a feature vector for every pixel (color, texture, position, or combination of these etc.) Define a similarity measure between feature vectors (Usually Euclidean Distance). Apply K-Means Algorithm. Apply Connected Components Algorithm. Merge any components of size less than some threshold to an adjacent component that is most similar to it. * From Marc Pollefeys COMP

14 Results of K-Means Clustering:
I gave each pixel the mean intensity or mean color of its cluster --- this is basically just vector quantizing the image intensities/colors. Notice that there is no requirement that clusters be spatially localized and they’re not. Image Clusters on intensity Clusters on color K-means clustering using intensity alone and color alone * From Marc Pollefeys COMP

15 K means: Challenges Will converge
But not to the global minimum of objective function Variations: search for appropriate number of clusters by applying k- means with different k and comparing the results

16 K-means Variants Different ways to initialize the means
Different stopping criteria Dynamic methods for determining the right number of clusters (K) for a given image The EM Algorithm: a probabilistic formulation of K-means

17 E M algorithm Like K-means with soft assignment. (vs hard assignment)
Assign point partly to all clusters based on probability it belongs to each. Compute weighted averages (centers, and covariances)

18 K-means vs. EM K-means EM Cluster mean mean, variance,
Representation and weight Cluster randomly select initialize K Initialization K means Gaussian distributions Expectation assign each point soft-assign each point to closest mean to each distribution Maximization compute means compute new params of current clusters of each distribution

19 Notation: Normal distribution 1D case
N( , ) is a 1D normal (Gaussian) distribution with mean  and standard deviation  (so the variance is 2.

20 Normal distribution: Multi-dimensional case
N(, ) is a multivariate Gaussian distribution with mean  and covariance matrix . What is a covariance matrix? R G B R R2 RG RB G GR G2 GB B BR BG B2 variance(X): X2 =  (xi - )2 (1/N) cov(X,Y) =  (xi - x)(yi - y) (1/N)

21 K –Means -> EM with GMM : The Intuition (1)
Instead of making a “hard” decision on to which class a pixel belongs to we use probability theory and assign pixels to classes probabilistically. Using Bayes rule We want to maximize the Posterior Likelihood Prior Posterior

22 K-Means -> EM : The Intuition (2)
We model the Likelihood as a Gaussian/Normal distribution

23 K-Means-> EM: The Intuition (3)
Instead of modeling the color distribution with one Gaussian, we model it as a weighted sum of Gaussians We want to find the parameters to maximize the correctness of We use the general trick of maximizing the logarithm of the probability function and maximize that. This is called Maximum Likelihood Estimation (MLE) We solve the minimization in 2 steps (E and M)

24 K-Means  EM with Gaussian Mixture Models (GMM)
Boot Step: Initialize K clusters: C1, …, CK Iteration Step: Estimate the (soft) cluster assignment of each data Re-estimate the cluster parameters (j, j) and P(Cj) for each cluster j. Expectation Maximization For each cluster j

25 Classification Results
EM Classifier Classification Results p(C1|x1) p(Cj|x2) p(Cj|xi) x1={r1, g1, b1} x2={r2, g2, b2} xi={ri, gi, bi} Classifier (EM) Cluster Parameters (1,1),p(C1) for C1 (2,2),p(C2) for C2 (k,k),p(Ck) for Ck

26 Classification Results
EM Classifier (Cont.) Input (Known) Output (Unknown) x1={r1, g1, b1} x2={r2, g2, b2} xi={ri, gi, bi} Cluster Parameters (1,1), p(C1) for C1 (2,2), p(C2) for C2 (k,k), p(Ck) for Ck Classification Results p(C1|x1) p(Cj|x2) p(Cj|xi)

27 Classification Results
Expectation Step Input (Known) Input (Estimation) Output x1={r1, g1, b1} x2={r2, g2, b2} xi={ri, gi, bi} Cluster Parameters (1,1), p(C1) for C1 (2,2), p(C2) for C2 (k,k), p(Ck) for Ck Classification Results p(C1|x1) p(Cj|x2) p(Cj|xi) +

28 Classification Results
Maximization Step Input (Known) Input (Estimation) Output x1={r1, g1, b1} x2={r2, g2, b2} xi={ri, gi, bi} Classification Results p(C1|x1) p(Cj|x2) p(Cj|xi) Cluster Parameters (1,1), p(C1) for C1 (2,2), p(C2) for C2 (k,k), p(Ck) for Ck +

29 EM Algorithm Boot Step: Iteration Step:
Initialize K clusters: C1, …, CK Iteration Step: Expectation Step Maximization Step (j, j) and P(Cj) for each cluster j.

30 EM Demo Video of Image Classification with EM Online Tutorial

31 EM Applications Blobworld: Image Segmentation Using Expectation-Maximization and its Application to Image Querying

32 Image Segmentation using EM
Step 1: Feature Extraction Step 2: Image Segmentation using EM In the literature often EM is used to segment the image into multiple regions. Usually each region corresponds to a gaussian model In our Project we segment one foreground region from the background. In the training, we crop the foreground pixels and train a Gaussian Mixture Model on those pixels. In the testing, we evaluate the cond. probability of each pixel and we threshold on that value.

33 Symbols The feature vector for pixel i is called xi.
There are going to be K Gaussian models; K is given. The j-th model has a Gaussian distribution with parameters j=(j,j). j's are the weights (which sum to 1) of Gaussians.  is the collection of parameters:  =(1, …, k, 1, …, k)

34 Initialization Each of the K Gaussians will have parameters j=(j,j), where j is the mean of the j-th Gaussian. j is the covariance matrix of the j-th Gaussian. The covariance matrices are initialized to be the identity matrix. The means can be initialized by finding the average feature vectors in each of K windows in the image; this is a data-driven initialization.

35 E-Step (referred to as responsibilities)

36 M-Step

37 Sample Results for Scene Segmentation

38 Finding good feature vectors

39 Fitting Gaussians in RGB space

40 Fitting Gaussians in other spaces
Experiment with other color spaces. Experiment with dimensionality reduction (PCA)


Download ppt "EM Algorithm and its Applications"

Similar presentations


Ads by Google