Most slides from Expectation Maximization (EM) Northwestern University EECS 395/495 Special Topics in Machine Learning
Most slides from Outline Objective Simple example Complex example
Most slides from Objective Learning with missing/unobservable data J B E A E B A J … Maximum likelihood
Most slides from Objective Learning with missing/unobservable data J B E A E B A J 1 1 ? ? ? 0 … Optimize what?
Most slides from Outline Objective Simple example Complex example
Most slides from Simple example
Most slides from Maximize likelihood
Most slides from Same Problem with Hidden Information Score GradeHidden Observable
Most slides from Same Problem with Hidden Information S G
Most slides from Same Problem with Hidden Information S G
Most slides from EM for our example
Most slides from EM Convergence
Most slides from Generalization X: observable data(score = {h, c, d}) z: missing data(grade = {a, b, c, d}) : model parameters to estimate ( ) E: given, compute the expectation of z M: use z obtained in E step, maximize the likelihood with respect to
Most slides from Outline Objective Simple example Complex example
Most slides from Gaussian Mixtures
Most slides from Gaussian Mixtures Know –Data – - Don’t know –Data label Objective – -
Most slides from The GMM assumption
Most slides from The GMM assumption
Most slides from The GMM assumption
Most slides from The GMM assumption
Most slides from The data generated Coordinates Label
Most slides from Computing the likelihood
Most slides from EM for GMMs
Most slides from EM for GMMs
Most slides from EM for GMMs
Most slides from
Generalization X: observable data z: unobservable data : model parameters to estimate E: given, compute the “expectation” of z M: use z obtained in E step, maximize the likelihood with respect to
Most slides from Exponential family –Yes: normal, exponential, beta, Bernoulli, binomial, multinomial, Poisson… –No: Cauchy and uniform EM using sufficient statistics –S1: computing the expectation of the statistics –S2: set the maximum likelihood For distributions in exponential family
Most slides from What EM really is Maximize expected log likelihood E-step: Determine the expectation M-step: Maximize the expectation above with respect to X: observable data z: missing data
Most slides from Final comments Deal with missing data/latent variables Maximize expected log likelihood Local minima