ECE 7251: Signal Detection and Estimation

Slides:

Advertisements

Similar presentations

Chapter Algorithms 3.2 The Growth of Functions

Advertisements

Week 8 2. The Laurent series and the Residue Theorem (continued)

Quantitative Methods 2 Lecture 3 The Simple Linear Regression Model Edmund Malesky, Ph.D., UCSD.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Jensen’s Inequality (Special Case) EM Theorem.

CSE Differentiation Roger Crawfis. May 19, 2015OSU/CIS 5412 Numerical Differentiation The mathematical definition: Can also be thought of as the.

Prof. David R. Jackson ECE Dept. Spring 2014 Notes 30 ECE

. EM algorithm and applications Lecture #9 Background Readings: Chapters 11.2, 11.6 in the text book, Biological Sequence Analysis, Durbin et al., 2001.

Function Optimization Newton’s Method. Conjugate Gradients

First introduced in 1977 Lots of mathematical derivation Problem : given a set of data (data is incomplete or having missing values). Goal : assume the.

Expectation Maximization Algorithm

Maximum Likelihood (ML), Expectation Maximization (EM)

Expectation-Maximization

Visual Recognition Tutorial

What is it? When would you use it? Why does it work? How do you implement it? Where does it stand in relation to other methods? EM algorithm reading group.

Tutorial 10 Iterative Methods and Matrix Norms. 2 In an iterative process, the k+1 step is defined via: Iterative processes Eigenvector decomposition.

EM Algorithm Likelihood, Mixture Models and Clustering.

Zen, and the Art of Neural Decoding using an EM Algorithm Parameterized Kalman Filter and Gaussian Spatial Smoothing Michael Prerau, MS.

EM algorithm LING 572 Fei Xia 03/02/06. Outline The EM algorithm EM for PM models Three special cases –Inside-outside algorithm –Forward-backward algorithm.

CSci 6971: Image Registration Lecture 16: View-Based Registration March 16, 2004 Prof. Chuck Stewart, RPI Dr. Luis Ibanez, Kitware Prof. Chuck Stewart,

1 10. Joint Moments and Joint Characteristic Functions Following section 6, in this section we shall introduce various parameters to compactly represent.

Do Now – #1 and 2, Quick Review, p.328 Find dy/dx: 1. 2.

Incomplete Graphical Models Nan Hu. Outline Motivation K-means clustering Coordinate Descending algorithm Density estimation EM on unconditional mixture.

Likelihood probability of observing the data given a model with certain parameters Maximum Likelihood Estimation (MLE) –find the parameter combination.

Chapter 9 Sequences and Series The Fibonacci sequence is a series of integers mentioned in a book by Leonardo of Pisa (Fibonacci) in 1202 as the answer.

All of Statistics Chapter 5: Convergence of Random Variables Nick Schafer.

Approximation Algorithms: The Subset-sum Problem

1 Spring 2003 Prof. Tim Warburton MA557/MA578/CS557 Lecture 5a.

Introduction We are going to look at exponential functions We will learn about a new ‘special’ number in Mathematics We will see how this number can be.

Consistency An estimator is a consistent estimator of θ, if , i.e., if

1 6. Mean, Variance, Moments and Characteristic Functions For a r.v X, its p.d.f represents complete information about it, and for any Borel set B on the.

1 CSE 552/652 Hidden Markov Models for Speech Recognition Spring, 2005 Oregon Health & Science University OGI School of Science & Engineering John-Paul.

CS Statistical Machine learning Lecture 24

ECE 8443 – Pattern Recognition Objectives: Jensen’s Inequality (Special Case) EM Theorem Proof EM Example – Missing Data Intro to Hidden Markov Models.

1 6. Mean, Variance, Moments and Characteristic Functions For a r.v X, its p.d.f represents complete information about it, and for any Borel set B on the.

Joint Moments and Joint Characteristic Functions.

Visual Tracking by Cluster Analysis Arthur Pece Department of Computer Science University of Copenhagen

Master Method Some of the slides are from Prof. Plaisted’s resources at University of North Carolina at Chapel Hill.

Week 31 The Likelihood Function - Introduction Recall: a statistical model for some data is a set of distributions, one of which corresponds to the true.

CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov

Information Bottleneck versus Maximum Likelihood Felix Polyakov.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Jensen’s Inequality (Special Case) EM Theorem.

Call the Feds! We’ve Got Nested Radicals! Alan Craig.

ECE 7251: Signal Detection and Estimation

Applied Electricity and Magnetism

ECE 6341 Spring 2016 Prof. David R. Jackson ECE Dept. Notes 25.

ECE 7251: Signal Detection and Estimation

ECE 7251: Signal Detection and Estimation

Week 4 4. Linear first-order ODEs

Classification of unlabeled data:

LECTURE 10: EXPECTATION MAXIMIZATION (EM)

The Exponential and Log Functions

Expectation-Maximization

CSE Differentiation Roger Crawfis.

Integration The Explanation of integration techniques.

Integration The Explanation of integration techniques.

Homework 1 (parts 1 and 2) For the general system described by the following dynamics: We have the following algorithm for generating the Kalman gain and.

Convergence or Divergence of Infinite Series

5.5 Properties of the Definite Integral

Gaussian Mixture Models And their training with the EM algorithm

Charles University Charles University STAKAN III

Computing and Statistical Data Analysis / Stat 7

Differentiate the function: {image} .

ECE 7251: Signal Detection and Estimation

ECE 6341 Spring 2016 Prof. David R. Jackson ECE Dept. Notes 30.

ECE 6341 Spring 2016 Prof. David R. Jackson ECE Dept. Notes 32.

9. Two Functions of Two Random Variables

Divide-and-Conquer 7 2  9 4   2   4   7

A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes International.

A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes International.

Presentation transcript:

ECE 7251: Signal Detection and Estimation Spring 2002 Prof. Aaron Lanterman Georgia Institute of Technology Lecture 14, 2/6/02: “The” Expectation-Maximization Algorithm (Theory)

Convergence of the EM Algorithm We’d like to prove that the likelihood goes up with each iteration: Recall from last lecture, Take logarithms of both sides and rearrange:

Deriving the Smiley Equality Multiply both sides by and integrate with respect to z: Simplifies to: Call this equality ☺

Using the Smiley Equality Evaluate ☺ at ♠ ♣ Subtract ♣ from ♠

Use A Really Helpful Inequality Let’s focus on this final term for a little bit

Zeroness of the Last Term

I Think We Got It! Now we have: Recall the definition of the M-step: So, by definition Hence

Some Words of Warning Notice we showed that the likelihood was nondecreasing; that doesn’t automatically imply that the parameter estimates converge Parameter estimate could slide along a contour of constant loglikelihood Can prove some things about parameter convergence in special cases Ex: EM Algorithm for Imaging from Poisson Data (i.e. Emission Tomography)

Generalized EM Algorithms Recall this line: What if the M-step is too hard? Try a “generalized” EM algorithm: Note convergence proof still works!

The SAGE Algorithm Problem: EM algorithms tend to be slow Observation: “Bigger” complete data spaces result in slower algorithms than “smaller” complete data spaces SAGE (Space-Alternating Generalized Expectation-Maximization), Hero & Fessler Split big complete data space into several smaller “hidden” data spaces Designed to yield faster convergence Generalization of “ordered subsets” EM algorithm