Download presentation

Presentation is loading. Please wait.

Published byMarissa Lynch Modified over 2 years ago

1
Markov models and HMMs

2
Probabilistic Inference P(X,H,E) P(X|E=e) = P(X,E=e) / P(E=e) with P(X,E=e) = sum h P(X, H=h, E=e) P(X,E=e) = sum h,x P(X=x, H=h, E=e)

3
Example… 4 binary random variables X 1, X 2, X 3, X 4 Joint distribution: P(X 1, X 2, X 3, X 4 ) Observe X 4, distribution of X 1 ? Inference: P(X 1 | X 4 =x 4 ) = P(X 1, X 4 =x 4 ) / P(X 4 =x 4 ) P(X 1, X 4 =x 4 )=sum x2,x3 P(X 1, X 2, X 3, X 4 =x 4 )

4
A problem… = 15 probabilities for each of the possible values of x i The summation sum x2,x3 is still ok, but imagine a larger set of variables… Exponentially hard in the number of variables

5
Solution Restrict the joint probability to a subset of all joint probabilities Probabilistic Graphical Models (PGMs) E.g. people: A=length, B=hair length, C=gender A and B independent when conditioned on C C BA

6
Independent random variables 4 Independent random variables Number of probabilities to be specified: 4 Inference: P(X 1 | X 4 =x 4 ) = P(X 1, X 4 =x 4 ) / P(X 4 =x 4 ) = P(X 1 ) P(X 1, X 4 =x 4 ) = sum x2,x3 P(X 1, X 2, X 3, X 4 =x 4 ) = P(X 1 )*P(X 4 =x 4 )*(sum x2 P(X 2 =x 2 ))*(sum x3 P(X 3 =x 3 )) Sums are simply over one random variable each. X1X1 X4X4 X3X3 X2X2

7
Markov chain P(X i+1 |X i, X i-1,…X 1 ) = P(X i+1 |X i ) Xs only depend on immediate history P(X 1, X 2, X 3, X 4 ) = P(X 1 ) * P(X 2 |X 1 ) * P(X 3 |X 2 ) * P(X 4 |X 3 ) Only conditional dependencies need to be specified: 1+2*3 = 7 X1X1 X4X4 X3X3 X2X2

8
Markov chain Inference, e.g.: P(X 1 |X 3 =x 3 ) = P(X 1,X 3 =x 3 ) / P(X 3 =x 3 ) P(X 1,X 3 =x 3 ) = sum x2,x4 P(X 1,X 2 =x 2,X 3 =x 3,X 4 =x 4 ) = sum x2,x4 P(X 1 )*P(X 2 |X 1 )*P(X 3 |X 2 )*P(X 4 |X 3 ) Distributive law: = P(X 1 )*(sum x2 P(X 2 =x 2 |X 1 )*P(X 3 =x 3 |X 2 =x 2 ) * (sum x4 P(X 4 =x 4 |X 3 =x 3 ))) = P(X 1 )*(sum x2 P(X 2 =x 2 |X 1 )*P(X 3 =x 3 |X 2 =x 2 )) Each of the sums is manageable…

9
Noisy channel – independent variables Fully specified by P(X i ) and P(E i |X i ) Given E i =e i (evidence), we can infer the most likely state of X i by means of Bayes rule: P(X i |E i =e i ) = P(E i =e i |X i ) * P(X i ) / P(E i =e i ) This is independent on the other evidence E j =e j X1X1 X4X4 X3X3 X2X2 E1E1 E4E4 E3E3 E2E2 Noisy channel…

10
Hidden Markov chain Fully specified by P(X 1 ), P(X i+1 |X i ) and P(E i |X i ) P(X 1,X 2,X 3,X 4,E 1,E 2,E 3,E 4 ) = P(X 1 )*prod i=2,3,4 P(X i |X i-1 )*P(E i |X i ) Hidden state transitions (Markov chain), and emission probabilities Inference: P(X i |E 1 =e 1,E 2 =e 2,E 3 =e 3,E 4 =e 4 ) ? X1X1 X4X4 X3X3 X2X2 E1E1 E4E4 E3E3 E2E2 Noisy channel…

11
Hidden Markov chain P(X 1 |E 1 =e 1,E 2 =e 2,E 3 =e 3,E 4 =e 4 ) = P(X 1, E 1 =e 1,E 2 =e 2,E 3 =e 3,E 4 =e 4 ) / P(E 1 =e 1,E 2 =e 2,E 3 =e 3,E 4 =e 4 ) Numerator: P(X 1, E 1 =e 1,E 2 =e 2,E 3 =e 3,E 4 =e 4 ) = sum x1,x3,x4 P(X 1,X 2 =x 2,X 3 =x 3,X 4 =x 4,E 1 =e 1,E 2 =e 2,E 3 =e 3,E 4 =e 4 ) = P(X 1 )*P(E 1 =e 1 |X 1 ) * sum x2 (P(X 2 =x 2 |X 1 )*P(E 2 |X 2 ) * sum x3 (P(X 3 =x 3 |X 2 =x 2 )*P(E 3 |X 3 ) * sum x4 (P(X 4 =x 4 |X 3 =x 3 )*P(E 4 |X 4 )) ) )

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google