Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hidden Markov Models (HMM) Rabiner’s Paper Markoviana Reading Group Computer Eng. & Science Dept. Arizona State University.

Similar presentations


Presentation on theme: "Hidden Markov Models (HMM) Rabiner’s Paper Markoviana Reading Group Computer Eng. & Science Dept. Arizona State University."— Presentation transcript:

1 Hidden Markov Models (HMM) Rabiner’s Paper Markoviana Reading Group Computer Eng. & Science Dept. Arizona State University

2 Markoviana Reading GroupFatih Gelgi – Feb, Stationary and Non-stationary Stationary Process: Its statistical properties do not vary with time Non-stationary Process: The signal properties vary over time

3 Markoviana Reading GroupFatih Gelgi – Feb, HMM Example - Casino Coin FairUnfair H HTT State transition Pbbties. Symbol emission Pbbties. HTHHTTHHHTHTHTHHTHHHHHHTHTHH Observation Sequence FFFFFFUUUFFFFFFUUUUUUUFFFFFF State Sequence Motivation: Given a sequence of H & Ts, can you tell at what times the casino cheated? Observation Symbols States Two CDF tables

4 Markoviana Reading GroupFatih Gelgi – Feb, Properties of an HMM  First-order Markov process q t only depends on q t-1  Time is discrete

5 Markoviana Reading GroupFatih Gelgi – Feb, Elements of an HMM  N, the number of States  M, the number of Symbols  States S 1, S 2, … S N  Observation Symbols O 1, O 2, … O M , the Probability Distributions a, b, 

6 Markoviana Reading GroupFatih Gelgi – Feb, HMM Basic Problems 1.Given an observation sequence O=O 1 O 2 O 3 …O T and, find P(O|)  Forward Algorithm / Backward Algorithm 2.Given O=O 1 O 2 O 3 …O T and find most likely state sequence Q=q 1 q 2 …q T  Viterbi Algorithm 3.Given O=O 1 O 2 O 3 …O T and re-estimate so that P(O|) is higher than it is now  Baum-Welch Re-estimation

7 Markoviana Reading GroupFatih Gelgi – Feb, Forward Algorithm Illustration  t (i) is the probability of observing a partial sequence O 1 O 2 O 3 …O t such that the state S i.

8 Markoviana Reading GroupFatih Gelgi – Feb, Forward Algorithm Illustration (cont’d) State Sj SNSN   b N (O 1 )  1 (i) a iN ) b N (O 2 ) … …… S6S6   b 6 (O 1 )  1 (i) a i6 ) b 6 (O 2 ) S5S5   b 5 (O 1 )  1 (i) a i5 ) b 5 (O 2 ) S4S4   b 4 (O 1 )  1 (i) a i4 ) b 4 (O 2 ) S3S3   b 3 (O 1 )  1 (i) a i3 ) b 3 (O 2 ) S2S2   b 2 (O 1 )  1 (i) a i2 ) b 2 (O 2 ) S1S1   b 1 (O 1 )  1 (i) a i1 ) b 1 (O 2 )  t (j) O1O1 O2O2 O3O3 O4O4 … OTOT Observations O t Total of this column gives solution  t (i) is the probability of observing a partial sequence O 1 O 2 O 3 …O t such that the state S i.

9 Markoviana Reading GroupFatih Gelgi – Feb, Forward Algorithm Definition: Initialization: Induction: Problem 1 Answer:  t (i) is the probability of observing a partial sequence O 1 O 2 O 3 …O t such that the state S i. Complexity: O(N 2 T)

10 Markoviana Reading GroupFatih Gelgi – Feb, Backward Algorithm Illustration  t (i) is the probability of observing a partial sequence O t+1 O t+2 O t+3 …O T such that the state S i.

11 Markoviana Reading GroupFatih Gelgi – Feb, Backward Algorithm Definition: Initialization: Induction:  t (i) is the probability of observing a partial sequence O t+1 O t+2 O t+3 …O T such that the state S i.

12 Markoviana Reading GroupFatih Gelgi – Feb, Q2: Optimality Criterion 1 * Maximize the expected number of correct individual states Definition: Initialization: Problem 2 Answer:  t (i) is the probability of being in state S i at time t given the observation sequence O and the model. Problem: If some a ij =0, the optimal state sequence may not even be a valid state sequence.

13 Markoviana Reading GroupFatih Gelgi – Feb, Q2: Optimality Criterion 2 * Find the single best state sequence (path), i.e. maximize P(Q|O,). Definition:  t (i) is the highest probability of a state path for the partial observation sequence O 1 O 2 O 3 …O t such that the state S i.

14 Markoviana Reading GroupFatih Gelgi – Feb, Viterbi Algorithm The major difference from the forward algorithm: Maximization instead of sum

15 Markoviana Reading GroupFatih Gelgi – Feb, Viterbi Algorithm Illustration State S j SNSN   b N (O 1 )max  1 (i) a iN ] b N (O 2 ) … …… S6S6   b 6 (O 1 )max  1 (i) a i6 ] b 6 (O 2 ) S5S5   b 5 (O 1 )max  1 (i) a i5 ] b 5 (O 2 ) S4S4   b 4 (O 1 )max  1 (i) a i4 ] b 4 (O 2 ) S3S3   b 3 (O 1 )max  1 (i) a i3 ] b 3 (O 2 ) S2S2   b 2 (O 1 )max  1 (i) a i2 ] b 2 (O 2 ) S1S1   b 1 (O 1 )max  1 (i) a i1 ] b 1 (O 2 )  t (j) O1O1 O2O2 O3O3 O4O4 … OTOT Observations O t  t (i) is the highest probability of a state path for the partial observation sequence O 1 O 2 O 3 …O t such that the state S i. Max of this col indicates traceback start

16 Markoviana Reading GroupFatih Gelgi – Feb, Relations with DBN Forward Function: Backward Function: Viterbi Algorithm: b j (O t+1 )a ij  t (i) b j (O t+1 )a ij  t+1 (j)  t+1 (j)  t (i)  T (i)=1 b j (O t+1 )a ij  t (i)  t+1 (j)

17 Markoviana Reading GroupFatih Gelgi – Feb, Some more definitions  t (i) is the probability of being in state S i at time t  t (i,j) is the probability of being in state S i at time t, and S j at time t+1

18 Markoviana Reading GroupFatih Gelgi – Feb, Baum-Welch Re-estimation Expectation-Maximization Algorithm Expectation:

19 Markoviana Reading GroupFatih Gelgi – Feb, Baum-Welch Re-estimation (cont’d) Maximization:

20 Markoviana Reading GroupFatih Gelgi – Feb, Notes on the Re-estimation  If the model does not change, it means that it has reached a local maxima.  Depending on the model, many local maxima can exist  Re-estimated probabilities will sum to 1

21 Markoviana Reading GroupFatih Gelgi – Feb, Implementation issues  Scaling  Multiple observation sequences  Initial parameter estimation  Missing data  Choice of model size and type

22 Markoviana Reading GroupFatih Gelgi – Feb, Scaling calculation: Recursion to calculate:

23 Markoviana Reading GroupFatih Gelgi – Feb, Scaling (cont’d) calculation: Desired condition: * Note that is not true!

24 Markoviana Reading GroupFatih Gelgi – Feb, Scaling (cont’d)

25 Markoviana Reading GroupFatih Gelgi – Feb, Maximum log-likelihood Initialization: Recursion: Termination:

26 Markoviana Reading GroupFatih Gelgi – Feb, Multiple observations sequences Problem with re-estimation

27 Markoviana Reading GroupFatih Gelgi – Feb, Initial estimates of parameters  For  and A, Random or uniform is sufficient  For B (discrete symbol prb.), Good initial estimate is needed

28 Markoviana Reading GroupFatih Gelgi – Feb, Insufficient training data Solutions:  Increase the size of training data  Reduce the size of the model  Interpolate parameters using another model

29 Markoviana Reading GroupFatih Gelgi – Feb, References  L Rabiner. ‘A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition.’ Proceedings of the IEEE 1989.A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition.  S Russell, P Norvig. ‘Probabilistic Reasoning Over Time’. AI: A Modern Approach, Ch.15, 2002 (draft).  V Borkar, K Deshmukh, S Sarawagi. ‘Automatic segmentation of text into structured records.’ ACM SIGMOD 2001.Automatic segmentation of text into structured records.  T Scheffer, C Decomain, S Wrobel. ‘Active Hidden Markov Models for Information Extraction.’ Proceedings of the International Symposium on Intelligent Data Analysis  S Ray, M Craven. ‘Representing Sentence Structure in Hidden Markov Models for Information Extraction.’ Proceedings of the 17th International Joint Conference on Artificial Intelligence 2001.Representing Sentence Structure in Hidden Markov Models for Information Extraction.


Download ppt "Hidden Markov Models (HMM) Rabiner’s Paper Markoviana Reading Group Computer Eng. & Science Dept. Arizona State University."

Similar presentations


Ads by Google