Download presentation

Presentation is loading. Please wait.

Published byKimberly Rodriguez Modified about 1 year ago

1
Hidden Markov Models (HMM) Rabiner’s Paper Markoviana Reading Group Computer Eng. & Science Dept. Arizona State University

2
Markoviana Reading GroupFatih Gelgi – Feb, Stationary and Non-stationary Stationary Process: Its statistical properties do not vary with time Non-stationary Process: The signal properties vary over time

3
Markoviana Reading GroupFatih Gelgi – Feb, HMM Example - Casino Coin FairUnfair H HTT State transition Pbbties. Symbol emission Pbbties. HTHHTTHHHTHTHTHHTHHHHHHTHTHH Observation Sequence FFFFFFUUUFFFFFFUUUUUUUFFFFFF State Sequence Motivation: Given a sequence of H & Ts, can you tell at what times the casino cheated? Observation Symbols States Two CDF tables

4
Markoviana Reading GroupFatih Gelgi – Feb, Properties of an HMM First-order Markov process q t only depends on q t-1 Time is discrete

5
Markoviana Reading GroupFatih Gelgi – Feb, Elements of an HMM N, the number of States M, the number of Symbols States S 1, S 2, … S N Observation Symbols O 1, O 2, … O M , the Probability Distributions a, b,

6
Markoviana Reading GroupFatih Gelgi – Feb, HMM Basic Problems 1.Given an observation sequence O=O 1 O 2 O 3 …O T and, find P(O|) Forward Algorithm / Backward Algorithm 2.Given O=O 1 O 2 O 3 …O T and find most likely state sequence Q=q 1 q 2 …q T Viterbi Algorithm 3.Given O=O 1 O 2 O 3 …O T and re-estimate so that P(O|) is higher than it is now Baum-Welch Re-estimation

7
Markoviana Reading GroupFatih Gelgi – Feb, Forward Algorithm Illustration t (i) is the probability of observing a partial sequence O 1 O 2 O 3 …O t such that the state S i.

8
Markoviana Reading GroupFatih Gelgi – Feb, Forward Algorithm Illustration (cont’d) State Sj SNSN b N (O 1 ) 1 (i) a iN ) b N (O 2 ) … …… S6S6 b 6 (O 1 ) 1 (i) a i6 ) b 6 (O 2 ) S5S5 b 5 (O 1 ) 1 (i) a i5 ) b 5 (O 2 ) S4S4 b 4 (O 1 ) 1 (i) a i4 ) b 4 (O 2 ) S3S3 b 3 (O 1 ) 1 (i) a i3 ) b 3 (O 2 ) S2S2 b 2 (O 1 ) 1 (i) a i2 ) b 2 (O 2 ) S1S1 b 1 (O 1 ) 1 (i) a i1 ) b 1 (O 2 ) t (j) O1O1 O2O2 O3O3 O4O4 … OTOT Observations O t Total of this column gives solution t (i) is the probability of observing a partial sequence O 1 O 2 O 3 …O t such that the state S i.

9
Markoviana Reading GroupFatih Gelgi – Feb, Forward Algorithm Definition: Initialization: Induction: Problem 1 Answer: t (i) is the probability of observing a partial sequence O 1 O 2 O 3 …O t such that the state S i. Complexity: O(N 2 T)

10
Markoviana Reading GroupFatih Gelgi – Feb, Backward Algorithm Illustration t (i) is the probability of observing a partial sequence O t+1 O t+2 O t+3 …O T such that the state S i.

11
Markoviana Reading GroupFatih Gelgi – Feb, Backward Algorithm Definition: Initialization: Induction: t (i) is the probability of observing a partial sequence O t+1 O t+2 O t+3 …O T such that the state S i.

12
Markoviana Reading GroupFatih Gelgi – Feb, Q2: Optimality Criterion 1 * Maximize the expected number of correct individual states Definition: Initialization: Problem 2 Answer: t (i) is the probability of being in state S i at time t given the observation sequence O and the model. Problem: If some a ij =0, the optimal state sequence may not even be a valid state sequence.

13
Markoviana Reading GroupFatih Gelgi – Feb, Q2: Optimality Criterion 2 * Find the single best state sequence (path), i.e. maximize P(Q|O,). Definition: t (i) is the highest probability of a state path for the partial observation sequence O 1 O 2 O 3 …O t such that the state S i.

14
Markoviana Reading GroupFatih Gelgi – Feb, Viterbi Algorithm The major difference from the forward algorithm: Maximization instead of sum

15
Markoviana Reading GroupFatih Gelgi – Feb, Viterbi Algorithm Illustration State S j SNSN b N (O 1 )max 1 (i) a iN ] b N (O 2 ) … …… S6S6 b 6 (O 1 )max 1 (i) a i6 ] b 6 (O 2 ) S5S5 b 5 (O 1 )max 1 (i) a i5 ] b 5 (O 2 ) S4S4 b 4 (O 1 )max 1 (i) a i4 ] b 4 (O 2 ) S3S3 b 3 (O 1 )max 1 (i) a i3 ] b 3 (O 2 ) S2S2 b 2 (O 1 )max 1 (i) a i2 ] b 2 (O 2 ) S1S1 b 1 (O 1 )max 1 (i) a i1 ] b 1 (O 2 ) t (j) O1O1 O2O2 O3O3 O4O4 … OTOT Observations O t t (i) is the highest probability of a state path for the partial observation sequence O 1 O 2 O 3 …O t such that the state S i. Max of this col indicates traceback start

16
Markoviana Reading GroupFatih Gelgi – Feb, Relations with DBN Forward Function: Backward Function: Viterbi Algorithm: b j (O t+1 )a ij t (i) b j (O t+1 )a ij t+1 (j) t+1 (j) t (i) T (i)=1 b j (O t+1 )a ij t (i) t+1 (j)

17
Markoviana Reading GroupFatih Gelgi – Feb, Some more definitions t (i) is the probability of being in state S i at time t t (i,j) is the probability of being in state S i at time t, and S j at time t+1

18
Markoviana Reading GroupFatih Gelgi – Feb, Baum-Welch Re-estimation Expectation-Maximization Algorithm Expectation:

19
Markoviana Reading GroupFatih Gelgi – Feb, Baum-Welch Re-estimation (cont’d) Maximization:

20
Markoviana Reading GroupFatih Gelgi – Feb, Notes on the Re-estimation If the model does not change, it means that it has reached a local maxima. Depending on the model, many local maxima can exist Re-estimated probabilities will sum to 1

21
Markoviana Reading GroupFatih Gelgi – Feb, Implementation issues Scaling Multiple observation sequences Initial parameter estimation Missing data Choice of model size and type

22
Markoviana Reading GroupFatih Gelgi – Feb, Scaling calculation: Recursion to calculate:

23
Markoviana Reading GroupFatih Gelgi – Feb, Scaling (cont’d) calculation: Desired condition: * Note that is not true!

24
Markoviana Reading GroupFatih Gelgi – Feb, Scaling (cont’d)

25
Markoviana Reading GroupFatih Gelgi – Feb, Maximum log-likelihood Initialization: Recursion: Termination:

26
Markoviana Reading GroupFatih Gelgi – Feb, Multiple observations sequences Problem with re-estimation

27
Markoviana Reading GroupFatih Gelgi – Feb, Initial estimates of parameters For and A, Random or uniform is sufficient For B (discrete symbol prb.), Good initial estimate is needed

28
Markoviana Reading GroupFatih Gelgi – Feb, Insufficient training data Solutions: Increase the size of training data Reduce the size of the model Interpolate parameters using another model

29
Markoviana Reading GroupFatih Gelgi – Feb, References L Rabiner. ‘A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition.’ Proceedings of the IEEE 1989.A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. S Russell, P Norvig. ‘Probabilistic Reasoning Over Time’. AI: A Modern Approach, Ch.15, 2002 (draft). V Borkar, K Deshmukh, S Sarawagi. ‘Automatic segmentation of text into structured records.’ ACM SIGMOD 2001.Automatic segmentation of text into structured records. T Scheffer, C Decomain, S Wrobel. ‘Active Hidden Markov Models for Information Extraction.’ Proceedings of the International Symposium on Intelligent Data Analysis S Ray, M Craven. ‘Representing Sentence Structure in Hidden Markov Models for Information Extraction.’ Proceedings of the 17th International Joint Conference on Artificial Intelligence 2001.Representing Sentence Structure in Hidden Markov Models for Information Extraction.

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google