Download presentation

Presentation is loading. Please wait.

Published byCameron Byrd Modified over 4 years ago

1
Hidden Markov Model 主講人：虞台文 大同大學資工所 智慧型多媒體研究室

2
Contents Introduction – Markov Chain – Hidden Markov Model (HMM) Formal Definition of HMM & Problems Estimate HMMs by EM-Algorithm HMM with GMM

3
Hidden Markov Model Introduction 大同大學資工所 智慧型多媒體研究室

4
Introduction Signal Generation Source Observation Sequence: O = O 1 O 2 O 3 O T Model Approximation

5
Example (Traffic Light) Signal Generation Source Observation Sequence: red - red/amber - green - amber - red Deterministic Patterns Each state is dependent solely on the previous state. Deterministic systems are relatively easy to understand and analyze.

6
Example (Weather) Signal Generation Source Observation Sequence: Nondeterministic Patterns This is the so-called Markov model. The observation is the state sequence. The state transition probability is dependent solely on the previous state. What probabilities has to be estimated?

7
Markov Chain A set of state: S = {S 1, S 2, …, S N } Transition Probability Matrix: Observation: State Transition Sequence Initial State Probability vector: Markov Model:

8
Example S1S1 S2S2 S3S3 S 1 : Sunny S 2 : Rainy S 3 : Cloudy 0.8 0.1 0.4 0.3 0.2 0.3 0.6 0.2

9
Properties of Markov Model Define state probability vector Given Then, or

10
Properties of Markov Model or Steady State: q is an eigenvector of matrix A T with eigenvalue 1.

11
Properties of Markov Model Consider the following Observation Sequence: DiDi SiSi a ii a ij ’s How to estimate a ii ?

12
Example (Coin Tossing) Signal Generation Source Observation Sequence: HHTTTHTTH…H What will be the model? Several different coins may have. Coins may be biased.

13
One-Coin Model Observation Sequence: HHTTTHTTH…H P(H)P(H) 1P(H)1P(H) 1P(H)1P(H) P(H)P(H) The observation sequence is the same as the state sequence. How to estimate the parameter of the model?

14
Two-Coin Model Observation Sequence: HHTTTHTTH…H a 11 a 12 1 a 11 1 a 12 P(H) = P 1 P(L) = 1 P 1 P(H) = P 2 P(L) = 1 P 2 State Sequence: 1 2 2 1 1 1 2 2 1 … 2

15
Two-Coin Model Observation Sequence: HHTTTHTTH…H a 11 a 12 1 a 11 1 a 12 P(H) = P 1 P(L) = 1 P 1 P(H) = P 2 P(L) = 1 P 2 State Sequence: 1 2 2 1 1 1 2 2 1 … 2 Observable Unobservable

16
Two-Coin Model Observation Sequence: HHTTTHTTH…H a 11 a 12 1 a 11 1 a 12 P(H) = P 1 P(L) = 1 P 1 P(H) = P 2 P(L) = 1 P 2 State Sequence: 1 2 2 1 1 1 2 2 1 … 2 Given the observation sequence, how to estimate the model? Because the state sequence is hidden, it is called the Hidden Markov Model (HMM). Given the observation sequence, how to estimate the model? Because the state sequence is hidden, it is called the Hidden Markov Model (HMM).

17
Example: The Urn and Ball urn1urn2urn3 a 11 a 22 a 33 a 12 a 23 a 21 a 32 a 31 a 13 Observation Sequence:

18
Example: The Urn and Ball urn1urn2urn3 a 11 a 22 a 33 a 12 a 23 a 21 a 32 a 31 a 13 Observation Sequence: Given the observation sequence, how to estimate the model? it is a Hidden Markov Model (HMM). Given the observation sequence, how to estimate the model? it is a Hidden Markov Model (HMM).

19
What is HMM? An HMM is a Markov chain, where each state generates observations. You only see the observations, and the goal is to infer the hidden state sequence. HMMs are very useful for time-series modeling, since the discrete state-space can be used to approximate many non-linear, non-Gaussian systems.

20
HMM Applications Pattern Recognition – Speech Recognition – Face Recognition – Gesture Recognition – Handwriting Character Recognition Molecular biology, Biochemistry and Genetics. Sequence Data Analysis & Alignment

21
Hidden Markov Model Formal Definition of HMM & Problems 大同大學資工所 智慧型多媒體研究室

22
Elements of an HMM A set of N states A set of M observation symbols The state transition probability distribution The observation symbol distribution for each state The Initial state distribution

23
The Definition We represent an HMM by a tuple, say, by A : state transition probability distribution. B : observation symbol probability distribution. : initial state distribution.

24
The Three Basic Problems for HMMs Problem I: Problem II: Problem III: Given 1.Observation sequence O = O 1 O 2 …O T 2.Model = (A, B, ) How to find the state sequence Q = q 1 q 2 … q T that best explains the observations? How to find the state sequence Q = q 1 q 2 … q T that best explains the observations? How to adjust the model parameter of = (A, B, ) to maximize P(O| )? How to adjust the model parameter of = (A, B, ) to maximize P(O| )?

25
Problem I A straightforward solution: Let Q = q 1 q 2 … q T be the state sequence that generates the observation sequence O = O 1 O 2 …O T. Time complexity is really huge.

26
Problem I Solution by a forward induction procedure: Define

27
Problem I Solution by a forward induction procedure:.............................. state 1 2 3 N 123 T1T1 T time... Define Time complexity

28
Problem I Solution by a backward induction procedure: Define Time complexity

29
Problem I Side product of the forward-backward procedure: Define Fact: Most likely state at time t

30
Problem II How to find the state sequence Q = q 1 q 2 … q T that best explains the observations? How to find the state sequence Q = q 1 q 2 … q T that best explains the observations? Unlike Problem I, the solution is dependent on the optimality criterion. For examples: 1.Maximize the expected number of correct states. 2.Viterbi Algorithm Choose

31
Viterbi Algorithm Define To retrieve backward path, we define

32
Viterbi Algorithm Backtracking Initialization Recursion Termination

33
Problem III The most difficult problem of the three. Approaches: 1.Baum-Welch Method 2.EM Algorithm How to adjust the model parameter of = (A, B, ) to maximize P(O| )? How to adjust the model parameter of = (A, B, ) to maximize P(O| )? These two methods, in fact, are the same but with different formulations.

34
Baum-Welch Method How to adjust the model parameter of = (A, B, ) to maximize P(O| )? How to adjust the model parameter of = (A, B, ) to maximize P(O| )? What parameter’s must be estimated? How?

35
Baum-Welch Method How to adjust the model parameter of = (A, B, ) to maximize P(O| )? How to adjust the model parameter of = (A, B, ) to maximize P(O| )? Review:

36
Baum-Welch Method SjSj SiSi Define time t time t+1 How to adjust the model parameter of = (A, B, ) to maximize P(O| )? How to adjust the model parameter of = (A, B, ) to maximize P(O| )?

37
Baum-Welch Method Define How to adjust the model parameter of = (A, B, ) to maximize P(O| )? How to adjust the model parameter of = (A, B, ) to maximize P(O| )? probability that q t = S i Facts: As stated before.

38
Baum-Welch Method

39
1. Provide an initial model 2. Reestimate the model as 3. Let 4. If not converge go to step 2. Baum-Welch Method

40
Hidden Markov Model Estimate HMMs by EM-Algorithm 大同大學資工所 智慧型多媒體研究室

41
Complete-Data Likelihood for HMM o = o 1 o 2 o 3 o T q = q 1 q 2 q 3 q T Observation sequence: Hidden state sequence: Complete data

42
Q-function (E-Step) Independent on q

43
Maximization Step maximize subject to

44
Maximization Step Solve subject to Solve for i for a ij for b j (k)

45
Maximization Step Solve for i

46
Maximization Step The same result as Baum-Welch Method

47
Maximization Step Solvefor a ij

48
Maximization Step Solvefor a ij

49
Maximization Step Again, it has the same result as Baum-Welch Method

50
Maximization Step Solve for b j (k)

51
Maximization Step Solve for b j (k)

52
Maximization Step These conclude that EM-algorithm to learn an HMM is equivalent to Baum-Welch Method.

53
Summary

54
Hidden Markov Model HMM with GMM 大同大學資工所 智慧型多媒體研究室

55
Continuous Observation Densities in HMMs In many cases, the observation densities are continuous, e.g., speech signal processing. Approaches: – Quantize the signal to a discrete one, e.g., vector quantization. – Using tractable/reasonable statistical models, e.g., GMM.

56
GMM (Gaussian Mixture Model) Gaussian Density frequency mean covariance matrix for the k th mixture in the j th state.

57
GMM (Gaussian Mixture Model) How to estimate the parameters?

58
Baum-Welch Method initial model reestimated model

59
Definitions Prob. reach S i at time t. Prob. reach S i and the k th mixture is selected at time t.

60
Definitions

61
Method

Similar presentations

© 2020 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google