Presentation is loading. Please wait.

# Hidden Markov Model 主講人：虞台文 大同大學資工所 智慧型多媒體研究室. Contents Introduction – Markov Chain – Hidden Markov Model (HMM) Formal Definition of HMM & Problems Estimate.

## Presentation on theme: "Hidden Markov Model 主講人：虞台文 大同大學資工所 智慧型多媒體研究室. Contents Introduction – Markov Chain – Hidden Markov Model (HMM) Formal Definition of HMM & Problems Estimate."— Presentation transcript:

Hidden Markov Model 主講人：虞台文 大同大學資工所 智慧型多媒體研究室

Contents Introduction – Markov Chain – Hidden Markov Model (HMM) Formal Definition of HMM & Problems Estimate HMMs by EM-Algorithm HMM with GMM

Hidden Markov Model Introduction 大同大學資工所 智慧型多媒體研究室

Introduction Signal Generation Source Observation Sequence: O = O 1 O 2 O 3  O T Model Approximation

Example (Traffic Light) Signal Generation Source Observation Sequence: red - red/amber - green - amber - red Deterministic Patterns Each state is dependent solely on the previous state. Deterministic systems are relatively easy to understand and analyze.

Example (Weather) Signal Generation Source Observation Sequence: Nondeterministic Patterns This is the so-called Markov model. The observation is the state sequence. The state transition probability is dependent solely on the previous state. What probabilities has to be estimated?

Markov Chain A set of state: S = {S 1, S 2, …, S N } Transition Probability Matrix: Observation: State Transition Sequence Initial State Probability vector: Markov Model:

Example S1S1 S2S2 S3S3 S 1 : Sunny S 2 : Rainy S 3 : Cloudy 0.8 0.1 0.4 0.3 0.2 0.3 0.6 0.2

Properties of Markov Model Define state probability vector Given Then, or

Properties of Markov Model or Steady State: q is an eigenvector of matrix A T with eigenvalue 1.

Properties of Markov Model Consider the following Observation Sequence: DiDi SiSi a ii a ij ’s How to estimate a ii ?

Example (Coin Tossing) Signal Generation Source Observation Sequence: HHTTTHTTH…H What will be the model? Several different coins may have. Coins may be biased.

One-Coin Model Observation Sequence: HHTTTHTTH…H P(H)P(H) 1P(H)1P(H) 1P(H)1P(H) P(H)P(H) The observation sequence is the same as the state sequence. How to estimate the parameter of the model?

Two-Coin Model Observation Sequence: HHTTTHTTH…H a 11 a 12 1  a 11 1  a 12 P(H) = P 1 P(L) = 1  P 1 P(H) = P 2 P(L) = 1  P 2 State Sequence: 1 2 2 1 1 1 2 2 1 … 2

Two-Coin Model Observation Sequence: HHTTTHTTH…H a 11 a 12 1  a 11 1  a 12 P(H) = P 1 P(L) = 1  P 1 P(H) = P 2 P(L) = 1  P 2 State Sequence: 1 2 2 1 1 1 2 2 1 … 2 Observable Unobservable

Two-Coin Model Observation Sequence: HHTTTHTTH…H a 11 a 12 1  a 11 1  a 12 P(H) = P 1 P(L) = 1  P 1 P(H) = P 2 P(L) = 1  P 2 State Sequence: 1 2 2 1 1 1 2 2 1 … 2 Given the observation sequence, how to estimate the model? Because the state sequence is hidden, it is called the Hidden Markov Model (HMM). Given the observation sequence, how to estimate the model? Because the state sequence is hidden, it is called the Hidden Markov Model (HMM).

Example: The Urn and Ball urn1urn2urn3 a 11 a 22 a 33 a 12 a 23 a 21 a 32 a 31 a 13 Observation Sequence:

Example: The Urn and Ball urn1urn2urn3 a 11 a 22 a 33 a 12 a 23 a 21 a 32 a 31 a 13 Observation Sequence: Given the observation sequence, how to estimate the model? it is a Hidden Markov Model (HMM). Given the observation sequence, how to estimate the model? it is a Hidden Markov Model (HMM).

What is HMM? An HMM is a Markov chain, where each state generates observations. You only see the observations, and the goal is to infer the hidden state sequence. HMMs are very useful for time-series modeling, since the discrete state-space can be used to approximate many non-linear, non-Gaussian systems.

HMM Applications Pattern Recognition – Speech Recognition – Face Recognition – Gesture Recognition – Handwriting Character Recognition Molecular biology, Biochemistry and Genetics. Sequence Data Analysis & Alignment

Hidden Markov Model Formal Definition of HMM & Problems 大同大學資工所 智慧型多媒體研究室

Elements of an HMM A set of N states A set of M observation symbols The state transition probability distribution The observation symbol distribution for each state The Initial state distribution

The Definition We represent an HMM by a tuple, say, by A : state transition probability distribution. B : observation symbol probability distribution.  : initial state distribution.

The Three Basic Problems for HMMs Problem I: Problem II: Problem III: Given 1.Observation sequence O = O 1 O 2 …O T 2.Model = (A, B,  ) How to find the state sequence Q = q 1 q 2 … q T that best explains the observations? How to find the state sequence Q = q 1 q 2 … q T that best explains the observations? How to adjust the model parameter of = (A, B,  ) to maximize P(O| )? How to adjust the model parameter of = (A, B,  ) to maximize P(O| )?

Problem I A straightforward solution: Let Q = q 1 q 2 … q T be the state sequence that generates the observation sequence O = O 1 O 2 …O T. Time complexity is really huge.

Problem I Solution by a forward induction procedure: Define

Problem I Solution by a forward induction procedure:.............................. state 1 2 3 N 123 T1T1 T time... Define Time complexity

Problem I Solution by a backward induction procedure: Define Time complexity

Problem I Side product of the forward-backward procedure: Define Fact: Most likely state at time t

Problem II How to find the state sequence Q = q 1 q 2 … q T that best explains the observations? How to find the state sequence Q = q 1 q 2 … q T that best explains the observations? Unlike Problem I, the solution is dependent on the optimality criterion. For examples: 1.Maximize the expected number of correct states. 2.Viterbi Algorithm Choose

Viterbi Algorithm Define To retrieve backward path, we define

Viterbi Algorithm Backtracking Initialization Recursion Termination

Problem III The most difficult problem of the three. Approaches: 1.Baum-Welch Method 2.EM Algorithm How to adjust the model parameter of = (A, B,  ) to maximize P(O| )? How to adjust the model parameter of = (A, B,  ) to maximize P(O| )? These two methods, in fact, are the same but with different formulations.

Baum-Welch Method How to adjust the model parameter of = (A, B,  ) to maximize P(O| )? How to adjust the model parameter of = (A, B,  ) to maximize P(O| )? What parameter’s must be estimated? How?

Baum-Welch Method How to adjust the model parameter of = (A, B,  ) to maximize P(O| )? How to adjust the model parameter of = (A, B,  ) to maximize P(O| )? Review:

Baum-Welch Method SjSj SiSi Define time t time t+1 How to adjust the model parameter of = (A, B,  ) to maximize P(O| )? How to adjust the model parameter of = (A, B,  ) to maximize P(O| )?

Baum-Welch Method Define How to adjust the model parameter of = (A, B,  ) to maximize P(O| )? How to adjust the model parameter of = (A, B,  ) to maximize P(O| )? probability that q t = S i Facts: As stated before.

Baum-Welch Method

1. Provide an initial model 2. Reestimate the model as 3. Let 4. If not converge go to step 2. Baum-Welch Method

Hidden Markov Model Estimate HMMs by EM-Algorithm 大同大學資工所 智慧型多媒體研究室

Complete-Data Likelihood for HMM o = o 1 o 2 o 3  o T q = q 1 q 2 q 3  q T Observation sequence: Hidden state sequence: Complete data

Q-function (E-Step) Independent on q

Maximization Step maximize subject to

Maximization Step Solve subject to Solve for  i for a ij for b j (k)

Maximization Step Solve for  i

Maximization Step The same result as Baum-Welch Method

Maximization Step Solvefor a ij

Maximization Step Solvefor a ij

Maximization Step Again, it has the same result as Baum-Welch Method

Maximization Step Solve for b j (k)

Maximization Step Solve for b j (k)

Maximization Step These conclude that EM-algorithm to learn an HMM is equivalent to Baum-Welch Method.

Summary

Hidden Markov Model HMM with GMM 大同大學資工所 智慧型多媒體研究室

Continuous Observation Densities in HMMs In many cases, the observation densities are continuous, e.g., speech signal processing. Approaches: – Quantize the signal to a discrete one, e.g., vector quantization. – Using tractable/reasonable statistical models, e.g., GMM.

GMM (Gaussian Mixture Model) Gaussian Density frequency mean covariance matrix for the k th mixture in the j th state.

GMM (Gaussian Mixture Model) How to estimate the parameters?

Baum-Welch Method initial model reestimated model

Definitions Prob. reach S i at time t. Prob. reach S i and the k th mixture is selected at time t.

Definitions

Method

Download ppt "Hidden Markov Model 主講人：虞台文 大同大學資工所 智慧型多媒體研究室. Contents Introduction – Markov Chain – Hidden Markov Model (HMM) Formal Definition of HMM & Problems Estimate."

Similar presentations

Ads by Google