Download presentation

Presentation is loading. Please wait.

1
ETHEM ALPAYDIN © The MIT Press, 2010 alpaydin@boun.edu.tr http://www.cmpe.boun.edu.tr/~ethem/i2ml2e Lecture Slides for

3
Introduction Modeling dependencies in input; no longer iid Sequences: Temporal: In speech; phonemes in a word (dictionary), words in a sentence (syntax, semantics of the language). In handwriting, pen movements Spatial: In a DNA sequence; base pairs 3Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

4
Discrete Markov Process N states: S 1, S 2,..., S N State at “time” t, q t = S i First-order Markov P(q t+1 =S j | q t =S i, q t-1 =S k,...) = P(q t+1 =S j | q t =S i ) Transition probabilities a ij ≡ P(q t+1 =S j | q t =S i ) a ij ≥ 0 and Σ j=1 N a ij =1 Initial probabilities π i ≡ P(q 1 =S i ) Σ j=1 N π i =1 4Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

5
Stochastic Automaton 5Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

6
Three urns each full of balls of one color S 1 : red, S 2 : blue, S 3 : green Example: Balls and Urns 6Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

7
Given K example sequences of length T Balls and Urns: Learning 7Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

8
Hidden Markov Models States are not observable Discrete observations {v 1,v 2,...,v M } are recorded; a probabilistic function of the state Emission probabilities b j (m) ≡ P(O t =v m | q t =S j ) Example: In each urn, there are balls of different colors, but with different probabilities. For each observation sequence, there are multiple state sequences 8Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

9
HMM Unfolded in Time 9Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

10
Elements of an HMM N: Number of states M: Number of observation symbols A = [a ij ]: N by N state transition probability matrix B = b j (m): N by M observation probability matrix Π = [π i ]: N by 1 initial state probability vector λ = (A, B, Π), parameter set of HMM 10Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

11
Three Basic Problems of HMMs 1. Evaluation: Given λ, and O, calculate P (O | λ) 2. State sequence: Given λ, and O, find Q * such that P (Q * | O, λ ) = max Q P (Q | O, λ ) 3. Learning: Given X={O k } k, find λ * such that P ( X | λ * )=max λ P ( X | λ ) 11 (Rabiner, 1989) Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

12
Forward variable: Evaluation 12Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

13
Backward variable: 13Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

14
Finding the State Sequence 14 No! Choose the state that has the highest probability, for each time step: q t * = arg max i γ t (i) Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

15
Viterbi’s Algorithm δ t (i) ≡ max q1q2∙∙∙ qt-1 p(q 1 q 2 ∙∙∙q t-1,q t =S i,O 1 ∙∙∙O t | λ) Initialization: δ 1 (i) = π i b i (O 1 ), ψ 1 (i) = 0 Recursion: δ t (j) = max i δ t-1 (i)a ij b j (O t ), ψ t (j) = argmax i δ t-1 (i)a ij Termination: p * = max i δ T (i), q T * = argmax i δ T (i) Path backtracking: q t * = ψ t+1 (q t+1 * ), t=T-1, T-2,..., 1 15Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

16
Learning 16Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

17
Baum-Welch (EM) 17 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

18
Continuous Observations 18 Discrete: Gaussian mixture (Discretize using k-means): Continuous: Use EM to learn parameters, e.g., Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

19
Input-dependent observations: Input-dependent transitions (Meila and Jordan, 1996; Bengio and Frasconi, 1996): Time-delay input: HMM with Input 19Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

20
Left-to-right HMMs: In classification, for each C i, estimate P (O | λ i ) by a separate HMM and use Bayes’ rule Model Selection in HMM 20Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Similar presentations

© 2020 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google