CHAPTER 15: Hidden Markov Models

Slides:



Advertisements
Similar presentations
Hidden Markov Models (HMM) Rabiner’s Paper
Advertisements

Angelo Dalli Department of Intelligent Computing Systems
Learning HMM parameters
Hidden Markov Models By Marc Sobel. Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Introduction Modeling.
Hidden Markov Model 主講人:虞台文 大同大學資工所 智慧型多媒體研究室. Contents Introduction – Markov Chain – Hidden Markov Model (HMM) Formal Definition of HMM & Problems Estimate.
Introduction to Hidden Markov Models
Cognitive Computer Vision
Page 1 Hidden Markov Models for Automatic Speech Recognition Dr. Mike Johnson Marquette University, EECE Dept.
Hidden Markov Models Adapted from Dr Catherine Sweeney-Reed’s slides.
Ch-9: Markov Models Prepared by Qaiser Abbas ( )
Hidden Markov Models Theory By Johan Walters (SR 2003)
Hidden Markov Models Fundamentals and applications to bioinformatics.
Lecture 15 Hidden Markov Models Dr. Jianjun Hu mleg.cse.sc.edu/edu/csce833 CSCE833 Machine Learning University of South Carolina Department of Computer.
Apaydin slides with a several modifications and additions by Christoph Eick.
INTRODUCTION TO Machine Learning 3rd Edition
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Ch 13. Sequential Data (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by Kim Jin-young Biointelligence Laboratory, Seoul.
1 Probabilistic Reasoning Over Time (Especially for HMM and Kalman filter ) December 1 th, 2004 SeongHun Lee InHo Park Yang Ming.
Hidden Markov Models. Hidden Markov Model In some Markov processes, we may not be able to observe the states directly.
1 Hidden Markov Model Instructor : Saeed Shiry  CHAPTER 13 ETHEM ALPAYDIN © The MIT Press, 2004.
Learning HMM parameters Sushmita Roy BMI/CS 576 Oct 21 st, 2014.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Fall 2001 EE669: Natural Language Processing 1 Lecture 9: Hidden Markov Models (HMMs) (Chapter 9 of Manning and Schutze) Dr. Mary P. Harper ECE, Purdue.
Visual Recognition Tutorial1 Markov models Hidden Markov models Forward/Backward algorithm Viterbi algorithm Baum-Welch estimation algorithm Hidden.
Dishonest Casino Let’s take a look at a casino that uses a fair die most of the time, but occasionally changes it to a loaded die. This model is hidden.
ALGORITHMIC TRADING Hidden Markov Models. Overview a)Introduction b)Methodology c)Programming codes d)Conclusion.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Combined Lecture CS621: Artificial Intelligence (lecture 25) CS626/449: Speech-NLP-Web/Topics-in- AI (lecture 26) Pushpak Bhattacharyya Computer Science.
Ch10 HMM Model 10.1 Discrete-Time Markov Process 10.2 Hidden Markov Models 10.3 The three Basic Problems for HMMS and the solutions 10.4 Types of HMMS.
CS344 : Introduction to Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 21- Forward Probabilities and Robotic Action Sequences.
7-Speech Recognition Speech Recognition Concepts
HMM - Basics.
Fundamentals of Hidden Markov Model Mehmet Yunus Dönmez.
Hidden Markov Models Usman Roshan CS 675 Machine Learning.
Cognitive Computer Vision Kingsley Sage and Hilary Buxton Prepared under ECVision Specific Action 8-3
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
PGM 2003/04 Tirgul 2 Hidden Markov Models. Introduction Hidden Markov Models (HMM) are one of the most common form of probabilistic graphical models,
1 Hidden Markov Model Observation : O1,O2,... States in time : q1, q2,... All states : s1, s2,... Si Sj.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Elements of a Discrete Model Evaluation.
Hidden Markov Models (HMMs) –probabilistic models for learning patterns in sequences (e.g. DNA, speech, weather, cards...) (2 nd order model)
1 Hidden Markov Models Hsin-min Wang References: 1.L. R. Rabiner and B. H. Juang, (1993) Fundamentals of Speech Recognition, Chapter.
1 Hidden Markov Model Observation : O1,O2,... States in time : q1, q2,... All states : s1, s2,..., sN Si Sj.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
Hidden Markov Models. A Hidden Markov Model consists of 1.A sequence of states {X t |t  T } = {X 1, X 2,..., X T }, and 2.A sequence of observations.
Definition of the Hidden Markov Model A Seminar Speech Recognition presentation A Seminar Speech Recognition presentation October 24 th 2002 Pieter Bas.
Visual Recognition Tutorial1 Markov models Hidden Markov models Forward/Backward algorithm Viterbi algorithm Baum-Welch estimation algorithm Hidden.
1 Hidden Markov Model Xiaole Shirley Liu STAT115, STAT215.
Hidden Markov Models Wassnaa AL-mawee Western Michigan University Department of Computer Science CS6800 Adv. Theory of Computation Prof. Elise De Doncker.
Hidden Markov Models HMM Hassanin M. Al-Barhamtoshy
MACHINE LEARNING 16. HMM. Introduction Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Modeling dependencies.
Hidden Markov Models BMI/CS 576
Learning, Uncertainty, and Information: Learning Parameters
CSCE 771 Natural Language Processing
Date: October, Revised by 李致緯
Structured prediction
EEL 6586: AUTOMATIC SPEECH PROCESSING Hidden Markov Model Lecture
Combined Lecture CS621: Artificial Intelligence (lecture 19) CS626/449: Speech-NLP-Web/Topics-in-AI (lecture 20) Hidden Markov Models Pushpak Bhattacharyya.
Hidden Markov Models - Training
CSC 594 Topics in AI – Natural Language Processing
Hidden Markov Models Part 2: Algorithms
CHAPTER 15: Hidden Markov Models
Hidden Markov Model LR Rabiner
Digital Systems: Hardware Organization and Design
4.0 More about Hidden Markov Models
Hassanin M. Al-Barhamtoshy
Algorithms of POS Tagging
LECTURE 15: REESTIMATION, EM AND MIXTURES
CPSC 503 Computational Linguistics
Introduction to HMM (cont)
Hidden Markov Models By Manish Shrivastava.
Presentation transcript:

CHAPTER 15: Hidden Markov Models Apaydin slides with a several modifications and additions by Christoph Eick.

Introduction Spatial Sequences Modeling dependencies in input; frequently the order of observations in a dataset matters: Temporal Sequences: In speech; phonemes in a word (dictionary), words in a sentence (syntax, semantics of the language). Stock market (stock values over time) Spatial Sequences Base pairs in DNA Sequences Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Discrete Markov Process N states: S1, S2, ..., SN State at “time” t, qt = Si First-order Markov P(qt+1=Sj | qt=Si, qt-1=Sk ,...) = P(qt+1=Sj | qt=Si) Transition probabilities aij ≡ P(qt+1=Sj | qt=Si) aij ≥ 0 and Σj=1N aij=1 Initial probabilities πi ≡ P(q1=Si) Σj=1N πi=1

Stochastic Automaton/Markov Chain Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Example: Balls and Urns 1 2 3 Example: Balls and Urns Three urns each full of balls of one color S1: blue, S2: red, S3: green

Balls and Urns: Learning Given K example sequences of length T Remark: Extract the probabilities from the observed sequences: s1-s2-s1-s3 s2-s1-s1-s2  1=1/3, 2=2/3, a11=1/3, a12=1/3, a13=1/3, a21=3/4,… s2-s3-s2-s1

Hidden Markov Models States are not observable http://en.wikipedia.org/wiki/Hidden_Markov_model Hidden Markov Models States are not observable Discrete observations {v1,v2,...,vM} are recorded; a probabilistic function of the state Emission probabilities bj(m) ≡ P(Ot=vm | qt=Sj) Example: In each urn, there are balls of different colors, but with different probabilities. For each observation sequence, there are multiple state sequences Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

http://a-little-book-of-r-for-bioinformatics. readthedocs http://a-little-book-of-r-for-bioinformatics.readthedocs.org/en/latest/src/chapter10.html HMM Unfolded in Time Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Now a more complicated problem 1 2 3 Markov Chains We observe: Hidden Markov Models What urn sequence create it? 1-1-2-2 (somewhat trivial, as states are observable!) (1 or 2)-(1 or 2)-(2 or 3)-(2 or 3) and the potential sequences have different probabilities—e.g drawing a blue ball from urn1 is more likely than from urn2!

Another Motivating Example

Elements of an HMM N: Number of states M: Number of observation symbols A = [aij]: N by N state transition probability matrix B = bj(m): N by M observation probability matrix Π = [πi]: N by 1 initial state probability vector λ = (A, B, Π), parameter set of HMM Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Three Basic Problems of HMMs Evaluation: Given λ, and sequence O, calculate P (O | λ) Most Likely State Sequence: Given λ and sequence O, find state sequence Q* such that P (Q* | O, λ ) = maxQ P (Q | O , λ ) Learning: Given a set of sequence O={O1,…Ok}, find λ* such that λ* is the most like explanation for the sequences in O. P ( O | λ* )=maxλ k P ( Ok | λ ) (Rabiner, 1989)

Evaluation Forward variable: Complexity: O(N2*T) Probability of observing O1-…-Ot and additionally being in state i Forward variable: Using i the probability of the observed sequence can be computed as follows: Complexity: O(N2*T)

Backward variable: Probability of observing Ot+1-…-OT and additionally being in state i Backward variable: Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Finding the Most Likely State Sequence t(i):=Probability of being in state i at step t. Choose the state that has the highest probability, for each time step: qt*= arg maxi γt(i) Observe: O1…OtOt+1…OT Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Viterbi’s Algorithm Idea: Combines path probability computations Only briefly discussed in 2014! Viterbi’s Algorithm δt(i) ≡ maxq1q2∙∙∙ qt-1 p(q1q2∙∙∙qt-1,qt =Si,O1∙∙∙Ot | λ) Initialization: δ1(i) = πibi(O1), ψ1(i) = 0 Recursion: δt(j) = maxi δt-1(i)aijbj(Ot), ψt(j) = argmaxi δt-1(i)aij Termination: p* = maxi δT(i), qT*= argmaxi δT (i) Path backtracking: qt* = ψt+1(qt+1* ), t=T-1, T-2, ..., 1 Idea: Combines path probability computations with backtracking over competing paths.

Observed Symbol Sequence Baum-Welch Algorithm Baum- Welch Algorithm O={O1,…,OK} Model =(A,B,) Hidden State Sequence O Observed Symbol Sequence

Learning a Model  from Sequences O This is a hidden(latent) variable, measuring the probability of going from state i to state j at step t+1 observing Ot+1, given a model  and an observed sequence O Ok. An EM-style algorithm is used! This is a hidden(latent) variable, measuring the probability of being in state i step t observing given a model  and an observed sequence O Ok.

Baum-Welch Algorithm: M-Step Probability going from i to j Probability being in i Remark: k iterates over the observed sequences O1,…,OK; for each individual sequence OrO r and r are computed in the E-step; then, the actual model  is computed in the M-step by averaging over the estimates of i,aij,bj (based on k and k) for each of the K observed sequences.

Baum-Welch Algorithm: Summary For more discussion see: http://www.robots.ox.ac.uk/~vgg/rg/slides/hmm.pdf Baum- Welch Algorithm O={O1,…,OK} Model =(A,B,) See also: http://www.digplanet.com/wiki/Baum%E2%80%93Welch_algorithm Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)

Generalization of HMM: Continuous Observations The observations generated at each time step are vectors consisting of k numbers; a multivariate Gaussian with k dimensions is associated with each state j, defining the probabilities of k-dimensional vector v generated when being in state j: Hidden State Sequence =(A, (j,j) j=1,…n,B) O Observed Vector Sequence

Generalization: HMM with Inputs Input-dependent observations: Input-dependent transitions (Meila and Jordan, 1996; Bengio and Frasconi, 1996): Time-delay input: Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)