Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hidden Markov Models Wassnaa AL-mawee Western Michigan University Department of Computer Science CS6800 Adv. Theory of Computation Prof. Elise De Doncker.

Similar presentations


Presentation on theme: "Hidden Markov Models Wassnaa AL-mawee Western Michigan University Department of Computer Science CS6800 Adv. Theory of Computation Prof. Elise De Doncker."— Presentation transcript:

1 Hidden Markov Models Wassnaa AL-mawee Western Michigan University Department of Computer Science CS6800 Adv. Theory of Computation Prof. Elise De Doncker 02/17/2016

2 Outline  Introduction  Motivation  Markov Models  Hidden Markov Models (HMMs)  HMMs Definition and Components  HMMs Problems  HMMs Basic Algorithms  Applications  HMMs in Speech Recognition  HMMs for Gene Finding in DNA  Summary  References 2

3 Introduction  Motivation :  Predictions based on models of observed data (independent and identically distributed).  This assumption not always is the best:  Measurements of weather patterns.  Daily values of stocks.  The composition of DNA.  The composition of texts.  Time frames used for speech recognition. 3

4 Introduction (Continue…)  Markov Models:  If the n’th observation in a chain of observations is influenced only by the n-1’th observation, then the chain of the observation is called 1 st order Markov chain:  P(X n |X 1,…….,X n-1 ) = P(X n |X n-1 )  Example: Weather Prediction(What the weather would be tomorrow depends on observations on the past: 4

5 Introduction (Continue…)  Weather of the day (day n), X n €(sunny, rainy, cloudy).  If the weather yesterday was cloudy and today is rainy, what is the probability that tomorrow will be sunny? P(X 3 = |X 2 = ) = 0.6 5

6 Introduction (Continue…)  What if the n’th observation in a chain of observations is influenced by a corresponding HIDDEN variable? Source: http://www.bbc.com/news/health-29708755http://www.bbc.com/news/health-29708755 6

7 Outline  Introduction  Hidden Markov Models (HMMs)  HMMs Definition and Components  HMMs Problems  HMMs Basic Algorithms  Applications  Summary  References 7

8 HMMs Definition and Components  HMMs: are powerful statistical models for modeling sequential or time-series data, and have been successfully used in many tasks such as [1]:  Robot control,  Speech recognition,  Protein/DNA sequence analysis,  and information extraction from text data. 8

9 HMMs Definition and Components  HMM is a 5-tuple (S, O, Π, A, B), where:  S = {s 1,..., s N } is a finite set of N states,  O = {o 1,..., o M } is a set of M possible outputs (observation)  Π = {π i } are the initial state probabilities,  A = {a ij } are the state transition probabilities,  B = {b i (o k )} are the output or emission probabilities.  We use λ(HMM) = (Π, A, B) to denote all the parameters. 9

10 HMMs Definition and Components  HMMs Example: Assistive Technology  Assume you have a little robot that tries to estimate the probability that you are happy (h) or sad (s), given that the robot has observed whether you are watching your favorite TV show(w), sleeping(s), crying(c), or face booking (f) [4].  Let hidden states be X= h, X=s.  Let observations be Y, which can be w, s, c, or f.  We want to find out those probabilities:  P(X=h|Y=w)?  P(X=s|Y=c)? Source: http://7-themes.com/6912015- little-robot.html 10

11 HMMs Definition and Components  Using Bayes Rule: describes the probability of an event based on conditions that might be related to the event [5].  For i,  For n, 11

12 HMMs Definition and Components Continue…  Solve them with Prior and Likelihood Model [4]:  P(X|Y)?  P(X)=, P(X=s)= 0.2  P(Y|X)=, P(X=h|Y=w)?= 0.94 sh 0.20.8 wscf s 0.10.30.50.1 h 0.4 0.20 0.4 0.8 0.10.2 12

13 HMMs Definition and Components Continue… What if we have Transition Prior rather than absolute prior. Let assume we have observation like this ccc, wcw, Hence we have  S = {(H),(S) },  O = {(w),(s),(c)),(f)}  Π = {H:0.8, S:0.2},  A = {{H-H:0.9,H-S:0.1},{S-H:0.1,S-S:0.9}}  B = {{H-w:0.4, H-s:0.4, H-c:0.2, H-f:0 }, {S-w:0.1, S-s:0.3, S-c:0.5,S-f: 0.1 }} 13

14 HMMs Problems  There are three basic problems are associated with an HMM: 1) Evaluation : Evaluating the probability of an observed sequence of symbols O over all of possible sequence given a particular HMM λ, i.e., p(O|λ). 2) Decoding : Finding the best state sequence, given an observation sequence O and HMM λ, i.e, q ∗ = argmax s p(q|O). 3) Training : Finding all the parameters of HMMλ to maximize the probability of generating an observed sequence, i.e., to find λ ∗ = argmax λ p(O|λ). 14

15 HMMs Basic Algorithms ProblemAlgorithm Evaluation p(O|λ) Forward-Backward Decoding q ∗ = argmax s p(q|O) Viterbi Decoding Training λ ∗ = argmax λ p(O|λ) Baum-Welch(EM) 15

16  The Forward Algorithm to solve Evaluation Problem :  Sum over all possible paths of the state sequence that generate the given observation sequence by the following recursive procedure:  α 1 (i) = π i b i (o 1 ), 1<=i<=N  α t+1 (i) = b i (o t+1 ), 1 ≤ t < T,  we may end at any of the N states.  The Backward Algorithm along with Forward Algorithm to solve the third Problem:  r  R  e HMMs Basic Algorithms continue… 16

17 HMMs Basic Algorithms continue…  Viterbi Algorithm to solve Decoding Problem:  The Viterbi algorithm is a dynamic programming algorithm.  It computes the most likely state transition path given an observed sequence of symbols.  It is similar to the forward algorithm, except takes “max”, rather than a “summation ”.  Viterbi Algorithm :  VP 1 (i)= π i b i (o 1 ) and q 1 * (i)= (i)  For 1≤t≤T, VP t+1 (i)=max 1≤j≤N VP t (j)a ji b i (o t+1 ) and q t+1 * (i)= q t * (k).(i), where k=argmax 1≤j≤N VP t (j)a ji b i (o t+1 ) 17

18 HMMs Basic Algorithms continue…  Baum-Welch Algorithm (also called Forward-Backward Algorithm) to solve Training Problem:  We assume that we know the HMMs parameters, but often these parameters re-estimated or annotated training that has two drawbacks: 1)Annotation is difficult/expensive. 2)Training data is different from the current data.  The goal of Baum-Welch algorithm is to tune the parameters of HMMs using EM (Expectation Maximization) that maximize the parameters of HMMs to the current data. 18

19 Outline  Introduction  Hidden Markov Models (HMMs)  Applications  HMMs in Speech Recognition  HMMs for Gene Finding in DNA  Summary  References 19

20 HMMs In Speech Recognition The word ball spoken by two different speakers: (a) female and (b) male[3] The phoneme /e/ in three different contexts: (a) let’s, (b) phonetic and (c) sentence[3] 20

21 HMMs In Speech Recognition Continue..  Using HMM to model some unit of speech word and sentence level from phoneme level-units. For example,  One with pronunciation “W AX N” [2]:  Representing speech as a sequence of symbols.  Output Probabilities: Probabilities of observing symbol in a state.  Transition Probabilities: Probabilities of staying in or skipping state. 21

22 HMMs In Speech Recognition Continue..  Training HMMs for Continuous Speech  Concatenate phone models to give word model.  Concatenate word models to give sentence model.  Train entire sentence model on entire spoken sentence. 22

23 HMMs In Speech Recognition Continue.. Source: http://www.cs.cmu.edu/~r oni/10601-slides/hmm-for- asr-whw.pdf http://www.cs.cmu.edu/~r oni/10601-slides/hmm-for- asr-whw.pdf Recognition Search 23

24 HMMs In Speech Recognition Continue.. Source: http://www.cs.cmu.edu/~roni/10601-slides/hmm-for-asr-whw.pdfhttp://www.cs.cmu.edu/~roni/10601-slides/hmm-for-asr-whw.pdf Forward-Backward Training for Continuous Speech 24

25 HMMs for Gene Finding in DNA Source: http://www.wignallandwales.co.nz/Revision-samples/7F1_Chromosomes.htm 25

26 HMMs for Gene Finding in DNA (Continue..) Basic Structure of Gene [6] 26

27 HMMs for Gene Finding in DNA (Continue..)  Input: DNA sequence X=(x 1,…x n ), where = A,C,G,T.  Output: Correct Labeling of each element in X as coding, non-coding, and intergenic regions.  The goal of gene finding is then to annotate the sets of genomic data with the location of genes and within these genes [6]. 27

28 HMMs for Gene Finding In DNA (Continue..) Enter: start codon or intron (3 ’ Splice Site) Exit: 5 ’ Splice site or three stop codons (taa, tag, tga) [7] 28

29 Outline  Introduction  Hidden Markov Models (HMMs)  Applications  Summary  References 29

30 Summary  Introduced Hidden Markov Models(HMMs).  Defined HMMs basic problems and the corresponding solving algorithms of them.  Presented the most known applications of HMMs. 30

31 References [1] http://sifaka.cs.uiuc.edu/course/498cxz06s/hmm.pdfhttp://sifaka.cs.uiuc.edu/course/498cxz06s/hmm.pdf [2] http://www.cs.tut.fi/~puhtunn/lecture_model_search.pdfhttp://www.cs.tut.fi/~puhtunn/lecture_model_search.pdf [3] http://www.kbs.twi.tudelft.nl/docs/syllabi/speech.pdf.http://www.kbs.twi.tudelft.nl/docs/syllabi/speech.pdf [4] https://www.youtube.com/watch?v=jY2E6ExLxaw.https://www.youtube.com/watch?v=jY2E6ExLxaw [5] https://en.wikipedia.org/wiki/Bayes%27_theoremhttps://en.wikipedia.org/wiki/Bayes%27_theorem [6] K. Smith, “ Hidden Markov Models in Bionformatics with Application to Gene Findingin Human DNA,” available on: https://www.rose-hulman.edu/~shibberu/MA490/Smith.pdf.https://www.rose-hulman.edu/~shibberu/MA490/Smith.pdf [7] Nagiza F. Samatova, “ Computational Gene Finding using HMMs,” ppt 31

32 Questions Q1) Define HMM and give its components? Q2) What are the three problems that associate with an HMMs? And how to solve them? Q3) What the difference between Forward and Viterbi algorithms? Q4) Given the example of Assistive Technology find P(X=s|Y=c)? Hint, Use Absolute prior Q5) How to find a gene in human DNA using HMMs? What are the input and outputs of HMMs? 32


Download ppt "Hidden Markov Models Wassnaa AL-mawee Western Michigan University Department of Computer Science CS6800 Adv. Theory of Computation Prof. Elise De Doncker."

Similar presentations


Ads by Google