Hidden Markov Models Chapter 11. CG “islands” The dinucleotide “CG” is rare –C in a “CG” often gets “methylated” and the resulting C then mutates to T.

Slides:



Advertisements
Similar presentations
Marjolijn Elsinga & Elze de Groot1 Markov Chains and Hidden Markov Models Marjolijn Elsinga & Elze de Groot.
Advertisements

HMM II: Parameter Estimation. Reminder: Hidden Markov Model Markov Chain transition probabilities: p(S i+1 = t|S i = s) = a st Emission probabilities:
Hidden Markov Model.
1 Hidden Markov Model Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520.
Hidden Markov Models.
Lecture 8: Hidden Markov Models (HMMs) Michael Gutkin Shlomi Haba Prepared by Originally presented at Yaakov Stein’s DSPCSP Seminar, spring 2002 Modified.
Bioinformatics Hidden Markov Models. Markov Random Processes n A random sequence has the Markov property if its distribution is determined solely by its.
Hidden Markov Models Eine Einführung.
Hidden Markov Models.
 CpG is a pair of nucleotides C and G, appearing successively, in this order, along one DNA strand.  CpG islands are particular short subsequences in.
Hidden Markov Models Modified from:
Hidden Markov Models Ellen Walker Bioinformatics Hiram College, 2008.
Statistical NLP: Lecture 11
Hidden Markov Models Theory By Johan Walters (SR 2003)
Hidden Markov Model Most pages of the slides are from lecture notes from Prof. Serafim Batzoglou’s course in Stanford: CS 262: Computational Genomics (Winter.
Statistical NLP: Hidden Markov Models Updated 8/12/2005.
1 Hidden Markov Models (HMMs) Probabilistic Automata Ubiquitous in Speech/Speaker Recognition/Verification Suitable for modelling phenomena which are dynamic.
Hidden Markov Models 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
Hidden Markov Model 11/28/07. Bayes Rule The posterior distribution Select k with the largest posterior distribution. Minimizes the average misclassification.
Hidden Markov Models. Decoding GIVEN x = x 1 x 2 ……x N We want to find  =  1, ……,  N, such that P[ x,  ] is maximized  * = argmax  P[ x,  ] We.
Hidden Markov Models Lecture 6, Thursday April 17, 2003.
Hidden Markov Models.
Hidden Markov Models I Biology 162 Computational Genetics Todd Vision 14 Sep 2004.
. Hidden Markov Model Lecture #6 Background Readings: Chapters 3.1, 3.2 in the text book, Biological Sequence Analysis, Durbin et al., 2001.
Hidden Markov Models Lecture 5, Tuesday April 15, 2003.
Hidden Markov Models 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
Hidden Markov Models 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
. Hidden Markov Model Lecture #6 Background Readings: Chapters 3.1, 3.2 in the text book, Biological Sequence Analysis, Durbin et al., 2001.
CpG islands in DNA sequences
Hidden Markov Models Lecture 5, Tuesday April 15, 2003.
. Computational Genomics Lecture 8a Hidden Markov Models (HMMs) © Ydo Wexler & Dan Geiger (Technion) and by Nir Friedman (HU) Modified by Benny Chor (TAU)
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Hidden Markov Models 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
Hidden Markov Models K 1 … 2. Outline Hidden Markov Models – Formalism The Three Basic Problems of HMMs Solutions Applications of HMMs for Automatic Speech.
Hidden Markov Models Usman Roshan BNFO 601. Hidden Markov Models Alphabet of symbols: Set of states that emit symbols from the alphabet: Set of probabilities.
Forward-backward algorithm LING 572 Fei Xia 02/23/06.
Bioinformatics Hidden Markov Models. Markov Random Processes n A random sequence has the Markov property if its distribution is determined solely by its.
Hidden Markov Models Usman Roshan BNFO 601. Hidden Markov Models Alphabet of symbols: Set of states that emit symbols from the alphabet: Set of probabilities.
Hidden Markov Models.
Hidden Markov models Sushmita Roy BMI/CS 576 Oct 16 th, 2014.
CS262 Lecture 5, Win07, Batzoglou Hidden Markov Models 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
. Class 5: Hidden Markov Models. Sequence Models u So far we examined several probabilistic model sequence models u These model, however, assumed that.
Class 5 Hidden Markov models. Markov chains Read Durbin, chapters 1 and 3 Time is divided into discrete intervals, t i At time t, system is in one of.
1 Markov Chains. 2 Hidden Markov Models 3 Review Markov Chain can solve the CpG island finding problem Positive model, negative model Length? Solution:
HMM Hidden Markov Model Hidden Markov Model. CpG islands CpG islands In human genome, CG dinucleotides are relatively rare In human genome, CG dinucleotides.
BINF6201/8201 Hidden Markov Models for Sequence Analysis
H IDDEN M ARKOV M ODELS. O VERVIEW Markov models Hidden Markov models(HMM) Issues Regarding HMM Algorithmic approach to Issues of HMM.
Hidden Markov Models Yves Moreau Katholieke Universiteit Leuven.
Hidden Markov Models Usman Roshan CS 675 Machine Learning.
S. Salzberg CMSC 828N 1 Three classic HMM problems 2.Decoding: given a model and an output sequence, what is the most likely state sequence through the.
Hidden Markov Models 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
1 CONTEXT DEPENDENT CLASSIFICATION  Remember: Bayes rule  Here: The class to which a feature vector belongs depends on:  Its own value  The values.
Hidden Markovian Model. Some Definitions Finite automation is defined by a set of states, and a set of transitions between states that are taken based.
Algorithms in Computational Biology11Department of Mathematics & Computer Science Algorithms in Computational Biology Markov Chains and Hidden Markov Model.
B IOINFORMATICS Dr. Aladdin HamwiehKhalid Al-shamaa Abdulqader Jighly Lecture 5 Hidden Markov Model Aleppo University Faculty of technical engineering.
Hidden Markov Models (HMMs) Chapter 3 (Duda et al.) – Section 3.10 (Warning: this section has lots of typos) CS479/679 Pattern Recognition Spring 2013.
Hidden Markov Models An Introduction to Bioinformatics Algorithms (Jones and Pevzner)
Hidden Markov Models (HMMs) –probabilistic models for learning patterns in sequences (e.g. DNA, speech, weather, cards...) (2 nd order model)
1 DNA Analysis Part II Amir Golnabi ENGS 112 Spring 2008.
COMP3456 – adapted from textbook slides Hidden Markov Models.
Definition of the Hidden Markov Model A Seminar Speech Recognition presentation A Seminar Speech Recognition presentation October 24 th 2002 Pieter Bas.
1 Hidden Markov Model Xiaole Shirley Liu STAT115, STAT215.
Hidden Markov Models – Concepts 1 2 K … 1 2 K … 1 2 K … … … … 1 2 K … x1x1 x2x2 x3x3 xKxK 2 1 K 2.
Hidden Markov Model I.
Hidden Markov Models.
Professor of Computer Science and Mathematics
Professor of Computer Science and Mathematics
Professor of Computer Science and Mathematics
CSE 5290: Algorithms for Bioinformatics Fall 2009
CSE 5290: Algorithms for Bioinformatics Fall 2011
Presentation transcript:

Hidden Markov Models Chapter 11

CG “islands” The dinucleotide “CG” is rare –C in a “CG” often gets “methylated” and the resulting C then mutates to T Methylation is suppressed in some areas of genome, called “CG islands” Such CG islands often found around genes Problem: find CG islands in whole genome

CG island normal high CG rare CG Two states. Each state emits sequence. Sequence emitted by “CG island” state is high in CG frequency Concatenation of sequence emissions = genome

Biased coin Fair coin more “H” equal “H” & “T” Two states. Each state emits a “coin toss” result. Sequence emitted by “Biased coin” state is high in “Heads” frequency

CG Islands and the “Fair Bet Casino” The CG islands problem can be modeled after a problem named “The Fair Bet Casino”The CG islands problem can be modeled after a problem named “The Fair Bet Casino” The game is to flip coins, which results in only two possible outcomes: Head or Tail.The game is to flip coins, which results in only two possible outcomes: Head or Tail. The Fair coin will give Heads and Tails with same probability ½.The Fair coin will give Heads and Tails with same probability ½. The Biased coin will give Heads with prob. ¾.The Biased coin will give Heads with prob. ¾.

The “Fair Bet Casino” (cont’d) Thus, we define the probabilities:Thus, we define the probabilities: –P(H|F) = P(T|F) = ½ –P(H|B) = ¾, P(T|B) = ¼ –The crooked dealer chages between Fair and Biased coins with probability 10%

The Fair Bet Casino Problem Input: A sequence x = x 1 x 2 x 3 …x n of coin tosses made by two possible coins (F or B).Input: A sequence x = x 1 x 2 x 3 …x n of coin tosses made by two possible coins (F or B). Output: A sequence π = π 1 π 2 π 3 … π n, with each π i being either F or B indicating that x i is the result of tossing the Fair or Biased coin respectively.Output: A sequence π = π 1 π 2 π 3 … π n, with each π i being either F or B indicating that x i is the result of tossing the Fair or Biased coin respectively.

Problem… Fair Bet Casino Problem Any observed outcome of coin tosses could have been generated by any sequence of states! Need to incorporate a way to grade different sequences differently. Decoding Problem

Hidden Markov Model (HMM) Can be viewed as an abstract machine with k hidden states that emits symbols from an alphabet Σ.Can be viewed as an abstract machine with k hidden states that emits symbols from an alphabet Σ. Each state has its own probability distribution, and the machine switches between states according to this probability distribution.Each state has its own probability distribution, and the machine switches between states according to this probability distribution. While in a certain state, the machine makes 2 decisions:While in a certain state, the machine makes 2 decisions: –What state should I move to next? –What symbol - from the alphabet Σ - should I emit?

Why “Hidden”? Observers can see the emitted symbols of an HMM but have no ability to know which state the HMM is currently in.Observers can see the emitted symbols of an HMM but have no ability to know which state the HMM is currently in. Thus, the goal is to infer the most likely hidden states of an HMM based on the given sequence of emitted symbols.Thus, the goal is to infer the most likely hidden states of an HMM based on the given sequence of emitted symbols.

HMM Parameters Σ: set of emission characters. Ex.: Σ = {H, T} for coin tossing Q: set of hidden states, each emitting symbols from Σ. Q={F,B} for coin tossing Q={F,B} for coin tossing

HMM Parameters (cont’d) A = (a kl ): a |Q| x |Q| matrix of probability of changing from state k to state l. a FF = 0.9 a FB = 0.1 a FF = 0.9 a FB = 0.1 a BF = 0.1 a BB = 0.9 a BF = 0.1 a BB = 0.9 E = (e k (b)): a |Q| x |Σ| matrix of probability of emitting symbol b while being in state k. e F (0) = ½ e F (1) = ½ e F (0) = ½ e F (1) = ½ e B (0) = ¼ e B (1) = ¾ e B (0) = ¼ e B (1) = ¾

HMM for Fair Bet Casino (cont’d) HMM model for the Fair Bet Casino Problem

Hidden Paths A path π = π 1 … π n in the HMM is defined as a sequence of states.A path π = π 1 … π n in the HMM is defined as a sequence of states. Consider path π = FFFBBBBBFFF and sequence x = Consider path π = FFFBBBBBFFF and sequence x = x π = F F F B B B B B F F F P(x i |π i ) ½ ½ ½ ¾ ¾ ¾ ¼ ¾ ½ ½ ½ P(π i-1  π i ) ½ 9 / 10 9 / 10 1 / 10 9 / 10 9 / 10 9 / 10 9 / 10 1 / 10 9 / 10 9 / 10 π i-1 to state π i Transition probability from state π i-1 to state π i π i Probability that x i was emitted from state π i

P(x|π) Calculation P(x,π): Probability that sequence x was generated by the path π:P(x,π): Probability that sequence x was generated by the path π:

Decoding Problem Goal: Find an optimal hidden path of states given observations.Goal: Find an optimal hidden path of states given observations. Input: Sequence of observations x = x 1 …x n generated by an HMM M(Σ, Q, A, E)Input: Sequence of observations x = x 1 …x n generated by an HMM M(Σ, Q, A, E) Output: A path that maximizes P(x,π) over all possible paths π.Output: A path that maximizes P(x,π) over all possible paths π.

Dynamic programming for the Dynamic programming for the Decoding Problem Andrew Viterbi used Dynamic programming to solve the Decoding Problem.Andrew Viterbi used Dynamic programming to solve the Decoding Problem. Build a graph with one node for every (i,j), representing the possibility that the i th position of π was emitted from state j.Build a graph with one node for every (i,j), representing the possibility that the i th position of π was emitted from state j. Every choice of π = π 1 … π n corresponds to a path in the graph.Every choice of π = π 1 … π n corresponds to a path in the graph. Edges are weighted. Sum of edge weights on a path π corresponds to P(x,π)Edges are weighted. Sum of edge weights on a path π corresponds to P(x,π)

Graph for Decoding Problem

Decoding Problem Every path in the graph has the probability P(x,π).Every path in the graph has the probability P(x,π). The Viterbi algorithm finds the path that maximizes P(x,π) among all possible paths.The Viterbi algorithm finds the path that maximizes P(x,π) among all possible paths. The Viterbi algorithm runs in O(n|Q| 2 ) time.The Viterbi algorithm runs in O(n|Q| 2 ) time.

Decoding Problem: weights of edges w The weight w is given by: ??? (k, i)(l, i+1)

Decoding Problem: weights of edges w The weight w is given by: ?? (k, i)(l, i+1) n P(x,π) = Π e π i (x i ). a π i-1, π i i=1 i=1

Decoding Problem: weights of edges w The weight w is given by: ? (k, i)(l, i+1) i+1-th term = e π i+1 (x i+1 ). a π i, π i+1 i+1-th term = e π i+1 (x i+1 ). a π i, π i+1

Decoding Problem: weights of edges w The weight w= e l (x i+1 ). a kl (k, i)(l, i+1) i+1-th term = e π i+1 (x i+1 ). a π i, π i+1 = i+1-th term = e π i+1 (x i+1 ). a π i, π i+1 = e l (x i+1 ). a kl

Dynamic programming recurrence s k,i is the probability of the most likely path for the prefix x 1 …x i that ends in k s l,i+1 = max k {s k,i x weight of edge from (k,i) to (l,i+1)} = max k {s k,i a kl e l (x i+1 )} Let π * be the optimal path. Then,Let π * be the optimal path. Then, P( x, π * ) = max k { s k,n. a k,end } Time complexity: O(n |Q| 2 )Time complexity: O(n |Q| 2 )

Viterbi Algorithm The value of the product can become extremely small, which leads to overflowing.The value of the product can become extremely small, which leads to overflowing. To avoid overflowing, use log value instead: S k,i = log s k,iTo avoid overflowing, use log value instead: S k,i = log s k,i New recurrence:New recurrence: S l,i+1 = log k { S k,i +} S l,i+1 = log e l (x i+1 ) + max k { S k,i + log( a kl )}

Forward-Backward Problem Given: a sequence x = x 1 x 2 …x n generated by an HMM. Goal: find the probability that x i was generated by state k. That is, given x, what is P(π i =k | x) ?

Forward Algorithm Define f k,i (forward probability) as the probability of emitting the prefix x 1 … x i and reaching the state π = k.Define f k,i (forward probability) as the probability of emitting the prefix x 1 … x i and reaching the state π = k. The recurrence for the forward algorithm:The recurrence for the forward algorithm: f k,i = e k (x i ). Σ f l,i- 1. a lk f k,i = e k (x i ). Σ f l,i- 1. a lk l Є Q l Є Q

Backward Algorithm However, forward probability is not the only factor affecting P(π i = k|x).However, forward probability is not the only factor affecting P(π i = k|x). The sequence of transitions and emissions that the HMM undergoes between π i+1 and π n also affect P(π i = k|x).The sequence of transitions and emissions that the HMM undergoes between π i+1 and π n also affect P(π i = k|x). forward x i backward

Backward Algorithm (cont’d) Define backward probability b k,i as the probability of being in state π i = k and emitting the suffix x i+1 … x n.Define backward probability b k,i as the probability of being in state π i = k and emitting the suffix x i+1 … x n. The recurrence for the backward algorithm:The recurrence for the backward algorithm: b k,i = Σ e l (x i+1 ). b l,i+1. a kl b k,i = Σ e l (x i+1 ). b l,i+1. a kl l Є Q l Є Q

Backward-Forward Algorithm The probability that the dealer used a biased coin at any moment i :The probability that the dealer used a biased coin at any moment i : P(x, π i = k) f k (i). b k (i) P(x, π i = k) f k (i). b k (i) P(π i = k|x) = _______________ = ______________ P(π i = k|x) = _______________ = ______________ P(x) P(x) P(x) P(x) P(x, π i = k) P(x) is the sum of P(x, π i = k) over all k