# Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.

## Presentation on theme: "Thomas Jellema & Wouter Van Gool 1 Question. 2Answer."— Presentation transcript:

Thomas Jellema & Wouter Van Gool 1 Question

3 Pairwise alignment using HMMs Wouter van Gool and Thomas Jellema

Thomas Jellema & Wouter Van Gool 4 Contents Most probable path Thomas Probability of an alignment Thomas Sub-optimal alignments Thomas Pause Posterior probability that xi is aligned to yi Wouter Pair HMMs versus FSAs for searchingWouter Conclusion and summaryWouter Questions Pairwise alignment using HMMs

Thomas Jellema & Wouter Van Gool 5 4.1 Most probable path Model that emits a single sequene

Thomas Jellema & Wouter Van Gool 6 4.1 Most probable path Begin and end state

Thomas Jellema & Wouter Van Gool 7 4.1 Most probable path Model that emits a pairwise alignment

Thomas Jellema & Wouter Van Gool 8 4.1 Most probable path Example of a sequence Seq1: A C T _ C Seq2: T _ G G C All : M X M Y M

Thomas Jellema & Wouter Van Gool 9 4.1 Most probable path Begin and end state

Thomas Jellema & Wouter Van Gool 10 4.1 Most probable path Finding the most probable path - The path you choose is the path that has the highest probability of being the correct alignment. - The state we choose to be part of the alignment has to be the state with the highest probability of being correct. - We calculate the probability of the state being a M, X or Y and choose the one with the highest probability - If the probability of ending the alignment is higher then the next state being a M, X or Y then we end the alignment

Thomas Jellema & Wouter Van Gool 11 4.1 Most probable path The probability of emmiting an M is the highest probability of: 1 previous state X new state M 2 previous state Y new state M 3 previous state M new state M

Thomas Jellema & Wouter Van Gool 12 4.1 Most probable path Probability of going to the M state

Thomas Jellema & Wouter Van Gool 13 4.1 Most probable path Viterbi algorithm for pair HMMs

Thomas Jellema & Wouter Van Gool 14 4.1 Most probable path Finding the most probable path using FSAs -The most probable path is also the optimal FSA alignment

Thomas Jellema & Wouter Van Gool 15 4.1 Most probable path Finding the most probable path using FSAs

Thomas Jellema & Wouter Van Gool 16 4.1 Most probable path Recurrence relations

Thomas Jellema & Wouter Van Gool 17 4.1 Most probable path We wish to know if the alignment score is above or below the score of random alignment. The log-odds ratio s(a,b) = log (p ab / q a q b ). log (p ab / q a q b )>0 iff the probability that a and b are related by our model is larger than the probability that they are picked at random. The log odds scoring function

Thomas Jellema & Wouter Van Gool 18 4.1 Most probable path Random model

Thomas Jellema & Wouter Van Gool 19 1 EN D η1- ηY η X ENDYX 1 τε 1-ε -τ Y τ ε X τδδ 1-2δ - τ M END YXM “Model” “Random” 4.1 Most probable path

Thomas Jellema & Wouter Van Gool 20 4.1 Most probable path Transitions

Thomas Jellema & Wouter Van Gool 21 4.1 Most probable path Transitions

Thomas Jellema & Wouter Van Gool 22 4.1 Most probable path Optimal log-odds alignment

Thomas Jellema & Wouter Van Gool 23 4.1 Most probable path A pair HMM for local alignment

Thomas Jellema & Wouter Van Gool 24 Contents Most probable path Thomas Probability of an alignment Thomas Sub-optimal alignments Thomas Pause Posterior probability that xi is aligned to yi Wouter Pair HMMs versus FSAs for searchingWouter Conclusion and summaryWouter Questions Pairwise alignment using HMMs

Thomas Jellema & Wouter Van Gool 25 4.2 Probability of an allignment Probability that a given pair of sequences are related.

Thomas Jellema & Wouter Van Gool 26 4.2 Probability of an allignment Summing the probabilities

Thomas Jellema & Wouter Van Gool 27 4.2 Probability of an allignment

Thomas Jellema & Wouter Van Gool 28 Contents Most probable path Thomas Probability of an alignment Thomas Sub-optimal alignments Thomas Pause Posterior probability that xi is aligned to yi Posterior probability that xi is aligned to yi Wouter Pair HMMs versus FSAs for searchingWouter Conclusion and summaryWouter Questions Pairwise alignment using HMMs

Thomas Jellema & Wouter Van Gool 29 4.3 Suboptimal alignment Finding suboptimal alignments How to make sample alignments?

Thomas Jellema & Wouter Van Gool 30 4.3 Suboptimal alignment Finding distinct suboptimal alignments

Thomas Jellema & Wouter Van Gool 31 Contents Most probable path Thomas Probability of an alignment Thomas Sub-optimal alignments Thomas Pause Posterior probability that xi is aligned to yi Wouter ExampleWouter Pair HMMs versus FSAs for searchingWouter Conclusion or summaryWouter Questions Pairwise alignment using HMMs

Thomas Jellema & Wouter Van Gool 32 Contents Most probable path Thomas Probability of an alignment Thomas Sub-optimal alignments Thomas Pause Posterior probability that xi is aligned to yi Wouter Pair HMMs versus FSAs for searchingWouter Conclusion and summaryWouter Questions Pairwise alignment using HMMs

Thomas Jellema & Wouter Van Gool 33 Posterior probability that x i is aligned to y i Local accuracy of an alignment? Reliability measure for each part of an alignment HMM as a local alignment measure Idea: P(all alignments trough (x i,y i )) P(all alignments of (x,y))

Thomas Jellema & Wouter Van Gool 34 Posterior probability that x i is aligned to y i Notation: x i ◊ y i means x i is aligned to y i

Thomas Jellema & Wouter Van Gool 35 Posterior probability that x i is aligned to y i

Thomas Jellema & Wouter Van Gool 36 Posterior probability that x i is aligned to y i

Thomas Jellema & Wouter Van Gool 37 Probability alignment Miyazawa: it seems attractive to find alignment by maximising P(x i ◊ y i ) May lead to inconsistencies: e.g. pairs (i 1,i 1 ) & (i 2,j 2 ) i 2 > i 1 and j 1 < j 2 Restriction to pairs (i,j) for which P(x i ◊ y i )>0.5

Thomas Jellema & Wouter Van Gool 38 Posterior probability that x i is aligned to y i The expected accuracy of an alignment  Expected overlap between π and paths sampled from the posterior distribution  Dynamic programming

Thomas Jellema & Wouter Van Gool 39 Contents Most probable path Thomas Probability of an alignment Thomas Sub-optimal alignments Thomas Pause Posterior probability that xi is aligned to yi Wouter Pair HMMs versus FSAs for searchingWouter Conclusion and summaryWouter Questions Pairwise alignment using HMMs

Thomas Jellema & Wouter Van Gool 40 Contents Most probable path Thomas Probability of an alignment Thomas Sub-optimal alignments Thomas Pause Posterior probability that xi is aligned to yi Wouter Pair HMMs versus FSAs for searchingWouter Conclusion and summaryWouter Questions Pairwise alignment using HMMs

Thomas Jellema & Wouter Van Gool 41 Pair HMMs versus FSAs for searching P(D | M) > P(M | D) HMM: maximum data likelihood by giving the same parameters (i.e. transition and emission probabilities) Bayesian model comparison with random model R

Thomas Jellema & Wouter Van Gool 42 Pair HMMs versus FSAs for searching Problems: 1. Most algorithms do not compute full probability P(x,y | M) but only best match or Viterbi path 2. FSA parameters may not be readily translated into probabilities

Thomas Jellema & Wouter Van Gool 43 Pair HMMs vs FSAs for searching Example: a model whose parameters match the data need not be the best model abacqaqa S B α 1-α 111 1 P S (abac) = α 4 q a q b q a q c P B (abac) = 1-α Model comparison using the best match rather than the total probability

Thomas Jellema & Wouter Van Gool 44 Pair HMMs vs FSAs for searching Problem: no fixed scaling procedure can make the scores of this model into the log probabilities of an HMM

Thomas Jellema & Wouter Van Gool 45 Pair HMMs vs FSAs for searching Bayesian model comparision: both HMMs have same log-odds ratio as previous FSA

Thomas Jellema & Wouter Van Gool 46 Pair HMMs vs FSAs for searching Conversion FSA into probabilistic model – Probabilistic models may underperform standard alignment methods if Viterbi is used for database searching. – Buf if forward algorithm is used, it would be better than standard methods.

Thomas Jellema & Wouter Van Gool 47 Contents Most probable path Thomas Probability of an alignment Thomas Sub-optimal alignments Thomas Pause Posterior probability that xi is aligned to yi Wouter ExampleWouter Pair HMMs versus FSAs for searchingWouter Conclusion and summaryWouter Questions Pairwise alignment using HMMs

Thomas Jellema & Wouter Van Gool 48 Why try to use HMMs? Many complicated alignment algorithms can be described as simple Finite State Machines. HMMs have many advantages: - Parameters can be trained to fit the data: no need for PAM/BLOSSUM matrices - HMMs can keep track of all alignments, not just the best one

Thomas Jellema & Wouter Van Gool 49 New things HMMs we can do with pair HMMs Compute probability over all alignments. Compute relative probability of Viterbi alignment (or any other alignment). Sample over all alignments in proportion to their probability. Find distinct sub-optimal alignments. Compute reliability of each part of the best alignment. Compute the maximally reliable alignment.

Thomas Jellema & Wouter Van Gool 50 Conclusion Pairs-HMM work better for sequence alignment and database search than penalty score based alignment algorithms. Unfortunately both approaches are O(mn) and hence too slow for large database searches!

Thomas Jellema & Wouter Van Gool 51 Contents Most probable path Thomas Probability of an alignment Thomas Sub-optimal alignments Thomas Pause Posterior probability that xi is aligned to yi Wouter Pair HMMs versus FSAs for searchingWouter Conclusion or summaryWouter Questions Pairwise alignment using HMMs