Presentation is loading. Please wait.

Presentation is loading. Please wait.

Copyright (c) 2002 by SNU CSE Biointelligence Lab 1 Chap. 4 Pairwise alignment using HMMs Biointelligence Laboratory School of Computer Sci. & Eng. Seoul.

Similar presentations


Presentation on theme: "Copyright (c) 2002 by SNU CSE Biointelligence Lab 1 Chap. 4 Pairwise alignment using HMMs Biointelligence Laboratory School of Computer Sci. & Eng. Seoul."— Presentation transcript:

1 Copyright (c) 2002 by SNU CSE Biointelligence Lab 1 Chap. 4 Pairwise alignment using HMMs Biointelligence Laboratory School of Computer Sci. & Eng. Seoul National University Seoul 151-742, Korea This slide file is available online at http://bi.snu.ac.kr/

2 2 Copyright (c) 2002 by SNU CSE Biointelligence Lab Contents FSA → HMM Pair HMMs The full probability of x & y Suboptimal alignment posterior that x i is aligned to y i Pair HMMs vs FSAs for searching

3 3 Copyright (c) 2002 by SNU CSE Biointelligence Lab Figure 4.1 A finite state machine diagram for affine gap alignment on the left, and the corresponding probabilistic model on the right. X (+1,+0) M (+1,+1) Y (+0,+1) -e-e -d-d -d-d -e-e s(x i,y j ) X q xi M p xiyj Y q yj ε ε 1-ε1-ε 1-ε1-ε δ δ 1-δ1-δ

4 4 Copyright (c) 2002 by SNU CSE Biointelligence Lab Recurrence Relation

5 5 Copyright (c) 2002 by SNU CSE Biointelligence Lab Pair HMMs (1) FSA → HMMs: How to?  Specification of emission & transition probabilities X (+1,+0) M (+1,+1) Y (+0,+1) -e-e -d-d -d-d -e-e s(x i,y j ) X q xi M p xiyj Y q yj ε ε 1-ε1-ε 1-ε1-ε δ δ 1-δ1-δ

6 6 Copyright (c) 2002 by SNU CSE Biointelligence Lab Pair HMMs (2) Definition of begin state & end state  Providing pd. over all possible sequences Pair HMM  Identical to ordinary HMM  Emitting a pairwise alignment

7 7 Copyright (c) 2002 by SNU CSE Biointelligence Lab X q xi M p xiyj Y q yj ε ε 1-ε-τ1-ε-τ δ δ 1-2δ-τ 1-ε-τ1-ε-τ δ δ τ τ τ τ Begin End Figure 4.2 The full probabilistic version of Figure 4.1 1-2δ-τ

8 8 Copyright (c) 2002 by SNU CSE Biointelligence Lab Pair HMMs (3) Algorithm: Viterbi algorithm for pair HMMs  Initialization: v M (0, 0) = 1. v X (0, 0) = v Y (0, 0) = 0 v * (-1, j) = v * (i, -1) = 0.  Recurrence: i = 0,…,n, j = 0,…,m, except for(0,0);  Termination:

9 9 Copyright (c) 2002 by SNU CSE Biointelligence Lab Pair HMMs (4) Random model Probability of a pair of sequences x and y X q xi Y q yj η η 1-η η η 1-η1-η BeginEnd

10 10 Copyright (c) 2002 by SNU CSE Biointelligence Lab Pair HMMs (5) Correspondence with FSA  Probability terms to log-odd terms  Viterbi match / random match Tricks Compensating term

11 11 Copyright (c) 2002 by SNU CSE Biointelligence Lab Pair HMMs (6) Algorithm: Optimal log-odds alignment  Initialization: V M (0, 0) = 2logη, V X (0,0) = V Y (0,0)= - . All V (i,-1), V (-1, j) are set to - .  Recurrence: i = 0,…,n, j = 0,…,m except(0,0);  Termination: Last compensating term

12 12 Copyright (c) 2002 by SNU CSE Biointelligence Lab Pair HMMs (7) Pair HMM for local alignment figure 4.3

13 13 Copyright (c) 2002 by SNU CSE Biointelligence Lab The full probability of x and y (1) Summation over all alignments  Forward algorithm does it.  P(x, y) = f E (n, m) Posterior distribution P(π|x,y) can be acquired.

14 14 Copyright (c) 2002 by SNU CSE Biointelligence Lab The full probability of x and y (2) Algorithm: Forward calculation for pair HMMs  Initialization: f M (0, 0) = 1, f X (0,0) = f Y (0,0)= 0. All f (i,-1), f (-1, j) are set to 0.  Recurrence: i = 0,…,n, j = 0,…,m except (0,0);  Termination:

15 15 Copyright (c) 2002 by SNU CSE Biointelligence Lab Suboptimal alignment (1) Type of suboptimal alignment  Slightly different from optimal alignment in a few positions  Substantially or completely different Repeats in one or both of the sequences

16 16 Copyright (c) 2002 by SNU CSE Biointelligence Lab Suboptimal alignment (2) Probabilistic sampling of alignments  Sampling from the posterior distribution  Trace back through f k (i, j)

17 17 Copyright (c) 2002 by SNU CSE Biointelligence Lab Suboptimal alignment (3) Finding distinct suboptimal alignments  Waterman & Eggert [1987]  Finding the next best alignment  No aligned residue pairs in common with any previously determined alignment

18 18 Copyright (c) 2002 by SNU CSE Biointelligence Lab figure 4.5

19 19 Copyright (c) 2002 by SNU CSE Biointelligence Lab posterior that x i is aligned to y j (1) Reliability measure for each part of an alignment Interest Forward algorithmBackward algorithm

20 20 Copyright (c) 2002 by SNU CSE Biointelligence Lab posterior that x i is aligned to y j (2) Algorithm: Backward calculation for pair HMMs  Initialization: b M (n, m) = b X (n, m) = b Y (n,m) = τ. All b (i, m+1), b (n+1, j) are set to 0.  Recurrence: i = 1,…,n, j = 1,…,m except (n, m);

21 21 Copyright (c) 2002 by SNU CSE Biointelligence Lab posterior that x i is aligned to y j (3) The expected accuracy of an alignment  Expected overlap between π and paths sampled from the posterior distribution  Dynamic programming

22 22 Copyright (c) 2002 by SNU CSE Biointelligence Lab Pair HMMs vs FSAs for searching (1) Two difficulties of conventional methods in searching  Not a probabilistic models for searching  Not computable full probability P(x, y|M) abac qaqa S B α 1-α 111 1 P S (abac) = α 4 q a q b q a q c P B (abac) = 1-α Model comparison using the best match rather than the total probability

23 23 Copyright (c) 2002 by SNU CSE Biointelligence Lab Pair HMMs vs FSAs for searching (2) Conversion FSA into probabilistic model  Probabilistic models may underperform standard alignment methods if Viterbi is used for database searching.  Buf if forward algorithm is used, it would be better than standard methods.


Download ppt "Copyright (c) 2002 by SNU CSE Biointelligence Lab 1 Chap. 4 Pairwise alignment using HMMs Biointelligence Laboratory School of Computer Sci. & Eng. Seoul."

Similar presentations


Ads by Google