Download presentation
Presentation is loading. Please wait.
Published byAugustus Cobb Modified over 9 years ago
1
Copyright (c) 2002 by SNU CSE Biointelligence Lab 1 Chap. 4 Pairwise alignment using HMMs Biointelligence Laboratory School of Computer Sci. & Eng. Seoul National University Seoul 151-742, Korea This slide file is available online at http://bi.snu.ac.kr/
2
2 Copyright (c) 2002 by SNU CSE Biointelligence Lab Contents FSA → HMM Pair HMMs The full probability of x & y Suboptimal alignment posterior that x i is aligned to y i Pair HMMs vs FSAs for searching
3
3 Copyright (c) 2002 by SNU CSE Biointelligence Lab Figure 4.1 A finite state machine diagram for affine gap alignment on the left, and the corresponding probabilistic model on the right. X (+1,+0) M (+1,+1) Y (+0,+1) -e-e -d-d -d-d -e-e s(x i,y j ) X q xi M p xiyj Y q yj ε ε 1-ε1-ε 1-ε1-ε δ δ 1-δ1-δ
4
4 Copyright (c) 2002 by SNU CSE Biointelligence Lab Recurrence Relation
5
5 Copyright (c) 2002 by SNU CSE Biointelligence Lab Pair HMMs (1) FSA → HMMs: How to? Specification of emission & transition probabilities X (+1,+0) M (+1,+1) Y (+0,+1) -e-e -d-d -d-d -e-e s(x i,y j ) X q xi M p xiyj Y q yj ε ε 1-ε1-ε 1-ε1-ε δ δ 1-δ1-δ
6
6 Copyright (c) 2002 by SNU CSE Biointelligence Lab Pair HMMs (2) Definition of begin state & end state Providing pd. over all possible sequences Pair HMM Identical to ordinary HMM Emitting a pairwise alignment
7
7 Copyright (c) 2002 by SNU CSE Biointelligence Lab X q xi M p xiyj Y q yj ε ε 1-ε-τ1-ε-τ δ δ 1-2δ-τ 1-ε-τ1-ε-τ δ δ τ τ τ τ Begin End Figure 4.2 The full probabilistic version of Figure 4.1 1-2δ-τ
8
8 Copyright (c) 2002 by SNU CSE Biointelligence Lab Pair HMMs (3) Algorithm: Viterbi algorithm for pair HMMs Initialization: v M (0, 0) = 1. v X (0, 0) = v Y (0, 0) = 0 v * (-1, j) = v * (i, -1) = 0. Recurrence: i = 0,…,n, j = 0,…,m, except for(0,0); Termination:
9
9 Copyright (c) 2002 by SNU CSE Biointelligence Lab Pair HMMs (4) Random model Probability of a pair of sequences x and y X q xi Y q yj η η 1-η η η 1-η1-η BeginEnd
10
10 Copyright (c) 2002 by SNU CSE Biointelligence Lab Pair HMMs (5) Correspondence with FSA Probability terms to log-odd terms Viterbi match / random match Tricks Compensating term
11
11 Copyright (c) 2002 by SNU CSE Biointelligence Lab Pair HMMs (6) Algorithm: Optimal log-odds alignment Initialization: V M (0, 0) = 2logη, V X (0,0) = V Y (0,0)= - . All V (i,-1), V (-1, j) are set to - . Recurrence: i = 0,…,n, j = 0,…,m except(0,0); Termination: Last compensating term
12
12 Copyright (c) 2002 by SNU CSE Biointelligence Lab Pair HMMs (7) Pair HMM for local alignment figure 4.3
13
13 Copyright (c) 2002 by SNU CSE Biointelligence Lab The full probability of x and y (1) Summation over all alignments Forward algorithm does it. P(x, y) = f E (n, m) Posterior distribution P(π|x,y) can be acquired.
14
14 Copyright (c) 2002 by SNU CSE Biointelligence Lab The full probability of x and y (2) Algorithm: Forward calculation for pair HMMs Initialization: f M (0, 0) = 1, f X (0,0) = f Y (0,0)= 0. All f (i,-1), f (-1, j) are set to 0. Recurrence: i = 0,…,n, j = 0,…,m except (0,0); Termination:
15
15 Copyright (c) 2002 by SNU CSE Biointelligence Lab Suboptimal alignment (1) Type of suboptimal alignment Slightly different from optimal alignment in a few positions Substantially or completely different Repeats in one or both of the sequences
16
16 Copyright (c) 2002 by SNU CSE Biointelligence Lab Suboptimal alignment (2) Probabilistic sampling of alignments Sampling from the posterior distribution Trace back through f k (i, j)
17
17 Copyright (c) 2002 by SNU CSE Biointelligence Lab Suboptimal alignment (3) Finding distinct suboptimal alignments Waterman & Eggert [1987] Finding the next best alignment No aligned residue pairs in common with any previously determined alignment
18
18 Copyright (c) 2002 by SNU CSE Biointelligence Lab figure 4.5
19
19 Copyright (c) 2002 by SNU CSE Biointelligence Lab posterior that x i is aligned to y j (1) Reliability measure for each part of an alignment Interest Forward algorithmBackward algorithm
20
20 Copyright (c) 2002 by SNU CSE Biointelligence Lab posterior that x i is aligned to y j (2) Algorithm: Backward calculation for pair HMMs Initialization: b M (n, m) = b X (n, m) = b Y (n,m) = τ. All b (i, m+1), b (n+1, j) are set to 0. Recurrence: i = 1,…,n, j = 1,…,m except (n, m);
21
21 Copyright (c) 2002 by SNU CSE Biointelligence Lab posterior that x i is aligned to y j (3) The expected accuracy of an alignment Expected overlap between π and paths sampled from the posterior distribution Dynamic programming
22
22 Copyright (c) 2002 by SNU CSE Biointelligence Lab Pair HMMs vs FSAs for searching (1) Two difficulties of conventional methods in searching Not a probabilistic models for searching Not computable full probability P(x, y|M) abac qaqa S B α 1-α 111 1 P S (abac) = α 4 q a q b q a q c P B (abac) = 1-α Model comparison using the best match rather than the total probability
23
23 Copyright (c) 2002 by SNU CSE Biointelligence Lab Pair HMMs vs FSAs for searching (2) Conversion FSA into probabilistic model Probabilistic models may underperform standard alignment methods if Viterbi is used for database searching. Buf if forward algorithm is used, it would be better than standard methods.
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.