# Bayesian Evolutionary Distance P. Agarwal and D.J. States. Bayesian evolutionary distance. Journal of Computational Biology 3(1):1— 17, 1996.

## Presentation on theme: "Bayesian Evolutionary Distance P. Agarwal and D.J. States. Bayesian evolutionary distance. Journal of Computational Biology 3(1):1— 17, 1996."— Presentation transcript:

Bayesian Evolutionary Distance P. Agarwal and D.J. States. Bayesian evolutionary distance. Journal of Computational Biology 3(1):1— 17, 1996

Determining time of divergence Goal: Determine when two aligned sequences X and Y diverged from a common ancestor AGTTGAC ACTTGCC Model: –Mutation only –Independence –Markov process

Divergence points have different probabilities Probability X Y Ancestor time

DNA PAM matrices Similar to Dayhoff PAM matrices PAM 1 corresponds to 1% mutation –1% change ≈ 10 million years Simplification: uniform mutation rates among nucleotides: –m ij =  if i = j –m ij =  if i  j Can modify to handle different transition/transversion rates –Transitions (A  G or C  T) have higher probability than transversions PAM x = (PAM 1) x

DNA PAM 1 0.99 0.330.99 0.33 0.99 0.33 0.99 A CTG T A G A

DNA PAM x  (x)  (x)  (x)  (x)  (x)  (x)  (x) A CTG T A G A

DNA PAM x As x  ,  (x) and  (x)  1/4 Assume p i = ¼ for i ={A,C,T,G} Leads to simple match/mismatch scoring scheme

DNA PAM x: Scoring 

DNA PAM PAM Dist (x) Match score (bits) Mismatch score (bits) 12-6 101.86-3.00 251.66-1.82 501.34-1.04 1000.84-0.44 1250.65-0.3

DNA PAM n: Scoring Log-odds score of alignment of length n with k mismatches: Odds score of same alignment:   )(4log)( )(4 )( xx xx    

Probability of k mismatches at distance x Note: Need odds score here, not log-odds!

Expected evolutionary distance given k mismatches Over all distances By Bayes’ Thm: Conditional expectation From odds scores ??

Assumptions Consider only a finite number of values of x; e.g., 1, 10, 25,50, etc. –In theory, could consider any number of values “Flat prior:” All values of x are equally likely –If M values are considered, Pr(x) = 1/M

Calculating Pr(k) and Pr(x|k)

Calculating the distance Fraction of the probability of k mismatches that comes from assuming distance is x

Ungapped local alignments Only matches and mismatches — no gaps An ungapped local alignment of sequences X and Y is a pair of equal-length substrings of X and Y X Y

Ungapped local alignments P. Agarwal and D.J. States. Bayesian evolutionary distance. Journal of Computational Biology 3(1):1—17, 1996 23 matches 2 mismatches 34 matches 11 mismatches A:A: B:B:

Which alignment is better? Answer depends on evolutionary distance

Download ppt "Bayesian Evolutionary Distance P. Agarwal and D.J. States. Bayesian evolutionary distance. Journal of Computational Biology 3(1):1— 17, 1996."

Similar presentations