Download presentation

Presentation is loading. Please wait.

Published byGarrison Zachry Modified about 1 year ago

1
Computational Molecular Biology Biochem 218 – BioMedical Informatics Doug Brutlag Professor Emeritus Biochemistry & Medicine (by courtesy) Phylogenies

2
Cladogram Representation of Phylogenies A B C D E F G H I J L M N O P Q Years x Years x

3
Dendrogram Representation of Phylogenies Substitutions per 100 Residues GrowTree Phylogram February 1, 2010

4
Cladogram

5
Phenogram

6
Curve-O-Gram

7
Eurogram

8
RadialGram

9
Methods for Determining Phylogenies Parsimony (character based) o Assigns mutations to branches o Minimize number of edits o Topology maximizes similarity of neighboring leaves Distance methods o Branch lengths = D(i,j)/2 for sequences i, j o Distances must be at least metric o Distances can reflect time or edits o Distance must be relatively constant per unit branch length A B C D E F G H I J L M N O P Q

10
Methods for Determining Phylogenies Parsimony o Minimum mutation (Fitch, PAUP) o Minimal length encoding Probabilistic o Branch and Bound o Maximum likelihood Distance methods o Ultrametric Trees o Additive Trees o UPGMA o Neighbor Joining A B C D E F G H I J L M N O P Q

11
Properties of Trees Rooted or Unrooted Nodes and Branches o Internal Nodes o External Nodes - leaves Operational Taxonomic Units Outgroups Topology One path/pair Distances X Y Z R X3 4 5 Z Y R2 21 X Z R1 R Y

12
Orthologous Evolution

13
Paralogous Evolution hemoglobins

14
Challenges Making Trees: Gene Duplication versus Speciation

15
Orthology and Paralogy HB Human WB Worm HA1 Human HA2 Human Yeast WA Worm Thanks to Seraphim Batzoglou Orthologs: Derived by speciation Paralogs: Gene Duplications Orthologs: Derived by speciation Paralogs: Gene Duplications

16
Challenges Making Trees: Gene Conversion A A T C G C G A T A G C A T C A A T T C C C T C Thanks to Maryellen Ruvolo

17
A A T C G C G A T A G C A T CC G C G A T C TC A T C A A T T C C C T C Challenges Making Trees: Gene Conversion Thanks to Maryellen Ruvolo

18
A B C D Gene N Challenges Making Trees: Gene Conversion Gene M Thanks to Maryellen Ruvolo Orthologs: Derived by speciation Paralogs: Gene Duplications Orthologs: Derived by speciation Paralogs: Gene Duplications

19
A N B N C M C N D N A M B M D M Challenges Making Trees: C M Has Been Converted from C N Thanks to Maryellen Ruvolo Orthologs: Derived by speciation Paralogs: Gene Duplications Orthologs: Derived by speciation Paralogs: Gene Duplications Gene N Gene M

20
Consensus CG/LH Tree Thanks to Maryellen Ruvolo

21
Gene conversion between 1 st & 2nd exons of LH, CG2 Genes LH Gen e CG2 Gene 168 nt 15nt No ConversionConversion ThankThank Thanks to Maryellen Ruvolo

22
Challenges Making Trees: Varying Rates of Mutation

23
Challenges Making Trees: Horizontal Gene Transfer

24
Maximum Ultrametric Distance Trees Matrix D is ultrametric for tree T if: o If D is a symmetric n by n matrix of distances o T contains n leaves, one from each row or column o Each node of T labeled by one entry from D o Numbers from root to leaves strictly decrease o For any two leaves i, j, D(i,j) labels nearest common ancestor of i and j in tree Matrix DTree T

25
Maximum Ultrametric Distance Trees A symmetric matrix D is ultrametric if and only if for every three leaves i, j, and k, there is a tie for the maximum distance between D(i,j), D(i,k) and D(j,k). U V IJ K

26
Additive Distance Trees Matrix D Tree T A B C D

27
Distance Metrics Obey the Triangle Inequality D(i,j) ≤ D(i,k) +D(j,k) for all i, j, k (Max Score - Smith-Waterman Score) is a Metric if o If Gap-penalty ≥ 1+ Gap-size/(n-1) o Assuming match = 1 and mismatch = -1

28
Three Leaf Tree Observe D 1,2 D 1,3 D 2,3 Calculate L 1,A L 2,A L 3,A A L1,A L2,A L3,A

29
Three Leaf Tree Observe D 1,2 D 1,3 D 2,3 Calculate L 1,A L 2,A L 3,A D1,2=L1,A+L2,A D1,3=L1,A+L3,A D2,3=L2,A+L3,A A L1,A L2,A L3,A

30
Solution to Three Species Tree A L1,A L2,A L3,A

31
Four Species Tree Calculate L 1,A L 2,A L 3,B L 4,B, L A,B Observe D 1,2 D 1,3 D 1,4 D 2,3 D 2,4 D 3, L1,A L2,A A,B L3,A L4,A A B

32
Four Species Topology Label species 1, 2, 3, and 4 so that: D(1,2) + D(3,4) ≤ D(1,3) + D(2,4) = D(1,4) + D(2,3) A 4 B L1,A L2,A LA,B L3,A L4,A

33
Solution for Four Species L1,A = 1/4*(D1,3 + D1,4 - D2,3 -D2,4) + 1/2*D1,2 L2,A = 1/4*(D2,3 + D2,4 - D1,3 - D1,4) + 1/2*D1,2 LB,3 = 1/4*(D1,3 + D2,3 - D1,4 - D2,4) + 1/2*D3,4 LB,4 = 1/4*(D1,4 + D2,4 - D1,3 - D2,3) + 1/2*D3,4 LA,B = 1/4*(D1,3 + D1,4 + D2,3 + D2,4) - 1/2*(D1,2 + D3,4) A 4 B L1,A L2,A LA,B L3,A L4,A

34
Four Species =>Three Topologies A 4 B A 4 B 1 2 A 3 4

35
Species, Distances, Branches & Topologies

36

37
Number of Topologies for n Species

38
UPGMA: Unweighted Pair Group Method with Arithmetic Average Where D1,(34) = (D1,3+D1,4)/2 and D2,(34) = (D2,3+D2,4)/

39
UPGMA Dendrogram

40
UPGMA Clustering

41
Neighbor Joining Method XY i j n n-2 n-1 n n-2 n-3 n-4 n

42
Nearest Neighbor Dendrogram A B C D E R W X Y Z

43
New Hampshire Standard Tree

44
SeqWeb GrowTree Program

45
GrowTree Parameters

46
GrowTree Distances

47
GrowTree Phylogram (UPGMA)

48
GrowTree Alignment

49
GrowTree Neighbor Joining Tree

50
GrowTree VegF Input

51
GrowTree VegF Neighbor Joining Tree

52
VegF Growth Factors

53
GrowTree VegF UPGMA Tree

54
GrowTree VegF Alignment

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google