Presentation is loading. Please wait.

Presentation is loading. Please wait.

Phylogenies Preliminaries Distance-based methods Parsimony Methods.

Similar presentations


Presentation on theme: "Phylogenies Preliminaries Distance-based methods Parsimony Methods."— Presentation transcript:

1 Phylogenies Preliminaries Distance-based methods Parsimony Methods

2 CS/BIO 271 - Introduction to Bioinformatics2 Phylogenetic Trees  Hypothesis about the relationship between organisms  Can be rooted or unrooted ABCDE AB C D E Time Root

3 CS/BIO 271 - Introduction to Bioinformatics3 Tree proliferation SpeciesNumber of Rooted TreesNumber of Unrooted Trees 211 331 4153 510515 634,459,4252,027,025 7213,458,046,767,8757,905,853,580,625 88,200,794,532,637,891,559,375221,643,095,476,699,771,875

4 CS/BIO 271 - Introduction to Bioinformatics4 Molecular phylogenetics  Specific genomic sequence variations (alleles) are much more reliable than phenotypic characteristics  More than one gene should be considered

5 CS/BIO 271 - Introduction to Bioinformatics5 An ongoing didactic  Pheneticists tend to prefer distance based metrics, as they emphasize relationships among data sets, rather than the paths they have taken to arrive at their current states.  Cladists are generally more interested in evolutionary pathways, and tend to prefer more evolutionarily based approaches such as maximum parsimony.

6 CS/BIO 271 - Introduction to Bioinformatics6 Distance matrix methods SpeciesABCD B9––– C811–– D121510– E1518135

7 CS/BIO 271 - Introduction to Bioinformatics7 UPGMA  Similar to average-link clustering  Merge the closest two groups Replace the distances for the new, merged group with the average of the distance for the previous two groups  Repeat until all species are joined

8 CS/BIO 271 - Introduction to Bioinformatics8 UPGMA Step 1 SpeciesABCD B9––– C811–– D121510– E1518135 Merge D & E DE SpeciesABC B9–– C811– DE13.516.511.5

9 CS/BIO 271 - Introduction to Bioinformatics9 UPGMA Step 2 Merge A & C DE SpeciesABC B9–– C811– DE13.516.511.5 AC SpeciesBAC 10– DE16.512.5

10 CS/BIO 271 - Introduction to Bioinformatics10 UPGMA Steps 3 & 4 Merge B & AC DEAC SpeciesBAC 10– DE16.512.5 B Merge ABC & DE DEACB (((A,C)B)(D,E))

11 CS/BIO 271 - Introduction to Bioinformatics11 Parsimony approaches  Belong to the broader class of character based methods of phylogenetics  Emphasize simpler, and thus more likely evolutionary pathways I: GCGGACG II: GTGGACG C T III (C or T) C T III A (C or T)

12 CS/BIO 271 - Introduction to Bioinformatics12 Informative and uninformative sites Position Seq123456 1 GGGGGG 2 GGGAGT 3 GGATAG 4 GATCAT For positions 5 & 6, it is possible to select more parsimonious trees – those that invoke less substitutions.

13 CS/BIO 271 - Introduction to Bioinformatics13 Parsimony methods  Enumerate all possible trees  Note the number of substitutions events invoked by each possible tree Can be weighted by transition/transversion probabilities, etc.  Select the most parsimonious

14 CS/BIO 271 - Introduction to Bioinformatics14 Branch and Bound methods  Key problem – number of possible trees grows enormous as the number of species gets large  Branch and bound – a technique that allows large numbers of candidate trees to be rapidly disregarded  Requires a “good guess” at the cost of the best tree

15 CS/BIO 271 - Introduction to Bioinformatics15 Branch and Bound for TSP  Find a minimum cost round-trip path that visits each intermediate city exactly once  NP-complete  Greedy approach: A,G,E,F,B,D,C,A = 251 A C F E D G B 93 46 20 35 68 12 57 31 15 82 17 82 59

16 CS/BIO 271 - Introduction to Bioinformatics16 Search all possible paths All paths A  G (20) A  G  F (88) AGFBAGFBAGFEAGFEAGFCAGFC A  G  E (55) A  B (46)A  C (93) A  C  B (175) A  C  B  E (257) ACDACDACFACF  Best estimate: 251

17 CS/BIO 271 - Introduction to Bioinformatics17 Parsimony – Branch and Bound  Use the UPGMA tree for an initial best estimate of the minimum cost (most parsimonious) tree  Use branch and bound to explore all feasible trees  Replace the best estimate as better trees are found  Choose the most parsimonious

18 CS/BIO 271 - Introduction to Bioinformatics18 Parsimony example Position Seq123456 1 GGGGGG 2 GGGAGT 3 GGATAG 4 GATCAT All trees (1,2) [0] (1,3) [1] (1,4) [1] Position 5: Etc.


Download ppt "Phylogenies Preliminaries Distance-based methods Parsimony Methods."

Similar presentations


Ads by Google