Download presentation

Presentation is loading. Please wait.

Published byRhett Carnal Modified over 8 years ago

1
. Class 9: Phylogenetic Trees

2
The Tree of Life

3
Evolution u Many theories of evolution u Basic idea: l speciation events lead to creation of different species l Speciation caused by physical separation into groups where different genetic variants become dominant u Any two species share a (possibly distant) common ancestor

4
Phylogenies u A phylogeny is a tree that describes the sequence of speciation events that lead to the forming of a set of current day species u Leafs - current day species u Nodes - hypothetical most recent common ancestors u Edges length - “time” from one speciation to the next AardvarkBisonChimpDogElephant

5
u Until mid 1950’s phylogenies were constructed by experts based on their opinion (subjective criteria) u The Linnaeus classification scheme implicitly assumes tree structure u Since then, focus on objective criteria for constructing phylogenetic trees l Thousands of articles in the last decades u Important for many aspects of biology l Classification (systematics) l Understanding biological mechanisms

6
Morphological vs. Molecular u Classical phylogenetic analysis: morphological features l number of legs, lengths of legs, etc. u Modern biological methods allow to use molecular features l Gene sequences l Protein sequences u Analysis based on homologous sequences (e.g., globins) in different species

7
Dangers in Molecular Phylogenies u We have to remember that gene/protein sequence can be homologous for different reasons: u Orthologs -- sequences diverged after a speciation event u Paralogs -- sequences diverged after a duplication event u Xenologs -- sequences diverged after a horizontal transfer (e.g., by virus)

8
Dangers of Paralogs Speciation events Gene Duplication 1A 2A 3A3B 2B1B

9
Dangers of Paralogs Speciation events Gene Duplication 1A 2A 3A3B 2B1B u If we only consider 1A, 2B, and 3A...

10
Types of Trees u A natural model to consider is that of rooted trees Common Ancestor

11
Types of Trees u Depending on the model, data from current day species does not distinguish between different placements of the root vs

12
Types of trees u Unrooted tree represents the same phylogeny with the root node

13
Positioning Roots in Unrooted Trees u We can estimate the position of the root by introducing an outgroup: l a set of species that are definitely distant from all the species of interest AardvarkBisonChimpDogElephant Falcon Proposed root

14
Type of Data u Distance-based l Input is a matrix of distances between species l Can be fraction of residue they disagree on, or alignment score between them, or … u Character-based l Examine each character (e.g., residue) separately

15
Simple Distance-Based Method Input: distance matrix between species Outline: u Cluster species together u Initially clusters are singletons u At each iteration combine two “closest” clusters to get a new one

16
UPGMA Clustering Let C i and C j be clusters, define distance between them to be When we combine two cluster, C i and C j, to form a new cluster C k, then

17
Molecular Clock u UPGMA implicitly assumes that all distances measure time in the same way 1 23 4 2341

18
Additivity u A weaker requirement is additivity l In “real” tree, distances between species are the sum of distances between intermediate nodes a b c i j k

19
Consequences of Additivity u Suppose input distances are additive u For any three leaves u Thus a b c i j k m

20
Neighbor Joining u Can we use this fact to construct trees? u Let where Theorem: if D(i,j) is minimal (among all pairs of leaves), then i and j are neighbors in the tree

21
Neighbor Joining Set L to contain all leaves Iteration: Choose i,j such that D(i,j) is minimal Create new node k, and set remove i,j from L, and add k Terminate: when |L| =2, connect two remaining nodes

22
Distance Based Methods u If we make strong assumptions on distances, we can reconstruct trees u In real-life distances are not additive u Sometimes they are close to additive

23
Parsimony u Character-based method Assumptions: u Independence of characters (no interactions) u Best tree is one where minimal changes take place

24
Simple Example u Suppose we have five species, such that three have ‘C’ and two ‘T’ at a specified position u Minimal tree has one evolutionary change: C C C C C T T T T C

25
Another Example u What is the parsimony score of AardvarkBisonChimpDogElephant A : CAGGTA B : CAGACA C : CGGGTA D : TGCACT E : TGCGTA

26
Evaluating Parsimony Scores u How do we compute the Parsimony score for a given tree? u Weighted Parsimony Each change is weighted by the score c(a,b)

27
Evaluating Parsimony Scores Dynamic programming on the tree Initialization: For each leaf i set S(i,a) = 0 if i is labeled by a, otherwise S(i,a) = Iteration: if k is node with children i and j, then S(k,a) = min b (S(i,b)+c(a,b)) + min b (S(j,b)+c(a,b)) Termination: cost of tree is min a S(r,a) where r is the root

28
Example AardvarkBisonChimpDogElephant A : CAGGTA B : CAGACA C : CGGGTA D : TGCACT E : TGCGTA

29
Cost of Evaluating Parsimony If there are n nodes, m characters, and k possible values for each character, then complexity is O(nmk) u Using this procedure, we can reconstruct most parsimonious values at each ancestor node

Similar presentations

© 2023 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google