Presentation is loading. Please wait.

Presentation is loading. Please wait.

PHYLOGENETIC TREES Dwyane George February 24, 2015 18.434.

Similar presentations


Presentation on theme: "PHYLOGENETIC TREES Dwyane George February 24, 2015 18.434."— Presentation transcript:

1 PHYLOGENETIC TREES Dwyane George February 24, 2015 18.434

2 Outline Introduction & Motivation Definition Algorithm & Proof of Correctness Unweighted Pair Group Method with Arithmetic Mean (UPGMA) Algorithm Runtime

3 Key Ideas Phylogenetic trees represent inferred evolutionary relationships Composed by various methods Clustering Maximum likelihood estimators

4 Definitions Phylogeny The relationships among species, populations, individuals or genes (taxa) Phylogenetic Trees Results presented as a collection of nodes and edges – a tree Tree showing inferred evolutionary relationships among various biological species or entities Closely related taxa are spatially nearby, evolutionarily distant taxa are far apart Rooted/unrooted variations

5 Number of Trees Theorem (Cavalli-Sforza & Edwards): The number of rooted binary phylogenetic trees of n vertices is given by: Proof: by induction

6 Unweighted Pair Group Method with Arithmetic Mean (UPGMA) d ij denote the distance between the i th and j th taxa Let d ij denote the distance between the i th and j th taxa SpeciesABCD A0--- B d ab 0-- C d ac d bc 0- D d ad d bd d cd 0

7 UPGMA Algorithm Initialize all vertices to a cluster of size 1 Cluster the two species with the smallest distance Let d ij = min(D) C k = C i U C j Update the distance matrix with the new group against all other nodes d (ij)k = ½ * (d ik + d jk ) Repeat steps 2 & 3 for n-1 times until all species have been grouped

8 UPGMA Implementation

9 UPGMA Correctness Definition: Ultrametric tree All pendant vertices are equidistant from the root. “Constant molecular clock” UPGMA assigns same positive height to all subtrees Greedy algorithm Picks locally optimal groupings from leaves to root Topographically correct iff input data is ultrametric

10 UPGMA Algorithm Runtime Total Runtime O(n 3 ) Potential Speedup to O(n 2 ) by clustering in linear time Gronau & Moran (2006) Quad Trees data structure OperationTimeNumber of CallsTotal Time Hierarchical Clustering O(n 2 )O(n)O(n 3 ) Update D MatrixO(n) O(n 2 ) UPGMA--O(n 3 )


Download ppt "PHYLOGENETIC TREES Dwyane George February 24, 2015 18.434."

Similar presentations


Ads by Google