PHYLOGENETIC TREES Dwyane George February 24, 2015 18.434.

Slides:



Advertisements
Similar presentations
Maximum Parsimony Probabilistic Models of Evolutions Distance Based Methods Lecture 12 © Shlomo Moran, Ilan Gronau.
Advertisements

Phylogenetic Tree A Phylogeny (Phylogenetic tree) or Evolutionary tree represents the evolutionary relationships among a set of organisms or groups of.
Effects of Rooting on Phylogenic Algorithms Margareta Ackerman Joint work with David Loker and Dan Brown.
. Class 9: Phylogenetic Trees. The Tree of Life Evolution u Many theories of evolution u Basic idea: l speciation events lead to creation of different.
Computing a tree Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas.
Lecture 13 CS5661 Phylogenetics Motivation Concepts Algorithms.
. Phylogenetic Trees (2) Lecture 13 Based on: Durbin et al 7.4, Gusfield , Setubal&Meidanis 6.1.
Molecular Evolution and Phylogenetic Tree Reconstruction
Phylogenetics - Distance-Based Methods CIS 667 March 11, 2204.
Phylogenetic trees Sushmita Roy BMI/CS 576 Sep 23 rd, 2014.
Phylogenies Preliminaries Distance-based methods Parsimony Methods.
Tree Reconstruction.
UPGMA Algorithm.  Main idea: Group the taxa into clusters and repeatedly merge the closest two clusters until one cluster remains  Algorithm  Add a.
. Computational Genomics 5a Distance Based Trees Reconstruction (cont.) Modified by Benny Chor, from slides by Shlomo Moran and Ydo Wexler (IIT)
Overview of Phylogeny Artiodactyla (pigs, deer, cattle, goats, sheep, hippopotamuses, camels, etc.) Cetacea (whales, dolphins, porpoises)
Bioinformatics Algorithms and Data Structures
Distance matrix methods calculate a measure of distance between each pair of species, then find a tree that predicts the observed set of distances.
Phylogeny Tree Reconstruction
. Perfect Phylogeny Tutorial #11 © Ilan Gronau Original slides by Shlomo Moran.
Building phylogenetic trees Jurgen Mourik & Richard Vogelaars Utrecht University.
Distance methods. UPGMA: similar to hierarchical clustering but not additive Neighbor-joining: more sophisticated and additive What is additivity?
In addition to maximum parsimony (MP) and likelihood methods, pairwise distance methods form the third large group of methods to infer evolutionary trees.
The Tree of Life From Ernst Haeckel, 1891.
Graphs and Trees This handout: Trees Minimum Spanning Tree Problem.
Phylogenetic Trees Presenter: Michael Tung
CISC667, F05, Lec15, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Phylogenetic Trees (II) Distance-based methods.
. Phylogenetic Trees Lecture 1 Credits: N. Friedman, D. Geiger, S. Moran,
. Class 9: Phylogenetic Trees. The Tree of Life D’après Ernst Haeckel, 1891.
Phylogeny Tree Reconstruction
Distance-Based Phylogenetic Reconstruction Tutorial #8 © Ilan Gronau, edited by Itai Sharon.
Building Phylogenies Distance-Based Methods. Methods Distance-based Parsimony Maximum likelihood.
Perfect Phylogeny MLE for Phylogeny Lecture 14
Phylogenetic trees Sushmita Roy BMI/CS 576
9/1/ Ultrametric phylogenies By Sivan Yogev Based on Chapter 11 from “Inferring Phylogenies” by J. Felsenstein.
Multiple Sequence Alignments and Phylogeny.  Within a protein sequence, some regions will be more conserved than others. As more conserved,
Phylogenetic Analysis. 2 Introduction Intension –Using powerful algorithms to reconstruct the evolutionary history of all know organisms. Phylogenetic.
Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.
Phylogenetics Alexei Drummond. CS Friday quiz: How many rooted binary trees having 20 labeled terminal nodes are there? (A) (B)
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
BINF6201/8201 Molecular phylogenetic methods
PRESENTED BY SUNIL MANJERI Maximum sub-triangulation in pre- processing phylogenetic data Anne Berry * Alain Sigayret * Christine Sinoquet.
OUTLINE Phylogeny UPGMA Neighbor Joining Method Phylogeny Understanding life through time, over long periods of past time, the connections between all.
Phylogenetic Prediction Lecture II by Clarke S. Arnold March 19, 2002.
Introduction to Phylogenetic Trees
Building phylogenetic trees. Contents Phylogeny Phylogenetic trees How to make a phylogenetic tree from pairwise distances  UPGMA method (+ an example)
Evolutionary tree reconstruction (Chapter 10). Early Evolutionary Studies Anatomical features were the dominant criteria used to derive evolutionary relationships.
394C, Spring 2013 Sept 4, 2013 Tandy Warnow. DNA Sequence Evolution AAGACTT TGGACTTAAGGCCT -3 mil yrs -2 mil yrs -1 mil yrs today AGGGCATTAGCCCTAGCACTT.
Evolutionary tree reconstruction
A Study on Measuring Distance between Two Trees 阮夙姿 教授 Advisor: 阮夙姿 教授 林陳輝 Presenter : 林陳輝.
Algorithms in Computational Biology11Department of Mathematics & Computer Science Algorithms in Computational Biology Building Phylogenetic Trees.
Phylogenetic Analysis Gabor T. Marth Department of Biology, Boston College BI420 – Introduction to Bioinformatics Figures from Higgs & Attwood.
Phylogeny Ch. 7 & 8.
Phylogenetic trees Sushmita Roy BMI/CS 576 Sep 23 rd, 2014.
. Perfect Phylogeny Tutorial #10 © Ilan Gronau Original slides by Shlomo Moran.
1 CAP5510 – Bioinformatics Phylogeny Tamer Kahveci CISE Department University of Florida.
Part 9 Phylogenetic Trees
Distance-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2010.
Phylogenetics-2 Marek Kimmel (Statistics, Rice)
. Perfect Phylogeny MLE for Phylogeny Lecture 14 Based on: Setubal&Meidanis 6.2, Durbin et. Al. 8.1.
Hierarchical clustering approaches for high-throughput data Colin Dewey BMI/CS 576 Fall 2015.
Distance-based methods for phylogenetic tree reconstruction Colin Dewey BMI/CS 576 Fall 2015.
CSCE555 Bioinformatics Lecture 13 Phylogenetics II Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu Course page:
Distance based phylogenetics
Multiple Alignment and Phylogenetic Trees
Inferring phylogenetic trees: Distance and maximum likelihood methods
Phylogenetic Trees.
CS 581 Tandy Warnow.
Phylogeny.
Perfect Phylogeny Tutorial #10
Presentation transcript:

PHYLOGENETIC TREES Dwyane George February 24,

Outline Introduction & Motivation Definition Algorithm & Proof of Correctness Unweighted Pair Group Method with Arithmetic Mean (UPGMA) Algorithm Runtime

Key Ideas Phylogenetic trees represent inferred evolutionary relationships Composed by various methods Clustering Maximum likelihood estimators

Definitions Phylogeny The relationships among species, populations, individuals or genes (taxa) Phylogenetic Trees Results presented as a collection of nodes and edges – a tree Tree showing inferred evolutionary relationships among various biological species or entities Closely related taxa are spatially nearby, evolutionarily distant taxa are far apart Rooted/unrooted variations

Number of Trees Theorem (Cavalli-Sforza & Edwards): The number of rooted binary phylogenetic trees of n vertices is given by: Proof: by induction

Unweighted Pair Group Method with Arithmetic Mean (UPGMA) d ij denote the distance between the i th and j th taxa Let d ij denote the distance between the i th and j th taxa SpeciesABCD A0--- B d ab 0-- C d ac d bc 0- D d ad d bd d cd 0

UPGMA Algorithm Initialize all vertices to a cluster of size 1 Cluster the two species with the smallest distance Let d ij = min(D) C k = C i U C j Update the distance matrix with the new group against all other nodes d (ij)k = ½ * (d ik + d jk ) Repeat steps 2 & 3 for n-1 times until all species have been grouped

UPGMA Implementation

UPGMA Correctness Definition: Ultrametric tree All pendant vertices are equidistant from the root. “Constant molecular clock” UPGMA assigns same positive height to all subtrees Greedy algorithm Picks locally optimal groupings from leaves to root Topographically correct iff input data is ultrametric

UPGMA Algorithm Runtime Total Runtime O(n 3 ) Potential Speedup to O(n 2 ) by clustering in linear time Gronau & Moran (2006) Quad Trees data structure OperationTimeNumber of CallsTotal Time Hierarchical Clustering O(n 2 )O(n)O(n 3 ) Update D MatrixO(n) O(n 2 ) UPGMA--O(n 3 )