Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix.

Slides:



Advertisements
Similar presentations
Maximum Parsimony Probabilistic Models of Evolutions Distance Based Methods Lecture 12 © Shlomo Moran, Ilan Gronau.
Advertisements

An introduction to maximum parsimony and compatibility
WSPD Applications.
8.3 Representing Relations Connection Matrices Let R be a relation from A = {a 1, a 2,..., a m } to B = {b 1, b 2,..., b n }. Definition: A n m  n connection.
Phylogenetic Tree A Phylogeny (Phylogenetic tree) or Evolutionary tree represents the evolutionary relationships among a set of organisms or groups of.
PHYLOGENETIC TREES Bulent Moller CSE March 2004.
. Phylogenetic Trees (2) Lecture 13 Based on: Durbin et al 7.4, Gusfield , Setubal&Meidanis 6.1.
Greedy Algorithms Greed is good. (Some of the time)
. Class 9: Phylogenetic Trees. The Tree of Life Evolution u Many theories of evolution u Basic idea: l speciation events lead to creation of different.
. Intro to Phylogenetic Trees Lecture 5 Sections 7.1, 7.2, in Durbin et al. Chapter 17 in Gusfield Slides by Shlomo Moran. Slight modifications by Benny.
. Phylogenetic Trees (2) Lecture 13 Based on: Durbin et al 7.4, Gusfield , Setubal&Meidanis 6.1.
Phylogenetic reconstruction
C++ Programming: Program Design Including Data Structures, Third Edition Chapter 21: Graphs.
. Phylogeny II : Parsimony, ML, SEMPHY. Phylogenetic Tree u Topology: bifurcating Leaves - 1…N Internal nodes N+1…2N-2 leaf branch internal node.
Bioinformatics Algorithms and Data Structures
Phylogenetic Trees: Assumptions All existing species have a common ancestor Each species is descended from a single ancestor Each speciation gives rise.
. Perfect Phylogeny Tutorial #11 © Ilan Gronau Original slides by Shlomo Moran.
University of CreteCS4831 The use of Minimum Spanning Trees in microarray expression data Gkirtzou Ekaterini.
Testing Metric Properties Michal Parnas and Dana Ron.
. Phylogenetic Trees Lecture 1 Credits: N. Friedman, D. Geiger, S. Moran,
UNIVERSITY OF SOUTH CAROLINA College of Engineering & Information Technology Bioinformatics Algorithms and Data Structures Chapter : Strings and.
. Class 9: Phylogenetic Trees. The Tree of Life D’après Ernst Haeckel, 1891.
UNIVERSITY OF SOUTH CAROLINA College of Engineering & Information Technology Bioinformatics Algorithms and Data Structures Chapter : Strings and.
Distance-Based Phylogenetic Reconstruction Tutorial #8 © Ilan Gronau, edited by Itai Sharon.
Phylogenetic Networks of SNPs with Constrained Recombination D. Gusfield, S. Eddhu, C. Langley.
. Phylogenetic Trees (2) Lecture 12 Based on: Durbin et al Section 7.3, 7.8, Gusfield: Algorithms on Strings, Trees, and Sequences Section 17.
Perfect Phylogeny MLE for Phylogeny Lecture 14
Phylogenetic Trees Lecture 2
Algorithm Animation for Bioinformatics Algorithms.
. Phylogenetic Trees (2) Lecture 13 Based on: Durbin et al 7.4, Gusfield , Setubal&Meidanis 6.1.
. Phylogenetic Trees (2) Lecture 12 Based on: Durbin et al Section 7.3, 7.8, Gusfield: Algorithms on Strings, Trees, and Sequences Section 17.
. Phylogenetic Trees Lecture 11 Sections 7.1, 7.2, in Durbin et al.
Phylogenetic trees Sushmita Roy BMI/CS 576
. Phylogenetic Trees Lecture 11 Sections 7.1, 7.2, in Durbin et al.
9/1/ Ultrametric phylogenies By Sivan Yogev Based on Chapter 11 from “Inferring Phylogenies” by J. Felsenstein.
1 Physical Mapping --An Algorithm and An Approximation for Hybridization Mapping Shi Chen CSE497 04Mar2004.
P HYLOGENETIC T REE. OVERVIEW Phylogenetic Tree Phylogeny Applications Types of phylogenetic tree Terminology Data used to build a tree Building phylogenetic.
Chapter 2 Graph Algorithms.
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
Phylogenetics II.
Benjamin Loyle 2004 Cse 397 Solving Phylogenetic Trees Benjamin Loyle March 16, 2004 Cse 397 : Intro to MBIO.
Discrete Structures Lecture 12: Trees Ji Yanyan United International College Thanks to Professor Michael Hvidsten.
 Rooted tree and binary tree  Theorem 5.19: A full binary tree with t leaves contains i=t-1 internal vertices.
5.5.2 M inimum spanning trees  Definition 24: A minimum spanning tree in a connected weighted graph is a spanning tree that has the smallest possible.
394C, Spring 2013 Sept 4, 2013 Tandy Warnow. DNA Sequence Evolution AAGACTT TGGACTTAAGGCCT -3 mil yrs -2 mil yrs -1 mil yrs today AGGGCATTAGCCCTAGCACTT.
Evolutionary tree reconstruction
More statistical stuff CS 394C Feb 6, Today Review of material from Jan 31 Calculating pattern probabilities Why maximum parsimony and UPGMA are.
5.5.3 Rooted tree and binary tree  Definition 25: A directed graph is a directed tree if the graph is a tree in the underlying undirected graph.  Definition.
Week 11 - Monday.  What did we talk about last time?  Binomial theorem and Pascal's triangle  Conditional probability  Bayes’ theorem.
1 The Floyd-Warshall Algorithm Andreas Klappenecker.
1 12/2/2015 MATH 224 – Discrete Mathematics Formally a graph is just a collection of unordered or ordered pairs, where for example, if {a,b} G if a, b.
Phylogeny Ch. 7 & 8.
. Perfect Phylogeny Tutorial #10 © Ilan Gronau Original slides by Shlomo Moran.
598AGB Basics Tandy Warnow. DNA Sequence Evolution AAGACTT TGGACTTAAGGCCT -3 mil yrs -2 mil yrs -1 mil yrs today AGGGCATTAGCCCTAGCACTT AAGGCCTTGGACTT.
Chapter 20: Graphs. Objectives In this chapter, you will: – Learn about graphs – Become familiar with the basic terminology of graph theory – Discover.
1 Assignment #3 is posted: Due Thursday Nov. 15 at the beginning of class. Make sure you are also working on your projects. Come see me if you are unsure.
Phylogenetics-2 Marek Kimmel (Statistics, Rice)
Graphs Definition: a graph is an abstract representation of a set of objects where some pairs of the objects are connected by links. The interconnected.
. Perfect Phylogeny MLE for Phylogeny Lecture 14 Based on: Setubal&Meidanis 6.2, Durbin et. Al. 8.1.
Distance-based methods for phylogenetic tree reconstruction Colin Dewey BMI/CS 576 Fall 2015.
CSCE555 Bioinformatics Lecture 13 Phylogenetics II Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu Course page:
5.6 Prefix codes and optimal tree Definition 31: Codes with this property which the bit string for a letter never occurs as the first part of the bit string.
Application of Phylogenetic Networks in Evolutionary Studies Daniel H. Huson and David Bryant Presented by Peggy Wang.
by d. gusfield v. bansal v. bafna y. song presented by vikas taliwal
394C, Spring 2012 Jan 23, 2012 Tandy Warnow.
dij(T) - the length of a path between leaves i and j
Character-Based Phylogeny Reconstruction
Phylogenetic Tree 12/8/2018.
Phylogeny.
Perfect Phylogeny Tutorial #10
Presentation transcript:

Ch.6 Phylogenetic Trees

2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix Additive Trees Ultrametric Trees Agreement (Isomorphic) between Phylogenies

3 Phylogenetic Trees (Phylogenies) Explain the evolutionary history of today’s species (Figure 6.1) A hypothesis; do not have enough data about distant ancestors of present-day species Characteristic Leaf; an object or a set of objects, Interior node; hypothetical ancestor objects Unrooted tree Classify input data for phylogeny reconstruction into main categories Character state matrix Distance matrix

4 Character State Matrix Character have following features Independent inheritance Homologous Character state matrix A matrix M with n rows (objects) and m columns (characters) M ij denotes the state the object i has for character j Each row is the state vector for an object

5 Difficulties to create a phylogeny from a character state matrix Convergence or parallel evolution Objects that share the same state are genetically closer than objects that do not Reversal Gains and losses of the character ☞ assume convergence or reversal should not happen, or their number should be minimized Ordered or unordered, directed

6 Perfect Phylogeny Problem For each state s of each character c, the set of all nodes u (leaves and interior nodes) for which the state is s with respect to c must form a subtree of T Characters are compatible If a set of objects defined by a character state matrix admits a perfect phylogeny

7 Example

8 Perfect Phylogeny Problem How many different trees can we build for n objects? Consider only unrooted binary trees

9 Binary Character States Two phases algorithm (runs in time O(nm)) Decide whether the input matrix M admits a perfect phylogeny Construct one possible phylogeny Assume that state 0 is ancestral and state 1 is derived

10 Deciding perfect phylogeny A rooted tree T is a perfect phylogeny for input matrix M, if Every character in input matrix M there corresponds an edge in T, and this edge marks the transition from state 0 to state 1 for that character Edges are labeled by their respective characters and root has character state vector (0, 0, …, 0)

11 Deciding perfect phylogeny Definition 6.1 For each column j of M, let O j be the set of objects whose state is 1 for j. Let O j be the set of objects whose state is 0 for j Lemma 6.1 A binary matrix M admits a perfect phylogeny if and only if for each pair of character i and j the sets O i and O j are disjoint or one of them contains the other

12 Deciding perfect phylogeny Example; Table 6.2 O 1 = {B, D}, O 2 = {B}, O 3 = {D} O 4 = {A, C, E}, O 5 = {A, C}, O 6 = {C} Lemma 6.1 for decision phase takes O(nm 2 ) Figure 6.5 Algorithm Perfect Binary Phylogeny Decision -> O(nm)

13 Deciding perfect phylogeny if L ij ≠ L lj for some i, l and both L ij and L lj are nonzero then return FALSE Mc4c1c5c2c3c6 A B C D E Lc4c1c5c2c3c6 A01000 B C D E 00000

14 Construction perfect phylogeny Figure 6.6 Algorithm Perfect Binary Phylogeny Construction Running time O(nm)

15 Unordered binary character The majority state becomes 0 and the other 1 If equal frequency, choose either one to be 0 and the other to be 1

16 Two characters Allow characters can be unordered and have an arbitrary number of states, but restrict on the maximum number of characters two Definition 6.2 A triangulated graph is an undirected graph in which any cycle with four or more vertices has a chord, that is, an edge joining two nonconsecutive vertices of the cycle Theorem 6.1 To every collection of subtrees {T 1, T 2, …, T l } of a tree T there corresponds a triangulated graph and vice versa

17 Two characters Definition 6.3 An intersection graph for a collection C of sets is the graph G that we get by mapping each set in C to a vertex of G, and linking two vertices in G by an edge if the corresponding sets have a nonempty intersection Definition 6.4 Given a graph G = (V, E) with a coloring c on V, we say that G can be c-triangulated if there exists a triangulated graph H = (V, E’), such that E ⊆ E’ and c is a valid coloring for H. In other words, any edge present in E’ but not in E must link two vertices with different colors

18 Two characters Theorem 6.2 A character state matrix M, with a character set defining a coloring c, admits a perfect phylogeny if and only if its corresponding SIG can be c-triangulated Theorem 6.3 A character state matrix M with only two characters admits a perfect phylogeny if and only if its corresponding SIG is acyclic

19 Example x1 y1 x2 z2 x3 y3 z3y2 {B}{A, B}{A} {B, C} {C}{C, D}{D} {A, D}

20 Reconstruction algorithm for two characters Running time O(n) Test for acyclicity -> O(n) Reconstruction of the perfect phylogeny -> O(n)

21 Parsimony and Compatibility Real character state matrices are unlikely to admit perfect phylogenies Experimental data always carries errors The assumptions (no reversals and no convergence) sometimes are violated Two approach Parsimony criterion Allow reversal and convergence events, but to try to minimize their occurrence Compatibility criterion Find a maximum set of characters that are compatible -> exclude characters that cause such “problem”

22 Algorithms for Distance Matrices Problem of reconstructing trees based on comparative numerical data between n objects, distance matrix M Consider two problems Reconstructing Additive Trees Reconstructing Ultrametric Trees

23 Reconstructing Additive Trees Metric space A set of objects O such that to every pair i, j ∈ O and associated a nonnegative real number d ij with the following properties: d ij > 0 for i ≠ j, d ij = 0 for i= j, d ij = d ji for all i and j, d ij ≤ d ik + d kj for all i, j, and k (the triangle inequality) M and T are additive Tree must have n leaves Leaves are nodes with degree one; the others with degree three All edges in the tree have nonnegative weight The weight of the path between any two leaves i and j must be equal to Mij

24 Reconstructing Additive Trees Lemma 6.2 A metric space O is additive if and only if given any four objects of O labeled i, j, k, and l such that dij + dkl = dik + djl ≥ dil + djk If M is additive, T is unique (algorithm runs in time O(n 2 )) Real-life distance matrices are rarely additive due to errors in the distance measurement Obtain a tree that is as close as possible to an additive tree Approaching the problem that is tractable

25 Reconstructing Ultrametric Trees Given two distance matrices, M l and M h, reconstruct an evolutionary tree such that the distances measured on the tree fit “between” these two input matrices (sandwich constraints, ) A tree is ultrametric when it is additive and can be rooted in such a way that the lengths of all leaf-root paths are equal -> the objects being studied have evolved at equal rate from a common ancestor

26 Reconstructing Ultrametric Trees link of a and b in MST T; (a, b) max The largest-weight edge in the unique path from a to b in T Definition 6.5 The cut-weight of an edge e of the minimum spanning tree of G h is given by

27 Reconstructing Ultrametric Trees Reconstruction algorithm -> runs in time O(n 2 ) Compute a MST T of G h ; Construction of R; Compute CW(e); Build ultrametric tree U

28 Agreement between Phylogenies In practice it occurs quite often that two different methods applied on the same data yield different trees (in the topological sense) Definition 6.6 We say that a tree T r refines another tree T s whenever T r can be transformed into T s by contracting selected edges from T r. Two trees T 1 and T 2 agree when there exists a tree T 3 that refines both

29 Isomorphic Two trees T1 and T2 are isomorphic when there is an one-to-one correspondence between their nodes such that for every pair u, v of corresponding nodes, u ∈ T1 and v ∈ T2, the objects contained in leaves below u are the same as the objects contained in leaves below v Binary Tree Isomorphism Figure 6.21 runs in time O(n) General case (leaves contain several objects) Figure 6.22 runs in time O(n)