. Comput. Genomics, Lecture 5b Character Based Methods for Reconstructing Phylogenetic Trees: Maximum Parsimony Based on presentations by Dan Geiger, Shlomo.

Slides:



Advertisements
Similar presentations
Parsimony Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.
Advertisements

Parsimony Small Parsimony and Search Algorithms Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.
Maximum Parsimony Probabilistic Models of Evolutions Distance Based Methods Lecture 12 © Shlomo Moran, Ilan Gronau.
. Phylogenetic Trees (2) Lecture 13 Based on: Durbin et al 7.4, Gusfield , Setubal&Meidanis 6.1.
. Class 9: Phylogenetic Trees. The Tree of Life Evolution u Many theories of evolution u Basic idea: l speciation events lead to creation of different.
Parsimony based phylogenetic trees Sushmita Roy BMI/CS 576 Sep 30 th, 2014.
. Phylogenetic Trees (2) Lecture 13 Based on: Durbin et al 7.4, Gusfield , Setubal&Meidanis 6.1.
 Aim in building a phylogenetic tree is to use a knowledge of the characters of organisms to build a tree that reflects the relationships between them.
Phylogenetic Trees Lecture 4
Molecular Evolution Revised 29/12/06
Tree Reconstruction.
. Computational Genomics 5a Distance Based Trees Reconstruction (cont.) Modified by Benny Chor, from slides by Shlomo Moran and Ydo Wexler (IIT)
. Phylogeny II : Parsimony, ML, SEMPHY. Phylogenetic Tree u Topology: bifurcating Leaves - 1…N Internal nodes N+1…2N-2 leaf branch internal node.
D. Gusfield, V. Bansal (Recomb 2005) A Fundamental Decomposition Theory for Phylogenetic Networks and Incompatible Characters.
Phylogenetic trees as a visualization tools for evolutionary classification.
. Maximum Likelihood (ML) Parameter Estimation with applications to inferring phylogenetic trees Comput. Genomics, lecture 7a Presentation partially taken.
CISC667, F05, Lec14, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Phylogenetic Trees (I) Maximum Parsimony.
. Phylogenetic Trees - Parsimony Tutorial #12 Next semester: Project in advanced algorithms for phylogenetic reconstruction (236512) Initial details in:
Building phylogenetic trees Jurgen Mourik & Richard Vogelaars Utrecht University.
Branch lengths Branch lengths (3 characters): A C A A C C A A C A C C Sum of branch lengths = total number of changes.
. Phylogenetic Trees Lecture 3 Based on: Durbin et al 7.4; Gusfield 17.
Phylogeny Tree Reconstruction
. Phylogenetic Trees - Parsimony Tutorial #11 © Ilan Gronau. Based on original slides of Ydo Wexler & Dan Geiger.
Phylogenetic trees. ChimpHumanGorilla HumanChimpGorilla = ChimpGorillaHuman == GorillaChimp Trees.
Phylogeny reconstruction BNFO 602 Roshan. Simulation studies.
BNFO 602 Phylogenetics Usman Roshan. Summary of last time Models of evolution Distance based tree reconstruction –Neighbor joining –UPGMA.
UNIVERSITY OF SOUTH CAROLINA College of Engineering & Information Technology Bioinformatics Algorithms and Data Structures Chapter : Strings and.
Phylogenetic Networks of SNPs with Constrained Recombination D. Gusfield, S. Eddhu, C. Langley.
Building Phylogenies Parsimony 1. Methods Distance-based Parsimony Maximum likelihood.
Perfect Phylogeny MLE for Phylogeny Lecture 14
. Class 9: Phylogenetic Trees. The Tree of Life D’après Ernst Haeckel, 1891.
. Phylogenetic Trees (2) Lecture 13 Based on: Durbin et al 7.4, Gusfield , Setubal&Meidanis 6.1.
Phylogenetic trees Sushmita Roy BMI/CS 576
. Phylogenetic Trees Lecture 13 This class consists of parts of Prof Joe Felsenstein’s lectures 4 and 5 taken from:
Phylogenetic Analysis. 2 Introduction Intension –Using powerful algorithms to reconstruct the evolutionary history of all know organisms. Phylogenetic.
Molecular phylogenetics
Parsimony and searching tree-space Phylogenetics Workhop, August 2006 Barbara Holland.
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
Molecular phylogenetics 1 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2010.
Phylogenetics II.
. Phylogenetic Trees Lecture 11 Sections 6.1, 6.2, in Setubal et. al., 7.1, 7.1 Durbin et. al. © Shlomo Moran, based on Nir Friedman. Danny Geiger, Ilan.
Introduction to Phylogenetic Trees
Introduction to Phylogenetics
Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix.
Introduction to Phylogenetic trees Colin Dewey BMI/CS 576 Fall 2015.
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2015.
Phylogeny Ch. 7 & 8.
1 Alignment Matrix vs. Distance Matrix Sequence a gene of length m nucleotides in n species to generate an… n x m alignment matrix n x n distance matrix.
Phylogenetic Trees - Parsimony Tutorial #13
1 CAP5510 – Bioinformatics Phylogeny Tamer Kahveci CISE Department University of Florida.
Probabilistic methods for phylogenetic tree reconstruction BMI/CS 576 Colin Dewey Fall 2015.
Probabilistic Approaches to Phylogenies BMI/CS 576 Sushmita Roy Oct 2 nd, 2014.
. Perfect Phylogeny MLE for Phylogeny Lecture 14 Based on: Setubal&Meidanis 6.2, Durbin et. Al. 8.1.
1 Trees : Part 1 Reading: Section 4.1 Theory and Terminology Preorder, Postorder and Levelorder Traversals.
CSCE555 Bioinformatics Lecture 13 Phylogenetics II Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu Course page:
Application of Phylogenetic Networks in Evolutionary Studies Daniel H. Huson and David Bryant Presented by Peggy Wang.
Phylogenetic Trees - Parsimony Tutorial #12
Phylogenetic basis of systematics
Lecture 6B – Optimality Criteria: ML & ME
Character-Based Phylogeny Reconstruction
Recitation 5 2/4/09 ML in Phylogeny
BNFO 602 Phylogenetics Usman Roshan.
BNFO 602 Phylogenetics – maximum parsimony
CS 581 Tandy Warnow.
Lecture 6B – Optimality Criteria: ML & ME
BNFO 602 Phylogenetics – maximum likelihood
BNFO 602 Phylogenetics Usman Roshan.
Phylogeny.
Computational Genomics Lecture #3a
Presentation transcript:

. Comput. Genomics, Lecture 5b Character Based Methods for Reconstructing Phylogenetic Trees: Maximum Parsimony Based on presentations by Dan Geiger, Shlomo Moran, and Ido Wexler. Modified by Benny Chor. References: Durbin et al 7.4, Gusfield , Setubal&Meidanis 6.1

2 Phylogenetic Trees - Reminder Leaves represent objects (genes, species) being compared Internal nodes are hypothetical ancestral objects In a rooted tree, path from root to a node corresponds to a path in evolutionary time An unrooted tree specifies relationships among objects, but not evolutionary time

3 Parsimony Based Approch Input: Character data (aligned sequences) Goal/Output: A labeled tree (labeled internal nodes) that “explains” the data with a minimal number of changes across edges

4 Parsimony: An Example Various trees that could explain the phylogeny of the following four sequences: AAG, AAA, GGA, AGA. For example, AAA AGAAGA AAG GGA AAA AGAAGA AGA AAA AAG GGA Parsimony prefers the second tree to the first, because it requires less substitution events (three vs. four changes).

5 Big and Small Parsimony Usually the approaches to finding a maximum parsimony tree have two separate components:  A search through the space of trees (BIG parsimony)  Given a specific tree topology, find an assignment of “ancestral labels” to internal nodes as to the minimize the total number of changes across tree edges (small parsimony)

6 Formally: Big Parsimony Input: Character data (aligned sequences) Goal/Output: A labeled tree (labeled internal nodes) that minimizes number of changes across edges (over all trees and internal labelings).

7 Formally: Small Parsimony Input: Character data (aligned sequences) and a tree with sequences at leaves. Goal/Output: A labeling of internal nodes that minimizes number of changes across edges (over all internal labelings).

8 Big, Small, and Weighted Parsimony  Small parsimony has a linear time solution (Fitch’ algorithm). BIG parsimony is NP hard: An easy reduction from vertex cover, that will be shown soon (on the board).  Weighted small parsimony also has a linear time solution (Sankoff’s algorithm, dynamic programming).

9 Small Parsimony: Fitch’s Algorithm  Traverse tree “up”, from leaves to root, finding sets of possible ancestral states (labels) for each internal node.  Traverse tree “down”, from root to leaves, determining ancestral states (labels) for internal nodes.  Key observation: Different sites are independent. Can solve one site at a time.

10 Fitch’s Algorithm – Step 1 Do a post-order (from leaves to root) traversal of tree Find out possible states R i of internal node i with children j and k

11 Fitch’s Algorithm – Step 1 # of changes = # union operations T T CT T C T A G T AGT GT

12 Fitch’s Algorithm – Step 2 Do a pre-order (from root to leaves) traversal of tree Select state r j of internal node j with parent i

13 Fitch’s Algorithm – Step 2 T T CT T C T A G T AGT GT T T CT T C T A G T AGT GT T T CT T C T A G T AGT GT T T CT T C T A G T AGT GT T T CTCT T C T A G T AGT GT T T CTCT T C T A G T AGT GTGT

14 Weighted Version Instead of assuming all state changes are unit cost (  equally likely), use different costs S(a,b) for different changes 1 st step of algorithm is to propagate costs up through tree

15 Weighted Version of Fitch’s Algorithm Want to determine min. cost R i (a) of assigning character a to node i for leaves:

16 Weighted Version of Fitch’s Algorithm want to determine min. cost R i (a) of assigning character a to node i for internal nodes: a b i j k

17 Weighted Version of Fitch’s Algorithm – Step 2 do a pre-order (from root to leaves) traversal of tree select minimal cost character for root For each internal node j, select character that produced minimal cost at parent i

18 Big Parsimony: Exploring the Space of Trees We’ve considered small parsimony: How to find the minimum number of changes for a given tree topology To solve big parsimony, need some search procedure for exploring the space of tree topologies There are unrooted trees on n leaves

19 Exploring the Space of Trees taxa (n) # trees , ,405,375

20 Does This Implies Big MP is Hard? taxa (n) # trees , ,405,375 Not necessarily: There could be some smarter way to zoom directly to best topology. But: We will show hardness of Big MP by a (simple) reduction from vertex cover (VC).

21 Big MP is NP Hard ! First, define VC and VC for triangle free graphs. Then… 1. You will show a poly time reduction from VC to VC for triangle free graphs as part of home assignment (easy). 2. In class, I will show a poly time reduction from VC for triangle free graphs to Big MP (old style, white board proof). This establishes NP hardness of Big MP.