Maximum Parsimony.

Slides:



Advertisements
Similar presentations
Bioinformatics Phylogenetic analysis and sequence alignment The concept of evolutionary tree Types of phylogenetic trees Measurements of genetic distances.
Advertisements

Introduction into Phylogenetics Katja Nowick Group Leader “TFome and Transcriptome Evolution” Bioinformatics Group Paul-Flechsig-Institute for Brain Research.
An Introduction to Phylogenetic Methods
1 Dan Graur Methods of Tree Reconstruction. 2 3.
Cladogram Building - 1 ß How complex is this problem anyway ? ß NP-complete:  Time needed to find solution in- creases exponentially with size of problem.
Phylogenetic Trees Lecture 4
Maximum Likelihood. Likelihood The likelihood is the probability of the data given the model.
Molecular Evolution Revised 29/12/06
Tree Reconstruction.
. Phylogeny II : Parsimony, ML, SEMPHY. Phylogenetic Tree u Topology: bifurcating Leaves - 1…N Internal nodes N+1…2N-2 leaf branch internal node.
UPGMA and FM are distance based methods. UPGMA enforces the Molecular Clock Assumption. FM (Fitch-Margoliash) relieves that restriction, but still enforces.
Maximum Likelihood. Historically the newest method. Popularized by Joseph Felsenstein, Seattle, Washington. Its slow uptake by the scientific community.
Some basics: Homology = refers to a structure, behavior, or other character of two taxa that is derived from the same or equivalent feature of a common.
Distance Methods. Distance Estimates attempt to estimate the mean number of changes per site since 2 species (sequences) split from each other Simply.
Branch lengths Branch lengths (3 characters): A C A A C C A A C A C C Sum of branch lengths = total number of changes.
Maximum Likelihood Flips usage of probability function A typical calculation: P(h|n,p) = C(h, n) * p h * (1-p) (n-h) The implied question: Given p of success.
NJ was originally described as a method for approximating a tree that minimizes the sum of least- squares branch lengths – the minimum – evolution criterion.
Lecture 13 – Performance of Methods Folks often use the term “reliability” without a very clear definition of what it is. Methods of assessing performance.
Building Phylogenies Parsimony 2.
Building Phylogenies Parsimony 1. Methods Distance-based Parsimony Maximum likelihood.
Tree-Building. Methods in Tree Building Phylogenetic trees can be constructed by: clustering method optimality method.
What Is Phylogeny? The evolutionary history of a group.
Maximum parsimony Kai Müller.
Phylogenetic analyses Kirsi Kostamo. The aim: To construct a visual representation (a tree) to describe the assumed evolution occurring between and among.
Phylogeny Estimation: Traditional and Bayesian Approaches Molecular Evolution, 2003
Terminology of phylogenetic trees
Molecular phylogenetics
Why Models of Sequence Evolution Matter Number of differences between each pair of taxa vs. genetic distance between those two taxa. The x-axis is a proxy.
Parsimony and searching tree-space Phylogenetics Workhop, August 2006 Barbara Holland.
Phylogenetics Alexei Drummond. CS Friday quiz: How many rooted binary trees having 20 labeled terminal nodes are there? (A) (B)
Tree Inference Methods
1 Dan Graur Molecular Phylogenetics Molecular phylogenetic approaches: 1. distance-matrix (based on distance measures) 2. character-state.
PARSIMONY ANALYSIS and Characters. Genetic Relationships Genetic relationships exist between individuals within populations These include ancestor-descendent.
Models of sequence evolution GTR HKY Jukes-Cantor Felsenstein K2P Tree building methods: some examples Assessing phylogenetic data Popular phylogenetic.
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
Molecular phylogenetics 1 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2010.
 Read Chapter 4.  All living organisms are related to each other having descended from common ancestors.  Understanding the evolutionary relationships.
Available at DNA variation in Ecology and Evolution DNA variation in Ecology and Evolution IV- Clustering methods and Phylogenetic.
Molecular phylogenetics 4 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
Lecture 2: Principles of Phylogenetics
Introduction to Phylogenetics
MOLECULAR PHYLOGENETICS Four main families of molecular phylogenetic methods :  Parsimony  Distance methods  Maximum likelihood methods  Bayesian methods.
Calculating branch lengths from distances. ABC A B C----- a b c.
Cladogram construction Thanks to Leandro Gaetano.
Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix.
GENE 3000 Fall 2013 slides wiki. wiki. wiki.
Phylogeny and Genome Biology Andrew Jackson Wellcome Trust Sanger Institute Changes: Type program name to start Always Cd to phyml directory before starting.
Phylogenetic Analysis – Part 2. Outline   Why do we do phylogenetics (cladistics)?   How do we build a tree?   Do we believe the tree?   Applications.
Chapter 10 Phylogenetic Basics. Similarities and divergence between biological sequences are often represented by phylogenetic trees Phylogenetics is.
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2015.
Maximum Likelihood Given competing explanations for a particular observation, which explanation should we choose? Maximum likelihood methodologies suggest.
Phylogenetic Trees - Parsimony Tutorial #13
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
Parsimony and searching tree-space. The basic idea To infer trees we want to find clades (groups) that are supported by synapomorpies (shared derived.
Phylogenetic Analysis – Part 2. Outline   Why do we do phylogenetics (cladistics)?   How do we build a tree?   Do we believe the tree?   Applications.
Maximum Parsimony Phenetic (distance based) methods are fast and often accurate but discard data and are not based on explicit character states at each.
Phylogeny and the Tree of Life
Phylogenetic basis of systematics
Inferring a phylogeny is an estimation procedure.
Maximum likelihood (ML) method
Phylogenetic Inference
Systematics: Tree of Life
CS 581 Tandy Warnow.
Why Models of Sequence Evolution Matter
Systematics: Tree of Life
Time morphospace. time morphospace time morphospace.
The Most General Markov Substitution Model on an Unrooted Tree
PARSIMONY ANALYSIS.
Presentation transcript:

Maximum Parsimony

Character-state Approaches - MP (Maximum Parsimony) and ML (Maximum Likelihood) are two major character-state/discrete approaches in constructing phylogenetic trees

MP Method MP chooses the tree(s) that require the fewest evolutionary changes. in MP analysis, synapomorphic characters provide the basis for clade identification. autapomorphic characters do not contribute to the topology of MP tree(s).

MP Method tree 1 change 5 changes 1 3 A G G A A G G A G 2 4 A A G For each site, the goal is to reconstruct the evolution of that site on a tree subject to the constraint of invoking the fewest possible evolutionary changes. Taxon-1 ATATT Taxon-2 ATCGT Taxon-3 GCAGT Taxon-4 GCCGT tree 1 change 5 changes 1 3 A G G A A G G A G 2 4 A A G

MP Method Site 2 (1 step) T C T C C T Site 3 (2 steps) A A Taxon-1 ATATT T Taxon-2 ATCGT C Taxon-3 GCAGT Taxon-4 GCCGT T C C T Site 3 (2 steps) A A Site 4 (1 step) A A Site 5 (No step) C T T C G T or A G G T T A G G T T C C C C

MP Method tree 1 3 ((1,2),(3,4)) 2 4 Sites tree 1 2 3 4 5 Total Taxon-1 ATATT 1 3 Taxon-2 ATCGT Taxon-3 GCAGT ((1,2),(3,4)) Taxon-4 GCCGT 2 4 Sites tree 1 2 3 4 5 Total ((1,2),(3,4)) 1 1 2 1 0 5 ((1,3),(2,4)) 2 2 1 1 0 6 ((1,4),(2,3)) 2 2 2 1 0 7 Changes required for each site to fit the three possible trees for 4 sequences

Search for Most Parsimonious Trees - most parsimonious trees can be obtained by exhaustive or heuristic searches. - exhaustive search will guarantee the most parsimonious trees but time consuming. - heuristic search reduces time of searching for the most parsimonious trees. However, the trees obtained may be suboptimal.

Parsimony Algorithms Parsimony approaches comprise a family of related methods with varying assumptions about how character-state transformation occurs. Fitch Parsimony Wagner Parsimony Dollo Parsimony Camin-Sokal Parsimony

Fitch Parsimony allows free reversibility of character states in the tree, with changes in any direction equally likely. characters may be binary or unordered multistate. State-0ne State-Two State-Three

Wagner Parsimony allows free reversibility of character states in the tree, with changes in either direction equally likely. characters may be binary or ordered multistate, although transformations among multistate characters must occur through intervening states only. State-0ne State-Two State-Three

Dollo Parsimony Dollo optimization is introduced in order to accommodate evolutionary scenarios in which it is considered most plausible a priori that each apomorphic state could only have arisen once and that all homoplasy (reversion) must be accounted for by secondary loss. multiple reversions to the ancestral condition are allowed. appropriate when probabilities of change among character states are highly asymmetric. eg. loss of a particular restriction site might be more likely than its gain.

Comparison Dollo - Fitch 0 1 0 0 0 0 0 1 0 1 0 0 0 0 0 1 Reversal Apomorphic Fitch optimization Dollo optimization

Camin-Sokal Parsimony assumption that all evolutionary changes are irreversible. goes beyond Dollo by disallowing reversions to the ancestral condition. the optimization is not employed widely with genetic data because most molecular characters probably violate this assumption.

Long Branch Attraction D B - The edges leading to sequences/taxa A and C are long relative to other branches in the tree, reflecting the relatively greater number of substitutions that have occurred along those two edges. - the long branch attraction occurs when rates of evolution show considerable variation among sequences, or where the sequences being analysed are quite divergent.

Long Branch Attraction D B If the internal edge is short relative to the long terminal edges, then by chance alone A and C may acquire the same nucleotide independently. These convergences may outweigh the sites changing along the internal edge, and hence by the parsimony criterion the tree ((A,C),(B,D)) would be flavoured. How to overcome this?

Long Branch Attraction D B How to overcome Long Branch Attraction? To reduce the effects of long edges is to add sequences/taxa that join onto those edges thus breaking them up.