Probabilistic Approaches to Phylogeny Wouter Van Gool & Thomas Jellema.

Slides:



Advertisements
Similar presentations
Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.
Advertisements

Computing a tree Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas.
Phylogenetic Tree A Phylogeny (Phylogenetic tree) or Evolutionary tree represents the evolutionary relationships among a set of organisms or groups of.
. Phylogenetic Trees (2) Lecture 13 Based on: Durbin et al 7.4, Gusfield , Setubal&Meidanis 6.1.
Bioinformatics Phylogenetic analysis and sequence alignment The concept of evolutionary tree Types of phylogenetic trees Measurements of genetic distances.
Computing a tree Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas.
Phylogenetic Trees Lecture 4
Molecular Evolution and Phylogenetic Tree Reconstruction
GENE TREES Abhita Chugh. Phylogenetic tree Evolutionary tree showing the relationship among various entities that are believed to have a common ancestor.
Maximum Likelihood. Likelihood The likelihood is the probability of the data given the model.
Tree Reconstruction.
. Phylogeny II : Parsimony, ML, SEMPHY. Phylogenetic Tree u Topology: bifurcating Leaves - 1…N Internal nodes N+1…2N-2 leaf branch internal node.
Phylogeny Tree Reconstruction
. Maximum Likelihood (ML) Parameter Estimation with applications to inferring phylogenetic trees Comput. Genomics, lecture 7a Presentation partially taken.
CISC667, F05, Lec14, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Phylogenetic Trees (I) Maximum Parsimony.
Heuristic alignment algorithms and cost matrices
Phylogeny Tree Reconstruction
Maximum Likelihood. Historically the newest method. Popularized by Joseph Felsenstein, Seattle, Washington. Its slow uptake by the scientific community.
Phylogenetic Trees Presenter: Michael Tung
Maximum Likelihood Flips usage of probability function A typical calculation: P(h|n,p) = C(h, n) * p h * (1-p) (n-h) The implied question: Given p of success.
Realistic evolutionary models Marjolijn Elsinga & Lars Hemel.
Phylogeny Tree Reconstruction
Class 3: Estimating Scoring Rules for Sequence Alignment.
CISC667, F05, Lec16, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Phylogenetic Trees (III) Probabilistic methods.
Probabilistic methods for phylogenetic trees (Part 2)
Phylogeny Tree Reconstruction
. Class 9: Phylogenetic Trees. The Tree of Life D’après Ernst Haeckel, 1891.
. Phylogenetic Trees Lecture 13 This class consists of parts of Prof Joe Felsenstein’s lectures 4 and 5 taken from:
Phylogeny Estimation: Traditional and Bayesian Approaches Molecular Evolution, 2003
Terminology of phylogenetic trees
BINF6201/8201 Molecular phylogenetic methods
Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence.
Phylogenetics Alexei Drummond. CS Friday quiz: How many rooted binary trees having 20 labeled terminal nodes are there? (A) (B)
Binary Encoding and Gene Rearrangement Analysis Jijun Tang Tianjin University University of South Carolina (803)
Phylogenetic Analysis. General comments on phylogenetics Phylogenetics is the branch of biology that deals with evolutionary relatedness Uses some measure.
Computational Biology, Part D Phylogenetic Trees Ramamoorthi Ravi/Robert F. Murphy Copyright  2000, All rights reserved.
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2010.
Molecular phylogenetics 4 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
Phylogenetic Prediction Lecture II by Clarke S. Arnold March 19, 2002.
A brief introduction to phylogenetics
Using traveling salesman problem algorithms for evolutionary tree construction Chantal Korostensky and Gaston H. Gonnet Presentation by: Ben Snider.
Ch.6 Phylogenetic Trees 2 Contents Phylogenetic Trees Character State Matrix Perfect Phylogeny Binary Character States Two Characters Distance Matrix.
More statistical stuff CS 394C Feb 6, Today Review of material from Jan 31 Calculating pattern probabilities Why maximum parsimony and UPGMA are.
Statistical stuff: models, methods, and performance issues CS 394C September 16, 2013.
Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 Colin Dewey Fall 2015.
Advanced Algorithms and Models for Computational Biology -- a machine learning approach Molecular Ecolution: Phylogenetic trees Eric Xing Lecture 21, April.
Learning Sequence Motifs Using Expectation Maximization (EM) and Gibbs Sampling BMI/CS 776 Mark Craven
Phylogenetic Trees - Parsimony Tutorial #13
Evolutionary Models CS 498 SS Saurabh Sinha. Models of nucleotide substitution The DNA that we study in bioinformatics is the end(??)-product of evolution.
1 CAP5510 – Bioinformatics Phylogeny Tamer Kahveci CISE Department University of Florida.
The statistics of pairwise alignment BMI/CS 576 Colin Dewey Fall 2015.
Probabilistic methods for phylogenetic tree reconstruction BMI/CS 576 Colin Dewey Fall 2015.
Probabilistic Approaches to Phylogenies BMI/CS 576 Sushmita Roy Oct 2 nd, 2014.
Statistical stuff: models, methods, and performance issues CS 394C September 3, 2009.
Modelling evolution Gil McVean Department of Statistics TC A G.
Substitution Matrices and Alignment Statistics BMI/CS 776 Mark Craven February 2002.
Phylogenetic basis of systematics
Maximum likelihood (ML) method
Multiple Alignment and Phylogenetic Trees
Bioinformatics Algorithms and Data Structures
Goals of Phylogenetic Analysis
Recitation 5 2/4/09 ML in Phylogeny
Inferring phylogenetic trees: Distance and maximum likelihood methods
CS 581 Tandy Warnow.
CS 581 Tandy Warnow.
BNFO 602 Phylogenetics – maximum likelihood
BNFO 602 Phylogenetics Usman Roshan.
The Most General Markov Substitution Model on an Unrooted Tree
Phylogeny.
Presentation transcript:

Probabilistic Approaches to Phylogeny Wouter Van Gool & Thomas Jellema

Probabilistic Approaches to Phylogeny Contents Introduction/Overview Wouter Probabilistic Models of Evolution Wouter Calculating the Likelihood Wouter Pause Evolution DemoThomas Using the likelihood for inferenceThomas Phylogeny Demo Thomas Summary/ConclusionThomas Questions

8.1 Introduction Goal: Formulate probabilistic models for phylogeny Infer trees from sets of sequences Aim Probability-based Phylogeny: Rank trees according to - likelihood P(data |tree) - posterior probability P(tree |data)

8.1 Introduction Compute probability of a set of data given A tree: P(x* |T, t * ) x*: set of n sequences x j (j=1…n) T : tree with n leaves, with sequence j at leaf j t * : edge lengths of the tree

8.1 Introduction Example

Probabilistic Approaches to Phylogeny Contents Introduction/Overview Wouter Probabilistic Models of Evolution Wouter Calculating the Likelihood Wouter Pause Evolution DemoThomas Using the likelihood for inferenceThomas Phylogeny Demo Thomas Summary/ConclusionThomas Questions

8.2 Probabilistic Models of Evolution Given the sequence at the leafs x 1 …x n : 1. Pick a model of evolution: P(x |y,t ),P(x) 2. Enumerate all possible tree topologies with n leaves 3. For each T, maximize over all possible edge lengths t: 4. Pick the T and t that have the largest probability

8.2 Probabilistic Models of Evolution Simplifying Assumptions: 1. Single base substitions only: ungapped alignments only 2. Each base evolves independently with the same model of evolution based on a substitution matrix

8.2 Probabilistic Models of Evolution Substitution Matrix for Phylogeny Many important families of substitution matrices are multiplicative: S(t)S(s) = S(T+s) Substitution matrices used in Phylogeny: Jukes & Cantor Model [1969] Kimura DNA Model [1980] PAM Matrix [1978]

8.2 Probabilistic Models of Evolution Jukes-Cantor Model

8.2 Probabilistic Models of Evolution Kimura DNA model

8.2 Probabilistic Models of Evolution PAM matrix model

Probabilistic Approaches to Phylogeny Contents Introduction/Overview Wouter Probabilistic Models of Evolution Wouter Calculating the Likelihood Wouter Pause Evolution DemoThomas Using the likelihood for inferenceThomas Phylogeny Demo Thomas Summary/ConclusionThomas Questions

8.3 Calculating the likelihood for ungapped alignments Example: The likelihood of two nucleotide sequences

8.3 calculating the likelihood for ungapped alignments Likelihood for general case Where node α(i) is the ancestor of node i A fixed set of values t1…t2n-1 and topology T is required

8.3 calculating the likelihood for ungapped alignments Likelihood for general case Where node α(i) is the ancestor of node i A fixed set of values t1…t2n-1 and topology T is required

8.3 calculating the likelihood for ungapped alignments Felsenstein’s recursive algorithm Define a table of probabilities F k,a for each site u and all tree nodes k and input characters a: = probability at a site u for subtree below node k assuming character u at node k is a

8.3 calculating the likelihood for ungapped alignments Felsenstein’s recursive algorithm

8.3 calculating the likelihood for ungapped alignments Likelihood for general case Overall algorithm: Enumerate each tree topology t Enumerate sets of values t (using some n- dimensional optimisation technique) Run Felsenstein’s recursive algortihm for each site u and multiply likelihoods Return best T&t

8.3 calculating the likelihood for ungapped alignments Reversibility & independence of root position  The score of the optimal tree is independent of the root position if and only if: - the substitution matrix is multiplicative - the substitution matrix is reversible  A substititution matrix is reversible if for all a,b and t:

Probabilistic Approaches to Phylogeny Contents Introduction/Overview Wouter Probabilistic Models of Evolution Wouter Calculating the Likelihood Wouter Pause Evolution DemoThomas Using the likelihood for inferenceThomas Phylogeny Demo Thomas Summary/ConclusionThomas Questions

Probabilistic Approaches to Phylogeny Contents Introduction/Overview Wouter Probabilistic Models of Evolution Wouter Calculating the Likelihood Wouter Pause Evolution DemoThomas Using the likelihood for inferenceThomas Phylogeny Demo Thomas Summary/ConclusionThomas Questions

Demo

Probabilistic Approaches to Phylogeny Contents Introduction/Overview Wouter Probabilistic Models of Evolution Wouter Calculating the Likelihood Wouter Pause Evolution DemoThomas Using the likelihood for inferenceThomas Phylogeny Demo Thomas Summary/ConclusionThomas Questions

8.4 Using the likelihood for inference Maximum likelihood: The best tree “could be “ the tree that maximises the likelihood Computationally demanding

8.4 Using the likelihood for inference Sampling from the posterior distribution: We use Bayes’ rule to compute the posterior probability This is the probability of a model given the data

8.4 Using the likelihood for inference Example Model name prior chance of model data Model % A Model % A 50% B Model % B

8.4 Using the likelihood for inference Sampling from the posterior distribution: We use Bayes’ rule to compute the posterior probability This is the probability of a model given the data

8.4 Using the likelihood for inference Metropolis algorithm It samples from the trees with probabilities given by their posterior distribution. It is a sampling procedure that generates a sequence of trees, each from the previous one.

8.4 Using the likelihood for inference Metropolis algorithm

1 Time from root Order of traversal A proposal distribution 8.4 Using the likelihood for inference

Metropolis algorithm 1 Time from root Order of traversal Using the likelihood for inference

Metropolis algorithm 1 Time from root Order of traversal Using the likelihood for inference

Metropolis algorithm 1 Time from root Order of traversal Using the likelihood for inference

Metropolis algorithm 1 Time from root Order of traversal Using the likelihood for inference

Metropolis algorithm 8.4 Using the likelihood for inference

Other phylogenetic uses of sampling 8.4 Using the likelihood for inference AATCAATT

Other phylogenetic uses of sampling 8.4 Using the likelihood for inference AATCAATT AATC

Other phylogenetic uses of sampling 8.4 Using the likelihood for inference AATTTTAA

Other phylogenetic uses of sampling 8.4 Using the likelihood for inference AATCAATT TCAA AATC AAAA TTAATCAA

Other phylogenetic uses of sampling Inferring the history of populations Probability density of a coalesence in time = Probability of a coalesence between any pair = * = 8.4 Using the likelihood for inference

Inferring the history of populations When the value of n is large and the value of p is close to 0 the binomial distribution with parameters n and p can be approximated by a Poisson distribution with mean n*p n*p = = and x = 1 The probability of a coalesence at the end of the period tk The total probability of the tree 8.4 Using the likelihood for inference

The bootstrap The bootstrap can give a approximation to the posterior. To much labour, so it is an unattractive alternative for sampling. The bootstrap is probably more useful for non-probabilistic tree building methods. 8.4 Using the likelihood for inference

Probabilistic Approaches to Phylogeny Contents Introduction/Overview Wouter Probabilistic Models of Evolution Wouter Calculating the Likelihood Wouter Pause Evolution DemoThomas Using the likelihood for inferenceThomas Phylogeny Demo Thomas Summary/ConclusionThomas Questions

Demo

Probabilistic Approaches to Phylogeny Contents Introduction/Overview Wouter Probabilistic Models of Evolution Wouter Calculating the Likelihood Wouter Pause Evolution DemoThomas Using the likelihood for inferenceThomas Phylogeny Demo Thomas Summary/ConclusionThomas Questions

Conclusion The methods of today can be used to find the most probable tree. Most of the methods were computationally demanding More realistic evolutionary models are explained Thursday Probabilistic Approaches to Phylogeny

Contents Introduction/Overview Wouter Probabilistic Models of Evolution Wouter Calculating the Likelihood Wouter Pause Evolution DemoThomas Using the likelihood for inferenceThomas Phylogeny Demo Thomas Summary/ConclusionThomas Questions

Questions???? Probabilistic Approaches to Phylogeny