7. Bayesian phylogenetic analysis using MrBAYES UST Jeong Dageum 2010.05.24 Thomas Bayes(1702-1761) The Phylogenetic Handbook – Section III, Phylogenetic.

Slides:



Advertisements
Similar presentations
Introduction to Monte Carlo Markov chain (MCMC) methods
Advertisements

Week 11 Review: Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution.
Markov Chain Monte Carlo Convergence Diagnostics: A Comparative Review By Mary Kathryn Cowles and Bradley P. Carlin Presented by Yuting Qi 12/01/2006.
Bayesian Estimation in MARK
Ch 11. Sampling Models Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by I.-H. Lee Biointelligence Laboratory, Seoul National.
CHAPTER 16 MARKOV CHAIN MONTE CARLO
Bayesian Reasoning: Markov Chain Monte Carlo
Bayesian statistics – MCMC techniques
Maximum Likelihood. Likelihood The likelihood is the probability of the data given the model.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Bayesian Inference Anders Gorm Pedersen Molecular Evolution Group Center for Biological Sequence Analysis Technical.
Computational statistics 2009 Random walk. Computational statistics 2009 Random walk with absorbing barrier.
Course overview Tuesday lecture –Those not presenting turn in short review of a paper using the method being discussed Thursday computer lab –Turn in short.
Evaluating Hypotheses
Today Introduction to MCMC Particle filters and MCMC
CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS Bayesian Inference Anders Gorm Pedersen Molecular Evolution Group Center for Biological Sequence Analysis Technical.
Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.
Continuous Random Variables and Probability Distributions
Inferences About Process Quality
Lehrstuhl für Informatik 2 Gabriella Kókai: Maschine Learning 1 Evaluating Hypotheses.
Introduction to Monte Carlo Methods D.J.C. Mackay.
Bayes Factor Based on Han and Carlin (2001, JASA).
Phylogeny Estimation: Traditional and Bayesian Approaches Molecular Evolution, 2003
Census A survey to collect data on the entire population.   Data The facts and figures collected, analyzed, and summarized for presentation and.
BINF6201/8201 Molecular phylogenetic methods
The horseshoe estimator for sparse signals CARLOS M. CARVALHO NICHOLAS G. POLSON JAMES G. SCOTT Biometrika (2010) Presented by Eric Wang 10/14/2010.
WSEAS AIKED, Cambridge, Feature Importance in Bayesian Assessment of Newborn Brain Maturity from EEG Livia Jakaite, Vitaly Schetinin and Carsten.
Bayesian parameter estimation in cosmology with Population Monte Carlo By Darell Moodley (UKZN) Supervisor: Prof. K Moodley (UKZN) SKA Postgraduate conference,
Priors, Normal Models, Computing Posteriors
Exam I review Understanding the meaning of the terminology we use. Quick calculations that indicate understanding of the basis of methods. Many of the.
Module 1: Statistical Issues in Micro simulation Paul Sousa.
1 Gil McVean Tuesday 24 th February 2009 Markov Chain Monte Carlo.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 11: Bayesian learning continued Geoffrey Hinton.
A brief introduction to phylogenetics
Ch 2. Probability Distributions (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by Yung-Kyun Noh and Joo-kyung Kim Biointelligence.
Lab3: Bayesian phylogenetic Inference and MCMC Department of Bioinformatics & Biostatistics, SJTU.
More statistical stuff CS 394C Feb 6, Today Review of material from Jan 31 Calculating pattern probabilities Why maximum parsimony and UPGMA are.
Molecular Systematics
Bayesian Reasoning: Tempering & Sampling A/Prof Geraint F. Lewis Rm 560:
Three Frameworks for Statistical Analysis. Sample Design Forest, N=6 Field, N=4 Count ant nests per quadrat.
Bayesian Phylogenetics. Bayes Theorem Pr(Tree|Data) = Pr(Data|Tree) x Pr(Tree) Pr(Data)
G. Cowan RHUL Physics Bayesian Higgs combination page 1 Bayesian Higgs combination based on event counts (follow-up from 11 May 07) ATLAS Statistics Forum.
FINE SCALE MAPPING ANDREW MORRIS Wellcome Trust Centre for Human Genetics March 7, 2003.
Cosmological Model Selection David Parkinson (with Andrew Liddle & Pia Mukherjee)
The generalization of Bayes for continuous densities is that we have some density f(y|  ) where y and  are vectors of data and parameters with  being.
Tutorial I: Missing Value Analysis
Introduction to Sampling Methods Qi Zhao Oct.27,2004.
The Unscented Particle Filter 2000/09/29 이 시은. Introduction Filtering –estimate the states(parameters or hidden variable) as a set of observations becomes.
CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov
Monte Carlo Linear Algebra Techniques and Their Parallelization Ashok Srinivasan Computer Science Florida State University
Bayesian statistics named after the Reverend Mr Bayes based on the concept that you can estimate the statistical properties of a system after measuting.
Bioinf.cs.auckland.ac.nz Juin 2008 Uncorrelated and Autocorrelated relaxed phylogenetics Michaël Defoin-Platel and Alexei Drummond.
Statistical NLP: Lecture 4 Mathematical Foundations I: Probability Theory (Ch2)
Bayesian II Spring Major Issues in Phylogenetic BI Have we reached convergence? If so, do we have a large enough sample of the posterior?
Kevin Stevenson AST 4762/5765. What is MCMC?  Random sampling algorithm  Estimates model parameters and their uncertainty  Only samples regions of.
Outline Historical note about Bayes’ rule Bayesian updating for probability density functions –Salary offer estimate Coin trials example Reading material:
Markov Chain Monte Carlo in R
From: Phylogenetic Inference via Sequential Monte Carlo
MCMC Output & Metropolis-Hastings Algorithm Part I
MCMC Stopping and Variance Estimation: Idea here is to first use multiple Chains from different initial conditions to determine a burn-in period so the.
Advanced Statistical Computing Fall 2016
Bayesian data analysis
Introduction to the bayes Prefix in Stata 15
Bayes Net Learning: Bayesian Approaches
Bayesian inference Presented by Amir Hadadi
Markov Chain Monte Carlo
Markov chain monte carlo
Remember that our objective is for some density f(y|) for observations where y and  are vectors of data and parameters,  being sampled from a prior.
More about Posterior Distributions
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Bayesian Networks in Educational Assessment
Presentation transcript:

7. Bayesian phylogenetic analysis using MrBAYES UST Jeong Dageum Thomas Bayes( ) The Phylogenetic Handbook – Section III, Phylogenetic inference Prior * Likelihood Posterior Normalizing constant 1

7. 1 Introduction 7.2 Bayesian phylogenetic inference 7.3 Markov chain Monte Carlo sampling 7.4 Burn-in, mixing and convergence 7.5 Metropolis coupling 7.6 Summarizing the results 7.7 An introduction to phylogenetic models 7.8 Bayesian model choice and model averaging 7.9 Prior probability distribution 2

Next year’s world championships in ice hockey? Sweden?!!!! 15 years 1 of 7 countries 1:7 or gold medal 2:15 or 0.13 Final? Semifinal? Russia, Canada, Finland, Czech Republic, Sweden, Slovakia, United States: Introduction 3

Bayesian approach: Bayesian inference is just a mathematical formalization of a decision process that most of us use without reflecting on it Posterior Prior * Likelihood Normalizing Constant 7.1 Introduction 4

? ? Forward probability 50: 50 ? W ball proportion : P B ball proportion: 1-p 7.1 Introduction a, b Converse ! 5

f(p |a,b) = ? : Reverse probability problem We know a and b, then What is the probability of a particular value of p? Need Prior beliefs about the value of p 7.1 Introduction 6

[Probability mass function]: a function describing the probability of a discrete Random variable (ex: Dice) Box 7.1 Probability distributions – [Considering Prior] [Probability density function]: For a continuous variable, the equivalent function The value of this function is not a probability Exponential distribution: A better choice for a vague prior on branch lengths 7.1 Introduction 7

Gamma distribution: 2 parameters (shape parameter α, scale parameter β Small value of α: the distribution is L-shaped And the variance is large High value of α: similar to normal distribution 7.1 Introduction Box 7.1 Probability distributions – [Considering Prior] The beta distribution denoted Beta (α1, α2) Describes the probability on two proportions, which are associated with the weight parameters. The beta distribution 8

Posterior probability distribution 7.1 Introduction How do we calculate f(a,b)? Bayes’ theorem We can calculate f(a,b|p) We can specify f(p) To integrate over All possible values of p - > Denominator is a normalizing constant 9

7.2 Bayesian phylogenetic inference P(Tree |Data) = P (Data) P(Data |Tree)P (Tree) Posterior Likelihood * Prior Normalizing constant 10

X: the matrix of aligned sequences Θ: topology, branch length, model.. Θ = (τ: topology parameter υ: branch lengths on the tree) substitution model parameters to be considered 7.2 Bayesian phylogenetic inference X(Data) : fixed, Θ(parameter): Random (Jukes Cantor substitution model) 11

7.2 Bayesian phylogenetic inference Parameter space It corresponds to a particular set of branch lengths on that topology Summarized all joint probabilities along one axis of the table, we obtain the marginal probabilities for the corresponding parameter. Each cell [Bayesian inference: there is no need to decide on the parameters of interest before performing the analysis] 12

7.3 Markov chain Monte Carlo sampling [For parameter sampling] Markov chain Monte Carlo steps 1.Start an arbitrary point (θ) 2.Make a small random move (to θ*) 3.Calculate height ration (r) of new state (to θ*) to old state (θ) (a) r>1: new state accepted (b) r<1: new state accepted with probability r if new state rejected, stay in old state 4. Go to step 2 f (θ | θ*) (Prior ratio) ) (likelihood ratio) (proposal ratio) 13

7.3 Markov chain Monte Carlo sampling Box 7.2 Proposal mechanism – To change continuous variables 1)Proposal step 2)Acceptance/Rejection step [Sliding window proposal] [Normal proposal] (similar to the above one) [The beta and Dirichlet proposal] ω: tuning parameter Large: more radical proposal & lower acceptance rates Small: more modest changes & higher acceptance rate σ 2 : Determine how drastic the new proposals are and how often they will be accepted [Multiplier proposal] σ 2 : Determine how drastic the new proposals are and how often they will be accepted 14

Discard 7.4 Burn-in, mixing and convergence – [about performance of an MCMC run] * Trace plot * To confirm convergence * Mixing behavior 15

7.4 Burn-in, mixing and convergence The mixing behavior of a Metropolis sampler can be adjusted using its tuning parameter Poor mixing Good mixing ω is too small, The proposal will be accepted well Takes long time to cover the all region ω is too large, The proposal will be rejected well Takes long time to cover the all region ω is an intermediate value, Moderate acceptance rates 16

7.4 Burn-in, mixing and convergence Convergence diagnostics help determine the quality of a sample from the posterior. 3 different types of diagnostics (1) Examining autocorrelation times, effective sample sizes, and other measures of the behavior of single chains (2) Comparing samples from successive time segments of a single chain (3) Comparing samples from different runs. => In Bayesian MCMC sampling of phylogenetic problems, the tree topology is typically the most difficult parameter to sample from  The approach to solve this problem is to focus on split frequencies instead. A split is a partition of the tips of the tree into two non-overlapping sets; To calculate the average standard deviation of the split frequencies.  Potential Scale Reduction Factor(PSRF) PSRF compares the variance among runs with the variance within runs. As the chains converge, the variances will become more similar and the PSRF will approach 1.o 17

7.5 Metropolis coupling – [To activate the mixing] 18 Cold chain, Hot chain When: Difficult of impossible to achieve convergence Metropolis coupling: A General technique to improve mixing * An incremental heating scheme T = 1/ 1 + λi where i ∈ { 0,1,…k} for k heated chains, with i=0 for the cold chain, and λ is the temperature factor ( intermediate value of λ works best)

Summarizing the results Stationary phase of the chain/ Adequate sample > To compute an estimate of the marginal posterior distribution > Summarized using statistics * Bayesian statisticians : 95% credibility interval. The posterior distribution on topology and branch lengths is more difficult to summarize efficiently. * To illustrate the topological variance in posterior -> Estimated number of topologies in various credible sets. * To give the frequencies of the most common splits => A majority rule consensus tree *The sampled branch lengths are even more difficult to summarize adequately.  To display the distribution of branch length values separately  To pool the branch length samples that correspond to the same split

7.7 An introduction to phylogenetic models 20 Phylogenetic model: 1)A Tree model Unrooted / rooted model, Strict / relaxed clock tree model 2) A substitution model The substitution model, Q matrices The general time-reversible(GTR) model * Factor π i : corresponds to the stationary state frequency of the receiving state * Factor r ij,: determines the intensity of the exchange between pairs of states, controlling for the stationary state frequencies

7.8 Bayesian model choice and model averaging 21 PriorBayes factor The probability of the data given the chosen model after we have integrated out all parameters: normalizing constant ( model likelihood) * Bayes’ theorem * Bayes factor comparisons are truly flexible. - Unlike likelihood ratio tests, No requirement for the models to be nexted - Unlike Akaike Information Criterion, Bayesian Information Criterion (confusingly named) no need to correct for the number of parameters in the model. To estimate the model likelihood-> Use harmonic means in the MCMC run.

Prior probability distributions – cautionary notes. The priors : negligible influence on the posterior distribution The Bayesian approach typically handles weak data quite well. But when the data are weak, Extremely low likelihoods that attract the chain. ;;;

23 Schematic overview of the models implemented in MrBayes3. Each box gives the available settings in normal font and then the program commands and coommand options needed to invoke those settings in italics

PRACTICE 7.10 Introduction to Mrbayes Acquiring and installing the program Getting started Changing the size of the Mrbayes window Getting help 24

PRACTICE 7.11 A simple analysis Quick start version Getting data into Mrbayes Specifying a model Setting the priors Checking the model Setting up the analysis Running the analysis When to stop the analysis Summarizing samples of substitution model parameters Summarizing samples of trees and branch lengths 25

PRACTICE 7.12 Analyzing a partitioned data set simple analysis Getting mixed data into Mrbayes Dividing the data into partitions Specifying a partitioned model Running the analysis Some practical advice 26