TOWARDS TESTING THE EPIDEMIC CLONE MODEL OF BACTERIAL PATHOGENS Daniel J. Wilson, Gilean A.T. McVean and Martin C.J. Maiden Peter Medawar Building for.

Slides:



Advertisements
Similar presentations
Quick Lesson on dN/dS Neutral Selection Codon Degeneracy Synonymous vs. Non-synonymous dN/dS ratios Why Selection? The Problem.
Advertisements

Lab 3 : Exact tests and Measuring Genetic Variation.
Background The demographic events experienced by populations influence their genealogical history and therefore the pattern of neutral polymorphism observable.
Recombination and genetic variation – models and inference
Experimental design Based on Chapter 2 of D. Heath (1995). An Introduction to Experimental Design and Statistics for Biology. CRC Press.
Practical Session: Bayesian evolutionary analysis by sampling trees (BEAST) Rebecca R. Gray, Ph.D. Department of Pathology University of Florida.
Sampling distributions of alleles under models of neutral evolution.
Coalescence with Mutations Towards incorporating greater realism Last time we discussed 2 idealized models – Infinite Alleles, Infinite Sites A realistic.
Basics of Linkage Analysis
Plant of the day! Pebble plants, Lithops, dwarf xerophytes Aizoaceae
Atelier INSERM – La Londe Les Maures – Mai 2004
Signatures of Selection
Bayesian inference Gil McVean, Department of Statistics Monday 17 th November 2008.
Maximum Likelihood. Likelihood The likelihood is the probability of the data given the model.
Molecular Evolution Revised 29/12/06
Forward Genealogical Simulations Assumptions:1) Fixed population size 2) Fixed mating time Step #1:The mating process: For a fixed population size N, there.
Network Morphospace Andrea Avena-Koenigsberger, Joaquin Goni Ricard Sole, Olaf Sporns Tung Hoang Spring 2015.
Association Mapping of Complex Diseases with Ancestral Recombination Graphs: Models and Efficient Algorithms Yufeng Wu UC Davis RECOMB 2007.
Molecular Evolution with an emphasis on substitution rates Gavin JD Smith State Key Laboratory of Emerging Infectious Diseases & Department of Microbiology.
Evolutionary Genome Biology Gabor T. Marth, D.Sc. Department of Biology, Boston College Medical Genomics Course – Debrecen, Hungary, May 2006.
Scott Williamson and Carlos Bustamante
Inferring human demographic history from DNA sequence data Apr. 28, 2009 J. Wall Institute for Human Genetics, UCSF.
Estimating recombination rates using three-site likelihoods Jeff Wall Program in Molecular and Computational Biology, USC.
Inference of Genealogies for Recombinant SNP Sequences in Populations Yufeng Wu Computer Science and Engineering Department University of Connecticut
Materials and Methods Abstract Conclusions Introduction 1. Korber B, et al. Br Med Bull 2001; 58: Rambaut A, et al. Nat. Rev. Genet. 2004; 5:
Molecular phylogenetics
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 2 – Slide 1 of 25 Chapter 11 Section 2 Inference about Two Means: Independent.
Haplotype Blocks An Overview A. Polanski Department of Statistics Rice University.
Geographic differentiation in Neisseria meningitidis Daniel J. Wilson 1, H. Claus 2, M. C. J. Maiden 1, N. McCarthy 1, K. A. Jolley 1, R. Urwin 1, F. Hessler.
Beyond Phylogeny: Evolutionary analysis of a mosaic pathogen Dr Rosalind Harding Departments of Zoology and Statistics, Oxford University,UK.
14 Elements of Nonparametric Statistics
Speciation history inferred from gene trees L. Lacey Knowles Department of Ecology and Evolutionary Biology University of Michigan, Ann Arbor MI
Phylogenetics and Coalescence Lab 9 October 24, 2012.
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
Quantifying uncertainty in species discovery with approximate Bayesian computation (ABC): single samples and recent radiations Mike HickersonUniversity.
1/23 Ch10 Nonparametric Tests. 2/23 Outline Introduction The sign test Rank-sum tests Tests of randomness The Kolmogorov-Smirnov and Anderson- Darling.
Phylogeny GENE why is coalescent theory important for understanding phylogenetics (species trees)? coalescent theory lets us test our assumptions.
Simon Myers, Gil McVean Department of Statistics, Oxford Recombination and genetic variation – models and inference.
Confidence intervals and hypothesis testing Petter Mostad
Identifying and Modeling Selection Pressure (a review of three papers) Rose Hoberman BioLM seminar Feb 9, 2004.
Getting Parameters from data Comp 790– Coalescence with Mutations1.
Patterns of divergent selection from combined DNA barcode and phenotypic data Tim Barraclough, Imperial College London.
Estimating evolutionary parameters for Neisseria meningitidis Based on the Czech MLST dataset.
Issues concerning the interpretation of statistical significance tests.
Why are there so few key mutant clones? Why are there so few key mutant clones? The influence of stochastic selection and blocking on affinity maturation.
Association mapping for mendelian, and complex disorders January 16Bafna, BfB.
NEW TOPIC: MOLECULAR EVOLUTION.
By Mireya Diaz Department of Epidemiology and Biostatistics for EECS 458.
Molecular evolution Part I: The evolution of macromolecules.
Coalescent theory CSE280Vineet Bafna Expectation, and deviance Statements such as the ones below can be made only if we have an underlying model that.
Computational Biology and Genomics at Boston College Biology Gabor T. Marth Department of Biology, Boston College
Testing the Neutral Mutation Hypothesis The neutral theory predicts that polymorphism within species is correlated positively with fixed differences between.
Evolutionary Genome Biology Gabor T. Marth, D.Sc. Department of Biology, Boston College
In populations of finite size, sampling of gametes from the gene pool can cause evolution. Incorporating Genetic Drift.
Modelling evolution Gil McVean Department of Statistics TC A G.
ILUTE A Tour-Based Mode Choice Model Incorporating Inter-Personal Interactions Within the Household Matthew J. Roorda Eric J. Miller UNIVERSITY OF TORONTO.
Hypothesis Testing. Statistical Inference – dealing with parameter and model uncertainty  Confidence Intervals (credible intervals)  Hypothesis Tests.
The Haplotype Blocks Problems Wu Ling-Yun
Inferences on human demographic history using computational Population Genetic models Gabor T. Marth Department of Biology Boston College Chestnut Hill,
DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT
DIVERSIFYING SELECTION AND FUNCTIONAL CONSTRAINT: ESTIMATING
Signatures of Selection
Neutrality Test First suggested by Kimura (1968) and King and Jukes (1969) Shift to using neutrality as a null hypothesis in positive selection and selection.
Pipelines for Computational Analysis (Bioinformatics)
Models of Sequence Evolution
Discrete Event Simulation - 4
David H. Spencer, Kerry L. Bubb, Maynard V. Olson 
Coupling Genetic and Ecological-Niche Models to Examine How Past Population Distributions Contribute to Divergence  L. Lacey Knowles, Bryan C. Carstens,
Statistical Inference
Fig. 1. —GO categories enriched in gene families showing high or low omega (dN/dS) values for Pneumocystis jirovecii. ... Fig. 1. —GO categories enriched.
Presentation transcript:

TOWARDS TESTING THE EPIDEMIC CLONE MODEL OF BACTERIAL PATHOGENS Daniel J. Wilson, Gilean A.T. McVean and Martin C.J. Maiden Peter Medawar Building for Pathogen Research and Departments of Statistics and Zoology, Oxford University Overview Neisseria meningitidis is the causal agent of meningococcal meningitis and septicaemia, yet it is found in up to 10% of healthy individuals as an asymptomatic commensal organism of the nasopharynx. Sporadic epidemics of virulent or hypervirulent strains are thought to contribute little to the long-term persistence of the pathogen. Starting sequence Mutational model Evolved sequences 1 2 Choose codons at random from the observed distribution of codon usage Estimate evolutionary parameters from the observed data Statistically test for differences between simulated and observed patterns of variation. 3 SimulationReal Data First the starting sequence is chosen from a distribution based on observed codon usage. A coalescent tree is then simulated (Hudson 1990), and the sequence mutated down the tree according to a model, the parameters of which are estimated from the observed data. Finally the test statistic is computed for the simulated data. When all 30,000 runs are complete, the distribution of values of the test statistic is compared to the observed value to determine whether the model plausibly describes the observed data. Codon frequencies The distribution of the starting codon frequencies were estimated using the observed codon usage patterns in the MLST data in a Bayesian manner. The mean marginal codon usage from the posterior distribution is shown in Figure 3. Methods The steps involved in testing the null hypothesis of meningococcal evolution can be summarised in Figure 2. Population structure is found in the form of significant association between loci, despite relatively high rates of recombination. The epidemic clone hypothesis posits that this is due to recent, explosive increases in groups of closely related individuals. However, in a finite population some degree of structuring is expected because of the stochastic nature of the evolutionary process. To test this simpler explanation, we perform coalescent simulations of seven housekeeping genes in N. meningitidis, modelling functional constraint as a form of mutational bias. Using the number of unique sequences (haplotypes) as a test statistic, we reject the null hypothesis (p< ), showing that genetic diversity is too clustered: a finding consistent with the epidemic clone hypothesis. Introduction Jolley et al (2000) sampled 218 isolates of Neisseria meningitidis from asymptomatic carriers in the Czech Republic during They characterised seven housekeeping genes in each of the isolates using multi locus sequence typing (MLST) (Maiden et al 1998), yielding complete nucleotide sequences of gene fragments some base pairs in length. The first step in constructing models of the epidemiological process is to determine whether the signature of evolutionary processes can be detected in the data. In other words, is it possible to outright reject a null hypothesis in which nothing interesting is happening? Simple summary statistics such as Tajima’s D (Tajima 1989) were unable to reject this type of null hypothesis (Jolley et al in press), so it is to coalescent simulations that we turned. Figure 1 shows a caricature of what the topology of a gene tree might look like in the case of (a) a neutral and (b) an epidemic clonal model of meningococcal evolution. The red branches indicate a recent expansion of a particular complex of closely-related clones. Model of mutational bias Under-representation of, for example, non-synonymous changes in the sequence data can be modelled as mutational bias rather than purifying selection. Confounding functional constraint in this way allows coalescent simulations of neutral evolution to be performed. The model was parameterised as follows: Estimates of μ, κ and ω were obtained by the method of maximum likelihood on the assumption that codons were independently and identically distributed, that the number of mutations in the genealogy was Poisson distributed, and that the probability of having more than one mutation at a nucleotide in the genealogy was negligible. Recombination Jolley et al (in press) estimated the rate of recombination to equal 0.94 times the rate of mutation, and the mean tract length of a recombination fragment to be 1.1 kilobases in length. Results and Conclusions The rates of synonymous transversion, synonymous transition, non-synonymous transversion and non- synonymous transition were estimated (in units of 10 3 N e generations) at 3.32, 19.4, 0.86 and 5.06 respectively (μ=3.32, κ=5.85 and ω=0.26). Figure 4 shows the distribution of the test statistic (number of haplotypes) simulated under 30,000 runs of the null model. The median is 126, with range The observed number of haplotypes in the Czech MLST data was 89, outside the range of the simulated values. Thus the null hypothesis can be overwhelmingly rejected (p< ). Using coalescent simulations it has been possible to reject the null hypothesis of neutral evolution with functional constraint. Our method has detected a strong signal of evolutionary forces consistent with the epidemic clone model, something that Tajima’s D did not have sufficient power to achieve. The next step will be to incorporate more sophisticated hypotheses, such as the clonal epidemic model, into the coalescent framework. Parameterisation of such models in terms of epidemiological and evolutionary forces, and estimation of those parameters from empirical data, will exploit these efficient methods of inference to address important problems pertaining to bacterial population biology. Acknowledgments Thanks go to Chris Spencer, Graham Coop, Jonathan Marchini and the BBSRC for funding. St. John’s College, Oxford kindly provided travel expenses. Cited References Hudson, R.R. (1990) Oxf. Surv. Evol. Biol. 7: 1-44 Jolley, K.A. et al (2000) J. Clin. Microbiol. 38: Figure 3 Codon frequencies estimated from the data. Scanning electron micrograph of Neisseria meningitidis taken from ab Figure 1 Caricatures of gene trees under the neutral and epidemic clonal hypotheses. Figure 2 Summary of testing the null hypothesis of meningococcal evolution. Figure 4 Simulated distribution of the test statistic. Arrow indicates observed value. Jolley, K.A. et al (in press) Maiden, M.C.J. et al (1998) Proc. Natl. Acad. Sci. USA 95: Tajima, F. (1989) Genetics 123: