TUM Weihenstephan. Freising

Slides:



Advertisements
Similar presentations
Background The demographic events experienced by populations influence their genealogical history and therefore the pattern of neutral polymorphism observable.
Advertisements

Recombination and genetic variation – models and inference
Genetic erosion and pollution - genetic and conservation consequences for European forest tree species François Lefèvre INRA, Avignon (France)
Evolution of Biodiversity
Evolution in Large Populations I: Natural Selection & Adaptation
Sampling distributions of alleles under models of neutral evolution.
Discovery of a rare arboreal forest-dwelling flying reptile (Pterosauria, Pterodactyloidea) from China Wang et al. PNAS Feb. 11, 2008.
Lecture 23: Introduction to Coalescence April 7, 2014.
Plant of the day! Pebble plants, Lithops, dwarf xerophytes Aizoaceae
Atelier INSERM – La Londe Les Maures – Mai 2004
Signatures of Selection
BIOL General Ecology Dr. Fisher
Forward Genealogical Simulations Assumptions:1) Fixed population size 2) Fixed mating time Step #1:The mating process: For a fixed population size N, there.
Biology and Bioinformatics Gabor T. Marth Department of Biology, Boston College BI820 – Seminar in Quantitative and Computational Problems.
Dispersal models Continuous populations Isolation-by-distance Discrete populations Stepping-stone Island model.
Evolutionary Genome Biology Gabor T. Marth, D.Sc. Department of Biology, Boston College Medical Genomics Course – Debrecen, Hungary, May 2006.
Scott Williamson and Carlos Bustamante
Human Migrations Saeed Hassanpour Spring Introduction Population Genetics Co-evolution of genes with language and cultural. Human evolution: genetics,
Salit Kark Department of Evolution, Systematics and Ecology The Silberman Institute of Life Sciences The Hebrew University of Jerusalem Conservation Biology.
KEY CONCEPT A population shares a common gene pool.
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Mind the gaps: Theoretical and empirical aspects of plant ecology and population genetics S. peruvianum S. chilense Aurélien Tellier Populationsgenetik.
Biodiversity IV: genetics and conservation
Natural Selection Developed by Charles Darwin in 1859
KEY CONCEPT A population shares a common gene pool.
Darwin’s Theory of Evolution as a Mechanistic Process Darwin’s Idea of Common Descent Darwin’s Idea of Common Descent Darwin’s Idea of Gradualism Darwin’s.
Haplotype Blocks An Overview A. Polanski Department of Statistics Rice University.
The Evolution of Populations.  Emphasizes the extensive genetic variation within populations and recognizes the importance of quantitative characteristics.
Evolution Chapters Evolution is both Factual and the basis of broader theory What does this mean? What are some factual examples of evolution?
Speciation history inferred from gene trees L. Lacey Knowles Department of Ecology and Evolutionary Biology University of Michigan, Ann Arbor MI
Population Viability Analysis 4 Seeks relationship between population size and probability of extinction –does not need to calculate MVP –concerned more.
ABC The method: practical overview. 1. Applications of ABC in population genetics 2. Motivation for the application of ABC 3. ABC approach 1. Characteristics.
Molecular phylogenetics 4 Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Chapter 5 Evolution of Biodiversity. What is biodiversity? Three different scales – all three contribute to the overall biodiversity of Earth 1.Ecosystem.
Patterns of divergent selection from combined DNA barcode and phenotypic data Tim Barraclough, Imperial College London.
Models of Molecular Evolution III Level 3 Molecular Evolution and Bioinformatics Jim Provan Page and Holmes: Sections 7.5 – 7.8.
Coalescent Models for Genetic Demography
Evolution of Populations. How Common Is Genetic Variation? Many genes have at least two forms, or alleles. Many genes have at least two forms, or alleles.
Discovery of a rare arboreal forest-dwelling flying reptile (Pterosauria, Pterodactyloidea) from China Wang et al. PNAS Feb. 11, 2008.
Lecture 20 : Tests of Neutrality
Biodiversity How did biological diversity come about? What are the principles of natural selection? What affects biodiversity?
NEW TOPIC: MOLECULAR EVOLUTION.
By Mireya Diaz Department of Epidemiology and Biostatistics for EECS 458.
Mammalian Population Genetics
The plant of the day Pinus longaevaPinus aristata.
Evolutionary Genome Biology Gabor T. Marth, D.Sc. Department of Biology, Boston College
CSCOPE Unit: 09 Lesson: 01.  Be prepared to share your response to the following: ◦ Biological evolution happens at the __________ level, not the individual.
Ms. Hughes.  Evolution is the process by which a species changes over time.  In 1859, Charles Darwin pulled together these missing pieces. He was an.
The Theory of Evolution.  Darwin developed the first theory on evolution, which is the basis for modern evolutionary theory ◦ Darwin spent 5 years sailing.
Inferences on human demographic history using computational Population Genetic models Gabor T. Marth Department of Biology Boston College Chestnut Hill,
Evolution and Biodiversity G. Tyler Miller’s Living in the Environment 14 th Edition Chapter 5 Part 1 G. Tyler Miller’s Living in the Environment 14 th.
Robert Page Doctoral Student in Dr. Voss’ Lab Population Genetics.
Biodiversity How did biological diversity come about?
Chapter 11: Evolution of Populations
Population Genetics And Speciation.
Gil McVean Department of Statistics
MULTIPLE GENES AND QUANTITATIVE TRAITS
Fig. 2. —The 26 models implemented in this study
The Making of the Fittest Evidence of Evolution youtube
MULTIPLE GENES AND QUANTITATIVE TRAITS
Type Topic in here! Created by Educational Technology Network
There is a Great Diversity of Organisms
EOC Review – Day 3 Standard B-5:
There is a Great Diversity of Organisms on Planet Earth……….why?
Bivariate density representation of the joint posterior distributions for the number of demes and the germination rate (b). Bivariate density representation.
Statistical evaluation of demographic scenarios indicated by the posterior probability for each model. Statistical evaluation of demographic scenarios.
Chapter 18: Evolution and Origin of Species
Chapter 11 Evolution of Populations
Presentation transcript:

TUM Weihenstephan. Freising Mind the gaps: Theoretical and empirical aspects of plant ecology and population genetics S. chilense S. peruvianum University of Uppsala 26.02.2013 Aurélien Tellier Populationsgenetik TUM Weihenstephan. Freising

Seed banks and seed dormancy (1) Many plant species present variation in seed dormancy and long term seed banks If conditions are predictable within a year No dormancy time = 1 year With dormancy** favorable conditions

Seed banks and seed dormancy (2) In a variable and unpredictable environment over many years! No seed bank = germination only by environmental clues or cyclic Seed bank** = damp the effect of bad years = bet-hedging

Evolution of seed banks Seed banks may evolve as bet-hedging strategies If the environment is stochastically variable or competition between species generates temporal variability (Cohen 1966 J Theor Biol; Snyder and Adler 2011 Am Nat) Bet hedging strategies such as germ banking are ubiquitous to many species of: Bacteria (Jones and Lennon 2011 Nat Rev Microbiol) Invertebrates: diapause in insects, eggs banks in crustaceans (Daphnia) a) Arbuscular mycorrhizal fungus, b) Cyanobacteria c) Bacterial biofilm (P. aeruginosa), d) Bacteria Viridibacillus arvi © J. Wolinska Nature 2007 From Jones and Lennon. 2011 Nat Rev Microbiol 4

Importance of seed banks (1) Seed banks are important for conservation biology Promote the temporal rescue effect (Brown and Kodric-Brown 1977 Ecology) Decreasing the likelihood of population extinction Seed banks promote storage of diversity in the soil => time lag between above-ground and seed banks Selection is slower (Hairston and De Stasio 1988 Nature) Balancing selection is favored (Turelli et al. 2001 Evolution) Coevolutionary dynamics are stabilized (Tellier and Brown 2009 Am Nat) Year 1 Year 2 Year 3 Year 4 Germination rate Linanthus parryae ©Bierzychudek lab

Hypothetical phylogeny of Solanum section Lycopersicon S. lycopersicum S. pimpinellifolium Self Compatible S. cheesmanii S. neorickii S. chmielewskii S. arcanum S. peruvianum* Self Incompatible S. chilense* S. habrochaites S. pennellii Solanum sp. From Peralta, Spooner, Knapp (2008)

Background: Two wild tomato species Solanum peruvianum generalist species found in wide range of habitats Solanum chilense specialist species found in dry to very dry habitats Städler et al. 2008 Genetics Photos © TGRC. T. Städler

Plant populations are like iceberg plants above ground = census size = the tip of the iceberg from the Tomato Genetics Ressource Center database (Davis. USA) S. chilense S. peruvianum total number populations in database 147 304 Mean number of plants above ground (Ncs) 33 - 154 44 - 185 Maximum number of plants observed 400 600 total number of populations (From Nakazato et al. 2010 Am J Bot): 526 (S. peruvianum) and 428 (S. chilense)

The hidden part: Effective population size Effective population size (Ne) reflects genetic diversity Key question Why is Ne so different from Ncs ??? Species Population Ne Ncs (census size) S. peruvianum Tarapaca 1.23106 ≈150 Arequipa 9 Nazca 10 Canta ≈300 S. chilense Antofagasta 1.06106 Tacna 42 Moquegua ≈200 Quicacha ≈40

Our objectives Objective 1: Reveal the existence of seed banks over evolutionary time scale without extensive sampling of seeds and above-ground populations e.g. for species where sampling is complicated Objective 2: What are the consequences of seed banks on statistical inference of past demographic events? Of interest in conservation biology where DNA sequences are increasingly used to detect past or recent crash of populations? Objective 3: What are the consequences of seed banks on statistical inference for speciation models? Can seed banks affect the detection of introgression between closely related species?

Part 1: Evidence for the existence of seed banks in wild tomato species Work with Stefan Laurent, Wolfgang Stephan

? Metapopulation, Seed Bank or both ? Spatial structuring of populations Seed banks in plants Year 1 Year 2 Year 3 Year 4 Germination rate=b Ne=Ncsnd+nd /4mig Ne=Ncs(1/b)2 Number of individuals per population Number of demes Migration rate Number of individuals per population Germination rate Demes are linked by limited gene flow (migration of seeds or pollen) (Pannell and Charlesworth 2000; Wakeley and Aliacar 2001) There is storage of diversity in the soil for several years (Nunney 2002; Kaj et al. 2001; Vitalis et al. 2004)

Overview of our approach Goal Explain the discrepancy between Ncs and Ne Are seed banks needed to generate the high observed diversity? Do these two wild tomato species have different seed banks? Method and data Ecological data: population sizes (35 < Ncs < 180) and number of demes (526 or 428) => priors Sequences at reference loci (> 300 silent SNPs in total) population samples species wide sample Population genetics statistics nucleotide diversity, site-frequency spectrum, index of fixation (FST) Approximate Bayesian Computation for statistical inference

Our sampling scheme

Our model of coalescence Based on Kaj, Krone and Lascoux J Appl Proba 2001 The germination process is memoryless => the germination rate decreases geometrically with age of seeds b = probability for a seed to germinate after one generation (= germination rate) b(1-b) = probability for a seed to germinate after two generations b(1-b)2 = probability for a seed to germinate after three generations β is a composite seed bank parameter function of b and m The rate of coalescence is rescaled by β2 (the size of the genealogy is affected) Mutation does not increase with age of seeds The scaled mutation rate is scaled by β along a given ancestral line The recombination rate and migration rate between demes are also scaled by β

ABC: island model and demography Ancestral unique population (at time of speciation?) with split at time T. constant population size time Expansion from an ancestral population AND fragmentation With split at time T time Crash from an ancestral population AND fragmentation With split at time T time

Results 1: Seed banks are necessary for high diversity S. peruvianum S. chilense Posterior probability for each model Posterior probability for each model are seed banks necessary ? = yes are seed banks necessary ? = yes past demography = expansion past demography ≈ expansion Tellier et al. PNAS 2011

Results 2: Estimates of germination rate Posterior density for the germination rate b S. peruvianum (b = 0.03 [0.011 – 0.103]) S. chilense (b = 0.093 [0.016 – 0.2]) P < 0.001 Kolmogorov-Smirnov We estimate the germination parameter for both species Tellier et al. PNAS 2011

Conclusions Part 1 Different adaptations for seed dormancy We estimate parameters of seed bank and metapopulation with combination of sampling schemes and chosen statistics in a flexible Bayesian framework Population genetics methods and theory to gain fundamental insights into the ecology and adaptation of wild plant species: S. peruvianum = generalist species S. chilense = specialist species Different adaptations for seed dormancy differences in rates of purifying selection (Tellier et al. 2011 Heredity) rates of local adaptation (Xia et al. 2010 Mol Ecol. Fischer et al. 2011 New Phytol) inter-population differentiation (Arunyawat et al. 2007 MBE. Städler et al. 2008 Genetics)

Part 2: Seed banks and statistical inference of simple demographic events Work with Daniel Živković

Scale of tree function of β2 Rationale Application: In conservation biology, polymorphism analyses are used increasingly to determine if populations show recent decline In our model from Kaj et al. (2001): coalescence occurs at slower rate (trees are longer by a factor β2) Scale of tree function of β2 Time t in past No seed bank With seed bank

Scale of tree function of β2 Rationale Application: In conservation biology, polymorphism analyses are used increasingly to determine if populations show recent decline In our model from Kaj et al. (2001): coalescence occurs at slower rate (trees are longer by a factor β2) Scale of tree function of β2 Time t in past No seed bank With seed bank Our hypothesis seed banks affect the scale of the coalescent tree => affect the inference of past events

Results 1: allele frequency-spectrum (SFS) We compute the SFS for a model with seed bank and varying population size β =1 β =0.6 β =0.2 Different past demography can result in the same (relative) allele frequency spectrum

Results 1: expectations For a simple demographic scenario of population expansion Time t2 in past Growth rate R2 Time t1 in past Growth rate R1 No seed bank With seed bank We see that an expansion that occurred at time t2 in the past and with growth rate R2 in a population with seed bank has equal SFS than more recent expansion at time t1 (t2 >> t1 ) AND with larger growth rate R1 (R2 << R1 ) in a population without seed bank

Our approach: test of expectation No seed bank With seed bank Time t1 in past Time t2 in past Growth rate R1 Growth rate R2 We simulate data with seed banks with various germination rate β Then estimate the parameters of the model in a population without seed bank Using Approximate Bayesian Computation (ABC) and SFS statistics

Results 2: for past expansion Over 500 datasets, we calculate errors in estimating the growth rate R Knowing the seed bank rate Ignoring the seed bank Long seed banks Short seed banks Long seed banks Short seed banks Large error bias for estimating the time and growth rate of a past expansion!!

Results 3: more complex demographic scenarios For more complex demographic models, it may even be worst Various expected SFS can be obtained for one set of model parameters but for different germination rates (β) β =1 β =0.6 β =0.2 Živković and Tellier 2012 Mol Ecol

Results 3: more complex demographic scenarios For more complex demographic models. it may even be worst Various expected SFS can be obtained for one set of model parameters but for different germination rates (β) β =1 β =0.6 β =0.2 Good news: seed bank give us access to more ancient demographic events Bad news: we need to know the germination rate Živković and Tellier 2012 Mol Ecol

Part 3: Seed banks and statistical inference in speciation models Work with Anja Hörger and Katharina Böndel

Coevolution in sister species? Coevolution may generate balancing selection at resistance genes in plants (Stahl et al. 1999 Nature; Tellier and Brown 2011 Annu Rev Phytopathol) Under balancing selection with multiple alleles (Castric et al. 2008 PLoS Genetics) => Repeated adaptive introgression due to frequency-dependent selection? Maintenance of ancestral polymorphism without gene flow? Adaptive introgression model Ancestral polymorphism model Predictions 1) Higher amount of shared polymorphism at resistance genes than genomic background due to balancing selection 2) Higher gene flow at resistance genes than genomic background

Methods We study three resistance genes: Pto, Pfi1, Rin4 (Rose et al. 2005 Genetics, Rose et al. 2011 Mol Plant Pathol) In two sister species S. chilense and S. peruvianum with history of gene flow (Städler et al. 2008 Genetics) 10 reference loci are sequenced (=980 SNPs) Species wide sampling: one individual per population (Städler et al. 2009 Genetics)

Result 1: occurrence of balancing selection Several tests for comparison with 10 reference loci (980 SNPs) Balancing selection may occur at Pto and Pfi1 in both species Purifying selection at Rin4 as in reference loci (Tellier et al. 2011 Heredity)

Coalescence and IM speciation model Time since speciation =  past Ancestral A=4NA 2 Derived 2 1 Derived 1 M12 Time since speciation =  present M21 The number of migrants per generation M12=4N1m12 We integrate seed banks in the coalescent model (using a modification of ms software) Use Joint-Site Frequency Spectrum frequency to summarize SNPs in both species (Wakeley and Hey 1997 Genetics; Tellier et al. 2011 PLoS One)

The JSFS Joint-Site Frequency Spectrum frequency of SNPs in both species Species 1 = S. peruvianum 1 2 3 4 5 … n2-1 n2 A1 A2 A3 A14 A4 A5 A6 A10 A7 A8 A9 A11 A12 A13 … n1-1 n1 A15 Species 2 = S. chilense

Result 1: Relative JSFS for reference vs candidate loci Boxplot of JSFS classes at reference loci A1, A2, A4, A7 = private singletons and low frequency SNPs (low at Pto) A3, A11 = private intermediate and high frequency SNPs (excess at Pto) A6, A13 = shared intermediate and high frequency SNPs (excess at Pto)

Power analysis of ABC on model choice There is an excess of shared polymorphism at the Pto locus compared to other loci Question: is the shared polymorphism due to gene flow or ancestral polymorphism? Lets look first at a model for the reference loci (neutral model for genomic background) Using our JSFS classes, can we distinguish between model with and without gene flow? Our approach: Power analysis with the ABC method on wide range of parameter values, for each parameter combination: 500 model choices Model 1 Model 2

Result 2: Power analysis of ABC Confusion matrix: how often do we find out the wrong model (Bayes Factor = 3) Divergence (τ) Migration (M) WITHOUT seed banks WITH seed banks (β=0.1) 0.005 0.02 0.95 0.99 0.2 2 0.96 20 0.98 0.05 0.09 0.5 5 0.82 0.84 0.87

Conclusion: results of ABC IM model We cannot distinguish models of speciation with and without gene flow based on genomic background (independent of number of loci) It is very difficult to estimate divergence time and migration rate WHY? Because seed banks retain genetic diversity for long period of time, and enhance the amount of shared polymorphism between species (at whole genome level) Following our previous results, this is because coalescent trees are lengthened by seed banks

Conclusion: Coevolution in sister species? We find an increase of shared polymorphism at the Pto locus under balancing selection BUT based on the JSFS and ABC analysis, we cannot distinguish the origin of shared polymorphisms between: Repeated adaptive introgression due to frequency-dependent selection? Maintenance of ancestral polymorphism without gene flow? because we cannot estimate the occurrence of gene flow for the neutral background

Unknown unknowns of germ/seed banking “There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we know we don't know. But there are also unknown unknowns. There are things we don't know we don't know.” Donald Rumsfeld How important are germ banks in evolution? Consequences for statistical inference of past events? Affecting selection and drift in populations, and genomic signatures of selection? For dating speciation times and rate of diversification in phylogenies? 40

Thanks to hazards of migration Wolfgang Stephan Stefan Laurent Daniel Živković Katharina Böndel Pavlos Pavlidis, Hilde Lainer, Laura Rose Anja Hörger Mamadou Mboup Thomas Städler James Brown

Research questions Do long term seed banks evolve as a bet-hedging strategy? => Ecological studies (Evans and Dennehy 2005 Q. Rev. Biol.. Evans et al 2007 Am Nat) What is the genetic basis for short-term seed dormancy ? Genetic basis. physiological adaptations and ecological conditions for long-term seed banks? Similar bases as dormancy? Correlation between these traits? Are seed banks important in plant evolution ? Do many plants have seed banks? Influence of seed banks on adaptation and genetic variability? © SeedNet. transcription network in seeds

Importance of seed banks (2) Consequence of seed banks for mutation Increase of mutation rate in seed banks? (suggested by Levin 1990 Am Nat) Comparison between seed banks and above-ground population (microsatellite data) In 47 studies. 42 different plant species (Honnay et al. 2008 Oikos) Higher values in above -ground Higher values in seed bank 43

Importance of seed banks (2) Consequence of seed banks for mutation Increase of mutation rate in seed banks? (suggested by Levin 1990 Am Nat) Comparison between seed banks and above-ground population (microsatellite data) In 47 studies. 42 different plant species (Honnay et al. 2008 Oikos) Higher values in above -ground Higher values in seed bank Above-ground population is not differentiated from seed bank No evidence for increase of mutation rate in seed banks Purifying selection at the seedling stage (Vitalis et al. 2002 Am Nat) 44

The hidden part: Effective Population size No correlation between census size and genetic diversity

Theory 1: Coalescence in metapopulation Two phases: collecting (long) and scattering (short) (Wakeley and Aliacar 2001 Genetics) Genealogy depends on the number of demes (n) and migration rate (M) past collecting phase time scattering phase present Deme 1 Deme 2 Deme 3 Deme 4

Theory 2: Species wide sample past collecting phase time scattering phase present Deme 1 Deme 2 Deme 3 Deme 4 1 individual per deme. over the species range = reflect the species wide evolution (Wakeley and Aliacar 2001 Genetics. Pannell 2003 Evolution. Städler et al. 2009 Genetics)

Theory 3: Population sample past collecting phase time scattering phase present Deme 1 Deme 2 Deme 3 Deme 4 Several individuals per deme. few populations = reflect the local evolution Combination of sampling schemes is necessary

Theory 3: Effect of sampling scheme Model with 100 demes. Städler et al. 2009 Stepping stone with constant population size Stepping stone with expansion Tajima’s D Tajima’s D pooled species wide local Expansion factor Migration rate local samples are weakly affected by species wide demography expansion at the species level is seen in pooled and species-wide samples use various sampling schemes to infer demography in metapopulation?

Size=Ncs compartments Model for seed banks (1) If we have no information about what lies in the seed bank. here m = 5 Step 0 Cell 0 Cell 1 Cell 2 Cell m-1 Cell m Our sample Size=Ncs compartments Kaj. Krone. Lascoux. J. Appl. Proba. 2001 50

Size=Ncs compartments Model for seed banks (2) If we have no information about what lies in the seed bank. here m = 5 Step 0 Cell 0 Cell 1 Cell 2 Cell m-1 Cell m Our sample Size=Ncs compartments b4 b2 b1 Step 1 Cell 0 Cell 1 Cell 2 Cell m-1 Cell m Cell m+1 m-window For all plants in cell 0 Kaj. Krone. Lascoux. J. Appl. Proba. 2001 51

Model for seed banks (3) Coalescent event m = 5. if two seeds fall on the same compartment within a cell. there can be a coalescent event Coalescent event Step 2 Cell 0 Cell 1 Cell 2 Cell m-1 Cell m Cell m+1 m-window At any moment. there are r lineages in the m-window Step 0 Cell 0 Cell 1 Cell 2 Cell m-1 Cell m Cell m+1 Then move every cell to the left. and start again step 0 Kaj. Krone. Lascoux. J. Appl. Proba. 2001 52

Results 3: Estimates of germination rate Joint posterior density A B S. peruvianum S. chilense Germination rate (b) Germination rate (b) Number of demes Number of demes b = 0.058 b = 0.135 Varying the number of demes affect also the estimates Tellier et al. 2011 PNAS

Two Orang-Utan populations Rationale The shape of the tree reflects past demographic events: expansion. crash.… The allele frequency-spectrum (SFS) summarizes the shape of the tree e.g. An excess of low frequency-variants reflects a possible past population expansion Wiuf et al. 2005. “Primer in Coalescent” Two Orang-Utan populations Locke et al. 2011 Nature

Results 2: for past expansion Ignoring the seed bank Correcting by β2 Long seed banks Short seed banks Long seed banks Short seed banks ×β2 Large error bias for estimating the time and growth rate of a past expansion!!