Presentation is loading. Please wait.

Presentation is loading. Please wait.

Detecting selection using genome scans Roger Butlin University of Sheffield.

Similar presentations

Presentation on theme: "Detecting selection using genome scans Roger Butlin University of Sheffield."— Presentation transcript:

1 Detecting selection using genome scans Roger Butlin University of Sheffield

2 Nielsen R (2005) Molecular signatures of natural selection. Annu. Rev. Genet. 39, 197–218. What signatures does selection leave in the genome? 1.Population differentiation – today’s focus! 2.Frequency spectrum, e.g. Tajima’s D 3.Selective sweeps 4.Haplotype structure (linkage disequilibrium) 5.MacDonald-Kreitman tests (or PAML over long time-scales)

3 From Nielsen (2005): frequency of derived allele in a sample of 20 alleles. Tajima’s D = (π-S)/sd, summarises excess of rare variants Frequency distribution:

4 Selective sweep:

5 Extended haplotype homozygosity (Sabeti et al. 2002)

6 MacDonald-Kreitman and related tests dN = replacement changes per replacement site dS = silent changes per silent site dN/dS = 1 - neutral dN/dS < 1 - conserved (purifying selection) dN/dS > 1 - adaptive evolution (positive selection)

7 Selection on phenotypic traits: QTL Association analysis Candidate genes

8 Genome scans (aka ‘Outlier analysis’)

9 ‘H’ ‘M’ Thornwick Bay Littorina saxatilis – locally adapted morphs What signatures of selection might we look for?

10 Signatures of selection: Departure from HWE Low diversity (selective sweep) Frequency spectrum tests High divergence Elevated proportion of non-synonymous substitutions LD

11 Neutral loci

12 Stabilizing selection

13 Local adaptation

14 Charlesworth et al. 1997 (from Nosil et al. 2009)

15 A concrete example: adaptation to altitude in Rana temporaria (Bonin et al. 2006) High – 2000m Intermediate – 1000m Low – 400m 190 individuals 392 AFLP bands

16 Generating the expected distribution NeNe DetSel – Vitalis et al. 2001 N0N0 N1N1 N2N2 t μ NeNe toto F 1,2 – measure of divergence of population 1,2 from population 2,1 Dfdist – Beaumont & Nichols 1996 N N N N N N N m F ST – symmetrical population differentiation, as a function of heterozygosity Does the structure/history matter?

17 DetSelDfdist ‘Low 1’ vs ‘High 1’ 95% CI 95% 50 % 5%

18 DetSelDfdistBothInterpretation Monomorphic in one population 35N/A Unreliable outliers Significant in one comparison 1429 False positives Significant in comparisons involving one population 311 Local effects Significant in at least 2 comparisons 231 Adaptation to altitude Significant in global comparison across altitudes 6 (2 at 99%) Adaptation to altitude 392 AFLPs, 12 pairwise comparisons across altitude or 3 altitude categories, 95% cut off

19 8 loci343 loci

20 Outliers and selected traits Coregonus clupeaformis (lake whitefish) Rogers and Bernatchez (2007): Dwarf x Normal cross  both backcrosses Measure ‘adaptive’ traits (9) QTL map (>400 AFLP plus microsatellites) Homologous AFLP in 4 natural sympatric population pairs Outlier analysis (forward simulation based on Winkle) Homologous AFLP Outlier AFLP in homologous set* Outlier within QTL (based on 1.5 LOD support) Hybrid x Dwarf 180199 (3.6 expected, P=0.0015) Hybrid x Normal 13184 (0.5 expected, P=0.0002) *Only 3 outliers shared between lakes

21 Roger Butlin - Genome scans21

22 Nosil et al. 2009 review of 14 studies: 1.0.5 – 26% outliers, most studies 5-10% 2.1 - 5% outliers replicated in pair-wise comparisons 3.25 - 100% of outliers specific to habitat comparisons 4.No consistent pattern for EST-associated loci 5.LD among outliers typically low But many methodological differences between studies Population sampling Marker type Analysis type and options Statistical cut-offs

23 Environmental correlations SAM – Joost et al. 2007 IBA – Nosil et al. 2007 F ST for each locus correlated with ‘adaptive distance’, controlling for geographic distance (partial Mantel test)

24 Methodological improvements – Bayesian approaches BayesFst – Beaumont & Balding 2004 Bayescan – Foll & Gaggiotti 2008 Ancestral For each locus i and population j we have an F ST measure, relative to the ‘ancestral’ population, F ij Then decompose into locus and population components, Log(F ij /(1-F ij ) = α i + β j α i is the locus-effect – 0 neutral, +ve divergence selection, -ve balancing selection β j is the population effect Assuming Dirichlet distribution of allele frequencies among subpopulations, can estimate α i + β j by MCMC In Bayescan, also explicitly test α i = 0

25 Apparently much greater power to detect balancing selection than FDIST Lower false positive rate Wider applicability

26 Methodological improvements – hierarchical structure Arlequin – Excoffier et al. 2009

27 Circles – simulated STR data, grey – null distribution


29 Bayenv – Coop et al. 2010 Estimates variance-covariance matrix of allele frequencies then tests for correlations with environmental variables (or categories). Software available at: Multiple analyses? Candidate vs control? E.g. Shimada et al. 2010


31 Hohenlohe et al. 2010

32 Mäkinen et al 2008 7 populations 3 marine, 4 freshwater 103 STR loci Analysed by BayesFst (and LnRH) 5 under directional selection (3 in Eda locus) 15 under balancing selection Used as a test case by Excoffier et al 2 directional 3 balancing

33 Can we replicate these results? Bayescan Stickleback_allele.txt – input file Output_fst.txt – view with R routine plot_Bayescan Arlequin Stickleback_data_standard.arp – IAM Stickleback_data_repeat.arp – SMM Run using Arlequin3.5 Try hierarchical and island models, maybe different hierarchies


35 Sympatric speciation? F ST distribution as evidence of speciation with gene flow Savolainen et al (2006) Howea - palms Cf. Gavrilets and Vose (2007) few loci underlying key traits intermediate selection initial environmental effect on phenology

Download ppt "Detecting selection using genome scans Roger Butlin University of Sheffield."

Similar presentations

Ads by Google