Presentation is loading. Please wait.

Presentation is loading. Please wait.

Inferences on human demographic history using computational Population Genetic models Gabor T. Marth Department of Biology Boston College Chestnut Hill,

Similar presentations


Presentation on theme: "Inferences on human demographic history using computational Population Genetic models Gabor T. Marth Department of Biology Boston College Chestnut Hill,"— Presentation transcript:

1 Inferences on human demographic history using computational Population Genetic models Gabor T. Marth Department of Biology Boston College Chestnut Hill, MA 02467

2 The current public resource (dbSNP) contains over 10 million SNPs 1. How are these SNPs structured within the genome? 2. What can we learn about the processes that shape human variability? 3. What is the utility of these data for medical applications? The current variation resource

3 in different regions of given lengths at the scale of the chromosomes Nucleotide diversity is heterogeneous

4 G+C nucleotide content CpG di-nucleotide content recombination rate functional constraints 3’ UTR5.00 x 10 -4 5’ UTR4.95 x 10 -4 Exon, overall4.20 x 10 -4 Exon, coding3.77 x 10 -4 synonymous 366 / 653 non-synonymous287 / 653 Variance is so high that these quantities are poor predictors of nucleotide diversity in local regions, hence random processes are likely to govern the basic shape of the genome variation landscape described by neutral theory Compositional and functional features

5 marker density (MD): distribution of number of SNPs observed in pairs of sequences Strategy – study observable distributions “rare” “common” allele frequency spectrum (AFS): distribution of SNPs according to allele frequency in a set of samples

6 Build models of fundamental forces (drift, mutation process, demography, recombination, selection) that accurately describe these distributions Use these same models to improve our expectations of allelic association (linkage disequilibrium, LD) and human haplotype structure, properties less amenable to measurement but fundamental for medical association Strategy – modeling approach region of strong allelic association region of reduced haplotype diversity

7 Trace the genealogy of samples at hand, through significant events (e.g. coalescent, recombination) back into the past, until the Most Recent Common Ancestor of all samples is found. The shape of the genealogy is modulated by the underlying model structure and parameters. Tabulate the statistical properties of the resultant polymorphic structure Add mutations according to a neutral mutation model Tool – the Coalescent process N1 N2 N3 T1 T2 past present simple, but dynamic model of demography

8 computable formulations simulation procedures Model generation and model fitting 3/5 1/52/5 parameter i parameter j

9 past present stationaryexpansioncollapse MD (simulation) AFS (direct form) history bottleneck Model expectations – Demography

10 Marth et al., PNAS 2003 our conclusions from the marker density data are confounded by the unknown ethnicity of the public genome sequence best model is a bottleneck shaped population size history data fit very good at each length examined (4-16 kb) present N1=6,000 T1=1,200 gen. N2=5,000 T2=400 gen. N3=11,000 Model fitting in BAC marker density data we looked at allele frequency data from ethnically defined samples

11 present N1=20,000 T1=3,000 gen. N2=2,000 T2=400 gen. N3=10,000 model consensus: bottleneck The frequency spectrum in European samples How general are these observations?

12 European data African data bottleneck modest but uninterrupted expansion African spectra tell a different story Marth et al., Genetics, in press

13 African dataEuropean data contribution of the past to alleles in various frequency classes average age of polymorphism Predictions – Age of polymorphisms

14 * LD measures the strength of allelic association between two markers Predictions – Linkage disequilibrium*

15 Severity of a European bottleneck

16 African-American spectra – Admixture? African spectrum European spectrum

17 Daly et al., Nature Genetics, 2001 Haplotype structure – Haplotype blocks These predictions agree with experimental observations from other labs, most notably with the presence of regions of strong allelic association, termed “haplotype blocks”, evident primarily in European samples. a few frequent haplotypes (e.g. 10% min. frequency) make up the majority of all observed haplotypes (e.g. > 80%) block

18 The HapMap initiative 1. Frequent haplotypes can be used as markers for functional variants 2. Significant marker reduction possible The promise HapMap Initiative: map haplotype blocks across the entire human genome Questions of generality within and across human populations patterns in reference samples patterns in clinical samples ?

19 Predictions – Haplotype structure Going back to our own studies, we predict haplotype block size under African demographic history as roughly half the European size (consistent with observations) To what degree do “blocks” coincide? We have to analyze the spatial relationships between the polymorphic structure of different populations We examine this question from the standpoint of demographic history (an obvious candidate to cause population specific differences)

20 The “true” history of all human populations is interconnected We study these relationships with models of population subdivision “African history”“European history” “migration” The genealogy of samples from different populations are connected through the shared part of our past Polymorphic markers (some shared, some population-specific) and haplotypes are placed into a common frame of reference Connecting ethnic demographies

21 European African monomorphicrarecommon monomorphic 0.0 % 19.9 % 13.2 % 2.3 % 1.0 % rare 43.4 % 43.7 % 11.5 % 11.0 % 4.6 % 7.4 % common 10.2 % 4.2 % 4.4 % 6.0 % 6.6 % 13.4 % shared SNPs observation in UW PGA data SNPs private to African samples SNPs private to European samples SNPs common in both populations alleles often have different frequencies in different populations our simple model of subdivision captures the qualitative dynamics we now have the tools to start evaluating and guiding the design for variation resources that are general for all populations Predictions – Joint allele frequency


Download ppt "Inferences on human demographic history using computational Population Genetic models Gabor T. Marth Department of Biology Boston College Chestnut Hill,"

Similar presentations


Ads by Google