Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using genetics to study human history and natural selection David Reich Harvard Medical School Depatment of Genetics Broad Institute.

Similar presentations


Presentation on theme: "Using genetics to study human history and natural selection David Reich Harvard Medical School Depatment of Genetics Broad Institute."— Presentation transcript:

1 Using genetics to study human history and natural selection David Reich Harvard Medical School Depatment of Genetics Broad Institute

2 tttctccatttgtcgtgacacctttgttgacaccttcatttctgcattctcaattctatttcactggtctatgg cagagaacacaaaatatggccagtggcctaaatccagcctactaccttttttttttttttgtaacattttacta acatagccattcccatgtgtttccatgtgtctgggctgcttttgcactctaatggcagagttaagaaattgtag cagagaccacaatgcctcaaatatttactctacagccctttataaaaacagtgtgccaactcctgatttatgaa cttatcattatgtcaataccatactgtctttattactgtagttttataagtcatgacatcagataatgtaaatc ctccaactttgtttttaatcaaaagtgttttggccatcctagatatactttgtattgccacataaatttgaaga tcagcctgtcagtgtctacaaaatagcatgctaggattttgatagggattgtgtagaatctatagattaattag aggagaatgactatcttgacaatactgctgcccctctgtattcgtgggggattggttccacaacaacacccacc ccccactcggcaacccctgaaacccccacatcccccagcttttttcccctgctaccaaaatccatggatgctca agtccatataaaatgccatactatttgcatataacctctgcaatcctcccctatagtttagatcatctctagat tacttataatactaataaaatctaaatgctatgtaaatagttgctatactgtgttgagggttttttgttttgtt ttgttttatttgtttgtttgtttgtattttaagagatggtgtcttgctttgttgcccaggctggagtgcagtgg tgagatcatagcttactgcagcctcaaactcctggactcaaacagtcctcccacctcagcctcccaaagtgctg ggatacaggtgtgacccactgtgcccagttattattttttatttgtattattttactgttgtattatttttaat tattttttctgaatattttccatctatagttggttgaatcatggatgtggaacaggcaaatatggagggctaac tgtattgcatcttccagttcatgagtatgcagtctctctgtttatttaaagttttagtttttctcaaccatgtt tacttttcagtatacaagactttgacgttttttgttaaatgtatttgtaagtattttattatttgtgatgttat ttaaaaagaaattgttgactgggcacagtggctcacgcctgtaatcccagcactttgggaggctgaggcgggca gatcacgaggtcaggagatcaagaccatcctggctaacatggtaaaaccccgtctctactaaaaatagaaaaaa attagccaggcgtggtggcgagtgcctgtagtcccagctactcgggaggctgaggcaggagaatggtgtgaacc tgggaggcggagcttgcagtgagctgagatcgtgccactgcattccagcctgcgtgacagagcgagactctgtc aaaaaaataaataaaatttaaaaaaagaagaagaaattattttcttaatttcattttcaggttttttatttatt tctactatatggatacatgattgatttttgtatattgatcatgtatcctgcaaactagctaacatagtttatta tttctctttttttgtggattttaaaggattttctacatagataaataaacacacataaacagttttacttcttt cttttcaacctagactggatgcattttttgtttttgtttgtttgtttgctttttaacttgctgcagtgactaga gaatgtattgaagaatatattgttgaacaaaagcagtgagagtggacatccctgctttccccctgattttaggg ggaatgttttcagtctttcactatttaatatgattttagctataggtttatcctagatccctgttatcatgttg aggaaattcccttctatttctagtttgttgagattttttaattcatgtgattgcgctatctggctttgctctca tctc gaga gaga gaga gaga gaga gcgc gcgc gcgc tctc gaga gaga gaga gaga gaga tctc tctc tctc tctc gaga gaga gaga tctc gcgc tctc tctc tctc

3 A 2-part talk: Section 1: How human history affects human genetic variation Section 2: Detecting selection by the pattern of genetic variation and finding disease genes

4 How does human history affect genetic variation? A genome-wide survey of Linkage Disequilibrium Section 1 Linkage disequilibrium is a phenomenon whereby genetic variants are associated: people who have one tend to have a second as well

5 Linkage Disequilibrium Explained Variations in Chromosomes Within a Population Common Ancestor Emergence of Variations Over Time timepresent Disease Mutation Section 1

6 Time = present What Determines Extent of LD? 2,000 gens. ago Disease-Causing Mutation 1,000 gens. ago Section 1

7 How Far Does Association (LD) Extend Between Neighboring Common Sites? 0kb 160kb 80kb40kb20kb10kb5kb Range of uncertainty Section 1 Theoretical: 3-8 kb

8 Strategy for Assessing Extent of LD 19 regions 44 Caucasian samples from Utah a great deal of DNA sequencing per sample Distance from core single nucleotide polymorphism (SNP) 5510204080 Section 1 0kb 160kb 80kb40kb20kb10kb5kb

9 Section 1

10 A Genome-Wide Assessment of Linkage Disequilibrium Disease Gene Mapping Human history Section 1

11 MYSTERY: What explains the long-range LD? Section 1  Important event in population history?

12 Positive Control: 48 Swedes Identical pattern to Utah Section 1

13 96 Nigerians (Yoruba) Much Less LD Associations in Africans a SUBSET of those in Caucasians MUST be influenced by population history Section 1

14 Confirmation of less LD in Africans from Direct DNA Sequencing Anna DiRienzo also shows this pattern Section 1

15 More evidence from Genotyping ~5,000 SNPs (Gabriel et al. 2002) K. Kidd, J. Kidd, Sarah Tishkoff also show this Section 1

16 Explanation: Bottleneck or ‘Founder Effect’ in History of North Europeans What was this event? (1) Out of Africa? Ancestral Population North Europeans likely <10 founding chromosomes ~100,000 years ago Yoruba Ancestors Section 1 (2) Founding of Europe?

17 Open Mysteries Section 1 what caused the bottleneck event? “Out of Africa” migration? how many people involved? When did it occur? can we better understand when the founder event occurred, and how many people involved?

18 Acknowledgements for Section 1 Collaborators: Michele Cargill Stacey Bolk James Ireland Pardis C. Sabeti Daniel J. Richter Thomas Lavery Rose Kouyoumjian Shelli F. Farhadian Ryk Ward Eric S. Lander Samples: Leif Groop Richard Cooper Charles Rotimi

19 Using Long-Range Linkage Disequilibrium to Detect Positive Selection in the Genome Section 2

20 Overview 1.The difficulty of detecting genomic regions affected by natural selection 2. The long-range haplotype test 3. Results for two genes: G6PD and CD40 ligand Section 2

21 Existing formal tests for selection DNA Sequence analysis Tajima’s D HKA test Mcdonald and Kreitman Fu and Li’s D Ka/Ks ratio Weak Genotyping-based tests Not general at present Section 2

22 Old alleles: low or high frequency short-range LD Positive Selection Our test is based on the relationship between allele frequency and extent of linkage disequilibrium Young alleles: low frequency long-range LD No selection Young alleles: high frequency long-range LD Section 2

23 The signal of selection frequency Linkage Disequilibrium (Homozygosity) Neutrality Positive Selection Section 2

24 gene Paradigm of the Core Region 5 3 2 1 4 Core Haplotypes Section 2

25 Long-range multi-SNP haplotypes 5 3 2 1 4 C/T A/G A/G C/T C/T C/T Long-range markersCore markers gene Decay of LD Section 2

26 Long-range multi-SNP haplotypes 100% Decay of homozygosity (probability, at any distance, that any two haplotypes that start out the same have all the same SNP genotypes) 18% gene C/T A/G C/T Core markers Long-range markers GG C C C C T T T T C T 75% 35% T T C C AG 3 Section 2

27 CD40 ligand (2002): Recent association by Sabeti et al. involved in immune regulation Two genes associated with malaria resistance well established association to malaria resistance G6PD (1960’s) selection demonstrated in 2001 by Tishkoff et al. Section 2

28 Experimental Design -180kb Gene +520kb CD40 ligand (7 SNPs in core, 14 at long distances) -480kb G6PD +220kb -180kb TNFSF5 +520kb telomere -480kb Gene +220kb telomere G6PD (11 SNPs in core, 14 at long distances) Section 2

29 Experimental Design DNA samples from 231 African men Yoruba(Nigeria) Beni (Nigeria) Shona(Zimbabwe) Perfect phase (X chromosome) Section 2

30 Core haplotypes G6PD 5 3 2 1 4 Africans (230) 6 7 8 9 38 72 4 28 14 41 5 4 61 13 17 non-Africans (95) CD40 ligand 5 91 9 78 30 1 5 3 2 1 4 6 Africans (231) 77 21 7 non-Africans (91) “A-” protective haplotype Section 2

31 G6PD: long-range haplotype diversity G6PD-corehap1 G6PD-corehap6 G6PD-corehap3 G6PD-corehap7 G6PD-corehap4 G6PD-corehap8 G6PD-corehap5 G6PD-corehap G6PD-corehap8 “A-” protective haplotype Section 2

32 G6PD: homozygosity vs. distance EHH Distance from the core region ( kb) Section 2

33 G6PD: computer simulation vs. data Core haplotype frequency Relative EHH Core haplotype 8 P << 0.0008 Section 2

34 G6PD: P-values from simulation P- value Distance from the core region ( kb) Section 2

35 G6PD also stands out in comparison to 7 control regions Corehaplotypefrequency Relative EHH Section 2

36 CD40 ligand: long-range haplotype diversity corehap1 corehap4 corehap2 corehap5 corehap3 corehap4 Section 2

37 CD40 ligand: homozygosity vs. distance EHH Distance from the core region ( kb) Section 2

38 CD40 ligand: computer simulation vs. data Core haplotype frequency Relative EHH Core haplotype 4 P << 0.0011 Section 2

39 CD40 ligand: P-values from simulation P- value Distance from the core region ( kb) Section 2

40 CD40 ligand also stands out in comparison to 7 control regions Corehaplotypefrequency Relative EHH Section 2

41 Malaria resistance arose in last 10,000 years in Africa ~2,500 years ago for G6PD ~6,500 years ago for CD40 ligand Long-range linkage disequilibrium also gives a direct estimate of the date Section 2

42 Traditional tests fail to detect the effect Tajima’s D HKA test Mcdonald and Kreitman Fu and Li’s D Ka/Ks ratio Not significant in our data. This test is a powerful way to detect selection in last 10,000 years Section 2

43 3 2 1 4 Conclusions: Powerful general approach for detecting selection Section 2

44 3 2 1 4 5 Conclusions: Powerful general approach for detecting selection Section 2

45 3 2 1 4 Screen the genome for Postive Selection Conclusions: Powerful general approach for detecting selection Section 2

46 Conclusions: Genome-wide screen for natural selection We can find disease genes without patients! Section 2

47 What’s coming… Section 2 1.Generalization of the long-range haplotype test 2.Application of the approach genome-wide Haplotype map data set Disease gene screen data sets

48 Acknowledgements for Section 2 Pardis C. Sabeti John Higgins Haninah Z.P. Levine Daniel J. Richter Stephen F. Schaffner Stacey Gabriel Jill V. Platko Nicholas J. Patterson Gavin J. McDonald Hans C. Ackerman Sarah J. Campbell David Altshuler Richard Cooper Ryk Ward Eric S. Lander

49 Note The 3 rd section of the talk is not included here because it presents data that have not yet been published.


Download ppt "Using genetics to study human history and natural selection David Reich Harvard Medical School Depatment of Genetics Broad Institute."

Similar presentations


Ads by Google