Presentation is loading. Please wait.

Presentation is loading. Please wait.

010101100010010100001010101010011011100110001100101000100101 Human Population Genomics ACGTTTGACTGAGGAGTTTACGGGAGCAAAGCGGCGTCATTGCTATTCGTATCTGTTTAG.

Similar presentations


Presentation on theme: "010101100010010100001010101010011011100110001100101000100101 Human Population Genomics ACGTTTGACTGAGGAGTTTACGGGAGCAAAGCGGCGTCATTGCTATTCGTATCTGTTTAG."— Presentation transcript:

1 010101100010010100001010101010011011100110001100101000100101 Human Population Genomics ACGTTTGACTGAGGAGTTTACGGGAGCAAAGCGGCGTCATTGCTATTCGTATCTGTTTAG

2 Cost Killer apps Roadblocks? How soon will we all be sequenced? Time 2013? 2018? Cost Applications

3 The Hominid Lineage

4 Human population migrations Out of Africa, Replacement –Single mother of all humans (Eve) ~190,000yr –Single father of all humans (Adam) ~340,000yr –Humans out of Africa ~50000 years ago replaced others (e.g., Neandertals) Multiregional Evolution –Generally debunked, however, –~5% of human genome in Europeans, Asians is Neanderthal, Denisova

5 Coalescence Y-chromosome coalescence

6 Why humans are so similar Out of Africa Oppenheimer S Phil. Trans. R. Soc. B 2012;367:770-784

7 Some Key Definitions Mary: AGCCCGTACG John: AGCCCGTACG Josh: AGCCCGTACG Kate: AGCCCGTACG Pete: AGCCCGTACG Anne: AGCCCGTACG Mimi: AGCCCGTACG Mike: AGCCCTTACG Olga: AGCCCTTACG Tony: AGCCCTTACG Mary: AGCCCGTACG John: AGCCCGTACG Josh: AGCCCGTACG Kate: AGCCCGTACG Pete: AGCCCGTACG Anne: AGCCCGTACG Mimi: AGCCCGTACG Mike: AGCCCTTACG Olga: AGCCCTTACG Tony: AGCCCTTACG Alleles: G, T Major Allele: G Minor Allele: T G/G G/T G/G T/T T/G G/G G/T G/G T/T T/G Recombinations: At least 1/chromosome On average ~1/100 Mb Linkage Disequilibrium: The degree of correlation between two SNP locations MomDad

8 Human Genome Variation SNP TGCTGAGA TGCCGAGA Novel Sequence TGCTCGGAGA TGC - - - GAGA Inversion Mobile Element or Pseudogene Insertion TranslocationTandem Duplication Microdeletion TGC - - AGA TGCCGAGA Transposition Large Deletion Novel Sequence at Breakpoint TGC

9 The Fall in Heterozygosity H – H POP F ST = ------------- H H – H POP F ST = ------------- H

10 From bones, compared genomes of three different Neanderthals with five genomes from modern humans from different areas of the world The Neanderthal Genome Figure 1- R. E. Green et al., Science 328, 710-722 (2010)

11 Neanderthal Genome

12

13 Denisovan – Another human relative

14 Denisovan/Human Comparison

15 Aboriginal Australian

16 Benefits of Admixture

17 Out of Africa Revisited Ann Gibbons Science 28 January 2011: “Human uniqueness?”

18 The HapMap Project ASWAfrican ancestry in Southwest USA 90 CEUNorthern and Western Europeans (Utah) 180 CHBHan Chinese in Beijing, China 90 CHDChinese in Metropolitan Denver100 GIHGujarati Indians in Houston, Texas100 JPTJapanese in Tokyo, Japan 91 LWKLuhya in Webuye, Kenya100 MXLMexican ancestry in Los Angeles 90 MKKMaasai in Kinyawa, Kenya180 TSIToscani in Italia100 YRIYoruba in Ibadan, Nigeria100 Genotyping: Probe a limited number (~1M) of known highly variable positions of the human genome

19 Linkage Disequilibrium & Haplotype Blocks pApA pGpG Linkage Disequilibrium (LD): D = P(A and G) - p A p G Linkage Disequilibrium (LD): D = P(A and G) - p A p G Minor allele: A G

20 Population Sequencing – 1000 Genomes Project 1000 Genomes Project Population Sequencing – 1000 Genomes Project 1000 Genomes Project

21 Population Sequencing – 1000 Genomes Project 1000 Genomes Project Population Sequencing – 1000 Genomes Project 1000 Genomes Project

22 Association Studies Control Disease A/G G/G A/G G/G A/A A/G A/A A/G A/A AA04 AG33 GG40 p-value

23 Wellcome Trust Case Control Nature 447, 661-678(7 June 2007) Nature 464, 713-720(1 April 2010) Many associations of small effect sizes (<1.5)

24 Heritability & Environment Bienvenu OJ, Davydow DS, & Kendler KS (2011). Psychological medicine, 41 (1), 33-40 PMID:

25 Disease Clustering RA vs. ATD RA vs. MS –No recorded co-occurrence of RA and MS SNP - Allele Gene Symbol Genetic Variation Score (GVS) RA (NARAC) RAAST1DATDMS (IMSGC)MS rs11752919 - CZSCAN23 -3.48-3.21-9.391.100.703.252.99 rs3130981 - ACDSN -0.46-9.47-4.940.3310.0013.41 rs151719 - GHLA-DMB -6.71-4.77-1.08-13.630.348.5817.76 rs10484565 - TTAP2 25.528.371.3415.74-1.36-0.56-0.30 rs1264303 - GVARS2 11.517.3618.760.89-1.76-1.85-1.75 rs1265048 - CCDSN 6.592.9750.136.34-0.85-2.39-4.16 rs2071286 - ANOTCH4 5.300.786.424.04-0.03-1.89-2.45 rs2076530 - GBTNL2 67.4956.4614.0613.58-6.41-9.50-18.52 rs757262 - TTRIM40 14.589.116.271.56-0.79-2.05-7.34

26 Global Ancestry Inference Nature. 2008 November 6; 456(7218): 98–101.

27 Ancestry Painting Danish French Spanish Mexican ALLOY: A factorial HMM for ancestry painting

28 Modeling population haplotypes – VLMC Browning, 2006

29 Phasing Browning & Browning, 2007

30 Identity By Descent { {............

31 IBD detection IBD = F IBD = T FastIBD: sample haplotypes for each individual, check for IBD Browning & Browining 2011 Parente Rodriguez et al. 2013

32 Fixation, Positive & Negative Selection Neutral Drift Positive Selection Negative Selection How can we detect negative selection? How can we detect positive selection?

33 Ka/Ks ratio: Ratio of nonsynonymous to synonymous substitutions Very old, persistent, strong positive selection for a protein that keeps adapting Examples: immune response, spermatogenesis Ka/Ks ratio: Ratio of nonsynonymous to synonymous substitutions Very old, persistent, strong positive selection for a protein that keeps adapting Examples: immune response, spermatogenesis

34 How can we detect positive selection?

35 Positive Selection in Human Lineage

36

37 X X X Mutations and LD Slide Credits: Marc Schaub

38

39 Extended Haplotype Homozygozity ChrS1S1 S2S2 S3S3 S4S4 S5S5 S6S6 S1S1 S2S2 S3S3 S4S4 S5S5 S6S6 Slide Credits: Marc Schaub

40 ChrS1S1 S2S2 S3S3 S4S4 S5S5 S6S6 A101110 B111010 C101110 D111100 E100011 F101110 G100100 H111110 I101111 J101111 Core C Extended Haplotype Homozygozity Slide Credits: Marc Schaub

41 ChrS1S1 S2S2 S3S3 S4S4 S5S5 S6S6 A101110 B111010 C101110 D111100 E100011 F101110 G100100 H111110 I101111 J101111 Core C 3 core haplotypes: ch 0 = 101 ch 1 = 111 ch 2 = 100 Extended Haplotype Homozygozity Slide Credits: Marc Schaub

42 ChrS1S1 S2S2 S3S3 S4S4 S5S5 S6S6 A101110 B111010 C101110 D111100 E100011 F101110 G100100 H111110 I101111 J101111 Core C 3 core haplotypes: ch 0 = 101 ch 1 = 111 ch 2 = 100 Extended Haplotype Homozygozity Slide Credits: Marc Schaub

43 ChrS1S1 S2S2 S3S3 S4S4 S5S5 S6S6 A101110 B111010 C101110 D111100 E100011 F101110 G100100 H111110 I101111 J101111 Core C 3 core haplotypes: ch 0 = 101 ch 1 = 111 ch 2 = 100 Extended Haplotype Homozygozity Slide Credits: Marc Schaub

44 ChrS1S1 S2S2 S3S3 S4S4 S5S5 S6S6 A101110 B111010 C101110 D111100 E100011 F101110 G100100 H111110 I101111 J101111 Core C Given a core haplotype (101) and a SNP (S 6 ) EHH is the conditional probability of two randomly chosen chromosomes to be homozygous from the core to S 6 given that they include core haplotype 101 Extended Haplotype Homozygozity

45 ChrS1S1 S2S2 S3S3 S4S4 S5S5 S6S6 A101110 B111010 C101110 D111100 E100011 F101110 G100100 H111110 I101111 J101111 Core C EHH is the conditional probability of two randomly chosen chromosomes to be homozygous from the core to S 6 given that they include core haplotype 101 Extended Haplotype Homozygozity

46 EHH is the conditional probability of two randomly chosen chromosomes to be homozygous from the core to S 6 given that they include core haplotype 101 ChrS1S1 S2S2 S3S3 S4S4 S5S5 S6S6 A101110 B111010 C101110 D111100 E100011 F101110 G100100 H111110 I101111 J101111 Core C Extended Haplotype Homozygozity

47 ChrS1S1 S2S2 S3S3 S4S4 S5S5 S6S6 A101110 B111010 C101110 D111100 E100011 F101110 G100100 H111110 I101111 J101111 Core C Extended Haplotype Homozygozity Slide Credits: Marc Schaub

48 Study of genes known to be implicated in the resistance to malaria. Infectious disease caused by protozoan parasites of the genus Plasmodium Frequent in tropical and subtropical regions Transmitted by the Anopheles mosquito Image source: wikipedia.org Application: Malaria Slide Credits: Marc Schaub

49 Image source: NIH - http://history.nih.gov/exhibits/bowman/images/malariacycleBig.jpgNIH - http://history.nih.gov/exhibits/bowman/images/malariacycleBig.jpg Application: Malaria Slide Credits: Marc Schaub

50 Image source: CDC - http://www.dpd.cdc.gov/dpdx/images/ParasiteImages/M- R/Malaria/malaria_risk_2003.gif http://www.dpd.cdc.gov/dpdx/images/ParasiteImages/M- R/Malaria/malaria_risk_2003.gif Application: Malaria Slide Credits: Marc Schaub

51 Source: Sabeti et al. Nature 2002. Results: G6PD Slide Credits: Marc Schaub

52 Results: G6PD Source: Sabeti et al. Nature 2002. Slide Credits: Marc Schaub

53 Results: TNFSF5 Source: Sabeti et al. Nature 2002. Slide Credits: Marc Schaub

54 Malaria and Sickle-cell Anemia Allison (1954): Sickle-cell anemia is limited to the region in Africa in which malaria is endemic. Image source: wikipedia.org Distribution of malariaDistribution of sickle-cell anemia Slide Credits: Marc Schaub

55 Malaria and Sickle-cell Anemia Hypothesis: mutation causing sickle-cell anemia positively selected for the resistance to malaria. Currat (2002) and Ohashi (2004) identify the mutations in the African respectively Asian populations. Slide Credits: Marc Schaub

56 Malaria and Sickle-cell Anemia Single point mutation in the coding region of the Hemoglobin-B gene (glu → val). Heterozygote advantage: Resistance to malaria Slight anemia. Image source: wikipedia.org Slide Credits: Marc Schaub

57

58 Source: Ingram and Swallow. Population Genetics of Encyclopedia of Life Sciences. 2007. Slide Credits: Marc Schaub Lactose Intolerance

59 LCT, 5’ LCT, 3’ Source: Bersaglieri et al. Am. J. Hum. Genet. 2004. Slide Credits: Marc Schaub Lactose Intolerance

60 Source: Catherine Janet Ellen Ingram and Dallas Mary Swallow. Population Genetics of Lactase Persistence and Lactose Intolerance advanced. Encyclopedia of Life Sciences. 2007. Slide Credits: Marc Schaub

61 -13910*T associated with persistent lactose tolerance. Is this mutation causal? Does not account for tolerance in sub-Saharan populations (Mulcare 2004). Additional SNPs in an enhancer within 100bp are associated with lactose tolerance. Several independent causes for lactose tolerance (reviewed in Ingram 2009). Slide Credits: Marc Schaub Finding the Causal Marker

62 Lactase persistence (litterature)Predicted lactase persistence 13910*T distribution Source: Ingram et al. Lactose digestion and the evolutionary genetics of lactase persistence. Hum Genet. 2009 Jan;124(6):579- 91. Slide Credits: Marc Schaub

63 Long Haplotypes –iHS test Less time: Fewer mutations Fewer recombinations

64 Positive Selection in Human Lineage

65 Immune System & Archaic Admixture

66

67 Orthology and Paralogy HB Human WB Worm HA1 Human HA2 Human Yeast WA Worm Orthologs: Derived by speciation Paralogs: Everything else Orthologs: Derived by speciation Paralogs: Everything else

68 Orthology, Paralogy, Inparalogs, Outparalogs


Download ppt "010101100010010100001010101010011011100110001100101000100101 Human Population Genomics ACGTTTGACTGAGGAGTTTACGGGAGCAAAGCGGCGTCATTGCTATTCGTATCTGTTTAG."

Similar presentations


Ads by Google