Presentation is loading. Please wait.

Presentation is loading. Please wait.

Topic #3 Linkage Disequilibrium, Haplotypes & Tagging University of Wisconsin Genetic Analysis Workshop June 2011.

Similar presentations


Presentation on theme: "Topic #3 Linkage Disequilibrium, Haplotypes & Tagging University of Wisconsin Genetic Analysis Workshop June 2011."— Presentation transcript:

1 Topic #3 Linkage Disequilibrium, Haplotypes & Tagging University of Wisconsin Genetic Analysis Workshop June 2011

2 Overview Fate of a new mutation Linkage Disequilibrium (LD) –Measurement –Indirect association SNP selection based on LD –Haplotypes –SNP selection by tagging Practical – SNP selection using Haploview

3 Introduction of a Mutation into a Population

4

5 Haplotype Concept The sequence in this location becomes a signature for the chromosome carrying the mutation Haplotype – alleles inherited together at linked loci on the same chromosome haplotype will not be a perfect marker of disease –At the time mutation arose, there may have been other chromosomes with –New mutations –Recombination

6

7 Indirect Association Each of the alleles in the haplotype is also expected to be indirectly associated with carrying the mutation. Indirect association is an association of a marker with phenotype that is non-causal, being based on linkage disequilibrium (LD)

8 Linkage Disequilibrium (LD) Mendel’s Second Law: alleles at different loci assort independently Linkage Disequilibrium (LD): population- level association of alleles at linked loci

9 How LD is Measured LD – population-level association between linked loci A locus: A 1 or A 2 B locus: B 1 or B 2 Let P(A 1 ) = p A1 Let P(B 1 ) = p B1 Let P(A 1 B 1 ) = p A1B1 D = p A1B1 - p A1 p B1 = 0 if independent

10 Common LD Measures D = |d| –Preferred measure for population geneticists –Maximum value is bounded by the marginals D’ = |d|/d max –D’ varies between 0 and 1 –Does not have an easy interpretation and 1.0 is achieved if one off-diagonal is zero r 2 ( D 2 ) = D 2 /p(1-p)q(1-q) –Has several interpretations: = squared (phi) correlation so lies in [0,1]. =  2 /N –Directly related to power for indirect association

11 Allelic Association Direct Association –Initially it was thought that we could pick the genes and the (single) genetic variant w/i each gene that was relevant for disease Indirect Association –The existence of LD opens up the possibility of tests by indirect association – we don’t need to actually test the causal variant but rather need only genotype a marker that is in high LD with the causal variant

12 Indirect and Direct Allelic Association D Direct Association Assess relationship of D locus to phenotype directly – expect D to be a functional polymorphism in a candidate gene D Indirect Association M1M1 M2M2 M3M3 Assess relationship of D locus indirectly by determining whether markers (M i ) are associated with disease – M i don’t need to be functional

13 Martin, E.R. et al. (2000). SNPing away at complex disease … AJHG 67:

14

15 Dawson, E. et al. (2002). A first-generation LD map of 22. Nature 418:

16 Population Differences Weiss, K.M & Clark, A.G. (2002). Trends in Genetics, 18(1):19-24.

17 Recombination Hotspots Kauppi, L., Jeffreys, A. J., & Keeney, S. (2004). Where the crossovers are: Recombination distributions in mammals. Nature Reviews Genetics, 5, Hotspots typically span 1-2 kb

18 Haplotype Blocks

19 Two- and Three-locus Haplotypes Martin, E.R. et al. (2000). SNPing away at complex disease … AJHG 67: APOE locus and haplotypes containing APOE

20 Two- and Three-locus Haplotypes Martin, E.R. et al. (2000). SNPing away at complex disease … AJHG 67: locus haplotype stronger signal than individual markers

21 SNP Selection by Tagging Basic rationale: –The power for a causal SNP in a sample of size N is equivalent to power of tagging SNP in a sample of size N/r 2 Tagging SNP selection: –Based on some reference sample (HapMap) –Two overarching strategies Pairwise tagging Multimarker tagging de Bakker, P. I. W., et al. (2005). Efficiency and power in genetic association studies. Nature Genetics, 37(11),

22 Reference Sample: HapMap (www.hapmap.org) HapMap Phase 1: –SNP Selection Strategy (yield ~ 1 million): >1 common SNP every 5 kb, total of 1.3 million before QC MAF >.05 Some priority for non-synonymous cSNPs –Sample: N=270 (269) individuals from 4 populations 30 trios of Europeans from Utah (CEU) 45 unrelated Han Chinese (CHB) 45 unrelated Japanese (JPT) 30 Yoruban trios from Nigeria (YRI)

23 Reference Sample: HapMap (www.hapmap.org) Phase 2: –2.1 million additional SNPs Total now averages ~ 1/per kb; >98% of common variants w/i 5kb Focus still on MAF >.05 Average max r 2 of untyped common SNPs to a typed SNP PopulationHapMap IHapMap II YRI CEU CHB+JPT.83.95

24 Reference Sample: HapMap (www.hapmap.org) Phase 3: –Expand to N=1115 in 11 ancestral groups 2.1 million additional SNPs * Sample consists of family triples

25 HAPMAP3, Release 2 Region in NCBI B36 COMT Phase, Release and Build

26 HapMap Genotyped SNPs in COMT

27 Using Haploview to Identify Tagging SNPs for COMT Download Data from HapMap –Choose HapMap Download, Phase 3, and Release 2 –Choose population –Choose chromosome (22) and region (NCBI B36/hg18) Transcription starts at 18309; I will start at Transcription ends at 18337; I will end at Haploview Analysis –Get LD plot –Run Tagger (pairwise) –Force include/exclude

28 COMT LD Plot (D’)

29 COMT LD Plot (r 2 )

30 COMT Tagging SNPs (15 tag 24 at avg r 2 =.996) Tag SNPbpLocationMAFOther SNPs Tagged rs ’.37 rs rs ’.21 rs rs Intron #1.32 rs737866, rs , rs rs Intron #1.31 rs rs Intron #1.26 rs Intron #1.41 rs Intron #2.45 rs , rs rs Exon #4.47 rs4633 rs Intron #5.25 rs Intron #5.18 rs Intron #5.11 rs Intron #5.17 rs rs Intron #5.06 rs ’.12 rs ’.06

31

32 LD Plot Available from SNPInfo (http://manticore.niehs.nih.gov/)

33 Conclusions Alleles at linked loci tend to be inherited together, a phenomenon known as linkage disequilibrium (LD) Because recombination is not uniform, the genome has a “block-like” structure – haplotype You do not need to have the “causal variant” in your genotyped set if it is adequately tagged A major strategy for SNP selection is to ensure adequate coverage (r 2 >.8) of common genetic variants in a gene, which can be done with Haploview


Download ppt "Topic #3 Linkage Disequilibrium, Haplotypes & Tagging University of Wisconsin Genetic Analysis Workshop June 2011."

Similar presentations


Ads by Google