Presentation is loading. Please wait.

Presentation is loading. Please wait.

Topic #3 Linkage Disequilibrium, Haplotypes & Tagging

Similar presentations


Presentation on theme: "Topic #3 Linkage Disequilibrium, Haplotypes & Tagging"— Presentation transcript:

1 Topic #3 Linkage Disequilibrium, Haplotypes & Tagging
University of Wisconsin Genetic Analysis Workshop June 2011

2 Overview Fate of a new mutation Linkage Disequilibrium (LD)
Measurement Indirect association SNP selection based on LD Haplotypes SNP selection by tagging Practical – SNP selection using Haploview

3 Introduction of a Mutation into a Population
TIME

4 Introduction of a Mutation into a Population
TIME

5 Haplotype Concept The sequence in this location becomes a signature for the chromosome carrying the mutation Haplotype – alleles inherited together at linked loci on the same chromosome haplotype will not be a perfect marker of disease At the time mutation arose, there may have been other chromosomes with New mutations Recombination

6

7 Indirect Association Each of the alleles in the haplotype is also expected to be indirectly associated with carrying the mutation. Indirect association is an association of a marker with phenotype that is non-causal, being based on linkage disequilibrium (LD)

8 Linkage Disequilibrium (LD)
Mendel’s Second Law: alleles at different loci assort independently Linkage Disequilibrium (LD): population-level association of alleles at linked loci

9 LD – population-level association between linked loci
How LD is Measured LD – population-level association between linked loci A locus: A1 or A2 B locus: B1 or B2 Let P(A1) = pA1 Let P(B1) = pB1 Let P(A1B1) = pA1B1 D = pA1B1 - pA1pB1 = 0 if independent

10 Common LD Measures D = |d| D’ = |d|/dmax r2 ( D2) = D2/p(1-p)q(1-q)
Preferred measure for population geneticists Maximum value is bounded by the marginals D’ = |d|/dmax D’ varies between 0 and 1 Does not have an easy interpretation and 1.0 is achieved if one off-diagonal is zero r2 ( D2) = D2/p(1-p)q(1-q) Has several interpretations: = squared (phi) correlation so lies in [0,1]. = c2/N Directly related to power for indirect association

11 Allelic Association Direct Association Indirect Association
Initially it was thought that we could pick the genes and the (single) genetic variant w/i each gene that was relevant for disease Indirect Association The existence of LD opens up the possibility of tests by indirect association – we don’t need to actually test the causal variant but rather need only genotype a marker that is in high LD with the causal variant

12 Indirect and Direct Allelic Association
Direct Association Indirect Association D M1 M2 D M3 Assess relationship of D locus indirectly by determining whether markers (Mi) are associated with disease – Mi don’t need to be functional Assess relationship of D locus to phenotype directly – expect D to be a functional polymorphism in a candidate gene

13 Martin, E.R. et al. (2000). SNPing away at complex disease … AJHG 67: 383-394

14

15 Dawson, E. et al. (2002). A first-generation LD map of 22
Dawson, E. et al. (2002). A first-generation LD map of 22. Nature 418:

16 Population Differences
Weiss, K.M & Clark, A.G. (2002). Trends in Genetics, 18(1):19-24.

17 Recombination Hotspots
Hotspots typically span 1-2 kb Kauppi, L., Jeffreys, A. J., & Keeney, S. (2004). Where the crossovers are: Recombination distributions in mammals. Nature Reviews Genetics, 5,

18 Haplotype Blocks

19 Two- and Three-locus Haplotypes
APOE locus and haplotypes containing APOE Martin, E.R. et al. (2000). SNPing away at complex disease … AJHG 67:

20 Two- and Three-locus Haplotypes
3-locus haplotype stronger signal than individual markers Martin, E.R. et al. (2000). SNPing away at complex disease … AJHG 67:

21 SNP Selection by Tagging
Basic rationale: The power for a causal SNP in a sample of size N is equivalent to power of tagging SNP in a sample of size N/r2 Tagging SNP selection: Based on some reference sample (HapMap) Two overarching strategies Pairwise tagging Multimarker tagging de Bakker, P. I. W., et al. (2005). Efficiency and power in genetic association studies. Nature Genetics, 37(11),

22 Reference Sample: HapMap (www.hapmap.org)
HapMap Phase 1: SNP Selection Strategy (yield ~ 1 million): >1 common SNP every 5 kb, total of 1.3 million before QC MAF > .05 Some priority for non-synonymous cSNPs Sample: N=270 (269) individuals from 4 populations 30 trios of Europeans from Utah (CEU) 45 unrelated Han Chinese (CHB) 45 unrelated Japanese (JPT) 30 Yoruban trios from Nigeria (YRI)

23 Reference Sample: HapMap (www.hapmap.org)
Phase 2: 2.1 million additional SNPs Total now averages ~ 1/per kb; >98% of common variants w/i 5kb Focus still on MAF > .05 Average max r2 of untyped common SNPs to a typed SNP Population HapMap I HapMap II YRI .67 .90 CEU .85 .96 CHB+JPT .83 .95

24 Reference Sample: HapMap (www.hapmap.org)
Phase 3: Expand to N=1115 in 11 ancestral groups 2.1 million additional SNPs * Sample consists of family triples

25 HAPMAP3, Release 2 Region in NCBI B36 COMT Phase, Release and Build

26 HapMap Genotyped SNPs in COMT

27 Using Haploview to Identify Tagging SNPs for COMT
Download Data from HapMap Choose HapMap Download, Phase 3, and Release 2 Choose population Choose chromosome (22) and region (NCBI B36/hg18) Transcription starts at 18309; I will start at 18304 Transcription ends at 18337; I will end at 18340 Haploview Analysis Get LD plot Run Tagger (pairwise) Force include/exclude

28 COMT LD Plot (D’)

29 COMT LD Plot (r2)

30 COMT Tagging SNPs (15 tag 24 at avg r2 = .996)
Tag SNP bp Location MAF Other SNPs Tagged rs 5’ .37 rs rs .21 rs rs737865 Intron #1 .32 rs737866, rs , rs rs174675 .31 rs933271 rs .26 rs .41 rs740601 Intron #2 .45 rs , rs rs4680 Exon #4 .47 rs4633 rs Intron #5 .25 rs174696 .18 rs .11 rs .17 rs165728 rs165824 .06 rs165815 3’ .12 rs

31

32 LD Plot Available from SNPInfo
(http://manticore.niehs.nih.gov/)

33 Conclusions Alleles at linked loci tend to be inherited together, a phenomenon known as linkage disequilibrium (LD) Because recombination is not uniform, the genome has a “block-like” structure – haplotype You do not need to have the “causal variant” in your genotyped set if it is adequately tagged A major strategy for SNP selection is to ensure adequate coverage (r2 > .8) of common genetic variants in a gene, which can be done with Haploview


Download ppt "Topic #3 Linkage Disequilibrium, Haplotypes & Tagging"

Similar presentations


Ads by Google