Presentation is loading. Please wait.

Presentation is loading. Please wait.

Picking SNPs Application to Association Studies Dana Crawford, PhD SeattleSNPs PGA University of Washington March 20, 2006.

Similar presentations


Presentation on theme: "Picking SNPs Application to Association Studies Dana Crawford, PhD SeattleSNPs PGA University of Washington March 20, 2006."— Presentation transcript:

1 Picking SNPs Application to Association Studies Dana Crawford, PhD SeattleSNPs PGA University of Washington March 20, 2006

2 Outline of Tutorial Concepts of tagSNPs LD and haplotype definitions Haplotype blocks and definitions Tools to identify tagSNPs

3 Why Do We Need tagSNPs? Whole Genome: 15,000,000 SNPs 6,000,000 SNPs > 5% MAF Too Many SNPs to Genotype! Ex: E2F2 Average Gene: 26.5 kb 130 SNPs 44 SNPs ≥5% MAF

4 SNPs Are Correlated (aka linkage disequilibrium) “the nonindependence of alleles at different sites.” Pritchard and Przeworski 2001 Genotype at one site can predict genotype at another site Proportion of sites are correlated

5 Measuring Pair-wise SNP Correlations SNP correlation described by linkage disequilibrium (LD) Pair-wise measures of LD: D´ and r 2 D = p AB - p A p B ; D´ = D/D max Recombination r 2 = D 2 f(A 1 )f(A 2 )f(B 1 )f(B 2 ) Power

6 r 2 is inversely related to power 1/r 2 1,000 cases1,250 cases 1,000 controls r 2 =1.01,250 controlsr 2 = 0.80 D´ is related to recombination history D´ = 1no recombination D´ < 1historical recombination LD Statistics: Practical Uses

7 Where to Find Population LD Statistics For your gene or region of interest, search HapMapwww.hapmap.org Perlegengenome.perlegen.com Environmental Genome Projectegp.gs.washington.edu SeattleSNPs PGApga.gs.washington.edu

8 Where to Find Population LD Statistics For your gene or region of interest, search HapMapwww.hapmap.org Perlegengenome.perlegen.com Environmental Genome Projectegp.gs.washington.edu SeattleSNPs PGApga.gs.washington.edu

9 Visualizing Pair-wise LD

10

11 USF1 2500 1500 Visualizing Pair-wise LD

12

13

14

15

16

17 SeattleSNPs + Perlegen SeattleSNPs

18 Visualizing Pair-wise LD: Beyond the Gene

19 Visualizing Pair-wise LD: Beyond the Gene

20

21

22 SeattleSNPs Visualizing Pair-wise LD: Beyond the Gene

23 Multi-SNP Correlations (aka Haplotypes) “…a unique combination of genetic markers present in a chromosome.” pg 57 in Hartl & Clark, 1997

24 Constructing Haplotypes C TA GC TA G T TG GT TG G C CA GC CA G C/T, A/G C/C, A/G T/T, G/G C/T, A/A C/C, A/G Collect pedigreesSomatic cell hybrids Human Rodent Hybrid SNP 1 SNP 2 C/TA/G Allele-specific PCR

25 Constructing Haplotypes Examples of Haplotype Inference Software: EM Algorithm Haploview http://www.broad.mit.edu/mpg/haploview/index.php Arlequin http://lgb.unige.ch/arlequin/ PHASE v2.1 http://www.stat.washington.edu/stephens/software.html HAPLOTYPER http://www.people.fas.harvard.edu/~junliu/Haplo/docMain.htm

26 Haplotypes in SeattleSNPs >250 genes re-sequenced in inflammation response 2 populations: European- and African-descent PHASEv2.0 results posted on website Interactive tool (VH1) to visualize and sort haplotypes http://pga.gs.washington.edu

27 Haplotypes in SeattleSNPs

28

29

30

31

32

33

34

35

36

37

38 r 2 is inversely related to power 1/r 2 1,000 cases1,250 cases 1,000 controls r 2 =1.01,250 controlsr 2 = 0.80 D´ is related to recombination history D´ = 1no recombination D´ < 1historical recombination Example: LDSelect in GVS Example: Haplotype “blocks” Using LD and Haplotypes to Pick tagSNPs

39 r 2 is inversely related to power 1/r 2 1,000 cases1,250 cases 1,000 controls r 2 =1.01,250 controlsr 2 = 0.80 Example: LDSelect Using LD and Haplotypes to Pick tagSNPs Discovery genotype datapair-wise LDpick tagSNPs

40 LDSelect: Using LD to Pick tagSNPs LDSelect Uses SNP discovery data (not haplotypes) Finds all correlated SNPs to minimize the total number Maintains genetic diversity of locus Carlson et al. AJHG (2004)

41 TagSNPs Are Population Specific European-Americans CRP African-Americans CRP

42 SNP Selection Using GVS

43 22 SNPs (>5% MAF) 7 tagSNPs

44

45 SNP Selection: tagSNP Data

46 Side Note: Categorizing tagSNPs SNP context Nonrepetitive > repetitive Location of SNP Coding > noncoding Function Nonsynonymous > synonymous

47 Categorizing tagSNPs

48 Haplotypes in Genetic Association Studies Two main approaches with haplotypes: HaplotypesPick tagSNPsGenotype samples Pick tagSNPs Infer haplotypesTest for association

49 Haplotypes in Genetic Association Studies Two main approaches with haplotypes: Haplotypes Pick tagSNPs Genotype samples Pick tagSNPs Infer haplotypesTest for association Recombination Natural selection Population history Population demography Haplotype block definition

50 Haplotype “Blocks” Strong LD Few Haplotypes Represent most chromosomes Daly et al 2001 Daly et al Nat. Genet. (2001)

51 Block Definitions Daly et al 2001 D ´ [Gabriel et al Science (2002)] Daly et al Nat. Genet. (2001)

52 Block Definitions AB ab Ab aB Four-gamete test: A B ab <4 haplotypes, D´=1block 4 haplotypes, D´<1boundary

53 Haplotype Blocks and tagSNPs Identifying blocks and tagSNPs: Manually Algorithms – Haploview

54 Haplotype Blocks and tagSNPs IL1B: 19 SNPs (MAF >5%) 4 “common” haplotypes tagSNPs

55 Haplotype Blocks and tagSNPs Identifying blocks and tagSNPs: Manually Algorithms – HaploView

56 HapMap Data and Haploview

57

58

59

60

61

62

63

64 Import HapMap Data into Haploview

65

66

67

68

69 May not be minimal set

70 Minimal set of tagSNPs based on r 2

71 Note: HapMap is not complete variation data

72 HapMap Variation data, LD, and tagSNPs for ABCE1 in European-Americans 7 SNPs 35 SNPs SeattleSNPs 4 tagSNPs

73 Where to Find Tagging Software HaploBlockFinder http://cgi.uc.edu/cgi-bin/kzhang/haploBlockFinder.cgi LDSelect http://pga.gs.washington.edu SNPtagger http://www.well.ox.ac.uk/~xiayi/haplotype/index.html TagIT http://popgen.biol.ucl.ac.uk/software.html tagSNPs http://www-rcf.usc.edu/~stram/tagSNPs.html Haploview http://www.broad.mit.edu/personal/jcbarret/haplo/

74 Haplotypes, TagSNPs, and Caveats Haplotypes are inferred Block-like structure assumed for some software Different block definitions Block boundaries sensitive to marker density Genotype savings may not be great (recombination)

75 Small sample size Subgroup analysis and multiple testing Random error Poorly matched control group Failure to attempt study replication Failure to detect LD with adjacent loci Overinterpreting results and positive publication bias Unwarranted ‘candidate gene’ declaration after identifying association in arbitrary genetic region Common Errors in Association Studies Bell and Cardon (2001) e.g., Second case/control study Gene expression studies

76 Resources available for pair-wise LD and haplotypes Software for tagSNP selection available Be aware the limitations of the approach you choose Replication required by several journals Picking SNPs Application to Association Studies Summary

77 SeattleSNPs Genotyping Service Free genotyping (BeadArray) Emphasis on young investigators Research related to heart, lung, blood, or sleep disorders Moderate to large population samples Apply at pga.gs.washington.edu Due date: TBA


Download ppt "Picking SNPs Application to Association Studies Dana Crawford, PhD SeattleSNPs PGA University of Washington March 20, 2006."

Similar presentations


Ads by Google