Presentation is loading. Please wait.

Presentation is loading. Please wait.

Genetics for Epidemiologists Lecture 5: Analysis of Genetic Association Studies National Human Genome Research Institute National Institutes of Health.

Similar presentations


Presentation on theme: "Genetics for Epidemiologists Lecture 5: Analysis of Genetic Association Studies National Human Genome Research Institute National Institutes of Health."— Presentation transcript:

1 Genetics for Epidemiologists Lecture 5: Analysis of Genetic Association Studies National Human Genome Research Institute National Institutes of Health U.S. Department of Health and Human Services National Institutes of Health National Human Genome Research Institute Teri A. Manolio, M.D., Ph.D. Director, Office of Population Genomics and Senior Advisor to the Director, NHGRI, for Population Genomics

2 Topics to be Covered Discrete traits and quantitative traits Measures of association Detecting/correcting for false positives Genotyping quality control Quantile-quantile (Q-Q) plots Odds ratios: allelic and genotypic Models of genetic transmission Interactions: gene-gene, gene-environment

3 Larson, G. The Complete Far Side

4 Quantitative Genetics “…concerned with the inheritance of those differences between individuals that are of degree rather than of kind…” QuantitativeQualitative Falconer and Mackay, Quantitative Genetics 1996.

5 Quantitative Genetics “…concerned with the inheritance of those differences between individuals that are of degree rather than of kind…” QuantitativeQualitative Continuous gradation among individuals from one extreme to other Sharply demarcated types with little connection by intermediates Falconer and Mackay, Quantitative Genetics 1996.

6 Quantitative Genetics “…concerned with the inheritance of those differences between individuals that are of degree rather than of kind…” QuantitativeQualitative Continuous gradation among individuals from one extreme to other Sharply demarcated types with little connection by intermediates Effects of genes are smallEffects of genes are large Falconer and Mackay, Quantitative Genetics 1996.

7 Quantitative Genetics “…concerned with the inheritance of those differences between individuals that are of degree rather than of kind…” QuantitativeQualitative Continuous gradation among individuals from one extreme to other Sharply demarcated types with little connection by intermediates Effects of genes are smallEffects of genes are large Usually many genes Single genes inherited in Mendelian ratios? Falconer and Mackay, Quantitative Genetics 1996.

8 Inheritance Models in Single Gene Trait A a

9 Genotype Group ModelAAAaaa Inheritance Models in Single Gene Trait

10 Genotype Group ModelAAAaaa A is Dominant Inheritance Models in Single Gene Trait

11 Genotype Group ModelAAAaaa A is Dominant Inheritance Models in Single Gene Trait

12 Genotype Group ModelAAAaaa A is Dominant A is Recessive Inheritance Models in Single Gene Trait

13 Genotype Group ModelAAAaaa A is Dominant A is Recessive A is Co-Dominant Inheritance Models in Single Gene Trait

14 Inheritance Models in Quantitative Trait A x increase in height a x decrease in height

15 Population Mean Model-x 0+x Inheritance Models in Quantitative Trait

16 Population Mean Model-x 0+x A is Completely Dominant aa AA Aa Inheritance Models in Quantitative Trait

17 Population Mean Model-x 0+x A is Completely Dominant aa AA Aa A is Partially Dominant aa AaAA Inheritance Models in Quantitative Trait

18 Population Mean Model-x 0+x A is Completely Dominant aa AA Aa A is Partially Dominant aa AaAA A is Not (Co-) Dominant aaAaAA Inheritance Models in Quantitative Trait

19 Population Mean Model-x 0+x A is Completely Dominant aa AA Aa A is Partially Dominant aa AaAA A is Not (Co-) Dominant aaAaAA A is Over- Dominant aaAA Aa Inheritance Models in Quantitative Trait

20 Quantitative Traits with Published GWA Studies ( ) QT interval Lipids and lipoproteins Memory Nicotine dependence ORMDL3 expression YKL-40 levels Obesity, BMI, waist Insulin resistance Height Bone mineral density F-cell distribution Fetal hemoglobin levels C-Reactive protein 18 groups of Framingham traits Pigmentation Uric Acid Levels Recombination Rate

21 Association of Alleles and Genotypes of rs (‘3049) with Myocardial Infarction C N (%) G N (%)  2 (1df) P-value Cases2,132 (55.4)1,716 (44.6) x Controls2,783 (47.4)3,089 (52.6) Allelic Odds Ratio = 1.38 Samani N et al, N Engl J Med 2007; 357:

22 Association of Alleles and Genotypes of rs (‘3049) with Myocardial Infarction C N (%) G N (%)  2 (1df) P-value Cases2,132 (55.4)1,716 (44.6) x Controls2,783 (47.4)3,089 (52.6) Allelic Odds Ratio = 1.38 CC N (%) CG N (%) GG N (%)  2 (2df) P-value Cases586 (30.5) 960 (49.9)378 (19.6) x Controls676 (23.0)1,431 (48.7)829 (28.2) Heterozygote Odds Ratio = 1.47 Homozygote Odds Ratio = 1.90 Samani N et al, N Engl J Med 2007; 357:

23 -Log 10 P Values for SNP Associations with Myocardial Infarction Samani N et al, N Engl J Med 2007; 357:

24 Genome-Wide Scan for Type 2 Diabetes in a Scandinavian Cohort

25 Linear regression of inverse normalized levels against number of alleles Additive model Sex, age, age 2 as covariates GWA Study of Serum Uric Acid Levels Li S et al, PLoS Genet 2007; 3:e194.

26 Association of rs and Uric Acid Levels Li S et al, PLoS Genet 2007; 3:e194. Genotype Means (mg/dl) CohortAdditive EffectAAAGGG SardiNIA (1.51)4.48 (1.59)4.02 (1.63) InCHIANTI (1.44)4.94 (1.31)4.33 (1.37)

27 Association Methods for Quantitative Traits Linear regression of multivariable adjusted residual against number of alleles (Kathiresan,Nat Genet 2008; 40:189-97) Linear regression of log transformed or centralized BMI against genotype (Frayling, Science 2007; 316:889-94) Variance components based Z-score analysis of quantile normalized height (Sanna, Nat Genet 2008; 40: )

28 Ways of Dealing with Multiple Testing Control family wise error rate (FWER): Bonferroni (α’ = α/n) or Sĭdák (α’ = 1- [1- α] 1/n ) False discovery rate: proportion of significant associations that are actually false positives False positive report probability: probability that the null hypothesis is true, given a statistically significant finding Bayes factors analysis: avoids need for assessing genome-wide error rates but must identify reasonable alternative model Hogart CJ et al, Genet Epidemiol 2008; 32:

29 Larson, G. The Complete Far Side

30 Quality Control of SNP Genotyping: Samples Identity with forensic markers (Identifiler) Blind duplicates Gender checks Cryptic relatedness or unsuspected twinning Degradation/fragmentation Call rate (> 80-90%) Heterozygosity: outliers Plate/batch calling effects Chanock et al, Nature 2007; Manolio et al Nat Genet 2007

31 Quality Control of SNP Genotyping: SNPs Duplicate concordance (CEPH samples) Mendelian errors (typically < 1) Hardy-Weinberg errors (often > ) Heterozygosity (outliers) Call rate (typically > 98%) Minor allele frequency (often > 1%) Validation of most critical results on independent genotyping platform Chanock et al, Nature 2007; Manolio et al Nat Genet 2007

32 Hardy-Weinberg Equilibrium Occurrence of two alleles of a SNP in the same individual are two independent events Ideal conditions: –random mating - no selection (equal survival) –no migration - no mutation –no inbreeding - large population sizes –gene frequencies equal in males and females)… If alleles A and a of SNP rs1234 have frequencies p and 1-p, expected frequencies of the three genotypes are: After G. Thomas, NCI Freq AA = p 2 Freq Aa = 2p(1-p)Freq aa = (1-p) 2

33 MetricPerlegenAffymetrix/Broad Number of SNPs480,744439,249 Coverage Single Marker Multi- Marker Single Marker Multi- Marker CEU CHB + JPT YRI Average call rate98.9%99.3% Concordance Homozygous genotypes 99.8%99.9% Heterozygous genotypes 99.8% Coverage, Call Rates, and Concordance of Perlegen and Affymetrix Platforms on HapMap Phase II GAIN Collaborative Group, Nat Genet 2007; 39:

34 Metric5.0% fail6.0% fail Total Samples1,829--2,289-- Passing QC1, , > 98% call rate1, , Sample and SNP QC Metrics for Affymetrix 5.0 and 6.0 Platforms in GAIN Courtesy, J Paschall, NCBI

35 Metric5.0% fail6.0% fail Total Samples1,829--2,289-- Passing QC1, , > 98% call rate1, , Total SNPs457, ,660-- Passing QC429, , MAF > 1%457, , > 98% call rate419, , > 95% call rate439, , HWE < , , < 1 Mendel error417, , < 1 Duplicate error454, , Sample and SNP QC Metrics for Affymetrix 5.0 and 6.0 Platforms in GAIN Courtesy, J Paschall, NCBI

36 Sample Heterozygosity in GAIN Courtesy, J Paschall, NCBI

37 Sample Heterozygosity in GAIN Courtesy, J Paschall, NCBI

38 Signal Intensity Plots for rs in AREDS

39 Signal Intensity Plots for rs in AREDS

40 Signal Intensity Plots for rs in AREDS

41 Signal Intensity Plots for rs in AREDS

42 Signal Intensity Plots for CD44 SNP rs Clayton DG et al, Nat Genet 2005; 37:

43 Courtesy, G. Thomas, NCI Principal Component Analysis of Structured Population: First to Third Components

44 Courtesy, G. Thomas, NCI Principal Component Analysis of Structured Population: Fourth and Fifth Components

45 Courtesy, G. Thomas, NCI Influence of Relatedness on Principal Component Analysis

46 Courtesy, G. Thomas, NCI Principal Component Analysis of Structured Population: Fourth and Fifth Components

47 Courtesy, G. Thomas, NCI Principal Component Analysis of Structured Population: Fourth and Fifth Components

48 Summary Points: Genotyping Quality Control Sample checks for identity, gender error, cryptic relatedness Sample handling differences can introduce artifacts but probably can be adjusted for Association analysis is often quickest way to find genotyping errors Low MAF SNPs are most difficult to call Inspection of genotyping cluster plots is crucial!

49 Easton D et al, Nature 2007; 447: Quantile-Quantile Plot for Test Statistics, 390 Breast Cancer Cases, 364 Controls 205,586 SNPs λ = 1.03

50 Easton D et al, Nature 2007; 447: Observed and Expected Associations after Stage 2 of Breast Cancer GWA SignificanceObserved Observed Adjusted ExpectedRatio ,2391, – – – < All p < 0.051,9561,7921,

51 Q-Q Plot for Multiple Sclerosis; Effect of MHC Hafler D et al, N Engl J Med 2007; 357:

52 Q-Q Plot for Prostate Cancer, all SNPs Gudmundsson J et al, Nat Genet 2007; 39:

53 Q-Q Plot for Prostate Cancer, excluding Chromosome 8 Gudmundsson J et al, Nat Genet 2007; 39:

54 Q-Q Plot for Myocardial Infarction Samani N et al, N Engl J Med 2007; 357: Expected chi-squared statistic Observed chi-squared statistic

55 -Log 10 P Values for SNP Associations with Myocardial Infarction Samani N et al, N Engl J Med 2007; 357:

56 -Log 10 P Values for SNP Associations with Myocardial Infarction Samani N et al, N Engl J Med 2007; 357:

57 SNP Associations with 1,928 MI Cases and 2,938 Controls from UK Samani N et al, N Engl J Med 2007; 357:

58 Association Signal for Coronary Artery Disease on Chromosome 9 ’3049 Samani N et al, N Engl J Med 2007; 357:

59 Winner’s Curse: Odds Ratios for CHD Associated with LTA Genotypes in Multiple Studies Clarke et al, PLoS Genet 2006; 2:e107.

60 Genome-Wide Scan for Alzheimer’s Disease in 861 Cases and 550 Controls Reiman E et al, Neuron 2007; 54:

61 Genome-Wide Scan for Alzheimer’s Disease in ApoE*e4Carriers Reiman E et al, Neuron 2007; 54:

62 LOAD Odds Ratios Associated with rs GG by APOE*e4 Status APOE*e4 Group APOE*e4 OR [95% CI] rs OR [95%CI] APOE*e [0.82,1.53] APOE*e [1.90,4.36] All6.07 [ ]1.34 [1.06,1.70] Reiman et al, Neuron 2007; 54:

63 Klein et al, Science 2005; 308: P Values of GWA Scan for Age-Related Macular Degeneration

64 Klein et al, Science 2005; 308: Odds Ratios and Population Attributable Risks for AMD Attribute (SNP) rs (C/G) rs (C/T) Risk alleleCC Allelic association χ 2 P value4.1 x 10 –8 1.4 x 10 –6 Odds ratio (dominant)4.6 [2.0-11]4.7 [1.0-22] Frequency in HapMap CEU Population Attributable Risk70% [42-84%]80% [0-96%] Odds ratio (recessive)7.4 [2.9-19]6.2 [2.9-13] Frequency in HapMap CEU Population Attributable Risk46% [31-57%]61% [43-73%]

65 Risk of Developing AMD by CFH Y402H and Modifiable Risk Factors Schaumberg DA et al, Arch Ophthalmol 2007; 125: Risk Factor CFH Y402H Genotype YYYHHH BMI < 30 kg/m [ ] 3.96 [ ] BMI > 30 kg/m [ ] 2.19 [ ] [ ] Non-smoker [ ] 4.23 [ ] Current smoker 2.34 [ ] 3.20 [ ] 8.69 [ ]

66 TT CC CT Ordovas et al, Circulation 2002; 106: Interaction: Is LIPC Genotype Related to HDL-C? TT CC CT

67 Inverse Relation between Endotoxin Exposure and Allergic Sensitization by CD14 Genotype Simpson A et al, Am J Respir Crit Care Med 2006;174:

68 Challenges in Studying Gene-Environment Interactions ChallengeGenesEnvironment Ease of measurePretty easyOften hard Variability over timeLow/noneHigh Recall biasNonePossible Temporal relation to disease EasyHard

69 Larson, G. The Complete Far Side

70


Download ppt "Genetics for Epidemiologists Lecture 5: Analysis of Genetic Association Studies National Human Genome Research Institute National Institutes of Health."

Similar presentations


Ads by Google