Presentation is loading. Please wait.

Presentation is loading. Please wait.

Association analysis Shaun Purcell Boulder Twin Workshop 2004.

Similar presentations


Presentation on theme: "Association analysis Shaun Purcell Boulder Twin Workshop 2004."— Presentation transcript:

1 Association analysis Shaun Purcell Boulder Twin Workshop 2004

2 Overview Candidate gene association Haplotypes and linkage disequilibrium Linkage and association Family-based association

3 What is association? Categorical traits –disease susceptibility genes Continuous traits –quantitative trait loci, QTL

4 Disease traits Case Control AAn 1 n 2 Aan 3 n 4 aan 5 n 6 Is there a difference in allele/genotype frequency between cases and controls?

5 Disease traits Case Control AA 3025p 2 Aa 50502p(1-p) aa 20 25 (1-p) 2 Is there a difference in allele/genotype frequency between cases and controls? Test for independence, p-value

6 Disease traits CaseControl AAn1n1 n2n2 Aan3n3 n4n4 aan5n5 n6n6 CaseControl A2n 1 +n 3 2n 2 +n 4 a2n 5 +n 3 2n 6 +n 4 CaseControl A*n 1 +n 3 n 2 +n 4 aan5n5 n6n6 General model Additive modelDominant model for A 2 df 1 df Effect sizes calculated as odds ratios

7 Relative risk D+D- E+ab E-cd Risk in E+ = a / ( a + b ) Risk in E- = c / ( c + d ) Relative risk of exposure = (a /( a + b )) / (c /(c + d ))

8 Odds ratio D+D- E+ab E-cd Odds in D+ = a/c Odds in D- = b/d Odds ratio = (a/c) / (b/d)

9 Quantitative traits AA Aa aa Aa AA IDYGAD 0010.34aa-10 0021.23Aa01 0031.66Aa01 0042.74AA10 0051.33AA10 …………… Y = aA + dD + e

10 Some web resources BGIM http://statgen.iop.kcl.ac.uk/bgim/ Introductory tutorials on twin analysis, primer on maximum likelihood, Mx language. GxE moderator models http://statgen.iop.kcl.ac.uk/gxe/ Power calculation http://statgen.iop.kcl.ac.uk/gpc/ Case/control association tools http://statgen.iop.kcl.ac.uk/gpc/model/

11

12 Relative risk GenotypeP(D|G)RR AAP(D|AA)P(D|AA)/P(D|aa) AaP(D|Aa)P(D|Aa)/P(D|aa) aaP(D|aa)1 P(D|AA) / P(D|aa) labelled RR(AA) P(D|Aa) / P(D|aa) labelled RR(Aa)

13 Genetic models ModelRR(Aa)RR(AA) Generalxy Multiplicativexx2x2 Dominantxx Recessive1.000x No effect1.000

14 Tests TestAlternateNull Any effect? GeneralNo effect Any effect assuming a multiplicative gene? MultiplicativeNo effect Any effect assuming a dominant gene? DominanceNo effect Any effect assuming a recessive gene? RecessiveNo effect Can we assume a multiplicative effect? GeneralMultiplicative Can we assume a dominant effect? GeneralDominance Can we assume a recessive effect? GeneralRecessive

15 Multiple samples Constrain frequencies across samples Constrain effects across samples –Can test genetic models with effects and/or frequencies constrained to be equal –Can perform tests of homogeneity of effects and/or frequencies across samples

16 An example 2 case/control samples Population frequency 5% CaseControl AA1711 Aa3559 aa2440 CaseControl AA3710 Aa6743 aa2037

17

18 Homogeneous effects across samples Homogeneous allele frequencies across samples ModelpRR(Aa)RR(AA)-2LL ---------------------- Gen0.367 1.979 3.663 0.367 1.979 3.663793.143 Mult0.367 1.9113.651 0.367 1.9113.651793.199 Dom 0.4011.9901.990 0.4011.9901.990802.927 Rec0.4051.0001.921 0.4051.0001.921805.064 None0.4421.0001.000 0.4421.0001.000 815.628

19 Heterogeneous effects across samples Homogeneous allele frequencies across samples ModelpRR(Aa)RR(AA)-2LL ----- ------- ---------- Gen0.367 1.2352.136 0.367 2.890 5.547786.498 Mult 0.367 1.4402.073 0.367 2.2825.208788.262 Dom 0.4011.2161.216 0.4012.9362.936796.422 Rec0.4051.0001.519 0.4051.0002.195803.849 None0.4431.0001.000 0.4431.0001.000815.628

20 TESTS OF GENETIC MODELS -- ASSUMING EQ EFFECTS & EQ FREQS ========================================================= Gen vs None (2 df) : 22.485p = 0.000 Mult vs None (1 df) : 22.429p = 0.000 Dom vs None (1 df) : 12.701p = 0.000 Rec vs None (1 df) : 10.564p = 0.001 Gen vs Mult (1 df) : 0.056p = 0.813 Gen vs Dom (1 df) : 9.784p = 0.002 Gen vs Rec (1 df) : 11.921p = 0.001 TESTS OF GENETIC MODELS -- ASSUMING UNEQ EFFECTS & EQ FREQS =========================================================== Gen vs None (4 df) : 29.130p = 0.000 Mult vs None (2 df) : 27.366p = 0.000 Dom vs None (2 df) : 19.205p = 0.000 Rec vs None (2 df) : 11.779p = 0.003 Gen vs Mult (2 df) : 1.764p = 0.414 Gen vs Dom (2 df) : 9.925p = 0.007 Gen vs Rec (2 df) : 17.351p = 0.000 TESTS OF EQUAL EFFECTS -- ASSUMING EQ FREQS =========================================== w/ Gen model (2 df) : 6.645p = 0.036 w/ Mult model (1 df) : 4.938p = 0.026 w/ Dom model (1 df) : 6.505p = 0.011 w/ Rec model (1 df) : 1.215p = 0.270

21 Indirect association QTL Genotyped markers Ungenotyped markers

22 Recombination Paternal chromosome Maternal chromosome Homologous chromosomes in one parent Recombination event during meiosis Recombinant gamete transmitted, harboring mutation

23 Recombination Paternal chromosome Maternal chromosome Homologous chromosomes in one parent No recombination event during meiosis Nonrecombinant gamete transmitted, not harboring mutation

24 Linkage: affected sib pairs Paternal chromosome Maternal chromosome First affected offspring, no recombination Second affected offspring, recombinant gamete IBD sharing from this one parent (0 or 1) 1 0

25 Association analysis Mutation occurs on a ‘red’ chromosome

26 Association analysis Mutation occurs on a ‘red’ chromosome

27 Association analysis Association due to `linkage disequilibrium’

28 Aa MAMaM mAmam This individual has aa and Mm genotypes and am and aM haplotypes Haplotypes

29 Aa MAMaM mAmam This individual has Aa and Mm genotypes and AM and am haplotypes … but given only genotype data, consistent with Am/aM as well as AM/amHaplotypes

30 Aa MAMaM mAmam This individual has AA and Mm genotypes and AM and Am haplotypesHaplotypes

31 Equilibrium haplotype frequencies Aa Mprpsp mqrqsq rs

32 Linkage disequilibrium Aa Mpr + Dps - Dp mqr - Dqs + Dq rs D MAX = Min(qs, pr) D’ = D /D MAX r 2 = D’ / pqrs

33 Haplotype analysis 1.Estimate haplotypes from genotypes 2.Associate haplotypes with trait HaplotypeFreq.Odds Ratio AAGG40%1.00* AAGT30%2.21 CGCG25%1.07 AGCT5%0.92 * baseline, fixed to 1.00

34

35 LinkageAssociation QTL genotype Trait IBD at the QTL Sib correlation 0 1 2 aaAaAA Marker genotype Trait QTL genotype Trait LD RF IBD at the Marker Sib correlation 0 1 2 IBD at the QTL Sib correlation 0 1 2 aaAaAA aaAaAA

36 Variance Components Means M 1 M 2 Variance-covariance matrix V 1 C 21 C 12 V 2 ASSOCIATION LINKAGE

37 Variance Components Means M 1 + bG 1 M 2 + bG 2 Variance-covariance matrix V 1 C 21 + q(  -½) C 12 + q(  -½) V 2 LINKAGE q = regression coef.  = IBD sharing 0, ½, 1 ASSOCIATION b = regression coef. G = individual’s genotype

38 POPULATION MODEL –Allele & genotype frequencies –Demographics & population history –Linkage disequilibrium, haplotype structure TRANSMISSION MODEL –Mendelian segregation –Identity by descent & genetic relatedness PHENOTYPE MODEL –Biometrical model of quantitative traits –Additive & dominance components Components of a Genetic Theory G G G G G G G G Time G G G G G G G G G G G G G G GG PP

39 3/52/6 3/2 5/2 3/52/6 3/6 5/6 Both families are ‘linked’ with the marker… …but a different allele is involved. Linkage without association

40 3/62/4 3/2 6/2 3/52/6 3/6 5/6 All families are ‘linked’ with the marker… … and allele 6 is ‘associated’ with disease 4/62/6 6/66/6 6/66/6 Linkage is just association within families Linkage and association

41 3/6 2/4 3/2 6/2 3/5 2/5 3/6 5/6 Allele 6 is more common in the GREEN population The disease is more common in the GREEN population … a ‘spurious association’ 4/6 2/6 6/66/6 2/2 3/4 5/2 ControlsCases Association without linkage

42 TDT Transmission disequilibrium test –test for linkage and association AA Aa AA Aa aa AA Aa

43 TDT “A” disease allele AA x Aa AA x Aa aa x Aa aa x Aa AA Aa Aa aa + - + - 0.5 0.5 + - + - 0.5 0.5 Additive Dominant Recessive

44 Between and within components Sib1 Sib2 Sib1 = B - W Sib2 = B + W

45 Between and within components Fulker et al (1999) S1S1 S2S2 S1S1 S2S2 BWS1S1 S2S2 AA 1110B+WB-W AAAa100.5 B+WB-W AAaa101B+WB-W Note : W = S 1 – B

46 Parental genotypes Use parental genotypes to generate B Examples –AA from AAxAA W = 0 –Aa from AAxAa W = -0.5 –Aa from AaxAa W = 0 PatMatB 111 100.5 10 010.5 000 0-0.5 10 0-0.5

47 assoc.mx Sibling pair sample B and W components precalculated in input file Single SNP genotype Quantitative trait

48 assoc.dat -0.007 -0.972 -1 0 -0.5 -0.5 0.5 -0.829 -0.196 1 1 1 0 0 0.369 0.645 1 1 1 0 0 0.318 1.55 0 1 0.5 -0.5 0.5 1.52 0.910 0 0 0 0 0 -0.948 -1.55 1 1 1 0 0 0.596 -0.394 1 0 0.5 0.5 -0.5 -1.91 -0.905 0 1 0.5 -0.5 0.5 0.499 0.940 1 0 0.5 0.5 -0.5 -1.17 -1.29 1 0 0.5 0.5 -0.5 -0.16 -1.81 1 1 1 0 0 s1 s2 g1 g2 b w1 w2

49 ! Mx script for QTL association: sib pairs, univariate Group 1 : Calc NG=2 Begin Matrices; ! ** Parameters B Full 1 1 free! association : between component W Full 1 1 free ! association : within component M Full 1 1 free ! mean S Full 1 1 free ! Shared residual variance N Full 1 1 free! Nonshared residual variance ! ** Definition variables ** C Full 1 1 ! association : between X Full 1 1 ! association : within, sib 1 Y Full 1 1 ! association : within, sib 2 End Matrices; ! ** Uncomment for B=W model ! Equate W 1 1 1 B 1 1 1 ! Starting values Matrix B 0 Matrix W 0 Matrix M 0 Matrix S 0.5 Matrix N 0.5 End

50 Group2 : Data Group Data NI=7 NO=0 RE file=assoc.dat Labels Sib1 Sib2 g1 g2 b w1 w2 Select Sib1 Sib2 b w1 w2 / Definition b w1 w2 / Matrices = Group 1 Means M + B*C + W*X | M + B*C + W*Y / Covariance S + N | S _ S | S + N / Specify C b / Specify X w1 / Specify Y w2 / End

51 Models B & W B Full 1 1 free W Full 1 1 free !Equate W 1 1 1 B 1 1 1 B = W B Full 1 1 free W Full 1 1 free Equate W 1 1 1 B 1 1 1 B B Full 1 1 free W Full 1 1 !Equate W 1 1 1 B 1 1 1 B=W=0 B Full 1 1 W Full 1 1 !Equate W 1 1 1 B 1 1 1

52 Tests TestH A H 0 Standard association testB = WB=W=0 Test of stratificationB & W B = W Robust association testB & W B

53 assoc.mx ModelBW-2LLdf B & W -0.478 -0.3652103.96795 B = W -0.420 -0.4202105.05796 B -0.47782127.01796 B=W=0 2163.34797 Test of total association H A B=W2105.05 H 0 B=W=02163.34 Δ-2LL= 58.29, df = 1, p < 1e-14

54 assoc.mx ModelBW-2LLdf B & W -0.478 -0.3652103.96795 B = W -0.420 -0.4202105.05796 B -0.47782127.01796 B=W=0 2163.34797 Test of stratification H A B &W2103.96 H 0 B = W2105.05 Δ-2LL= 1.09, df = 1, p =0.29

55 assoc.mx ModelBW-2LLdf B & W -0.478 -0.3652103.96795 B = W -0.420 -0.4202105.05796 B -0.47782127.01796 B=W=0 2163.34797 Test of within association H A B &W2103.96 H 0 B2127.01 Δ-2LL= 23.06, df = 1, p < 1e-6

56 Implementation QTDT –Abecasis et al (2001) AJHG –extends between/within model to general pedigrees –multiple alleles –covariates –combined test of linkage and association –discrete as well as quantitative traits

57 Linkage Association families detectable over large distances >10 cM large effects OR >3, variance>10% unrelateds or families detectable over small distances <1 cM small effects OR<2, variance<1%


Download ppt "Association analysis Shaun Purcell Boulder Twin Workshop 2004."

Similar presentations


Ads by Google