Presentation is loading. Please wait.

Presentation is loading. Please wait.

Genomics Workshop Demography of Aging Centers Biomarker Network Meeting in Conjunction with the Annual Meeting of the PAA April 14, 9:00 AM to 3:30 PM.

Similar presentations


Presentation on theme: "Genomics Workshop Demography of Aging Centers Biomarker Network Meeting in Conjunction with the Annual Meeting of the PAA April 14, 9:00 AM to 3:30 PM."— Presentation transcript:

1 Genomics Workshop Demography of Aging Centers Biomarker Network Meeting in Conjunction with the Annual Meeting of the PAA April 14, 9:00 AM to 3:30 PM – Hyatt Regency, Dallas, Texas Sponsored by USC/UCLA Center of Biodemography and Population Health Organized by Teresa Seeman, Steven Cole, Eileen Crimmins

2 Tactical aspects of study administration and sample capture/storage Biological overview of genetics & functional genomics Strategic aspects of study design and data analysis Lunch Technical aspects of study design and data analysis Perspectives on the State of the Field Application clinic

3 Tactical aspects of study administration and sample capture/storage DNA 1.New sample capture Methods: e.g., Oragene, leukocytes Consent & administrative issues 2.Retrospective analyses Sources: blood spots, cheek swabs, etc Consent & administrative issues 3.Epigenetics DNA methylation Histone acetylation & chromatin dynamics Tissue specificity (vs DNA) 4.Tactical issues – Reports from the Field I wish I’d known then… RNA 1.Identifying appropriate target tissues Whole blood, PBMC, saliva, hair, path specim. 2.Sample capture/storage 3.Consent & administrative issues

4

5

6

7 Tactical aspects of study administration and sample capture/storage DNA 1.New sample capture Methods: e.g., Oragene, leukocytes Consent & administrative issues 2.Retrospective analyses Sources: blood spots, cheek swabs, etc Consent & administrative issues 3.Epigenetics DNA methylation Histone acetylation & chromatin dynamics Tissue specificity (vs DNA) 4.Tactical issues – Reports from the Field I wish I’d known then… RNA 1.Identifying appropriate target tissues Whole blood, PBMC, saliva, hair, path specim. 2.Sample capture/storage 3.Consent & administrative issues

8

9

10 Tactical aspects of study administration and sample capture/storage DNA 1.New sample capture Methods: e.g., Oragene, leukocytes Consent & administrative issues 2.Retrospective analyses Sources: blood spots, cheek swabs, etc Consent & administrative issues 3.Epigenetics DNA methylation Histone acetylation & chromatin dynamics Tissue specificity (vs DNA) 4.Tactical issues – Reports from the Field I wish I’d known then… RNA 1.Identifying appropriate target tissues Whole blood, PBMC, saliva, hair, path specim. 2.Sample capture/storage 3.Consent & administrative issues

11 Gene IL6 DNA

12 Gene IL6 DNA

13 Gene IL6 RNA DNA

14 Gene Health IL6 RNA DNA

15 Tactical aspects of study administration and sample capture/storage DNA 1.New sample capture Methods: e.g., Oragene, leukocytes Consent & administrative issues 2.Retrospective analyses Sources: blood spots, cheek swabs, etc Consent & administrative issues 3.Epigenetics DNA methylation Histone acetylation & chromatin dynamics Tissue specificity (vs DNA) 4.Tactical issues – Reports from the Field I wish I’d known then… RNA 1.Identifying appropriate target tissues Whole blood, PBMC, saliva, hair, path specim. 2.Sample capture/storage 3.Consent & administrative issues

16 Biological overview of genetics & functional genomics Theoretical framework: Genes, Environments, transcription, and health 1.“Genetic” influences (missing h, penetrance R-square, etc.) 2.Functional genomics Transcription factors Epigenetics 3.Gene-Environment interactions Regulatory polymorphism Coding polymorphism System dynamics 1.Feedback, network pleiotropy 2.Recursive developmental trajectories

17 Gene IL6 DNA

18 Biological overview of genetics & functional genomics Theoretical framework: Genes, Environments, transcription, and health 1.“Genetic” influences (missing h, penetrance R-square, etc.) 2.Functional genomics Transcription factors Epigenetics 3.Gene-Environment interactions Regulatory polymorphism Coding polymorphism System dynamics 1.Feedback, network pleiotropy 2.Recursive developmental trajectories

19 Gene IL6 DNA

20 Gene IL6 DNA

21 Gene IL6 RNA DNA

22 Gene Health IL6 RNA DNA

23 Gene Health IL6 RNA DNA

24 Social Environment Gene Health IL6 RNA DNA

25 Social Environment Gene Health IL6 RNA DNA

26 Social Environment Gene Health IL6 RNA DNA

27 Social Environment Gene Health IL6 RNA DNA

28 IL6 TCT TGCGATGCTA AAG IL6 gene transcription

29 IL6 TCT TGCGATGCTA AAG IL6 gene transcription NE

30 IL6 TCT TGCGATGCTA AAG IL6 gene transcription NE PKA

31 IL6 TCT TGCGATGCTA AAG IL6 gene transcription NE GATA1 P PKA

32 IL6 TCT TGCGATGCTA AAG IL6 gene transcription NE GATA1 P PKA

33 IL6 TCT TGCGATGCTA AAG IL6 gene transcription NE GATA1 P PKA IL6 promoter activity (fold-change) 10 8 6 4 2 0 Norepinephrine (  M): 0 10 - 0 10

34 Non-depressed Depressed p =.008 Socio-environmental regulation of IL6

35 Biological overview of genetics & functional genomics Theoretical framework: Genes, Environments, transcription, and health 1.“Genetic” influences (missing h, penetrance R-square, etc.) 2.Functional genomics Transcription factors Epigenetics 3.Gene-Environment interactions Regulatory polymorphism Coding polymorphism System dynamics 1.Feedback, network pleiotropy 2.Recursive developmental trajectories

36 Gene IL6 DNA

37 Gene IL6 DNA

38 Gene Health IL6 RNA DNA

39 Gene Health IL6 RNA DNA

40 Gene IL6 DNA

41 Biological overview of genetics & functional genomics Theoretical framework: Genes, Environments, transcription, and health 1.“Genetic” influences (missing h, penetrance R-square, etc.) 2.Functional genomics Transcription factors Epigenetics 3.Gene-Environment interactions Regulatory polymorphism Coding polymorphism System dynamics 1.Feedback, network pleiotropy 2.Recursive developmental trajectories

42 Social Environment Gene Health IL6 RNA DNA

43 Social Environment Gene Health IL6 … [G/C] … RNA DNA

44 Social Environment Gene Health IL6 … [G/C] … RNA DNA

45 Social Environment Gene IL6 … [G/C] … DNA

46 IL6 TCT TGCGATGCTA AAG Gene x Environment Interaction In silico

47 IL6 TCT TGCGATGCTA AAG V$GATA1_01 =.943 Gene x Environment Interaction In silico

48 IL6 TCT TGCGATGCTA AAG C V$GATA1_01 =.943 Gene x Environment Interaction In silico

49 IL6 TCT TGCGATGCTA AAG C V$GATA1_01 =.943 V$GATA1_01 =.619 Gene x Environment Interaction In silico

50 Transcriptional activity (fold-change) 10 8 6 4 2 0 IL6 promoter: WT -174C Norepinephrine (  M): 0 10 - 0 10 IL6 TCT TGCGATGCTA AAG C V$GATA1_01 =.943 V$GATA1_01 =.619 Gene x Environment Interaction In silico In vitro

51 Transcriptional activity (fold-change) 10 8 6 4 2 0 IL6 promoter: WT -174C Norepinephrine (  M): 0 10 - 0 10 Difference: p <.0001 IL6 TCT TGCGATGCTA AAG C V$GATA1_01 =.943 V$GATA1_01 =.619 Gene x Environment Interaction In silico In vitro

52 Non-depressed Depressed p =.008 Gene x Environment Interaction IL6 -174 GG IL6 -174 CC/GC

53 p =.439 Non-depressed Depressed Non-depressed Depressed p =.008 Gene x Environment Interaction IL6 -174 GG IL6 -174 CC/GC

54 Biological overview of genetics & functional genomics Theoretical framework: Genes, Environments, transcription, and health 1.“Genetic” influences (missing h, penetrance R-square, etc.) 2.Functional genomics Transcription factors Epigenetics 3.Gene-Environment interactions Regulatory polymorphism Coding polymorphism System dynamics 1.Feedback, network pleiotropy 2.Recursive developmental trajectories

55 Social Environment Gene Health IL6 RNA DNA

56 Social Environment Gene Health IL6 RNA DNA … [G/C] …

57 Social Environment Gene Health 2 IL6 RNA 2 DNA … [G/C] …

58

59

60 Biological overview of genetics & functional genomics Theoretical framework: Genes, Environments, transcription, and health 1.“Genetic” influences (missing h, penetrance R-square, etc.) 2.Functional genomics Transcription factors Epigenetics 3.Gene-Environment interactions Regulatory polymorphism Coding polymorphism System dynamics 1.Feedback, network pleiotropy 2.Recursive developmental trajectories

61 Social Environment Gene Health IL6 RNA DNA

62 Social Environment Gene IL6 RNA DNA Behavior

63 Social Environment Gene IL6 RNA DNA Behavior Gene-Environment Correlation

64 Social Environment Gene IL6 RNA DNA Behavior Gene-Environment Correlation

65 Social Environment Gene IL6 RNA DNA Behavior Gene-Environment Correlation

66 Social Environment Gene IL6 RNA DNA Behavior Gene-Environment Correlation

67 Social Environment Gene IL6 RNA DNA Behavior Gene-Environment Correlation Recursive Molecular Remodeling

68 Body 1 Recursive developmental remodeling Cole (2009) Current Directions in Psychological Science

69 Environment 1 Body 1 Recursive developmental remodeling Cole (2009) Current Directions in Psychological Science

70 Environment 1 Body 1 Behavior 1 Recursive developmental remodeling Cole (2009) Current Directions in Psychological Science

71 Environment 1 Body 1 RNA 1 Behavior 1 Recursive developmental remodeling Cole (2009) Current Directions in Psychological Science

72 Time 1 Environment 1 Body 1 RNA 1 Behavior 1 Time 2 Body 2 Recursive developmental remodeling Cole (2009) Current Directions in Psychological Science

73 Time 1 Environment 1 Body 1 RNA 1 Behavior 1 Time 2 Environment 2 Body 2 Recursive developmental remodeling Cole (2009) Current Directions in Psychological Science

74 Time 1 Environment 1 Body 1 RNA 1 Behavior 1 Time 2 Environment 2 Body 2 RNA 2 Behavior 2 Recursive developmental remodeling Cole (2009) Current Directions in Psychological Science

75 Time 1 Environment 1 Body 1 RNA 1 Behavior 1 Time 2 Environment 2 Body 2 RNA 2 Behavior 2 Time 3 Environment 3 Body 3 RNA 3 Behavior 3 Recursive developmental remodeling Cole (2009) Current Directions in Psychological Science

76 Time 1 Environment 1 Body 1 RNA 1 Behavior 1 Time 2 Environment 2 Body 2 RNA 2 Behavior 2 Time 3 Environment 3 Body 3 RNA 3 Behavior 3 Recursive developmental remodeling RNA = intra-organismic adaptation Cole (2009) Current Directions in Psychological Science

77 Biological overview of genetics & functional genomics Theoretical framework: Genes, Environments, transcription, and health 1.“Genetic” influences (missing h, penetrance R-square, etc.) 2.Functional genomics Transcription factors Epigenetics 3.Gene-Environment interactions Regulatory polymorphism Coding polymorphism System dynamics 1.Feedback, network pleiotropy 2.Recursive developmental trajectories

78 Strategic aspects of study design and data analysis Basic substantive objectives & study designs 1.“Gene discovery” (e.g., genetic epidemiology) 2.Environmental regulation of health (via transcription) 3.Gene-Environment interaction

79 Gene IL6 DNA

80 Gene Health IL6 DNA

81 Strategic aspects of study design and data analysis Basic substantive objectives & study designs 1.“Gene discovery” (e.g., genetic epidemiology) 2.Environmental regulation of health (via transcription) 3.Gene-Environment interaction

82 Gene Health IL6 DNA

83 Gene Health IL6 RNA DNA

84 Strategic aspects of study design and data analysis Basic substantive objectives & study designs 1.“Gene discovery” (e.g., genetic epidemiology) 2.Environmental regulation of health (via transcription) 3.Gene-Environment interaction

85 Gene Health IL6 RNA DNA

86 Gene Health IL6 RNA DNA … [G/C] …

87 Strategic aspects of study design and data analysis Basic substantive objectives & study designs 1.“Gene discovery” (e.g., genetic epidemiology) 2.Environmental regulation of health (via transcription) 3.Gene-Environment interaction Antagonistic pleiotropy

88 IL6 -174: CC GC GG CC GC GG p =.007 Older Adult Adolescent CRP mg/L / Adversity SD 3.0 2.0 1.0 0.0 -2.0 -3.0 p =.032 Antagonistic pleiotropy

89 IL6 -174: CC GC GG CC GC GG p =.007 Older Adult Adolescent CRP mg/L / Adversity SD 3.0 2.0 1.0 0.0 -2.0 -3.0 p =.032 Antagonistic pleiotropy

90 IL6 -174: CC GC GG CC GC GG p =.007 Older Adult Adolescent CRP mg/L / Adversity SD 3.0 2.0 1.0 0.0 -2.0 -3.0 p =.032 Antagonistic pleiotropy Evolution deletes disadvantage, particularly to the young

91 GG GC CC Outcome

92 Fisher’s regression: GG GC CC Outcome y = a + b(#G) + e

93 Fisher’s regression: GG GC CC Outcome y = a + b(#G) + e Environment A GG GC CC Outcome Environment B

94 Fisher’s regression: GG GC CC Outcome y = a + b(#G) + c(Env) + d(#G x Env) + e Environment A GG GC CC Outcome Environment B

95 Fisher’s regression: GG GC CC Outcome y = a + b(#G) + e’ ← c(Env) + d(#G x Env) + e Environment A GG GC CC Outcome Environment B

96 Fisher’s regression: GG GC CC Outcome y = a + b(#G) + e’ ← c(Env) + d(#G x Env) + e ↓ power Environment A GG GC CC Outcome Environment B

97 Fisher’s regression: GG GC CC Outcome y = a + b(#G) + e’ ← c(Env) + d(#G x Env) + e ↓ power ↑ parameter estimate bias Environment A GG GC CC Outcome Environment B

98 Fisher’s regression: GG GC CC Outcome y = a + b(#G) + e’ ← c(Env) + d(#G x Env) + e ↓ power ↑ parameter estimate bias Marginal: 0 Environment A GG GC CC Outcome Environment B

99 Strategic aspects of study design and data analysis Basic substantive objectives & study designs 1.“Gene discovery” (e.g., genetic epidemiology) 2.Environmental regulation of health (via transcription) 3.Gene-Environment interaction Antagonistic pleiotropy Valid statistical models are one major reason that substantive interests (environments) matter.

100 Strategic aspects of study design and data analysis Basic substantive objectives & study designs 1.“Gene discovery” (e.g., genetic epidemiology) 2.Environmental regulation of health (via transcription) 3.Gene-Environment interaction Antagonistic pleiotropy Valid statistical models are one major reason that substantive interests (environments) matter. OK, then, let’s have lunch.

101 Technical aspects of study design and data analysis Study designs, assay technologies, and statistical methods 1.“Gene discovery” (e.g., genetic epidemiology) Candidate gene studies Genome-wide association studies The bioinformatic middle road 2.Environmental regulation of health (via transcription) Candidate transcript studies Genome-wide approaches 3.Gene-Environment interaction Statistical issues Revisiting the bioinformatic middle road

102 Technical aspects of study design and data analysis Study designs, assay technologies, and statistical methods 1.“Gene discovery” (e.g., genetic epidemiology) Candidate gene studies - Candidate identification - Targeted genotyping a.PCR b.High-throughput approaches - Statistical models a.Fisher’s basic regression model b.Multivariate mapping / association / recombination i.Recombination ii.Haplotype blocks c.Confounding i.Linkage disequilibrium & haplotype analyses ii.Ethnic stratification Phenotypic ascertainment Genetic ancestry iii.Mendelian randomization

103

104 IL6 TCT TGCGATGCTA AAG Gene x Environment Interaction

105 IL6 TCT TGCGATGCTA AAG C

106 IL6 TCT TGCGATGCTA AAG C V$GATA1_01 =.943 Gene x Environment Interaction In silico

107 IL6 TCT TGCGATGCTA AAG C V$GATA1_01 =.943 V$GATA1_01 =.619 Gene x Environment Interaction In silico

108 Transcriptional activity (fold-change) 10 8 6 4 2 0 IL6 promoter: WT -174C Norepinephrine (  M): 0 10 - 0 10 Difference: p <.0001 IL6 TCT TGCGATGCTA AAG C V$GATA1_01 =.943 V$GATA1_01 =.619 Gene x Environment Interaction In silico In vitro

109 p =.439 Non-depressed Depressed Non-depressed Depressed p =.008 Gene x Environment Interaction IL6 -174 GG IL6 -174 CC/GC

110 Technical aspects of study design and data analysis Study designs, assay technologies, and statistical methods 1.“Gene discovery” (e.g., genetic epidemiology) Candidate gene studies - Candidate identification - Targeted genotyping a.PCR b.High-throughput approaches - Statistical models a.Fisher’s basic regression model b.Multivariate mapping / association / recombination i.Recombination ii.Haplotype blocks c.Confounding i.Linkage disequilibrium & haplotype analyses ii.Ethnic stratification Phenotypic ascertainment Genetic ancestry iii.Mendelian randomization

111

112

113 WellID1ID2 RFU1 RFU2Ct1Ct2Call A01053053 1094.39 956.9042.5341.36Heterozygote A02065065 -43.33 1519.2560.0040.39Allele2 A03075075 1126.77 890.9642.8242.02Heterozygote A04079079 2095.09 25.3642.8460.00Allele1 A05087087 2187.80 18.0941.2760.00Allele1

114 Technical aspects of study design and data analysis Study designs, assay technologies, and statistical methods 1.“Gene discovery” (e.g., genetic epidemiology) Candidate gene studies - Candidate identification - Targeted genotyping a.PCR b.High-throughput approaches - Statistical models a.Fisher’s basic regression model b.Multivariate mapping / association / recombination i.Recombination ii.Haplotype blocks c.Confounding i.Linkage disequilibrium & haplotype analyses ii.Ethnic stratification Phenotypic ascertainment Genetic ancestry iii.Mendelian randomization

115

116

117

118 Technical aspects of study design and data analysis Study designs, assay technologies, and statistical methods 1.“Gene discovery” (e.g., genetic epidemiology) Candidate gene studies - Candidate identification - Targeted genotyping a.PCR b.High-throughput approaches - Statistical models a.Fisher’s basic regression model b.Multivariate mapping / association / recombination i.Recombination ii.Haplotype blocks c.Confounding i.Linkage disequilibrium & haplotype analyses ii.Ethnic stratification Phenotypic ascertainment Genetic ancestry iii.Mendelian randomization

119 Fisher’s regression: GG GC CC Outcome

120 Fisher’s regression: GG GC CC Outcome

121 Fisher’s regression: GG GC CC Outcome

122 Fisher’s regression: GG GC CC Outcome

123 Fisher’s regression: GG GC CC Outcome y = a + b(#G)

124 Fisher’s regression: GG GC CC Outcome y = a + b(#G) y = a + b(GG) + c(GC) + d(CC)

125 Fisher’s regression: GG GC CC Outcome y = a + b(#G) y = a + b(GG) + c(GC) + d(CC)

126 Technical aspects of study design and data analysis Study designs, assay technologies, and statistical methods 1.“Gene discovery” (e.g., genetic epidemiology) Candidate gene studies - Candidate identification - Targeted genotyping a.PCR b.High-throughput approaches - Statistical models a.Fisher’s basic regression model b.Multivariate mapping / association / recombination i.Recombination ii.Haplotype blocks c.Confounding i.Linkage disequilibrium & haplotype analyses ii.Ethnic stratification Phenotypic ascertainment Genetic ancestry iii.Mendelian randomization

127

128

129

130

131 Fisher’s regression: GG GC CC Outcome y = a + b(#G rs1800795)

132 Fisher’s regression: GG GC CC Outcome y = a + b(#G rs1800795) y = a + b(#G rs1800795) + c(#T rs20937) + ….

133 Fisher’s regression: GG GC CC Outcome y = a + b(#G rs1800795) y = a + b(Haplotype containing rs1800795)

134 Fisher’s regression: GG GC CC Outcome y = a + b(#G rs1800795) y = a + b(Haplotype containing rs1800795) y = a + b(ATTCGTAC)

135 Fisher’s regression: GG GC CC Outcome y = a + b(#G rs1800795) y = a + b(Haplotype containing rs1800795) y = a + b(ATTCGTAC) HapMap Tag SNP

136 Technical aspects of study design and data analysis Study designs, assay technologies, and statistical methods 1.“Gene discovery” (e.g., genetic epidemiology) Candidate gene studies - Candidate identification - Targeted genotyping a.PCR b.High-throughput approaches - Statistical models a.Fisher’s basic regression model b.Multivariate mapping / association / recombination i.Recombination ii.Haplotype blocks c.Confounding i.Linkage disequilibrium & haplotype analyses ii.Ethnic stratification Phenotypic ascertainment Genetic ancestry iii.Mendelian randomization

137 Linkage-driven indirect association gradients

138

139 Technical aspects of study design and data analysis Study designs, assay technologies, and statistical methods 1.“Gene discovery” (e.g., genetic epidemiology) Candidate gene studies - Candidate identification - Targeted genotyping a.PCR b.High-throughput approaches - Statistical models a.Fisher’s basic regression model b.Multivariate mapping / association / recombination i.Recombination ii.Haplotype blocks c.Confounding i.Linkage disequilibrium & haplotype analyses ii.Ethnic stratification Phenotypic ascertainment Genetic ancestry iii.Mendelian randomization

140

141 Culture/behavior/exposure “Environment”

142

143

144 Ancestry classification via mitochondrial haplogroups (also Y haplogroups for paternal lineage)

145 Technical aspects of study design and data analysis Study designs, assay technologies, and statistical methods 1.“Gene discovery” (e.g., genetic epidemiology) Candidate gene studies - Candidate identification - Targeted genotyping a.PCR b.High-throughput approaches - Statistical models a.Fisher’s basic regression model b.Multivariate mapping / association / recombination i.Recombination ii.Haplotype blocks c.Confounding i.Linkage disequilibrium & haplotype analyses ii.Ethnic stratification Phenotypic ascertainment Genetic ancestry iii.Mendelian randomization

146

147

148 CRP CVD

149 CRP CVD CRP

150 CVD CRP

151 CVD CRP IL-6

152 Technical aspects of study design and data analysis Study designs, assay technologies, and statistical methods 1.“Gene discovery” (e.g., genetic epidemiology) Candidate gene studies - Candidate identification - Targeted genotyping a.PCR b.High-throughput approaches - Statistical models a.Fisher’s basic regression model b.Multivariate mapping / association / recombination i.Recombination ii.Haplotype blocks c.Confounding i.Linkage disequilibrium & haplotype analyses ii.Ethnic stratification Phenotypic ascertainment Genetic ancestry iii.Mendelian randomization

153 Technical aspects of study design and data analysis Study designs, assay technologies, and statistical methods 1.“Gene discovery” (e.g., genetic epidemiology) Candidate gene studies

154 Technical aspects of study design and data analysis Study designs, assay technologies, and statistical methods 1.“Gene discovery” (e.g., genetic epidemiology) Candidate gene studies Genome-wide association studies

155 Technical aspects of study design and data analysis Study designs, assay technologies, and statistical methods 1.“Gene discovery” (e.g., genetic epidemiology) Candidate gene studies Genome-wide association studies - Marker selection for blind search: tag SNPs - Massively parallel genotyping a.Array-based strategies b.Deep resequencing - Statistical models a.Main effect models b.Interaction models c.Managing Type I error - Bonferronni & FDR - Internal cross-validation - External replication

156

157 Technical aspects of study design and data analysis Study designs, assay technologies, and statistical methods 1.“Gene discovery” (e.g., genetic epidemiology) Candidate gene studies Genome-wide association studies - Marker selection for blind search: tag SNPs - Massively parallel genotyping a.Array-based strategies b.Deep resequencing - Statistical models a.Main effect models b.Interaction models c.Managing Type I error - Bonferronni & FDR - Internal cross-validation - External replication

158

159

160

161 Technical aspects of study design and data analysis Study designs, assay technologies, and statistical methods 1.“Gene discovery” (e.g., genetic epidemiology) Candidate gene studies Genome-wide association studies - Marker selection for blind search: tag SNPs - Massively parallel genotyping a.Array-based strategies b.Deep resequencing - Statistical models a.Main effect models b.Interaction models c.Managing Type I error - Bonferronni & FDR - Internal cross-validation - External replication

162 Fisher’s regression: GG GC CC Outcome y = a + b(#G) y = a + b(GG) + c(GC) + d(CC)

163 Fisher’s regression: GG GC CC Outcome Environment A GG GC CC Outcome Environment B y = a + b(#G) y = a + b(GG) + c(GC) + d(CC)

164 Fisher’s regression: GG GC CC Outcome y = a + b(#G) + c(Env) + d(#G x Env) y = a + b(GG) + c(GC) + d(CC) + e(Env) + f(Env x GG) + g(Env x GC) + h(Env x CC) Environment A GG GC CC Outcome Environment B

165 Technical aspects of study design and data analysis Study designs, assay technologies, and statistical methods 1.“Gene discovery” (e.g., genetic epidemiology) Candidate gene studies Genome-wide association studies - Marker selection for blind search: tag SNPs - Massively parallel genotyping a.Array-based strategies b.Deep resequencing - Statistical models a.Main effect models b.Interaction models c.Managing Type I error - Bonferronni & FDR - Internal cross-validation - External replication

166 Type 1 / false positive error:

167 Confirmatory hypothesis testing (candidate genes) 1 hypothesis = 1 t-test = 1 p-value = no problem: p <.05 = p <.05

168 Type 1 / false positive error: Confirmatory hypothesis testing (candidate genes) 1 hypothesis = 1 t-test = 1 p-value = no problem: p <.05 = p <.05 Gene mapping (exploratory association testing) Gene expression: 22,000 p-values = 1,100 false positives (p <.05) p(false discovery > 0) =.999999999999999999999999+

169 Type 1 / false positive error: Confirmatory hypothesis testing (candidate genes) 1 hypothesis = 1 t-test = 1 p-value = no problem: p <.05 = p <.05 Gene mapping (exploratory association testing) Gene expression: 22,000 p-values = 1,100 false positives (p <.05) p(false discovery > 0) =.999999999999999999999999+ Gene polymorphism: 10,000,000 p-values = 500,000 false positives (p <.05) p(false discovery > 0) =.999999999999999999999999+

170 What to do?

171 1.Increase stringency (intra-study) Bonferroni correct ( p =.05/22,000 =.00000227 ) Choice: huge samples or massive Type 2 “false negative” error

172 What to do? 1.Increase stringency (intra-study) Bonferroni correct ( p =.05/22,000 =.00000227 ) Choice: huge samples or massive Type 2 “false negative” error Model/simulate error Randomization test or FDR modeling = less conservative bias Unimpressive yield: p =.00000300 if you’re lucky. Still too conservative, and biased ( omitted true effects in error term )

173

174 What to do? 1.Increase stringency (intra-study) Bonferroni correct ( p =.05/22,000 =.00000227 ) Choice: huge samples or massive Type 2 “false negative” error Model/simulate error Randomization test or FDR modeling = less conservative bias Unimpressive yield: p =.00000300 if you’re lucky. Still too conservative, and biased ( omitted true effects in error term )

175 What to do? 1.Increase stringency (intra-study) Bonferroni correct ( p =.05/22,000 =.00000227 ) Choice: huge samples or massive Type 2 “false negative” error Model/simulate error Randomization test or FDR modeling = less conservative bias Unimpressive yield: p =.00000300 if you’re lucky. Still too conservative, and biased ( omitted true effects in error term ) Use a better sampling design

176 Population prevalence design

177 Outcome-stratified design

178 What to do? 1.Increase stringency (intra-study) Bonferroni correct ( p =.05/22,000 =.00000227 ) Choice: huge samples or massive Type 2 “false negative” error Model/simulate error Randomization test or FDR modeling = less conservative bias Unimpressive yield: p =.00000300 if you’re lucky. Still too conservative, and biased ( omitted true effects in error term ) Use a better sampling design

179 What to do? 1.Increase stringency (intra-study) Bonferroni correct ( p =.05/22,000 =.00000227 ) Choice: huge samples or massive Type 2 “false negative” error Model/simulate error Randomization test or FDR modeling = less conservative bias Unimpressive yield: p =.00000300 if you’re lucky. Still too conservative, and biased ( omitted true effects in error term ) Use a better sampling design 2.Replicate (inter-study or intra-study cross-validation).05 x.05 x.05 =.000125 x 22,000 = 2.75 false positives ( vs. 1,100 )

180

181 What to do? 1.Increase stringency (intra-study) Bonferroni correct ( p =.05/22,000 =.00000227 ) Choice: huge samples or massive Type 2 “false negative” error Model/simulate error Randomization test or FDR modeling = less conservative bias Unimpressive yield: p =.00000300 if you’re lucky. Still too conservative, and biased ( omitted true effects in error term ) Use a better sampling design 2.Replicate (inter-study or intra-study crossvalidation).05 x.05 x.05 =.000125 x 22,000 = 2.75 false positives ( vs. 1,100 )

182 Technical aspects of study design and data analysis Study designs, assay technologies, and statistical methods 1.“Gene discovery” (e.g., genetic epidemiology) Candidate gene studies Genome-wide association studies - Marker selection for blind search: tag SNPs - Massively parallel genotyping a.Array-based strategies b.Deep resequencing - Statistical models a.Main effect models b.Interaction models c.Managing Type I error - Bonferronni & FDR - Internal cross-validation - External replication

183 Technical aspects of study design and data analysis Study designs, assay technologies, and statistical methods 1.“Gene discovery” (e.g., genetic epidemiology) Candidate gene studies Genome-wide association studies

184 Technical aspects of study design and data analysis Study designs, assay technologies, and statistical methods 1.“Gene discovery” (e.g., genetic epidemiology) Candidate gene studies Genome-wide association studies The bioinformatic “middle road” – biological hypotheses buy power

185 Technical aspects of study design and data analysis Study designs, assay technologies, and statistical methods 1.“Gene discovery” (e.g., genetic epidemiology) Candidate gene studies Genome-wide association studies The bioinformatic “middle road” – biological hypotheses buy power - Candidate set selection a.Regulatory polymorphism b.Coding polymorphism - Statistical considerations a.Power b.Differential enrichment

186 IL6 TCT TGCGATGCTA AAG In silico prediction of Gene x Environment Interaction

187 IL6 TCT TGCGATGCTA AAG C V$GATA1_01 =.943 V$GATA1_01 =.619 In silico prediction of Gene x Environment Interaction In silico

188 Transcriptional activity (fold-change) 10 8 6 4 2 0 IL6 promoter: WT -174C Norepinephrine (  M): 0 10 - 0 10 Difference: p <.0001 IL6 TCT TGCGATGCTA AAG C V$GATA1_01 =.943 V$GATA1_01 =.619 In silico prediction of Gene x Environment Interaction In silico In vitro

189 p =.439 Non-depressed Depressed Non-depressed Depressed p =.008 IL6 -174 GG IL6 -174 CC/GC In silico prediction of Gene x Environment Interaction In vivo

190 FLJ20719 -734 LOC148490 -929 AKR7A2 -678 RHCE -292 LOC440576 -934 SOC -39 SOC -49 SOC -26 UNQ6122 -877 LAPTM5 -728 PHC2 -168 PHC2 -16 ITGB3BP -311 FLJ20331 -994 ZNF265 -663 FUBP1 -778 LOC388650 -392 LOC388654 -957 PDE4DIP -175 COAS2 -435 LOC199882 -474 LOC440689 -692 LOC440689 -16 LOC441906 -496 FLG -17 LEP3 -631 RAB13 -310 LOC91181 -956 LOC126669 -407 LOC440693 -399 PKLR -118 PKLR -597 FCRH1 -580 SPTA1 -163 SLAMF9 -256 KCNJ10 -383 ITLN1 -760 F11R -798 LMX1A -85 SELP -144 LOC400796 -263 F13B -881 MYOG -951 LOC440712 -956 LGTN -331 FLJ10874 -676 GPATC2 -556 LOC440721 -625 AGT 1 FLJ10359 -367 LOC441927 -406 LOC440741 -564 MGC12466 -863 KIAA1720 -894 LOC388578 -522 LOC391205 -430 MIG-6 -618 MIG-6 -638 MIG-6 -678 LOC441870 -731 LOC440561 -255 LOC401940 -500 LOC401940 -564 LOC401940 -606 LOC339553 -400 LOC440753 -695 LOC388789 -593 FLJ38374 -686 LOC391241 -81 LOC388794 -28 C20orf70 -431 STK4 -122 PIGT -910 DNTTIP1 -479 C20orf67 -1 MMP9 -875 CEBPB -978 RNPC1 -370 TH1L -26 LOC400849 -714 LOC400849 -382 CGI-09 -309 FKHL18 -608 C20orf172 -118 TGM2 -220 LOC388798 -828 Kua-UEV -465 Kua-UEV -561 Kua -465 BTBD4 -590 C21orf99 -772 C21orf99 -13 KRTAP15-1 -566 B3GALT5 -889 LOC441955 -824 LOC400858 -624 CLDN8 -17 KRTAP19-7 -127 DSCR1 -620 C21orf84 -232 KRTAP12-4 -899 FTCD -410 LOC440842 -24 PEX26 -129 PEX26 -89 ZNF74 -726 LOC440804 -940 SMARCB1 -290 CABIN1 -797 KIAA1671 -26 ARP10 -612 ADSL -602 ARHGAP8 -260 NUP50 -629 PPARA -184 BID -126 DGCR14 -951 TXNRD2 -882 LOC391303 -881 LOC150221 -939 LOC91219 -352 LOC150236 -666 GSTT1 -141 SEC14L4 -746 SSTR3 -705 FLJ22582 -372 DIA1 -749 ATP5L2 -328 A4GALT -825 SULT4A1 -729 C2orf15 -882 LOC129521 -477 LOC440892 -918 IL1RL1 -332 MRPS9 -970 LOC442037 -839 IL1F7 -978 MGC52000 -273 MGC52000 -466 MGC52057 -404 MAP1D -120 COL3A1 -310 SLC39A10 -921 LOC200726 -220 IL8RB -447 TUBA4 -643 FLJ25955 -24 ALPPL2 -296 UGT1A9 -651 UGT1A7 -351 UGT1A6 -224 UGT1A6 -402 TRPM8 -170 ASB1 -723 GCKR -204 LOC388938 -212 FLJ38348 -606 MSH2 -376 MSH2 -976 MSH2 -376 SBLF -59 LOC151443 -85 LOC391387 -134 SEMA4F -751 RBM29 -1 LOC339562 -621 LOC339562 -641 LOC200493 -245 TXNDC9 -714 FLJ40629 -946 LOC401005 -12 LOC389050 -170 ORC4L -16 ARL5 -895 NR4A2 -527 ATP5G3 -55 ZNF533 -598 ZSWIM2 -772 PGAP1 -821 PGAP1 -827 SF3B1 -138 ORC2L -786 LOC391475 -413 CRYGC -765 PECR -942 SLC23A3 -412 LOC442070 -877 LOC129607 -488 LOC339789 -268 LOC130502 -558 ALK -710 BCL11A -615 PAP -438 PAP -531 CNTN4 -809 PPARG -584 PPARG -914 LOC401054 -926 GALNTL2 -427 FBXL2 -107 APRG1 -269 APRG1 -347 LOC440951 -20 LOC389123 -140 LOC285194 -808 NR1I2 -769 STXBP5L -480 LOC442092 -880 MRPS22 -897 KCNAB1 -793 LOC402146 -134 LOC90133 -2 NLGN1 -541 FLJ20522 -803 ATP2B2 -593 LOC440946 -917 ANKRD28 -437 LOC152024 -365 FLJ32685 -953 SLC4A7 -509 MST1 -895 LOC377064 -623 LOC200959 -572 CPOX -150 LOC401079 -259 CBLB -250 LOC344807 -514 GPR156 -497 IQCB1 -412 MGC34728 -553 LOC256374 -248 KIAA0861 -435 MGC15397 -397 LOC254808 -484 LRRC15 -288 KIAA0226 -776 LOC255324 -380 IBSP -319 MGC48628 -101 NDST3 -902 LOC401149 -733 LOC441038 -837 FLJ35630 -291 CYP4V2 -117 LOC401164 -978 LOC391727 -934 LOC399917 -840 ZAR1 -106 LOC401132 -18 PF4 -819 EIF4E -716 ADH7 -557 TACR3 -957 AGXT2L1 -631 PLA2G12A -795 PITX2 -411 LOC401155 -72 CDHJ -652 FGA -110 PPID -384 LOC441049 -368 GPM6A -203 LOC389833 -878 LOC389833 -288 LOC389833 -878 LOC442102 -418 FGFBP1 -290 LOC441013 -188 FLJ00310 -289 FLJ00310 -881 FLJ00310 -289 LOC442127 -287 SRD5A1 -631 LOC345711 -877 LOC389281 -225 MGC42105 -669 PELO -938 BDP1 -918 DKFZp564C0469 -378 LOC134505 -63 TSLP -331 LOC340069 -755 SNCAIP -671 LOC441106 -646 SLC27A6 -484 CDC42SE2 -384 PHF15 -52 LOC389331 -27 PCDHA4 -26 PCDHB3 -623 PCDHB6 -212 PCDHB16 -609 ABLIM3 -474 LARP -716 LOC134541 -868 FGFR4 -472 FGFR4 -745 LOC442145 -7 LOC442146 -856 LOC345462 -604 LOC345462 -609 LOC442148 -595 OR2V2 -340 OR2V2 -901 TPPP -454 MYO10 -583 LOC441066 -463 GDNF -36 LOC345643 -568 FOXD1 -990 ARSB -493 DHFR -473 SPATA9 -748 CHD1 -581 STK22D -863 LOC389316 -227 CDO1 -360 FLJ33977 -166 LOC391824 -129 ALDH7A1 -920 CAMK2A -429 C5orf4 -657 LOC345430 -332 DUSP1 -361 LOC285770 -132 NQO2 -705 MRS2L -22 HIST1H2BA -960 HIST1H2BD -597 HIST1H2BH -618 HIST1H4I -283 HLA-H -477 MRPS18B -207 LOC401250 -26 LOC401250 -497 NFKBIL1 -305 LY6G5B -359 C6orf25 -413 HSPA1B -942 C2 -687 HLA-DRA -774 HLA-DQA1 -265 ZBTB9 -229 LOC389386 -725 TLT4 -607 C6orf139 -788 KIAA1411 -549 C6orf57 -986 C6orf165 -728 POU3F2 -997 LOC340148 -581 C6orf55 -149 LOC345829 -202 LOC442278 -732 LOC442279 -858 LOC401289 -82 LOC285766 -472 SERPINB6 -657 OFCC1 -367 LOC441129 -714 SMA3 -762 LOC222699 -719 LOC441138 -870 OR12D3 -872 LOC346171 -389 HCG4P6 -80 HCG4P6 -501 PSORS1C2 -78 HLA-C -512 HLA-B -594 HLA-DRB1 -469 HLA-DRB1 -821 HLA-DQB2 0 HLA-DQB2 -333 HLA-DQB2 0 HLA-DOB -500 MLN -740 LRFN2 -452 C6orf108 -907 PLA2G7 -227 CRISP1 -236 IL17F -733 HMGCLL1 -759 LOC442226 -67 C6orf66 -832 DJ467N11.1 -34 RTN4IP1 -207 SLC22A16 -869 LOC442254 -307 DEADC1 -509 FLJ44955 -391 SYNE1 -484 SYNE1 -126 LOC389435 -451 LOC389435 -565 PIP3-E -457 T -9 T -3 LOC442280 -112 DKFZP434J154 -615 LOC401303 -632 LOC441198 -739 GHRHR -646 ADCYAP1R1 -60 C7orf16 -842 LOC441209 -41 GPR154 -435 C7orf36 -707 BLVRA -400 LOC51619 -311 WBSCR19 -38 LOC136288 -523 LOC392030 -632 FZD9 -485 LOC85865 -255 LOC442341 -390 AKR1D1 -159 LOC93432 -126 OR2F1 -160 OR2A5 -927 LOC441184 -336 LOC441186 -584 LOC441187 -654 LOC389831 -914 LOC222967 -338 LOC340267 -244 ICA1 -699 AGR2 -65 LOC389472 -184 LOC401316 -837 CRHR2 -610 PDE1C -20 LOC441210 -361 LOC222052 -77 LOC441224 -287 LOC441230 -143 LOC441245 -127 LOC441259 -954 CCL26 -441 SEMA3C -385 C7orf23 -761 PON1 -785 GATS -36 ACHE -715 ACHE -224 ACHE -715 ACHE -224 ORC5L -990 CHCHD3 -793 MGC5242 -861 LOC392997 -596 FLJ44186 -168 HIPK2 -70 ZC3HDC1 -407 LOC402301 -14 BAGE4 -100 BAGE4 -648 MCPH1 -520 MCPH1 -203 AMAC -766 NEIL2 -956 NEF3 -789 PNOC -756 LOC441344 -308 FKSG2 -72 DKFZp586M1819 -469 SNTG1 -463 LOC389657 -899 ADHFE1 -54 SULF1 -522 WWP1 -695 LOC401471 -326 FLJ45248 -290 LOC441309 -343 LOC392169 -927 ANGPT2 -895 SPAG11 -971 SPAG11 -622 SPAG11 -971 DEFB104 -132 LOC389633 -370 ASAH1 -702 ASAH1 -882 FLJ22494 -242 FLJ22494 -781 SNAI2 -728 CPA6 -613 FSBP -393 MFTC -905 MRPL13 -525 LOC442399 -126 TOP1MT -477 LOC286126 -887 LOC340393 -922 DOCK8 -109 LOC441386 -327 C9orf93 -708 SH3GL2 -702 C9orf94 -376 LOC340501 -32 LOC441417 -394 DKFZP434M131 -944 SECISBP2 -404 LOC441453 -821 PHF2 -646 PHF2 -648 LOC441457 -742 LOC441457 -802 PRG-3 -971 RAD23B -998 SLC31A2 -380 OR1N2 -646 C9orf54 -2 LAMC3 -895 LOC441473 -825 DBH -768 OBP2A -732 EGFL7 -330 EGFL7 -335 TRAF2 -32 LOC441408 -394 LOC389702 -288 C9orf46 -353 SLC24A2 -265 IFNA10 -138 IFNA14 -85 C9orf11 -311 C9orf24 -905 UNQ470 -31 STOML2 -420 LOC392334 -904 LOC286327 -215 HNRPK -86 LOC441452 -955 DIRAS2 -896 LOC286359 -774 TXNDC4 -690 TXN -239 OR1L8 -459 DYT1 -561 ABO -790 ABO -789 ABO -790 XPMC2H -374 LOC441474 -921 LOC389734 -489 LOC389734 -223 FCN1 -673 FCN1 -709 LOC441410 -990 GAGE1 -21 RRAGB -788 LOC340527 -194 SH3BGRL -944 DIAPH2 -921 HSU24186 -145 NXF2 -89 PLP1 -918 LOC286436 -713 SLC6A14 -962 LOC392529 -73 FLJ25735 -992 MAGEB4 -834 LOC389844 -822 LOC389844 -814 UBE1 -964 LOC203604 -16 LOC441481 -796 DMD -923 RPGR 3 ZNF21 -828 PRKY -308 LOC441537 -223 LOC441539 -222 LOC441535 -225 LOC441536 -223 LOC338588 -51 UCN3 -368 NET1 -14 MAPK8 -856 LOC399768 -100 CDC2 -415 SLC29A3 -596 LOC143244 -131 LOC439994 -962 LIPL3 -68 LIPL3 -704 LOC439996 -302 LOC387701 -717 LOC387701 -817 FRAT1 -287 ABCC2 -3 HPS6 -237 NFKB2 -790 PNLIPRP2 -442 DMBT1 -462 FANK1 3 TAF3 -544 LOC441547 9 LOC220998 -941 TPRT -277 C10orf68 -817 C10orf9 -269 ZNF33A -477 LOC399744 -202 PPYR1 -81 LOC439946 -71 AKR1C2 -641 LOC441560 -504 LOC439975 -618 NEUROG3 6 AMID -452 PPP3CB -854 LOC439983 -240 LOC389988 -68 MMS19L -221 C10orf69 -121 GPR10 -555 C10orf93 -42 ASB13 -506 IL15RA -222 IL15RA -827 USP6NL -573 C10orf45 -181 NMT2 -912 SIAT8F -676 NEBL -727 C10orf52 -163 LOC439953 -879 LOC399737 -608 CTGLF1 -504 LOC439963 -500 KCNQ1 -40 LOC387746 -61 OR51F2 -640 TRIM34 -105 OR10A2 -851 SAA1 -721 SAA1 -722 LOC441593 -126 PDHX -845 TRIM44 -24 LOC90139 -660 NDUFS3 -929 LOC196346 -885 OR5T3 -97 CTNND1 -133 CTNND1 -116 CNTF -149 ROM1 -515 MARK2 -375 RAB1B -75 GSTP1 -841 LOC440056 -824 USP35 -148 LOC390231 -471 OR4D5 -465 OR8G5 -809 MGC39545 -867 LOC399969 -328 LOC219797 -216 NUP98 -651 KIAA0409 -533 LOC283299 -427 LOC440026 -69 LOC440030 -675 LOC387754 -159 LOC144100 -631 HPS5 -917 LOC387764 -149 LOC440041 -221 FLJ31393 -362 OR8H1 -161 AGTRL1 -809 PRG2 -899 TCN1 -716 RAB3IL1 -976 KIAA0404 -771 CHRDL2 -754 KCTD14 -94 MRE11A -879 MRE11A -982 MMP7 -853 CRYAB -175 ZNF202 -527 LOC387820 -553 LOC387823 -178 CCND2 -350 NDUFA9 -485 KCNA5 -805 FLJ10665 -245 FLJ10665 -576 LOC285407 -743 LOC390299 -771 FLJ10652 -491 LOC144245 -455 PFKM -838 DKFZp686O1689 -733 C12orf10 -110 DGKA -806 DGKA -800 SUOX -384 ZNFN1A4 -874 LYZ -944 GAS41 -166 VEZATIN -34 LOC387876 -110 C12orf8 -840 COX6A1 -124 LOC390364 -971 LOC144678 -418 LOC338797 -31 SLC6A13 -445 NRIP2 -107 NOL1 -122LOC387701 -817 FRAT1 -287 ABCC2 -3 HPS6 -237 NFKB2 -790 PNLIPRP2 -442 CLECSF12 -885 KLRK1 -349 PRB1 -589 ADAMTS20 -965 SLC38A2 -638 K-ALPHA-1 -27 KIAA1602 -262 RACGAP1 -620 K6IRS3 -708 KRT4 -83 NPFF -777 STAT2 -94 FLJ32949 -500 IFNG -795 MGC26598 -498 HAL -358 DKFZp434M0331 -920 LOC400070 -223 TSC -785 GPR109B -392 EPIM -568 GALNT9 -798 LOC440122 -169 LOC221140 -342 LOC440128 -877 LOC387912 -279 LOC341784 -327 NURIT -947 RB1 -525 DKFZP434K1172 -595 LOC144983 -906 LOC144983 -892 LOC144983 -896 LOC400144 -807 PROZ -865 CRYL1 -768 POSTN -32 LOC440134 -367 EBPL -973 GUCY1B2 -832 LOC338862 -918 LOC404785 -818 OR11H6 -269 C14orf92 -234 PSMA6 -219 KTN1 -222 C14orf166B -786 EVL -28 CCNB1IP1 -868 NEDD8 -143 BAZ1A -508 NFKBIA -963 LOC283551 -302 CDKL1 -902 LOC400214 -138 RTN1 -974 LOC390488 -457 PLEK2 -465 PIGH -153 RDH11 -251 FLJ39779 -161 KIAA1509 -179 SERPINA2 -559 SERPINA9 -856 LOC390529 -204 LOC388073 -112 LOC400307 -332 LOC283694 -71 LOC400320 -443 FLJ35785 -414 LOC440249 -92 HH114 -991 PLA2G4B -483 CAPN3 -318 LOC400368 -320 SLC28A2 -275 DUT -32 SCG3 -739 LIPC -853 OSTbeta -781 LOC440289 -446 COMMD4 -790 LOC400433 -496 LOC390637 -55 FLJ11175 -113 LOC440224 -815 LOC283804 -112 CHSY1 -876 LOC440315 -303 LOC400470 -62 LOC388076 -715 LOC440250 -206 LOC440255 -981 FLJ20313 -236 AVEN -767 KIAA0377 -896 FBN1 -191 SPPL2A -4 BCL2L10 -653 LOC145780 -610 BNIP2 -842 BNIP2 -421 RASL12 -878 SNAPC5 -540 BG1 -364 LOC400411 -62 LOC440293 -718 FLJ40113 -951 IP -207 TBL3 0 KIAA1171 -70 TNFRSF12A -968 DNAJA3 -24 ALG1 -464 FLJ12363 -773 LOC92017 -711 TMC7 -412 MGC16824 -271 RBBP6 -795 ITGAX -504 ERAF -510 LOC388248 -649 FLJ38101 -981 CES4 -221 MT1H -280 GAN -839 PLCG2 -534 CDH13 -906 HSBP1 -425 MLYCD -917 FLJ45121 -772 DPEP1 -765 FLJ32252 -288 FLJ32252 -346 MGC35212 -360 FLJ25410 -280 LOC400506 -715 LOC94431 -77 DOC2A -265 LOC441761 -889 LOC57019 -375 ZNF319 -360 DNCLI2 -857 DKFZP434A1319 -236 LOC439920 -70 CHST5 -601 CHST5 -756 LOC390748 -242 DPH2L1 -42 LOC388323 -892 MAP2K4 -128 KRTAP4-12 -78 JJAZ1 -789 CCL2 -912 PSMB3 -889 LOC440440 -1 FLJ25168 -244 SP2 -57 LOC388406 -800 TBX4 -465 DDX42 -212 LOC90799 -734 DKFZP586L0724 -829 SSTR2 -874 MRPS7 -822 MRPS7 -719 LOC388429 -804 NARF -669 GEMIN4 -911 OR1D2 -376 ALOX15 -267 SLC16A11 -346 CLECSF14 -596 CLECSF14 -640 FLJ40217 -393 RCV1 -761 CDRT1 -618 NOS2A -287 KRT25D -828 KRT12 -585 HUMGT198A -797 HUMGT198A -690 FLJ31222 -769 LOC284058 -524 GIP -957 LOC400619 -823 UNC13D -695 LOC339162 -685 LOC388462 -43 SEH1L -801 LOC284232 -988 LOC284232 -845 CABLES1 -281 CABYR -908 DSG3 -367 SLC14A1 -333 DCC -386 RAB27B -713 ZCCHC2 -249 LOC342808 -306 LOC284276 -397 MYOM1 -232 MC2R -113 LOC441817 -600 KIAA1632 -405 FBXO15 -123 FBXO15 -192 LOC390865 -489 TXNL4 -33 CDC34 -270 GZMM -678 C19orf21 -573 ARID3A -913 LOC126295 -456 MGC39581 -37 TRAPPC5 -352 LOC51257 9 OR7C2 -399 OR10H3 -953 OR10H4 -323 LOC284434 -560 HSPC142 -632 PGLS -935 LOC148206 -288 ZNF431 -967CLECSF12 -885 PSMC4 -215 EGLN2 -452 LOC388549 -412 SYNGR4 -825 RPL13A -816 LOC402665 -925 FLJ46385 -176 LOC91661 -13 LAIR2 -705 KIR2DL1 -763 KIR3DL2 3 ZNF583 -867 ZNF71 -861 MGC4728 -490 ZNF211 -76 LOC401895 -957 APBA3 -13 FUT5 -174 TNFSF7 8 SH2D3A -273 8D6A -950 EIF3S4 -547 RAB3D -852 MGC20983 -338 NDUFB7 -741 LOC339377 -660 IL12RB1 -56 LOC148198 -361 CEBPA -564 UNQ467 -521 FLJ22573 -941 CLC -823 DYRK1B -849 PSG11 -297 PSG4 -299 PSG9 -435 FLJ34222 -415 ERCC2 -123 DMPK -988 PGLYRP1 -212 LIG1 -806 FLJ32926 -288 CGB8 -202 TEAD2 -546 FLJ20643 -895 LOC400712 -236 SIGLEC6 -972 ZNF577 -582 ZNF611 -148 ZNF600 -716 ZNF600 -37 NALP9 -489 PRDM2 -762 LOC400743 -400 PADI1 -598 FLJ44952 -494 DJ462O23.2 -973 PPP1R8 5 ATPIF1 -766 LOC440581 -793 CGI-94 -384 FLJ14351 -753 UROD -715 LOC441885 -810 DKFZp761D221 -478 DKFZp761D221 -221 IL23R -322 CTH -6 AK5 -966 DNAJB4 -987 CDC7 -604 LOC388649 -426 DCLRE1B -406 LOC440610 -739 LOC440610 -584 LOC440610 -652 LOC441903 -538 LOC440673 -482 BNIPL -420 BNIPL -419 SPRR1B -826 IL6R -110 CKS1B -983 SYT11 -785 PMF1 -223 LOC164118 -75 FY -397 NCSTN -809 HSPA6 -839 HSPA6 -611 CGI-01 7 DKFZP564J047 -208 HFL1 -551 HFL3 -563 NEK7 -714 MGC14801 -276 OR2AK2 -528 LOC441873 -501 LOC441873 -565 LOC441873 -607 LOC343068 -256ARID3A -913 LOC126295 -456 MGC39581 -37 TRAPPC5 -352 LOC51257 9 OR7C2 -399 OR10H3 -953 OR10H4 -323 LOC284434 -560 HSPC142 -632 PGLS -935 LOC148206 -288 ZNF431 -967CLECSF12 -885 1205 GRE-modifying SNPs

191 Gene set enrichment analysis

192 Technical aspects of study design and data analysis Study designs, assay technologies, and statistical methods 1.“Gene discovery” (e.g., genetic epidemiology) Candidate gene studies Genome-wide association studies The bioinformatic “middle road” – biological hypotheses buy power - Candidate set selection a.Regulatory polymorphism b.Coding polymorphism - Statistical considerations a.Power b.Differential enrichment

193 Population prevalence design Outcome-stratified design

194 Population prevalence design GEscan Outcome-stratified design

195 Technical aspects of study design and data analysis Study designs, assay technologies, and statistical methods 1.“Gene discovery” (e.g., genetic epidemiology) Candidate gene studies Genome-wide association studies The bioinformatic “middle road” – biological hypotheses buy power - Candidate set selection a.Regulatory polymorphism b.Coding polymorphism - Statistical considerations a.Power b.Differential enrichment

196

197 Technical aspects of study design and data analysis Study designs, assay technologies, and statistical methods 1.“Gene discovery” (e.g., genetic epidemiology) Candidate gene studies Genome-wide association studies The bioinformatic “middle road” – biological hypotheses buy power - Candidate set selection a.Regulatory polymorphism b.Coding polymorphism - Statistical considerations a.Power b.Differential enrichment

198 Technical aspects of study design and data analysis Study designs, assay technologies, and statistical methods 1.“Gene discovery” (e.g., genetic epidemiology) Candidate gene studies Genome-wide association studies The bioinformatic “middle road” – biological hypotheses buy power

199 Technical take-home points: Strengths & weaknesses of alternative approaches 1.Candidate gene studies: focus on 1 candidate Advantages - Scientifically tractable: incremental & cross-validatable - Maximal statistical power (focused hypothesis) Disadvantages - Can only “discover” what we already know (i.e., biased) 2.Genome-wide association studies: focus on all candidates Advantages - Unbiased de novo discovery Disadvantages - Minimal statistical power, particularly for interactions 3.The bioinformatic “middle road”: focus on a small set of causally plausible candidates (unbiased search of regulatory and coding SNPs) Advantages - Scientifically tractable: “short leap of inference” & cross-validatable - Relatively high statistical power (focus on 1-10% of plausible SNPs) Disadvantages - Likely missing some true causal genetic influences - Bioinformatically intensive – thought (and programming) required

200 Take-home points for this group:

201 1.Gene-Environment interactions are likely far more… - ubiquitous - large in effect size - clinically/socially meaningful …than current genetic analyses presume.

202 Take-home points for this group: 1.Gene-Environment interactions are likely far more… - ubiquitous - large in effect size - clinically/socially meaningful …than current genetic analyses presume. There is plenty left for you to find.

203 Take-home points for this group: 1.Gene-Environment interactions are likely far more… - ubiquitous - large in effect size - clinically/socially meaningful …than current genetic analyses presume. There is plenty left for you to find. 2.If you have the study you have (i.e., can’t alter sampling design), your major opportunities for increasing power/discovery involve:

204 Take-home points for this group: 1.Gene-Environment interactions are likely far more… - ubiquitous - large in effect size - clinically/socially meaningful …than current genetic analyses presume. There is plenty left for you to find. 2.If you have the study you have (i.e., can’t alter sampling design), your major opportunities for increasing power/discovery involve: - focusing on substantive effects that are true/big (e.g., GxE, not G, given antagonistic pleiotropy; E, ExE, GxG, etc.)

205 Take-home points for this group: 1.Gene-Environment interactions are likely far more… - ubiquitous - large in effect size - clinically/socially meaningful …than current genetic analyses presume. There is plenty left for you to find. 2.If you have the study you have (i.e., can’t alter sampling design), your major opportunities for increasing power/discovery involve: - focusing on substantive effects that are true/big (e.g., GxE, not G, given antagonistic pleiotropy; E, ExE, GxG, etc.) - modeling biological mechanisms to focus power/impose constraints (e.g., candidate systems, functional themes, regulatory themes)

206 Take-home points for this group: 1.Gene-Environment interactions are likely far more… - ubiquitous - large in effect size - clinically/socially meaningful …than current genetic analyses presume. There is plenty left for you to find. 2.If you have the study you have (i.e., can’t alter sampling design), your major opportunities for increasing power/discovery involve: - focusing on substantive effects that are true/big (e.g., GxE, not G, given antagonistic pleiotropy; E, ExE, GxG, etc.) - modeling biological mechanisms to focus power/impose constraints (e.g., candidate systems, functional themes, regulatory themes) - combinatorial data-mining (e.g., machine learning in discovery sample)

207 Take-home points for this group: 1.Gene-Environment interactions are likely far more… - ubiquitous - large in effect size - clinically/socially meaningful …than current genetic analyses presume. There is plenty left for you to find. 2.If you have the study you have (i.e., can’t alter sampling design), your major opportunities for increasing power/discovery involve: - focusing on substantive effects that are true/big (e.g., GxE, not G, given antagonistic pleiotropy; E, ExE, GxG, etc.) - modeling biological mechanisms to focus power/impose constraints (e.g., candidate systems, functional themes, regulatory themes) - combinatorial data-mining (e.g., machine learning in discovery sample) - sequential testing designs (low stringency discovery, med stringency test, high stringency confirm)

208 Take-home points for this group: 1.Gene-Environment interactions are likely far more… - ubiquitous - large in effect size - clinically/socially meaningful …than current genetic analyses presume. There is plenty left for you to find. 2.If you have the study you have (i.e., can’t alter sampling design), your major opportunities for increasing power/discovery involve: - focusing on substantive effects that are true/big (e.g., GxE, not G, given antagonistic pleiotropy; E, ExE, GxG, etc.) - modeling biological mechanisms to focus power/impose constraints (e.g., candidate systems, functional themes, regulatory themes) - combinatorial data-mining (e.g., machine learning in discovery sample) - sequential testing designs (low stringency discovery, med stringency test, high stringency confirm) Your advantage is smart data analysis.

209 Follow-up references Overview of genetics / biology Attia, J., et al. (2009) How to use an article about genetic association: A: Background concepts. JAMA, 301, 74-81 Genetic association studies Hirschhorn, J., & Daly, M. (2005) Genome-wide association studies for common diseases and complex traits. Nature Reviews Genetics, 6, 95-108. Attia, J., et al. (2009) How to use an article about genetic association: B: Are the results of the study valid? JAMA, 301, 191-197. Cordell, H, & Clayton, D. (2005) Genetic epidemiology 3: Genetic association studies. Lancet, 366, 1121-1131 Basic statistical modeling for genetics Siegmund, D., & Yakir, B. (2007) The statistics of gene mapping. New York, Springer Sampling & statistical approaches for GxE discovery Thomas, D., (2010) Gene-environment-wide association studies: emerging approaches. Nature Reviews Genetics, 11, 259-272 Statistical strategies for combinatorial discovery Hastie, T., Tibshirani, R. & Friedman, J. (2001) The elements of statistical learning. New York, Springer..

210 Perspectives on the State of the Field How can we best promote the integration of genetic and demographic approaches?

211 Application clinic Open microphone 1.What do you want to accomplish? 2.At what stage are you now? i.Study design? ii.Data collection? iii.Analysis and reporting? 3.How can we be of help?

212 Genomics Workshop Demography of Aging Centers Biomarker Network Meeting in Conjunction with the Annual Meeting of the PAA April 14, 9:00 AM to 3:30 PM – Hyatt Regency, Dallas, Texas Sponsored by USC/UCLA Center of Biodemography and Population Health Organized by Teresa Seeman, Steven Cole, Eileen Crimmins

213

214 Technical aspects of study design and data analysis Study designs, assay technologies, and statistical methods 1.“Gene discovery” (e.g., genetic epidemiology) Candidate gene studies Genome-wide association studies The bioinformatic “middle road” – biological hypotheses buy power 2.Environmental regulation of health (via transcription) Candidate transcript studies - RT-PCR - Statistical analyses incorporating temporal & spatial heterogeneity Genome-wide approaches - Microarrays - Theme discovery a.Functional (Gene Ontology) b.Regulatory (TELiS) c.Spatial (SpAnGEL)

215 RNA DNA RT

216 IFN-  Antiviral cytokine mRNA 0 6 12 IFN-  consensus mRNA (fold-induction over baseline) Exposure (hrs.) IFN-  0 6 12 IFN-  mRNA (fold-induction over baseline) Exposure (hrs.) CpG + NE CpG Collado-Hidalgo et al (2006) Brain, Behavior and Immunity

217

218 SIV RNA (in situ hybridization) SIV replication Social Stress - + SIV replication (sites / spatial quadrat) p <.0001 SNS neurons - + SIV replication (sites / spatial quadrat) p <.0001 Sloan et al. (2006) Journal of Virology Sloan et al. (2007) Journal of Neuroscience

219 Technical aspects of study design and data analysis Study designs, assay technologies, and statistical methods 1.“Gene discovery” (e.g., genetic epidemiology) Candidate gene studies Genome-wide association studies The bioinformatic “middle road” – biological hypotheses buy power 2.Environmental regulation of health (via transcription) Candidate transcript studies - RT-PCR - Statistical analyses incorporating temporal & spatial heterogeneity Genome-wide approaches - Microarrays - Theme discovery a.Functional (Gene Ontology) b.Regulatory (TELiS) c.Spatial (SpAnGEL)

220

221 Lonely Integrated Social isolation J. Cacioppo Genome Biology, 2007 78 131

222 Palmer et al. BMC Genomics (2006)

223 Technical aspects of study design and data analysis Study designs, assay technologies, and statistical methods 1.“Gene discovery” (e.g., genetic epidemiology) Candidate gene studies Genome-wide association studies The bioinformatic “middle road” – biological hypotheses buy power 2.Environmental regulation of health (via transcription) Candidate transcript studies - RT-PCR - Statistical analyses incorporating temporal & spatial heterogeneity Genome-wide approaches - Microarrays - Theme discovery a.Functional (Gene Ontology) b.Regulatory (TELiS) c.Spatial (SpAnGEL)

224 Social Environment Gene Biological function IL6 RNA DNA

225 Social Environment Gene Biological function IL6 RNA DNA

226 Social Environment Gene Biological function IL6 RNA DNA

227 Social Environment Gene Biological function IL6 RNA DNA

228 Social Environment Gene Biological function IL6 RNA DNA

229 Lonely Integrated Social isolation J. Cacioppo Genome Biology, 2007 78 131

230 Lonely Integrated Social isolation J. Cacioppo Genome Biology, 2007 78 131 Inflammation Cell growth/differentiation Transcription control

231 Lonely Integrated Social isolation J. Cacioppo Genome Biology, 2007 78 131 Inflammation Cell growth/differentiation Transcription control Immunoglobulin production Type I interferon antiviral response

232 http://www.gostat.wehi.edu.au

233 TRIM54 ACSBG2 HIST4H4 KLHL32 FLJ35773 GPC4 TRPV4 LBP C20ORF200 ASB15 OCLM http://www.gostat.wehi.edu.au

234

235 Social Environment Gene Biological function IL6 RNA DNA

236

237

238 Sp1 CREB NF-  B

239 Sp1 CREB NF-  B

240 Sp1 CREB NF-  B

241 Sp1 CREB NF-  B Environment S equence Expression Promoter Sequence

242 Sp1 CREB NF-  B Environment S equence Expression Promoter Sequence

243 Sp1 CREB NF-  B Environment S equence Expression Promoter Sequence

244 Sp1 CREB NF-  B

245 Sp1 CREB NF-  B

246 Sp1 CREB NF-  B Environment S equence Expression Promoter Sequence ?

247 Sp1 CREB NF-  B

248 Sp1 CREB NF-  B

249 Sp1 CREB NF-  B

250 Sp1 CREB NF-  B

251 Cole et al (2005) Bioinformatics, 21, 803 http://www.telis.ucla.edu

252 Cole et al (2005) Bioinformatics, 21, 803 http://www.telis.ucla.edu

253 Cole et al (2005) Bioinformatics, 21, 803 http://www.telis.ucla.edu

254 Lonely Integrated Social isolation J. Cacioppo Genome Biology, 2007 78 131

255 Lonely Integrated Social isolation J. Cacioppo Genome Biology, 2007 78 131 NF-  B

256 Lonely Integrated Social isolation J. Cacioppo Genome Biology, 2007 78 131 NF-  B GRE

257 Social Environment Gene Biological function IL6 RNA DNA

258

259

260

261

262 NaB de-repression - fibroblast

263 gene 1 gene 2 gene 3 gene 4 gene 5 gene 6 gene 7 gene 8 gene 9 gene 10 gene 11 gene 12 gene 13 gene 14 gene 15 gene 16 gene 17 gene 18 gene 19 gene 20 gene 21 gene 22 gene 23 gene 24 gene 25 gene 26 gene 27 gene 28 gene 29 gene 30 gene 31 gene 32 gene 33 gene 34 gene 35 gene 36 gene 37 gene 38 gene 39 gene 40 gene 41

264 gene 1 gene 2 gene 3 gene 4 gene 5 gene 6 gene 7 gene 8 gene 9 gene 10 gene 11 gene 12 gene 13 gene 14 gene 15 gene 16 gene 17 gene 18 gene 19 gene 20 gene 21 gene 22 gene 23 gene 24 gene 25 gene 26 gene 27 gene 28 gene 29 gene 30 gene 31 gene 32 gene 33 gene 34 gene 35 gene 36 gene 37 gene 38 gene 39 gene 40 gene 41 TF 1

265 gene 1 gene 2 gene 3 gene 4 gene 5 gene 6 gene 7 gene 8 gene 9 gene 10 gene 11 gene 12 gene 13 gene 14 gene 15 gene 16 gene 17 gene 18 gene 19 gene 20 gene 21 gene 22 gene 23 gene 24 gene 25 gene 26 gene 27 gene 28 gene 29 gene 30 gene 31 gene 32 gene 33 gene 34 gene 35 gene 36 gene 37 gene 38 gene 39 gene 40 gene 41 TF 1 TF 2

266 gene 1 gene 2 gene 3 gene 4 gene 5 gene 6 gene 7 gene 8 gene 9 gene 10 gene 11 gene 12 gene 13 gene 14 gene 15 gene 16 gene 17 gene 18 gene 19 gene 20 gene 21 gene 22 gene 23 gene 24 gene 25 gene 26 gene 27 gene 28 gene 29 gene 30 gene 31 gene 32 gene 33 gene 34 gene 35 gene 36 gene 37 gene 38 gene 39 gene 40 gene 41 TF 1 TF 2 TF 3

267 gene 1 gene 2 gene 3 gene 4 gene 5 gene 6 gene 7 gene 8 gene 9 gene 10 gene 11 gene 12 gene 13 gene 14 gene 15 gene 16 gene 17 gene 18 gene 19 gene 20 gene 21 gene 22 gene 23 gene 24 gene 25 gene 26 gene 27 gene 28 gene 29 gene 30 gene 31 gene 32 gene 33 gene 34 gene 35 gene 36 gene 37 gene 38 gene 39 gene 40 gene 41 TF 1 TF 2 TF 3

268 gene 1 gene 2 gene 3 gene 4 gene 5 gene 6 gene 7 gene 8 gene 9 gene 10 gene 11 gene 12 gene 13 gene 14 gene 15 gene 16 gene 17 gene 18 gene 19 gene 20 gene 21 gene 22 gene 23 gene 24 gene 25 gene 26 gene 27 gene 28 gene 29 gene 30 gene 31 gene 32 gene 33 gene 34 gene 35 gene 36 gene 37 gene 38 gene 39 gene 40 gene 41 TF 1 TF 2 TF 3

269 gene 1 gene 2 gene 3 gene 4 gene 5 gene 6 gene 7 gene 8 gene 9 gene 10 gene 11 gene 12 gene 13 gene 14 gene 15 gene 16 gene 17 gene 18 gene 19 gene 20 gene 21 gene 22 gene 23 gene 24 gene 25 gene 26 gene 27 gene 28 gene 29 gene 30 gene 31 gene 32 gene 33 gene 34 gene 35 gene 36 gene 37 gene 38 gene 39 gene 40 gene 41 gene 1 gene 2 gene 3 gene 4 gene 5 gene 6 gene 7 gene 8 gene 9 gene 10 gene 11 gene 12 gene 13 gene 14 gene 15 gene 16 gene 17 gene 18 gene 19 gene 20 gene 21 gene 22 gene 23 gene 24 gene 25 gene 26 gene 27 gene 28 gene 29 gene 30 gene 31 gene 32 gene 33 gene 34 gene 35 gene 36 gene 37 gene 38 gene 39 gene 40 gene 41 TF 1 TF 2 TF 3

270 gene 1 gene 2 gene 3 gene 4 gene 5 gene 6 gene 7 gene 8 gene 9 gene 10 gene 11 gene 12 gene 13 gene 14 gene 15 gene 16 gene 17 gene 18 gene 19 gene 20 gene 21 gene 22 gene 23 gene 24 gene 25 gene 26 gene 27 gene 28 gene 29 gene 30 gene 31 gene 32 gene 33 gene 34 gene 35 gene 36 gene 37 gene 38 gene 39 gene 40 gene 41 gene 1 gene 2 gene 3 gene 4 gene 5 gene 6 gene 7 gene 8 gene 9 gene 10 gene 11 gene 12 gene 13 gene 14 gene 15 gene 16 gene 17 gene 18 gene 19 gene 20 gene 21 gene 22 gene 23 gene 24 gene 25 gene 26 gene 27 gene 28 gene 29 gene 30 gene 31 gene 32 gene 33 gene 34 gene 35 gene 36 gene 37 gene 38 gene 39 gene 40 gene 41 TF 1 TF 2 TF 3 miRNA 1 miRNA 2 miRNA 3

271 gene 1 gene 2 gene 3 gene 4 gene 5 gene 6 gene 7 gene 8 gene 9 gene 10 gene 11 gene 12 gene 13 gene 14 gene 15 gene 16 gene 17 gene 18 gene 19 gene 20 gene 21 gene 22 gene 23 gene 24 gene 25 gene 26 gene 27 gene 28 gene 29 gene 30 gene 31 gene 32 gene 33 gene 34 gene 35 gene 36 gene 37 gene 38 gene 39 gene 40 gene 41 gene 1 gene 2 gene 3 gene 4 gene 5 gene 6 gene 7 gene 8 gene 9 gene 10 gene 11 gene 12 gene 13 gene 14 gene 15 gene 16 gene 17 gene 18 gene 19 gene 20 gene 21 gene 22 gene 23 gene 24 gene 25 gene 26 gene 27 gene 28 gene 29 gene 30 gene 31 gene 32 gene 33 gene 34 gene 35 gene 36 gene 37 gene 38 gene 39 gene 40 gene 41 TF 1 TF 2 TF 3 miRNA 1 miRNA 2 miRNA 3 DNMT1 DNMT2 DNMT3

272 Technical aspects of study design and data analysis Study designs, assay technologies, and statistical methods 1.“Gene discovery” (e.g., genetic epidemiology) Candidate gene studies Genome-wide association studies The bioinformatic “middle road” – biological hypotheses buy power 2.Environmental regulation of health (via transcription) Candidate transcript studies - RT-PCR - Statistical analyses incorporating temporal & spatial heterogeneity Genome-wide approaches - Microarrays - Theme discovery a.Functional (Gene Ontology) b.Regulatory (TELiS) c.Spatial (SpAnGEL)

273 Technical aspects of study design and data analysis Study designs, assay technologies, and statistical methods 1.“Gene discovery” (e.g., genetic epidemiology) Candidate gene studies Genome-wide association studies The bioinformatic “middle road” – biological hypotheses buy power 2.Environmental regulation of health (via transcription) Candidate transcript studies Genome-wide approaches

274 Technical aspects of study design and data analysis Study designs, assay technologies, and statistical methods 1.“Gene discovery” (e.g., genetic epidemiology) Candidate gene studies Genome-wide association studies The bioinformatic “middle road” – biological hypotheses buy power 2.Environmental regulation of health (via transcription) Candidate transcript studies Genome-wide approaches 3.Gene-Environment interaction Statistical considerations - Main effects and antagonistic pleiotropy - Interaction models - Combinatorial discovery Revisiting the “bioinformatic” middle road - Candidate set selection a.Regulatory polymorphism b.Coding polymorphism

275 Fisher’s regression: GG GC CC Outcome y = a + b(#G) y = a + b(GG) + c(GC) + d(CC)

276 Fisher’s regression: GG GC CC Outcome y = a + b(#G) + c(Env) + d(#G x Env) y = a + b(GG) + c(GC) + d(CC) + e(Env) + f(Env x GG) + g(Env x GC) + h(Env x CC) Environment A GG GC CC Outcome Environment B

277 Combinatorial explosion 10 7 SNPs x 10 1-2 environments = 10 8-9 intx terms

278 Combinatorial explosion 10 7 SNPs x 10 1-2 environments = 10 8-9 intx terms N = 2,000-20,000 for current main effect studies Given that power/effect size, need 2 Million subjects for interaction sweep.

279 What to do? 1.Increase stringency (intra-study) Bonferroni correct / FDR correct Model/simulate error Use a better sampling design 2.Replicate (inter-study or intra-study crossvalidation) 3.Get a hypothesis -Biological -Empirical

280 Combinatorial discovery strategies Smart study design + smart statistics + biological constraint

281 Population prevalence design Outcome-stratified design

282

283 Combinatorial discovery strategies Smart study design + smart statistics + biological constraint

284 Combinatorial discovery strategies Smart study design + smart statistics + biological constraint Stratified sampling Multi-stage testing Cross-validation

285 Combinatorial discovery strategies Smart study design + smart statistics + biological constraint Stratified sampling Multi-stage testing Cross-validation Data-mining / Machine learning -CART/forests -MARS -PRIM

286 Combinatorial discovery strategies Smart study design + smart statistics + biological constraint Stratified sampling Multi-stage testing Cross-validation Data-mining / Machine learning -CART/forests -MARS -PRIM Functional pathways Regulatory pathways Chromosomal units

287 Transcriptional activity (fold-change) 10 8 6 4 2 0 IL6 promoter: WT -174C Norepinephrine (  M): 0 10 - 0 10 Difference: p <.0001 IL6 TCT TGCGATGCTA AAG C V$GATA1_01 =.943 V$GATA1_01 =.619 In silico prediction of Gene x Environment Interaction In silico In vitro

288 FLJ20719 -734 LOC148490 -929 AKR7A2 -678 RHCE -292 LOC440576 -934 SOC -39 SOC -49 SOC -26 UNQ6122 -877 LAPTM5 -728 PHC2 -168 PHC2 -16 ITGB3BP -311 FLJ20331 -994 ZNF265 -663 FUBP1 -778 LOC388650 -392 LOC388654 -957 PDE4DIP -175 COAS2 -435 LOC199882 -474 LOC440689 -692 LOC440689 -16 LOC441906 -496 FLG -17 LEP3 -631 RAB13 -310 LOC91181 -956 LOC126669 -407 LOC440693 -399 PKLR -118 PKLR -597 FCRH1 -580 SPTA1 -163 SLAMF9 -256 KCNJ10 -383 ITLN1 -760 F11R -798 LMX1A -85 SELP -144 LOC400796 -263 F13B -881 MYOG -951 LOC440712 -956 LGTN -331 FLJ10874 -676 GPATC2 -556 LOC440721 -625 AGT 1 FLJ10359 -367 LOC441927 -406 LOC440741 -564 MGC12466 -863 KIAA1720 -894 LOC388578 -522 LOC391205 -430 MIG-6 -618 MIG-6 -638 MIG-6 -678 LOC441870 -731 LOC440561 -255 LOC401940 -500 LOC401940 -564 LOC401940 -606 LOC339553 -400 LOC440753 -695 LOC388789 -593 FLJ38374 -686 LOC391241 -81 LOC388794 -28 C20orf70 -431 STK4 -122 PIGT -910 DNTTIP1 -479 C20orf67 -1 MMP9 -875 CEBPB -978 RNPC1 -370 TH1L -26 LOC400849 -714 LOC400849 -382 CGI-09 -309 FKHL18 -608 C20orf172 -118 TGM2 -220 LOC388798 -828 Kua-UEV -465 Kua-UEV -561 Kua -465 BTBD4 -590 C21orf99 -772 C21orf99 -13 KRTAP15-1 -566 B3GALT5 -889 LOC441955 -824 LOC400858 -624 CLDN8 -17 KRTAP19-7 -127 DSCR1 -620 C21orf84 -232 KRTAP12-4 -899 FTCD -410 LOC440842 -24 PEX26 -129 PEX26 -89 ZNF74 -726 LOC440804 -940 SMARCB1 -290 CABIN1 -797 KIAA1671 -26 ARP10 -612 ADSL -602 ARHGAP8 -260 NUP50 -629 PPARA -184 BID -126 DGCR14 -951 TXNRD2 -882 LOC391303 -881 LOC150221 -939 LOC91219 -352 LOC150236 -666 GSTT1 -141 SEC14L4 -746 SSTR3 -705 FLJ22582 -372 DIA1 -749 ATP5L2 -328 A4GALT -825 SULT4A1 -729 C2orf15 -882 LOC129521 -477 LOC440892 -918 IL1RL1 -332 MRPS9 -970 LOC442037 -839 IL1F7 -978 MGC52000 -273 MGC52000 -466 MGC52057 -404 MAP1D -120 COL3A1 -310 SLC39A10 -921 LOC200726 -220 IL8RB -447 TUBA4 -643 FLJ25955 -24 ALPPL2 -296 UGT1A9 -651 UGT1A7 -351 UGT1A6 -224 UGT1A6 -402 TRPM8 -170 ASB1 -723 GCKR -204 LOC388938 -212 FLJ38348 -606 MSH2 -376 MSH2 -976 MSH2 -376 SBLF -59 LOC151443 -85 LOC391387 -134 SEMA4F -751 RBM29 -1 LOC339562 -621 LOC339562 -641 LOC200493 -245 TXNDC9 -714 FLJ40629 -946 LOC401005 -12 LOC389050 -170 ORC4L -16 ARL5 -895 NR4A2 -527 ATP5G3 -55 ZNF533 -598 ZSWIM2 -772 PGAP1 -821 PGAP1 -827 SF3B1 -138 ORC2L -786 LOC391475 -413 CRYGC -765 PECR -942 SLC23A3 -412 LOC442070 -877 LOC129607 -488 LOC339789 -268 LOC130502 -558 ALK -710 BCL11A -615 PAP -438 PAP -531 CNTN4 -809 PPARG -584 PPARG -914 LOC401054 -926 GALNTL2 -427 FBXL2 -107 APRG1 -269 APRG1 -347 LOC440951 -20 LOC389123 -140 LOC285194 -808 NR1I2 -769 STXBP5L -480 LOC442092 -880 MRPS22 -897 KCNAB1 -793 LOC402146 -134 LOC90133 -2 NLGN1 -541 FLJ20522 -803 ATP2B2 -593 LOC440946 -917 ANKRD28 -437 LOC152024 -365 FLJ32685 -953 SLC4A7 -509 MST1 -895 LOC377064 -623 LOC200959 -572 CPOX -150 LOC401079 -259 CBLB -250 LOC344807 -514 GPR156 -497 IQCB1 -412 MGC34728 -553 LOC256374 -248 KIAA0861 -435 MGC15397 -397 LOC254808 -484 LRRC15 -288 KIAA0226 -776 LOC255324 -380 IBSP -319 MGC48628 -101 NDST3 -902 LOC401149 -733 LOC441038 -837 FLJ35630 -291 CYP4V2 -117 LOC401164 -978 LOC391727 -934 LOC399917 -840 ZAR1 -106 LOC401132 -18 PF4 -819 EIF4E -716 ADH7 -557 TACR3 -957 AGXT2L1 -631 PLA2G12A -795 PITX2 -411 LOC401155 -72 CDHJ -652 FGA -110 PPID -384 LOC441049 -368 GPM6A -203 LOC389833 -878 LOC389833 -288 LOC389833 -878 LOC442102 -418 FGFBP1 -290 LOC441013 -188 FLJ00310 -289 FLJ00310 -881 FLJ00310 -289 LOC442127 -287 SRD5A1 -631 LOC345711 -877 LOC389281 -225 MGC42105 -669 PELO -938 BDP1 -918 DKFZp564C0469 -378 LOC134505 -63 TSLP -331 LOC340069 -755 SNCAIP -671 LOC441106 -646 SLC27A6 -484 CDC42SE2 -384 PHF15 -52 LOC389331 -27 PCDHA4 -26 PCDHB3 -623 PCDHB6 -212 PCDHB16 -609 ABLIM3 -474 LARP -716 LOC134541 -868 FGFR4 -472 FGFR4 -745 LOC442145 -7 LOC442146 -856 LOC345462 -604 LOC345462 -609 LOC442148 -595 OR2V2 -340 OR2V2 -901 TPPP -454 MYO10 -583 LOC441066 -463 GDNF -36 LOC345643 -568 FOXD1 -990 ARSB -493 DHFR -473 SPATA9 -748 CHD1 -581 STK22D -863 LOC389316 -227 CDO1 -360 FLJ33977 -166 LOC391824 -129 ALDH7A1 -920 CAMK2A -429 C5orf4 -657 LOC345430 -332 DUSP1 -361 LOC285770 -132 NQO2 -705 MRS2L -22 HIST1H2BA -960 HIST1H2BD -597 HIST1H2BH -618 HIST1H4I -283 HLA-H -477 MRPS18B -207 LOC401250 -26 LOC401250 -497 NFKBIL1 -305 LY6G5B -359 C6orf25 -413 HSPA1B -942 C2 -687 HLA-DRA -774 HLA-DQA1 -265 ZBTB9 -229 LOC389386 -725 TLT4 -607 C6orf139 -788 KIAA1411 -549 C6orf57 -986 C6orf165 -728 POU3F2 -997 LOC340148 -581 C6orf55 -149 LOC345829 -202 LOC442278 -732 LOC442279 -858 LOC401289 -82 LOC285766 -472 SERPINB6 -657 OFCC1 -367 LOC441129 -714 SMA3 -762 LOC222699 -719 LOC441138 -870 OR12D3 -872 LOC346171 -389 HCG4P6 -80 HCG4P6 -501 PSORS1C2 -78 HLA-C -512 HLA-B -594 HLA-DRB1 -469 HLA-DRB1 -821 HLA-DQB2 0 HLA-DQB2 -333 HLA-DQB2 0 HLA-DOB -500 MLN -740 LRFN2 -452 C6orf108 -907 PLA2G7 -227 CRISP1 -236 IL17F -733 HMGCLL1 -759 LOC442226 -67 C6orf66 -832 DJ467N11.1 -34 RTN4IP1 -207 SLC22A16 -869 LOC442254 -307 DEADC1 -509 FLJ44955 -391 SYNE1 -484 SYNE1 -126 LOC389435 -451 LOC389435 -565 PIP3-E -457 T -9 T -3 LOC442280 -112 DKFZP434J154 -615 LOC401303 -632 LOC441198 -739 GHRHR -646 ADCYAP1R1 -60 C7orf16 -842 LOC441209 -41 GPR154 -435 C7orf36 -707 BLVRA -400 LOC51619 -311 WBSCR19 -38 LOC136288 -523 LOC392030 -632 FZD9 -485 LOC85865 -255 LOC442341 -390 AKR1D1 -159 LOC93432 -126 OR2F1 -160 OR2A5 -927 LOC441184 -336 LOC441186 -584 LOC441187 -654 LOC389831 -914 LOC222967 -338 LOC340267 -244 ICA1 -699 AGR2 -65 LOC389472 -184 LOC401316 -837 CRHR2 -610 PDE1C -20 LOC441210 -361 LOC222052 -77 LOC441224 -287 LOC441230 -143 LOC441245 -127 LOC441259 -954 CCL26 -441 SEMA3C -385 C7orf23 -761 PON1 -785 GATS -36 ACHE -715 ACHE -224 ACHE -715 ACHE -224 ORC5L -990 CHCHD3 -793 MGC5242 -861 LOC392997 -596 FLJ44186 -168 HIPK2 -70 ZC3HDC1 -407 LOC402301 -14 BAGE4 -100 BAGE4 -648 MCPH1 -520 MCPH1 -203 AMAC -766 NEIL2 -956 NEF3 -789 PNOC -756 LOC441344 -308 FKSG2 -72 DKFZp586M1819 -469 SNTG1 -463 LOC389657 -899 ADHFE1 -54 SULF1 -522 WWP1 -695 LOC401471 -326 FLJ45248 -290 LOC441309 -343 LOC392169 -927 ANGPT2 -895 SPAG11 -971 SPAG11 -622 SPAG11 -971 DEFB104 -132 LOC389633 -370 ASAH1 -702 ASAH1 -882 FLJ22494 -242 FLJ22494 -781 SNAI2 -728 CPA6 -613 FSBP -393 MFTC -905 MRPL13 -525 LOC442399 -126 TOP1MT -477 LOC286126 -887 LOC340393 -922 DOCK8 -109 LOC441386 -327 C9orf93 -708 SH3GL2 -702 C9orf94 -376 LOC340501 -32 LOC441417 -394 DKFZP434M131 -944 SECISBP2 -404 LOC441453 -821 PHF2 -646 PHF2 -648 LOC441457 -742 LOC441457 -802 PRG-3 -971 RAD23B -998 SLC31A2 -380 OR1N2 -646 C9orf54 -2 LAMC3 -895 LOC441473 -825 DBH -768 OBP2A -732 EGFL7 -330 EGFL7 -335 TRAF2 -32 LOC441408 -394 LOC389702 -288 C9orf46 -353 SLC24A2 -265 IFNA10 -138 IFNA14 -85 C9orf11 -311 C9orf24 -905 UNQ470 -31 STOML2 -420 LOC392334 -904 LOC286327 -215 HNRPK -86 LOC441452 -955 DIRAS2 -896 LOC286359 -774 TXNDC4 -690 TXN -239 OR1L8 -459 DYT1 -561 ABO -790 ABO -789 ABO -790 XPMC2H -374 LOC441474 -921 LOC389734 -489 LOC389734 -223 FCN1 -673 FCN1 -709 LOC441410 -990 GAGE1 -21 RRAGB -788 LOC340527 -194 SH3BGRL -944 DIAPH2 -921 HSU24186 -145 NXF2 -89 PLP1 -918 LOC286436 -713 SLC6A14 -962 LOC392529 -73 FLJ25735 -992 MAGEB4 -834 LOC389844 -822 LOC389844 -814 UBE1 -964 LOC203604 -16 LOC441481 -796 DMD -923 RPGR 3 ZNF21 -828 PRKY -308 LOC441537 -223 LOC441539 -222 LOC441535 -225 LOC441536 -223 LOC338588 -51 UCN3 -368 NET1 -14 MAPK8 -856 LOC399768 -100 CDC2 -415 SLC29A3 -596 LOC143244 -131 LOC439994 -962 LIPL3 -68 LIPL3 -704 LOC439996 -302 LOC387701 -717 LOC387701 -817 FRAT1 -287 ABCC2 -3 HPS6 -237 NFKB2 -790 PNLIPRP2 -442 DMBT1 -462 FANK1 3 TAF3 -544 LOC441547 9 LOC220998 -941 TPRT -277 C10orf68 -817 C10orf9 -269 ZNF33A -477 LOC399744 -202 PPYR1 -81 LOC439946 -71 AKR1C2 -641 LOC441560 -504 LOC439975 -618 NEUROG3 6 AMID -452 PPP3CB -854 LOC439983 -240 LOC389988 -68 MMS19L -221 C10orf69 -121 GPR10 -555 C10orf93 -42 ASB13 -506 IL15RA -222 IL15RA -827 USP6NL -573 C10orf45 -181 NMT2 -912 SIAT8F -676 NEBL -727 C10orf52 -163 LOC439953 -879 LOC399737 -608 CTGLF1 -504 LOC439963 -500 KCNQ1 -40 LOC387746 -61 OR51F2 -640 TRIM34 -105 OR10A2 -851 SAA1 -721 SAA1 -722 LOC441593 -126 PDHX -845 TRIM44 -24 LOC90139 -660 NDUFS3 -929 LOC196346 -885 OR5T3 -97 CTNND1 -133 CTNND1 -116 CNTF -149 ROM1 -515 MARK2 -375 RAB1B -75 GSTP1 -841 LOC440056 -824 USP35 -148 LOC390231 -471 OR4D5 -465 OR8G5 -809 MGC39545 -867 LOC399969 -328 LOC219797 -216 NUP98 -651 KIAA0409 -533 LOC283299 -427 LOC440026 -69 LOC440030 -675 LOC387754 -159 LOC144100 -631 HPS5 -917 LOC387764 -149 LOC440041 -221 FLJ31393 -362 OR8H1 -161 AGTRL1 -809 PRG2 -899 TCN1 -716 RAB3IL1 -976 KIAA0404 -771 CHRDL2 -754 KCTD14 -94 MRE11A -879 MRE11A -982 MMP7 -853 CRYAB -175 ZNF202 -527 LOC387820 -553 LOC387823 -178 CCND2 -350 NDUFA9 -485 KCNA5 -805 FLJ10665 -245 FLJ10665 -576 LOC285407 -743 LOC390299 -771 FLJ10652 -491 LOC144245 -455 PFKM -838 DKFZp686O1689 -733 C12orf10 -110 DGKA -806 DGKA -800 SUOX -384 ZNFN1A4 -874 LYZ -944 GAS41 -166 VEZATIN -34 LOC387876 -110 C12orf8 -840 COX6A1 -124 LOC390364 -971 LOC144678 -418 LOC338797 -31 SLC6A13 -445 NRIP2 -107 NOL1 -122LOC387701 -817 FRAT1 -287 ABCC2 -3 HPS6 -237 NFKB2 -790 PNLIPRP2 -442 CLECSF12 -885 KLRK1 -349 PRB1 -589 ADAMTS20 -965 SLC38A2 -638 K-ALPHA-1 -27 KIAA1602 -262 RACGAP1 -620 K6IRS3 -708 KRT4 -83 NPFF -777 STAT2 -94 FLJ32949 -500 IFNG -795 MGC26598 -498 HAL -358 DKFZp434M0331 -920 LOC400070 -223 TSC -785 GPR109B -392 EPIM -568 GALNT9 -798 LOC440122 -169 LOC221140 -342 LOC440128 -877 LOC387912 -279 LOC341784 -327 NURIT -947 RB1 -525 DKFZP434K1172 -595 LOC144983 -906 LOC144983 -892 LOC144983 -896 LOC400144 -807 PROZ -865 CRYL1 -768 POSTN -32 LOC440134 -367 EBPL -973 GUCY1B2 -832 LOC338862 -918 LOC404785 -818 OR11H6 -269 C14orf92 -234 PSMA6 -219 KTN1 -222 C14orf166B -786 EVL -28 CCNB1IP1 -868 NEDD8 -143 BAZ1A -508 NFKBIA -963 LOC283551 -302 CDKL1 -902 LOC400214 -138 RTN1 -974 LOC390488 -457 PLEK2 -465 PIGH -153 RDH11 -251 FLJ39779 -161 KIAA1509 -179 SERPINA2 -559 SERPINA9 -856 LOC390529 -204 LOC388073 -112 LOC400307 -332 LOC283694 -71 LOC400320 -443 FLJ35785 -414 LOC440249 -92 HH114 -991 PLA2G4B -483 CAPN3 -318 LOC400368 -320 SLC28A2 -275 DUT -32 SCG3 -739 LIPC -853 OSTbeta -781 LOC440289 -446 COMMD4 -790 LOC400433 -496 LOC390637 -55 FLJ11175 -113 LOC440224 -815 LOC283804 -112 CHSY1 -876 LOC440315 -303 LOC400470 -62 LOC388076 -715 LOC440250 -206 LOC440255 -981 FLJ20313 -236 AVEN -767 KIAA0377 -896 FBN1 -191 SPPL2A -4 BCL2L10 -653 LOC145780 -610 BNIP2 -842 BNIP2 -421 RASL12 -878 SNAPC5 -540 BG1 -364 LOC400411 -62 LOC440293 -718 FLJ40113 -951 IP -207 TBL3 0 KIAA1171 -70 TNFRSF12A -968 DNAJA3 -24 ALG1 -464 FLJ12363 -773 LOC92017 -711 TMC7 -412 MGC16824 -271 RBBP6 -795 ITGAX -504 ERAF -510 LOC388248 -649 FLJ38101 -981 CES4 -221 MT1H -280 GAN -839 PLCG2 -534 CDH13 -906 HSBP1 -425 MLYCD -917 FLJ45121 -772 DPEP1 -765 FLJ32252 -288 FLJ32252 -346 MGC35212 -360 FLJ25410 -280 LOC400506 -715 LOC94431 -77 DOC2A -265 LOC441761 -889 LOC57019 -375 ZNF319 -360 DNCLI2 -857 DKFZP434A1319 -236 LOC439920 -70 CHST5 -601 CHST5 -756 LOC390748 -242 DPH2L1 -42 LOC388323 -892 MAP2K4 -128 KRTAP4-12 -78 JJAZ1 -789 CCL2 -912 PSMB3 -889 LOC440440 -1 FLJ25168 -244 SP2 -57 LOC388406 -800 TBX4 -465 DDX42 -212 LOC90799 -734 DKFZP586L0724 -829 SSTR2 -874 MRPS7 -822 MRPS7 -719 LOC388429 -804 NARF -669 GEMIN4 -911 OR1D2 -376 ALOX15 -267 SLC16A11 -346 CLECSF14 -596 CLECSF14 -640 FLJ40217 -393 RCV1 -761 CDRT1 -618 NOS2A -287 KRT25D -828 KRT12 -585 HUMGT198A -797 HUMGT198A -690 FLJ31222 -769 LOC284058 -524 GIP -957 LOC400619 -823 UNC13D -695 LOC339162 -685 LOC388462 -43 SEH1L -801 LOC284232 -988 LOC284232 -845 CABLES1 -281 CABYR -908 DSG3 -367 SLC14A1 -333 DCC -386 RAB27B -713 ZCCHC2 -249 LOC342808 -306 LOC284276 -397 MYOM1 -232 MC2R -113 LOC441817 -600 KIAA1632 -405 FBXO15 -123 FBXO15 -192 LOC390865 -489 TXNL4 -33 CDC34 -270 GZMM -678 C19orf21 -573 ARID3A -913 LOC126295 -456 MGC39581 -37 TRAPPC5 -352 LOC51257 9 OR7C2 -399 OR10H3 -953 OR10H4 -323 LOC284434 -560 HSPC142 -632 PGLS -935 LOC148206 -288 ZNF431 -967CLECSF12 -885 PSMC4 -215 EGLN2 -452 LOC388549 -412 SYNGR4 -825 RPL13A -816 LOC402665 -925 FLJ46385 -176 LOC91661 -13 LAIR2 -705 KIR2DL1 -763 KIR3DL2 3 ZNF583 -867 ZNF71 -861 MGC4728 -490 ZNF211 -76 LOC401895 -957 APBA3 -13 FUT5 -174 TNFSF7 8 SH2D3A -273 8D6A -950 EIF3S4 -547 RAB3D -852 MGC20983 -338 NDUFB7 -741 LOC339377 -660 IL12RB1 -56 LOC148198 -361 CEBPA -564 UNQ467 -521 FLJ22573 -941 CLC -823 DYRK1B -849 PSG11 -297 PSG4 -299 PSG9 -435 FLJ34222 -415 ERCC2 -123 DMPK -988 PGLYRP1 -212 LIG1 -806 FLJ32926 -288 CGB8 -202 TEAD2 -546 FLJ20643 -895 LOC400712 -236 SIGLEC6 -972 ZNF577 -582 ZNF611 -148 ZNF600 -716 ZNF600 -37 NALP9 -489 PRDM2 -762 LOC400743 -400 PADI1 -598 FLJ44952 -494 DJ462O23.2 -973 PPP1R8 5 ATPIF1 -766 LOC440581 -793 CGI-94 -384 FLJ14351 -753 UROD -715 LOC441885 -810 DKFZp761D221 -478 DKFZp761D221 -221 IL23R -322 CTH -6 AK5 -966 DNAJB4 -987 CDC7 -604 LOC388649 -426 DCLRE1B -406 LOC440610 -739 LOC440610 -584 LOC440610 -652 LOC441903 -538 LOC440673 -482 BNIPL -420 BNIPL -419 SPRR1B -826 IL6R -110 CKS1B -983 SYT11 -785 PMF1 -223 LOC164118 -75 FY -397 NCSTN -809 HSPA6 -839 HSPA6 -611 CGI-01 7 DKFZP564J047 -208 HFL1 -551 HFL3 -563 NEK7 -714 MGC14801 -276 OR2AK2 -528 LOC441873 -501 LOC441873 -565 LOC441873 -607 LOC343068 -256ARID3A -913 LOC126295 -456 MGC39581 -37 TRAPPC5 -352 LOC51257 9 OR7C2 -399 OR10H3 -953 OR10H4 -323 LOC284434 -560 HSPC142 -632 PGLS -935 LOC148206 -288 ZNF431 -967CLECSF12 -885 1205 GRE-modifying SNPs

289 Population prevalence design GEscan Outcome-stratified design

290 Coding sequence polymorphisms

291 gene 1 gene 2 gene 3 gene 4 gene 5 gene 6 gene 7 gene 8 gene 9 gene 10 gene 11 gene 12 gene 13 gene 14 gene 15 gene 16 gene 17 gene 18 gene 19 gene 20 gene 21 gene 22 gene 23 gene 24 gene 25 gene 26 gene 27 gene 28 gene 29 gene 30 gene 31 gene 32 gene 33 gene 34 gene 35 gene 36 gene 37 gene 38 gene 39 gene 40 gene 41 TF 1 TF 2 TF 3

292 Combinatorial discovery strategies Smart study design + smart statistics + biological constraint Stratified sampling Multi-stage testing Cross-validation Data-mining / Machine learning -CART/forests -MARS -PRIM Functional pathways Regulatory pathways Chromosomal units

293 Combinatorial discovery strategies Smart study design + smart statistics + biological constraint Stratified sampling Multi-stage testing Cross-validation Data-mining / Machine learning -CART/forests -MARS -PRIM Functional pathways Regulatory pathways Chromosomal units Why is this critical?

294 Combinatorial discovery strategies Smart study design + smart statistics + biological constraint Stratified sampling Multi-stage testing Cross-validation Data-mining / Machine learning -CART/forests -MARS -PRIM Functional pathways Regulatory pathways Chromosomal units Why is this critical? Antagonistic pleiotropy is the norm → GxE

295 Combinatorial discovery strategies Smart study design + smart statistics + biological constraint Stratified sampling Multi-stage testing Cross-validation Data-mining / Machine learning -CART/forests -MARS -PRIM Functional pathways Regulatory pathways Chromosomal units Why is this critical? Antagonistic pleiotropy is the norm → GxE Epistatic interaction is the norm → GxG

296 Combinatorial discovery strategies Smart study design + smart statistics + biological constraint Stratified sampling Multi-stage testing Cross-validation Data-mining / Machine learning -CART/forests -MARS -PRIM Functional pathways Regulatory pathways Chromosomal units Why is this critical? Antagonistic pleiotropy is the norm → GxE Epistatic interaction is the norm → GxG High-order interactions are likely normal → GxGxExE

297 Combinatorial discovery strategies Smart study design + smart statistics + biological constraint Stratified sampling Multi-stage testing Cross-validation Data-mining / Machine learning -CART/forests -MARS -PRIM Functional pathways Regulatory pathways Chromosomal units Why is this critical? Antagonistic pleiotropy is the norm → GxE Epistatic interaction is the norm → GxG High-order interactions are likely normal → GxGxExE Low power, “replication failure”, and epistemological slop - the missing “h”, and the missing “E”

298 Technical aspects of study design and data analysis Study designs, assay technologies, and statistical methods 1.“Gene discovery” (e.g., genetic epidemiology) Candidate gene studies Genome-wide association studies The bioinformatic “middle road” – biological hypotheses buy power 2.Environmental regulation of health (via transcription) Candidate transcript studies Genome-wide approaches 3.Gene-Environment interaction Statistical considerations - Main effects and antagonistic pleiotropy - Interaction models - Combinatorial discovery Revisiting the “bioinformatic” middle road - Candidate set selection a.Regulatory polymorphism b.Coding polymorphism

299 Technical aspects of study design and data analysis Study designs, assay technologies, and statistical methods 1.“Gene discovery” (e.g., genetic epidemiology) Candidate gene studies Genome-wide association studies The bioinformatic “middle road” – biological hypotheses buy power 2.Environmental regulation of health (via transcription) Candidate transcript studies Genome-wide approaches 3.Gene-Environment interaction Statistical considerations Revisiting the “bioinformatic” middle road

300 Take-home points for this group:

301 1.Gene-Environment interactions are likely far more… - ubiquitous - large in effect size - clinically/socially meaningful …than current genetic analyses presume.

302 Take-home points for this group: 1.Gene-Environment interactions are likely far more… - ubiquitous - large in effect size - clinically/socially meaningful …than current genetic analyses presume. There is plenty left for you to find.

303 Take-home points for this group: 1.Gene-Environment interactions are likely far more… - ubiquitous - large in effect size - clinically/socially meaningful …than current genetic analyses presume. There is plenty left for you to find. 2.If you have the study you have (i.e., can’t alter sampling design), your major opportunities for increasing power/discovery involve:

304 Take-home points for this group: 1.Gene-Environment interactions are likely far more… - ubiquitous - large in effect size - clinically/socially meaningful …than current genetic analyses presume. There is plenty left for you to find. 2.If you have the study you have (i.e., can’t alter sampling design), your major opportunities for increasing power/discovery involve: - focusing on substantive effects that are true/big (e.g., GxE, not G, given antagonistic pleiotropy; E, ExE, GxG, etc.)

305 Take-home points for this group: 1.Gene-Environment interactions are likely far more… - ubiquitous - large in effect size - clinically/socially meaningful …than current genetic analyses presume. There is plenty left for you to find. 2.If you have the study you have (i.e., can’t alter sampling design), your major opportunities for increasing power/discovery involve: - focusing on substantive effects that are true/big (e.g., GxE, not G, given antagonistic pleiotropy; E, ExE, GxG, etc.) - modeling biological mechanisms to focus power/impose constraints (e.g., candidate systems, functional themes, regulatory themes)

306 Take-home points for this group: 1.Gene-Environment interactions are likely far more… - ubiquitous - large in effect size - clinically/socially meaningful …than current genetic analyses presume. There is plenty left for you to find. 2.If you have the study you have (i.e., can’t alter sampling design), your major opportunities for increasing power/discovery involve: - focusing on substantive effects that are true/big (e.g., GxE, not G, given antagonistic pleiotropy; E, ExE, GxG, etc.) - modeling biological mechanisms to focus power/impose constraints (e.g., candidate systems, functional themes, regulatory themes) - combinatorial data-mining (e.g., machine learning in discovery sample)

307 Take-home points for this group: 1.Gene-Environment interactions are likely far more… - ubiquitous - large in effect size - clinically/socially meaningful …than current genetic analyses presume. There is plenty left for you to find. 2.If you have the study you have (i.e., can’t alter sampling design), your major opportunities for increasing power/discovery involve: - focusing on substantive effects that are true/big (e.g., GxE, not G, given antagonistic pleiotropy; E, ExE, GxG, etc.) - modeling biological mechanisms to focus power/impose constraints (e.g., candidate systems, functional themes, regulatory themes) - combinatorial data-mining (e.g., machine learning in discovery sample) - sequential testing designs (low stringency discovery, med stringency test, high stringency confirm)

308 Take-home points for this group: 1.Gene-Environment interactions are likely far more… - ubiquitous - large in effect size - clinically/socially meaningful …than current genetic analyses presume. There is plenty left for you to find. 2.If you have the study you have (i.e., can’t alter sampling design), your major opportunities for increasing power/discovery involve: - focusing on substantive effects that are true/big (e.g., GxE, not G, given antagonistic pleiotropy; E, ExE, GxG, etc.) - modeling biological mechanisms to focus power/impose constraints (e.g., candidate systems, functional themes, regulatory themes) - combinatorial data-mining (e.g., machine learning in discovery sample) - sequential testing designs (low stringency discovery, med stringency test, high stringency confirm) Your advantage is smart data analysis.

309 Follow-up references Overview of genetics / biology Attia, J., et al. (2009) How to use an article about genetic association: A: Background concepts. JAMA, 301, 74-81 Genetic association studies Hirschhorn, J., & Daly, M. (2005) Genome-wide association studies for common diseases and complex traits. Nature Reviews Genetics, 6, 95-108. Attia, J., et al. (2009) How to use an article about genetic association: B: Are the results of the study valid? JAMA, 301, 191-197. Cordell, H, & Clayton, D. (2005) Genetic epidemiology 3: Genetic association studies. Lancet, 366, 1121-1131 Basic statistical modeling for genetics Siegmund, D., & Yakir, B. (2007) The statistics of gene mapping. New York, Springer Sampling & statistical approaches for GxE discovery Thomas, D., (2010) Gene-environment-wide association studies: emerging approaches. Nature Reviews Genetics, 11, 259-272 Statistical strategies for combinatorial discovery Hastie, T., Tibshirani, R. & Friedman, J. (2001) The elements of statistical learning. New York, Springer..

310 Perspectives on the State of the Field How can we best promote the integration of genetic and demographic approaches?

311 Application clinic Open microphone 1.What do you want to accomplish? 2.At what stage are you now? i.Study design? ii.Data collection? iii.Analysis and reporting? 3.How can we be of help?

312 Genomics Workshop Demography of Aging Centers Biomarker Network Meeting in Conjunction with the Annual Meeting of the PAA April 14, 9:00 AM to 3:30 PM – Hyatt Regency, Dallas, Texas Sponsored by USC/UCLA Center of Biodemography and Population Health Organized by Teresa Seeman, Steven Cole, Eileen Crimmins

313

314 Richlin et al. Brain, Behavior & Immunity (2004)


Download ppt "Genomics Workshop Demography of Aging Centers Biomarker Network Meeting in Conjunction with the Annual Meeting of the PAA April 14, 9:00 AM to 3:30 PM."

Similar presentations


Ads by Google