Presentation is loading. Please wait.

Presentation is loading. Please wait.

Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology.

Similar presentations


Presentation on theme: "Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology."— Presentation transcript:

1 Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology

2 Gene Selection for Molecular Studies Selection of putative genetic factors is the central issue of the molecular epidemiological studies even thought the selection of the putative risk factors are equally important because of the focus of the molecular epidemiology is the assessment of gene-environment interaction

3 Two Types of Genes High Risk Genes Low Risk Genes

4 Familiar Disease Genes (High Risk Gene): -High penetrance -High AR/RR -Gene frequency: low (<1%) -Study setting: family -Study type: Linkage -PAR: low -Role of Environment: Modest

5 Example of High Risk Genes Mutations of TP53 gene BRCA1 and BRCA2 RB gene mutations

6 Susceptibility Genes (Low Risk Genes) -Low penetrance -Low AR/RR -Gene frequency: high (>1%-90%) -Study setting: population -Study type: association -PAR: high -Role of Environment: critical

7 Approach for High Risk Genes Functional approach (forward genetics): from genotype to phenotype Positional approach (reversed genetics) from phenotype to genotype

8 Functional Approach: An Example From patients with DNA repair defects: a cell line is created Add certain fragment of human chromosome Produce a repair component phenotype

9 Positional Approach Linkage analysis Loss of heterozygosity (LOH) Chromosome abnormalities

10 Linkage analysis It is method to identify the disease loci Family based, need sufficient sample size Germline DNA from affected and unaffected individuals A genetic mechanism (autosomal dominant/recessive) A set of markers

11 Loss of Heterozygosity (LOH) Need both normal and tumor tissues The loss of signal in targeted tissue (tumor) in comparison with normal tissue If LOH consistently observed in a particular region, an indication of an important gene is indicated in the region.

12 Chromosome Abnormalities Deletion Insertion Microsatellite instability

13 In-depth Approaches to Identify Candidate Genes When above three methods indicate a region in chromosome, further work is needed to identify particular candidate genes: -Mutation screening -restoration of normal phenotype by transfection of a normal allele -mouse model of disease by introducing defective mutations

14 Approaches for Low Risk Genes Linkage analysis may not be feasible because it requires a relatively large sample size (If the OR=2, 2500 family would be needed)

15 Approaches for Low Risk Genes New techniques will be needed to identify the low risk susceptibility genes -Automated micro-array genechips -SNP identification

16 Selection of Putative Genes (1) Inter-individual variation in the trait exist in the population -If there is very small variation of the phenotype in the population, the rationale to examine the genotype is weak. -If there is a very large variation of the phenotype, other potential factors need to be considered

17 Selection of Putative Genes (2) The gene is involved in the process related to carcinogenesis: -DNA repair -Chromosome stability -Activities of oncogenes/tumor suppressor genes -cell cycle control/signal tranduction

18 Selection of Putative Genes (3) The trait exhibits an inheritance pattern consistent with Mendelian transmission Any phenotype should have a genetic basis

19 Selection of Putative Genes (3) Certain phenotypes such as “mutagen sensitivity” has been reported to be associated with many smoking related cancers, however, the precise nature of this susceptibility factor remains incompletely understand because the genotype associated with mutagen sensitivity is still unclear.

20 Selection of Putative Genes (4) Gene action exists in relevant organ. -CYP1A1 is largely absent from liver, but present in lung -CYP2D6 is expressed in brain -GSTM1 has some expression in lung -GSTP1 is expressed in lung

21 Selection of Putative Genes (5) Gene location and characterization. -Similar gene structure may indicate similar function -Most of mutations occur in the coding sequence, but mutations in intragenic noncoding may occur -Specific point mutation may indicate specific exposures

22 Selection of Putative Genes Polymorphisms and mutation Gene-Gene interactions Animal models Human studies Genotype-phenotype Relation to disease Ethnic variation

23 Selection of Putative Genes Gene-Gene interaction (phase I and phase II). -CYP1A1 and GSTM1 and lung cancer risk, PAH (carcinogens) -CYP2A6 and CYP2D6, NNK

24 2-1. Background:The summary of characteristics and significance of the genes of interest.

25 2-1. Background: Theoretical model of gene-gene/environmental interaction pathway Environmental Carcinogens / Procarcinogens Exposures PAHs, Xenobiotics, Arene, Alkine, etc Carcinogenesis Tobacco consumption Occupational Exposures Environmental Exposure ?

26 Environmental Carcinogens / Procarcinogens Exposures PAHs, Xenobiotics, Arene, Alkine, etc Active carcinogens Detoxified carcinogens Tobacco Consumption Occupational Exposures Environmental Exposure CYP1A1 GSTP1 mEH NQO1 GSTM1 2-1. Background: Theoretical model of gene-gene/environmental interaction pathway

27 GSTM1 If DNA damage not repaired DNA damage repaired CYP1A1 GSTP1 mEH NQO1 XRCC1 Defected DNA repair gene Environmental Carcinogens / Procarcinogens Exposures PAHs, Xenobiotics, Arene, Alkine, etc Active carcinogens Detoxified carcinogens DNA Damage Normal cell Tobacco consumption Occupational Exposures Environmental Exposure 2-1. Background: Theoretical model of gene-gene/environmental interaction pathway

28 If DNA damage not repaired DNA damage repaired If loose cell cycle control Defected DNA repair gene G S G2 M Environmental Carcinogens / Procarcinogens Exposures PAHs, Xenobiotics, Arene, Alkine, etc Active carcinogens Detoxified carcinogens DNA Damage Normal cell Carcinogenesis Programmed cell death Tobacco consumption Occupational Exposures Environmental Exposure CYP1A1 GSTP1 mEH NQO1 XRCC1 GSTM1 2-1. Background: Theoretical model of gene-gene/environmental interaction pathway

29 If DNA damage not repaired DNA damage repaired If loose cell cycle control Defected DNA repair gene G S G2 M P53 Cyclin D1 P16 Environmental Carcinogens / Procarcinogens Exposures PAHs, Xenobiotics, Arene, Alkine, etc Active carcinogens Detoxified carcinogens DNA Damage Normal cell Carcinogenesis Programmed cell death Tobacco consumption Occupational Exposures Environmental Exposure CYP1A1 GSTP1 mEH NQO1 XRCC1 GSTM1 2-1. Background: Theoretical model of gene-gene/environmental interaction pathway Ile 105 Val  Ala 114 Val  Tyr 113 His  His 139 Arg  Tyr 113 His  His 139 Arg  Pro 187 Ser  MspI Ile 462 Val  Arg 194 Trp, Arg 399 Gln, Arg 280 His  Null  Ala 146 Thr Arg 72 Pro  G 870 A 

30 2-1. Background:The summary of epidemiological literature for the genes of interest

31

32

33

34

35

36

37 UCLA Prostate Cancer SPORE Development Project Single Nucleotide Polymorphisms (SNPs) of Genes in the DNA Double Strand Break Repair (DSBR) Pathways and Risk of Prostate Cancer, A Preliminary Study Zuo-Feng Zhang, MD, PhD Department of Epidemiology UCLA School of Public Health

38 Epidemiological Observations: Involvement of DSBR Pathway Genes in Prostate Cancer Risk The risk of prostate cancer is known to be elevated in carriers of germline mutations in BRCA2 Increased risk of prostate cancer is also observed in carriers of BRCA1 and CHEK2 mutations, and also associated with SNPs of the ATM genes Those observations indicate possible involvement of DNA DSBR pathway genes

39 BRCA2 BRCA1 ATM CHEK2(RAD53 homologous recombination Non-homologous Recombination Damage recognition cell cycle delay response (DRCCD )

40 Hypotheses Single Nucleotide Polymorphisms (SNPs) of genes in the DNA Double Strand Break Repair (DSBR) Pathways may be associated with the susceptibility to prostate cancer. We further hypothesize that the SNPs of the DSBR may interplay each other and may modify effects of environmental factors on the risk of prostate cancer.

41 Specific Aim 1 To assay Single Nucleotide Polymorphisms (SNPs) of genes in double strand break (DSB) repair pathway, including genes involved in Homologous Recombinational Repair (HRR): RAD51, RAD52, RAD54L, NBS1, XRCC2, XRCC3, BRCA1, and BRCA2; LIG4, and XRCC4 in Non-homologous end-joining (NHEJ), ATM, BRCA1, CHEK1, CHEK2 (RAD53), P53, and HUS1 in damage recognition cell cycle delay response (DRCCD) pathway.

42 Specific Aim 2 To evaluate independent effect of SNPs of the DSB repair pathway when potential confounding factors, such as age, race, and education and to assess potential combined effects of SNPs To explore possible effect modifications on nutritional factors on the risk of prostate cancer.

43 Proposed Experimental Approach This study is based on a case-control study with a total of 122 cases with prostate cancer and 135 healthy controls. All cases and controls were interviewed by a research nurse using a standard epidemiological questionnaire at MSKCC from 1993 to 1997. Blood samples and tumor tissue specimens were collected. The SNPs will be genotyped in individual DNA samples using the SNPlex platform by ABI. The UCLA Sequencing and Genotyping Core Facility has recently added Applied Biosystem ’ s high-throughput SNP genotyping assay – SNPlex – to the available services. This assay is flexible, robust and highly reproducible.

44 www.genetics.ucla.edu/genotyping JCCC Genotyping Core: ABI SNPlex, a New High Throughput Approach to Identify SNPs of Susceptibility Genes

45 Zhang Lab SNP Genotyping Pilot Project 75% passed design process 75% passed design process 48 SNPs chosen for first pool 48 SNPs chosen for first pool Whole Genome Amplification of DNA for 3080 samples Whole Genome Amplification of DNA for 3080 samples 122,496 SNPs since genotyped since January 122,496 SNPs since genotyped since January

46 Preliminary Results 99.4% reproducibility by automated scoring. 99.4% reproducibility by automated scoring. 99.7% reproducibility by manual scoring. 99.7% reproducibility by manual scoring. 6 SNPs never worked 6 SNPs never worked 96% call rate of remaining markers 96% call rate of remaining markers Comparable to results reported by other labs Comparable to results reported by other labs

47 Study Population

48 Progress of the Study Specific Aim 1, we have assayed selected single nucleotide polymorphisms (SNPs) of genes in double strand break repair (DSBR) pathway, including genes BRCA1, NBS1, TP53, APEX1, CHEK1, CHEK2, and ATM in 68 cases with prostate cancer and 90 healthy male controls using ABI SNPlex platform.

49 Progress of the Study Specific Aim 2, we explored independent effect of SNPs of the genes mentioned above in the DSBR pathway when potential confounding factors, such as age, race, and education were controlled.

50

51 Results of Preliminary study The adjusted ORs are: 4.6 (95%CI: 0.6-34.1) for BRCA1 (rs8176109) 5.0 (95% CI: 1.1-22.3) for NBS1 (rs9995) 3.1 (95%CI: 0.46-21.2) for TP53 (rs2909430) 2.0 (95%CI: 0.49-8.02) for APEX1 (rs3136820) 2.6 (95%CI: 0.58-11.6) for CHEK1 (rs506504) 0.6 (95%CI: 0.17-2.3) for ATM (rs228591).

52 Future Plan We will continue our proposed specific aims by assaying additional SNPs in the DSBR pathway genes as well as other pathways including other DNA repair pathways, metabolic, inflammatory, and cell cycle pathways among prostate cancer cases and controls. We will explore the independent effect of those SNPs on the risk of prostate cancer. We will also add the haplotype tagging SNPs of the DSBR pathways in order to identify haplotypes associated with prostate cancer risk. Those additional studies will have a greater impact on the translational research objectives of the SPORE.

53 BRCA1 Haplotypes and Risk of Prostate Cancer

54 Translational Potential of the Study Our results with relatively small samples size suggest potential involvement of SNPs of the DSBR pathway genes in the development of prostate cancer. If confirmed by studies with larger sample size, SNPs in DSBR pathway genes may be used in individual risk assessment, and identification of high risk population for intervention and chemoprevention

55

56 The Selection Criteria of SNPs functional SNPs if possible amino-acid-changing SNPs; SNPs in the functional region of the gene or SNPs without amino acid changes that were hypothesized to affect the transcription/ translation of the protein; the rare allele frequency of SNPs must be equal to or higher than 5% in the general population

57

58

59

60 Proposed Study of Lung Cancer among Non-smokers

61 Motives and Conceptual Framework For Study of Genetic Susceptibility to Lung Cancer among Non-smokers About 16% of the male smokers and 10% of female smokers will eventually develop lung cancer, which suggest exposures to other environmental carcinogens and individual genetic susceptibility may play an important role among non smoking lung cancer. It is suggested that 26% of lung cancer are associated with genetic susceptibility Lichtenstein P, et al. NEJM, 2000) We hypothesize that the variation of genetic susceptibility or single nucleotide polymorphisms (SNPs) of genes in inflammation, DNA repair, and cell cycle control pathways may be important on the development of lung cancer among non-smokers.

62 Lichtenstein P, Holm NV, Verkasalo PK, Iliadou A, Kaprio J, Koskenvuo M, Pukkala E, Skytthe A, Hemminki K. NEJM, 2000

63 If DNA damage not repaired DNA damage repaired If loose cell cycle control Defected DNA repair gene G1 S G2 M P53 Cyclin D1 P16 Environmental Carcinogens / Procarcinogens Exposures PAHs, Xenobiotics, Arene, Alkine, etc Active carcinogens Detoxified carcinogens DNA Damage Normal cell Carcinogenesis Programmed cell death Tobacco consumption Occupational Exposures Environmental Exposure CYP1A1 GSTP1 mEH NQO1 XRCC1 GSTM1 Theoretical model of gene-gene/environmental interaction pathway for lung cancer Ile 105 Val  Ala 114 Val  Tyr 113 His  His 139 Arg  Tyr 113 His  His 139 Arg  Pro 187 Ser  MspI Ile 462 Val  Arg 194 Trp, Arg 399 Gln, Arg 280 His  Null  Ala 146 Thr Arg 72 Pro  G 870 A  G0

64 Issues in genetic association studies Many genes –~25,000 genes, many can be candidates Many SNPs –~10,000,000 SNPs, ability to predict functional SNPs is limited Methods to select SNPs: –Only functional SNPs in a candidate gene –Systematic screen of SNPs in a candidate gene –Systematic screen of SNPs in an entire pathway –Genomewide screen –Systematic screen for all coding changes

65 Selection of SNPs (Genome-wide association studies) –Molecular Higher requirements: Affymetrix and Perlegen –Analytical Highest requirements: Data management, automation –Advantages No biological assumptions and can identify novel genes/pathways Excellent chance to identify risk alleles Utility in individual risk assessment –Disadvantages High costs Concern of multiple tests

66 500K SNP Coverage Median intermarker distance: 3.3 kb Mean intermarker distance: 5.4 kb Average Heterozygosity 0.30 Average minor allele frequency 0.22 SNPs in genes196,384 80% of genome within 10kb of a SNP

67

68 LIG SNP and Passive Smoking

69 Figure 1. The effects of SNPs on the Risk of Lung Cancer among Smokers and Non-smokers OR

70 Hypothesis The overall hypothesis is that multiple sequence variants in the genome are associated with the risk of lung cancer among non-smokers. Specifically, we hypothesize that a number of common nonsmoking lung cancer risk-modifying SNPs are in strong LD with the SNPs arrayed on the 500K GeneChip®.

71 Figure 2. Structure and Governance of ILCCO

72

73

74 Specific Aims Aim 1. To perform exploratory tests for association between 500K SNPs across the genome and lung cancer risk among 200 non- smoking lung cancer patients and 200 controls. Aim 2. To perform first stage of confirmatory association tests between lung cancer risk and more than 1,000 SNPs implicated in Aim 1 among an independent set of 600 pairs of cases and controls.

75 Specific Aims Aim 3. To perform second stage of confirmatory association tests between lung cancer risk and more than 500 SNPs that were replicated in Aim 2 among an additional 600 cases and 600 controls. Additional SNPs will also be added from our ongoing pathway specific analyses of DNA repair, cell cycle regulation, inflammation and metabolic pathways based on non- smokers in our lung cancer study. Aim 4. To perform fine mapping association studies in the flanking regions of each of the 30-100 SNPs confirmed in Aim 3 among the entire 1,400 cases and 1,400 controls. The large number of cases with non-smoking lung cancer in this study population also allows us to identify SNPs that are associated with risk of the disease among nonsmokers.

76 Specific Aims Aim 5. To explore the generalizability of the SNPs identified in Specific Aims 1-4 within a Chinese population of 600 nonsmoking lung cancer cases and 600 nonsmoking controls. The relatively homogeneous Chinese population not only allows us to further confirm the associations, but also improves our ability to finely map the SNPs associated with lung cancer risk among non- smokers.

77 Discussion: Costs Affy 500 k SNP chip $1000/case 2000 x $1000=$2m 1000 x $1000=$1m 500 x $1000=$0.5 M 500 x 3000 (SNP) x $0.15=$225, 000 500 x 30 (SNP) x $0.15 =$2,250

78 1040 controls, 601 head and neck cancer cases, and 611 lung cancer cases

79

80 GSTM1 and Lung Cancer among Non-Smokers GSTM1 Normal Null Smoking No No 1.19 0.69-2.03

81 GSTT1 and Lung Cancer among Non-Smokers GSTT1 Normal Null Smoking No No 1.53 0.83-2.81

82 p53 codon 72 and Lung Cancer among non- smokers p53 A/A or A/P P/P Smoking No No 0.79 0.32-1.95

83 GSTP1 and Lung Cancer among Non-Smokers GSTP1 Ile/Ile Any Val Smoking No No 0.69 0.39-1.24

84

85

86

87

88

89

90

91

92

93


Download ppt "Selection of Candidate Genes for Population Studies Zuo-Feng Zhang, MD, PhD Epidemiology 243: Molecular Epidemiology."

Similar presentations


Ads by Google