Presentation is loading. Please wait.

Presentation is loading. Please wait.

Aspects of Genetics and Genomics in Cancer Research Li Hsu Biostatistics and Biomathematics Program Fred Hutchinson Cancer Research Center.

Similar presentations


Presentation on theme: "Aspects of Genetics and Genomics in Cancer Research Li Hsu Biostatistics and Biomathematics Program Fred Hutchinson Cancer Research Center."— Presentation transcript:

1 Aspects of Genetics and Genomics in Cancer Research Li Hsu Biostatistics and Biomathematics Program Fred Hutchinson Cancer Research Center

2 Outline Cancer facts Linkage analysis of family studies Genome-wide association studies

3

4 Etiology of Cancer The etiology of cancer is multifactorial, with genetic, environmental, medical, and lifestyle factors interacting to produce a given malignancy. The breakthroughs in high throughput genotyping technologies have made it possible for systematically identifying genes that are responsible for disease occurrence.

5 BRCA1 and Breast Cancer BRCA1 (breast cancer 1) is a human gene that belongs to a class of genes known as tumor suppressors, which maintains genomic integrity to prevent uncontrolled proliferation. Variations in the gene have been implicated in a number of hereditary cancers, namely breast, ovarian and prostate. The BRCA1 gene is located on the long (q) arm of chromosome 17 at 38Mb.

6 Probability of developing breast cancer by age (Chen et al. 2009) carriers Non-carriers

7 Probability of Developing Breast Cancer for BRCA1 carriers Average PersonBRCA1 Carrier Age 50 2.1%(1.7%-2.7%)18.8%(8.2%-2.3%) Age 60 4.1%(3.4-5.0%)31.3%(14.3%-61.2%) Age 70 7.2%(6.0%-9.0%)45.4%(22.7%-74.3%) Age 80 10.2%(8.4%-12.5%)54.9%(30.4%-81.4%)

8 How was BRCA1 found?

9

10 Linkage Analysis 1/2 3/4 1/3 2/4 3/4 3/2 1/4 1/23/2

11 Assume disease gene (D) is rare with full penetrance 1/2 3/4 1/3 2/4 3/4 3/2 1/4 1/23/2 d/d D/d d/D d/d D/d d/dD/dd/dD/d

12 Linkage Analysis (continued) Disease allele (D) originally in chromosome with allele 3 How often does D co-segregate with allele 3 (non-recombinant)?

13 Assume disease gene (D) is rare with full penetrance 1/2 3/4 1/3 2/4 3/4 3/2 1/4 1/23/2 d/d D/d d/D d/d D/d d/dD/dd/dD/d

14 Linkage Analysis (continued) Disease allele (D) originally in chromosome with allele 3 How often does D co-segregate with allele 3 (non-recombinant)? –5 meiosises How often is D separated from allele 3 (recombinant)?

15 Assume disease gene (D) is rare with full penetrance 1/2 3/4 1/3 2/4 3/4 3/2 1/4 1/23/2 d/d D/d d/D d/d D/d d/dD/dd/dD/d

16 Linkage Analysis (continued) Disease allele (D) originally in chromosome with allele 3 How often does D co-segregate with allele 3 (non-recombinant)? –5 meiosises How often is D separated from allele 3 (recombinant)? –1 meiosis

17 Likelihood function Set a parameter θ which measures the distance between allele 3 and D by how frequently they recombine. The likelihood function L(θ) = (1- θ) 5 θ The maximum likelihood estimate is 1/6 LOD = log 10 L(1/6)/L(1/2) = 0.63 LOD for 7 families = 7x0.63 = 4.41

18 Issues Linkage analysis has narrowed down to a region about 1Mb. However it took another four years before the BRCA1 gene was mapped. Reduced penetrance, phenocopy, and genetic heterogeneity are among the factors that limit the success of the linkage analysis. Relevance of the findings to the population at large.

19 Genome-Wide Association Studies(GWAS) The Human Genome Project began in 1990 and completed in 2003.

20 Part of sequence from Chromosome 7 AGACGGAGTTTCACTCTTGTTGCCAACCTGGAGTGCAGTGGCGTGATCTCAGCTCACTGCACACTCCGCTTTCC/TGG TTTCAAGCGATTCTCCTGCCTCAGCCTCCTGAGTAGCTGGGACTACAGTCACACACCACCACGCCCGGCTAATTTTTG TATTTTTAGTAGAGTTGGGGTTTCACCATGTTGGCCAGACTGGTCTCGAACTCCTGACCTTGTGATCCGCCAGCCTCT GCCTCCCAAAGAGCTGGGATTACAGGCGTGAGCCACCGCGCTCGGCCCTTTGCATCAATTTCTACAGCTTGTTTTCTT TGCCTGGACTTTACAAGTCTTACCTTGTTCTGCCTTCAGATATTTGTGTGGTCTCATTCTGGTGTGCCAGTAGCTAAAA ATCCATGATTTGCTCTCATCCCACTCCTGTTGTTCATCTCCTCTTATCTGGGGTCACA/CTATCTCTTCGTGATTGCATTC TGATCCCCAGTACTTAGCATGTGCGTAACAACTCTGCCTCTGCTTTCCCAGGCTGTTGATGGGGTGCTGTTCATGCCT CAGAAAAATGCATTGTAAGTTAAATTATTAAAGATTTTAAATATAGGAAAAAAGTAAGCAAACATAAGGAACAAAAAG GAAAGAACATGTATTCTAATCCATTATTTATTATACAATTAAGAAATTTGGAAACTTTAGATTACACTGCTTTTAGAGAT GGAGATGTAGTAAGTCTTTTACTCTTTACAAAATACATGTGTTAGCAATTTTGGGAAGAATAGTAACTCACCCGAACA GTGTAATGTGAATATGTCACTTACTAGAGGAAAGAAGGCACTTGAAAAACATCTCTAAACCGTATAAAAACAATTACA TCATAATGATGAAAACCCAAGGAATTTTTTTAGAAAACATTACCAGGGCTAATAACAAAGTAGAGCCACATGTCATTT ATCTTCCCTTTGTGTCTGTGTGAGAATTCTAGAGTTATATTTGTACATAGCATGGAAAAATGAGAGGCTAGTTTATCAA CTAGTTCATTTTTAAAAGTCTAACACATCCTAGGTATAGGTGAACTGTCCTCCTGCCAATGTATTGCACATTTGTGCCC AGATCCAGCATAGGGTATGTTTGCCATTTACAAACGTTTATGTCTTAAGAGAGGAAATATGAAGAGCAAAACAGTGCA TGCTGGAGAGAGAAAGCTGATACAAATATAAATGAAACAATAATTGGAAAAATTGAGAAACTACTCATTTTCTAAATT ACTCATGTATTTTCCTAGAATTTAAGTCTTTTAATTTTTGATAAATCCCAATGTGAGACAAGATAAGTATTAGTGATGGT ATGAGTAATTAATATCTGTTATATAATATTCATTTTCATAGTGGAAGAAATAAAATAAAGGTTGTGATGATTGTTGATTA TTTTTTCTAGAGGGGTTGTCAGGGAAAGAAATTGCTTTTTTTCATTCTCTCTTTCCACTAAGAAAGTTCAACTATTAATT TAGGCACATACAATAATTACTCCATTCTAAAATGCCAAAAAGGTAATTTAAGAGACTTAAAACTGAAAAGTTTAAGATA GTCACACTGAACTATATTAAAAAATCCACAGGGTGGTTGGAACTAGGCCTTATATTAAAGAGGCTAAAAATTGCAATA AGACCACAGGCTTTAAATATGGCTTTAAACTGTGAAAGGTGAAACTAGAATGAATAAAATCCTATAAATTTAAATCAA AAGAAAGAAACAAACTA/GAAATTAAAGTTAATATACAAGAATATGGTGGCCTGGATCTAGTGAACATATAGTAAAGA TAAAACAGAATATTTCTGAAAAATCCTGGAAAATCTTTTGGGCTAACCTGAAAACAGTATATTTGAAACTATTTTTAAA

21 Genome-Wide Association Study 550,000 SNPs on an array 2000 diseased individuals (colon cancer cases) and 2000 normal individuals Genotype all DNAs for 550,000 SNPs That is 2 billion genotyping!

22 GWAS on Type 2 Diabetes (Steinthorsdottir et al., 2007, Nature Genetics) CasesControls AA80930493858 Aa50919172426 aa81305385 139852716669 Expected count for cases if AA is not associated with the disease. First, calculate the frequency of AA genotype in both cases and controls combined: freq = 3858/6669 = 57.85% For 1398 cases, we expect to see 1398*57.85%=809 individuals having genotype AA. CasesControls AA75131073858 Aa53918872426 aa108277385 139852716669

23 GWAS on Type 2 Diabetes The chi-square statistic is calculated by finding the difference between each observed and expected for each cell, squaring them, dividing each by the expected, and taking the sum of the results. (757-809)^2/809+(3107-3049)^2/3049+… Compare the value to a standard chi-square distribution with degrees of freedom (# rows-1)*(# col -1) = 2. The p-value for this SNP is 6.772e-5.

24 Issues Too many SNPs! Identifying gene-gene and gene- environmental interactions are now possible.

25 Germline mutations account for only a small portion of cancer cases. http://envirocancer.cornell.edu/FactSheet/General/fs48.inheritance.cfm

26 Summary The amount of the data that have been generated increases exponentially in the last few years. This creates a great demand on efficient and valid computational and statistical methods and tools for picking the needles from a haystack.


Download ppt "Aspects of Genetics and Genomics in Cancer Research Li Hsu Biostatistics and Biomathematics Program Fred Hutchinson Cancer Research Center."

Similar presentations


Ads by Google