Presentation is loading. Please wait.

Presentation is loading. Please wait.

Www.polyomx.org Introduction Hereditary predisposition (mutations in BRCA1 and BRCA2 genes) contribute to familial breast cancers. Eighty percent of the.

Similar presentations


Presentation on theme: "Www.polyomx.org Introduction Hereditary predisposition (mutations in BRCA1 and BRCA2 genes) contribute to familial breast cancers. Eighty percent of the."— Presentation transcript:

1 www.polyomx.org Introduction Hereditary predisposition (mutations in BRCA1 and BRCA2 genes) contribute to familial breast cancers. Eighty percent of the breast cancers are sporadic in nature, with contributions from several yet to be identified low penetrance genes and their interactions with carcinogenic environmental exposures. Identification of the molecular signatures would help target at- risk populations for early screening and preventive efforts. Hereditary predisposition (mutations in BRCA1 and BRCA2 genes) contribute to familial breast cancers. Eighty percent of the breast cancers are sporadic in nature, with contributions from several yet to be identified low penetrance genes and their interactions with carcinogenic environmental exposures. Identification of the molecular signatures would help target at- risk populations for early screening and preventive efforts. SNPs (Single Nucleotide Polymorphisms) are commonly occurring genetic variations. SNPs may affect an individual's susceptibility to disease or response to particular treatment by altering the expression of the gene in which it occurs. SNPs (Single Nucleotide Polymorphisms) are commonly occurring genetic variations. SNPs may affect an individual's susceptibility to disease or response to particular treatment by altering the expression of the gene in which it occurs. -- -- - -- Analysis of Single Nucleotide Polymorphisms in Candidate Genes and Application of Machine Learning Techniques for assessing Susceptibility to Breast Cancer in Alberta Women Data In an earlier study, we have validated 98 SNPs from candidate genes in a retrospective cohort of 174 cases and 158 apparently healthy individuals (controls) from the Edmonton region. The present study attempts to reproduce these earlier findings using 169 newly diagnosed breast cancer patients, coupled with the same control population. Information Gain is a concept coming from the information and decision tree theory. It defines the increase in information which is caused by adding a new attribute node to a rule or decision tree. Usually an attribute with high information gain should be preferred over other attributes. Results Cancer Prediction: we report high reproducibility of genetic markers identified in each of the two independent studies with a prediction accuracy of 63-65%. Machine Learning: The field of machine learning is concerned with the question of how to construct computer programs that automatically perform better with (experience) training[1]. The techniques are designed to find patterns in training data and classify new data. K-fold Cross Validation is a common method used for model checking. ( Example: when K=3) Reference [1] Mitchell, T. Machine Learning. McGraw-Hill, Boston, 1997. Future Work We plan to genotype these SNPs using a larger number of cases and controls (750 each) in a prospective study with emphasis also to environmental factors influencing the breast cancer risk. Acknowledgements Dufour J 1, Wang Y 1,2, Cass CE 1,3,4, Greiner R 1,2, Mackey J 1,3 and Damaraju S 1,3,4 PolyomX Program 1, Department of Computing Science 2 and Oncology 3, U of A and Cross Cancer Institute 4 Method We used information gain to rank the quality of the SNPs and then considered classifiers based on the top k SNPs, for different "k”s. We used 20-fold cross validation to estimate the quality (predictive accuracy) of each classifier with each feature subset, as a way to identify the best classification system. Significant SNPs: We found SNPs in MSH6, MLH1, P53, ADPRT, AGTR, RET, BCL6, CYP19A1, CYP1B1 and CYP11B2 as informative. Prediction Accuracy Early Study 63% Present Study 65% Rank Early StudyPresent Study IG valueSNP Name IG value 1 0.06158CYP11B2_4_T201CMSH6_4_T201C0.0338 2 0.03803CYP1B1_1_C201GMLH1_1_A132G0.0276 3 0.02879BCL6_2_C201TBDKRB2_2_C152T0.0245 4 0.02103ARO1_1_T201CCYP11B2_4_T201C0.0243 5 0.01825CYP11B2_9_G201ATp53_2_G201A0.0229 6 0.01817MLH1_1_A132GCD38_1_A177C0.0185 7 0.01781MSH6_4_T201CCYP11B2_11_G201A0.0185 8 0.01609AGTR1_1_T201CCYP1B1_2_A201G0.0169 9 0.01486RET_1_G201ACYP11B1_6_G135A0.0161 10 0.01342CYP17_1_G201TCYP1B1_1_C201G0.0161 11 0.01333CD38_1_A177CRET_1_G201A0.0154 12 0.01299ADPRT_3_C201TGRB10_1_A201G0.0144 13 0.01297ERCC2_2_C201TCYP11B2_17_T201C0.0138 14 0.01294CD68_1_G201ABRCA2_13_A201G0.0137 15 0.01283CYP11B1_6_G135AADPRT_3_C201T0.0135 16 0.01209CYP11B2_12_C201TAGTR1_1_T201C0.0132 17 0.01192CYP11B2_17_T201CCYP11B2_12_C201T0.0130 18 0.01155Tp53_6_G201ABRCA2_3_C201A0.0129 0 if attribute “a” is NOT correlated with class “c” Positive if correlated Naïve Bayes Classifier assumes that attributes are independent given a class. It computes the probability of each label, given the evidence and returns the most likely prediction. This work was funded by the Alberta Cancer Foundation and the Alberta Cancer Board.


Download ppt "Www.polyomx.org Introduction Hereditary predisposition (mutations in BRCA1 and BRCA2 genes) contribute to familial breast cancers. Eighty percent of the."

Similar presentations


Ads by Google