Www.polyomx.org Wang Y 1,2, Damaraju S 1,3,4, Cass CE 1,3,4, Murray D 3,4, Fallone G 3,4, Parliament M 3,4 and Greiner R 1,2 PolyomX Program 1, Department.

Slides:

Advertisements

Similar presentations

Regulation of Consumer Tests in California AAAS Meeting June 1-2, 2009 Beatrice OKeefe Acting Chief, Laboratory Field Services California Department of.

Advertisements

Basic Gene Expression Data Analysis--Clustering

DECISION TREES. Decision trees  One possible representation for hypotheses.

Random Forest Predrag Radenković 3237/10

Which Phenotypes Can be Predicted from a Genome Wide Scan of Single Nucleotide Polymorphisms (SNPs): Ethnicity vs. Breast Cancer Mohsen Hajiloo, Russell.

Decision Tree Approach in Data Mining

Data Mining Classification: Alternative Techniques

Instance-based Classification Examine the training samples each time a new query instance is given. The relationship between the new query instance and.

Clinical Trial Designs for the Evaluation of Prognostic & Predictive Classifiers Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer.

Indian Statistical Institute Kolkata

1 Single Nucleotide Polymorphisms (SNP) Gary Jones SPE, Technology Center 1600 (703)

Recursive Partitioning Method on Survival Outcomes for Personalized Medicine 2nd International Conference on Predictive, Preventive and Personalized Medicine.

A Perspective on the Data Ajit Paul Singh M.Sc. Candidate Dept. of Computing Science University of Alberta.

Three kinds of learning

Microarray-based Disease Prognosis using Gene Annotation Signatures Michael Kovshilovsky Swapna Annavarapu SoCalBSI 2005.

Selecting Informative Genes with Parallel Genetic Algorithms Deodatta Bhoite Prashant Jain.

Guidelines on Statistical Analysis and Reporting of DNA Microarray Studies of Clinical Outcome Richard Simon, D.Sc. Chief, Biometric Research Branch National.

1 Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data Presented by: Tun-Hsiang Yang.

Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.

Single nucleotide polymorphisms in genes for cytokines interleukin (IL)-2, IL-6 and TNFalpha influence severity of osteolysis after total hip arthroplasty.

Computational Molecular Biology Biochem 218 – BioMedical Informatics Simple Nucleotide.

Study Design / Data: Case-Control, Descriptives Basic Medical Statistics Course: Module C October 2010 Wilma Heemsbergen

Data Mining to Aid Beam Angle Selection for IMRT Stuart Price-University of Maryland Bruce Golden- University of Maryland Edward Wasil- American University.

Analysis of Molecular and Clinical Data at PolyomX Adrian Driga 1, Kathryn Graham 1, 2, Sambasivarao Damaraju 1, 2, Jennifer Listgarten 3, Russ Greiner.

Pharmacogenomics. Developing drugs on the basis of individual genetic differences Tailoring therapies to genetically similar subpopulations results in.

From Genomic Sequence Data to Genotype: A Proposed Machine Learning Approach for Genotyping Hepatitis C Virus Genaro Hernandez Jr CMSC 601 Spring 2011.

Doug Brutlag 2011 Genomics & Medicine Doug Brutlag Professor Emeritus of Biochemistry &

GATree: Genetically Evolved Decision Trees 전자전기컴퓨터공학과 데이터베이스 연구실 G 김태종.

SNPs Daniel Fernandez Alejandro Quiroz Zárate. A SNP is defined as a single base change in a DNA sequence that occurs in a significant proportion (more.

Learning from Observations Chapter 18 Through

Pharmacogenetics & Pharmacogenomics Personalized Medicine.

Decision Trees. Decision trees Decision trees are powerful and popular tools for classification and prediction. The attractiveness of decision trees is.

 2003, G.Tecuci, Learning Agents Laboratory 1 Learning Agents Laboratory Computer Science Department George Mason University Prof. Gheorghe Tecuci 9 Instance-Based.

Classification and Prediction Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot Readings: Chapter 6 – Han and Kamber.

Acknowledgements Contact Information Anthony Wong, MTech 1, Senthil K. Nachimuthu, MD 1, Peter J. Haug, MD 1,2 Patterns and Rules  Vital signs medoids.

Personalized Medicine Dr. M. Jawad Hassan. Personalized Medicine Human Genome and SNPs What is personalized medicine? Pharmacogenetics Case study – warfarin.

Evolutionary Algorithms for Finding Optimal Gene Sets in Micro array Prediction. J. M. Deutsch Presented by: Shruti Sharma.

Julia N. Chapman, Alia Kamal, Archith Ramkumar, Owen L. Astrachan Duke University, Genome Revolution Focus, Department of Computer Science Sources

Using Predictive Classifiers in the Design of Phase III Clinical Trials Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute.

Evaluating Results of Learning Blaž Zupan

Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.

Computational Approaches for Biomarker Discovery SubbaLakshmiswetha Patchamatla.

Introduction Hereditary predisposition (mutations in BRCA1 and BRCA2 genes) contribute to familial breast cancers. Eighty percent of the.

A Comprehensive Genomic Approach to the Identification of Predictive Markers using DNA and Tissue Repair Gene Polymorphisms in Radiation.

BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.

Competition II: Springleaf Sha Li (Team leader) Xiaoyan Chong, Minglu Ma, Yue Wang CAMCOS Fall 2015 San Jose State University.

Computational Biology and Genomics at Boston College Biology Gabor T. Marth Department of Biology, Boston College

Advanced Gene Selection Algorithms Designed for Microarray Datasets Limitation of current feature selection methods: –Ignores gene/gene interaction: single.

Identifying Ethnic Origins with A Prototype Classification Method Fu Chang Institute of Information Science Academia Sinica ext. 1819

Notes: Human Genome (Right side page)

Human Genomics Higher Human Biology. Learning Intentions Explain what is meant by human genomics State that bioinformatics can be used to identify DNA.

SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.

Data Mining, Machine Learning, Data Analysis, etc. scikit-learn

Classification with Gene Expression Data

Hiroshi de Silva, A. Shehan Perera

Heping Zhang, Chang-Yung Yu, Burton Singer, Momian Xiong

Results for all features Results for the reduced set of features

Chapter 6 Classification and Prediction

Data Mining (and machine learning)

Evaluating classifiers for disease gene discovery

Introduction to bioinformatics lecture 11 SNP by Ms.Shumaila Azam

Prepared by: Mahmoud Rafeek Al-Farra

MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING

MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING

Data Mining, Machine Learning, Data Analysis, etc. scikit-learn

Data Mining, Machine Learning, Data Analysis, etc. scikit-learn

Treated with Neoadjuvant Therapy

Using Bayesian Network in the Construction of a Bi-level Multi-classifier. A Case Study Using Intensive Care Unit Patients Data B. Sierra, N. Serrano,

Research Techniques Made Simple: Interpreting Measures of Association in Clinical Research Michelle Roberts PhD,1,2 Sepideh Ashrafzadeh,1,2 Maryam Asgari.

Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017

Presentation transcript:

Wang Y 1,2, Damaraju S 1,3,4, Cass CE 1,3,4, Murray D 3,4, Fallone G 3,4, Parliament M 3,4 and Greiner R 1,2 PolyomX Program 1, Department of Computing Science 2 and Oncology 3, U of A and Cross Cancer Institute 4 Study Design AIM: To explore the possible relationship between 51 single nucleotide polymorphisms (SNPs) in candidate genes encoding DNA damage, recognition/repair/response and clinical radiation toxicity in a retrospective cohort of patients (n=82) treated with conformal radiotherapy (3DCRT) for prostate cancer. In this study, we tested techniques from Machine Learning (ML) to build classifiers and to predict toxicity in patients' treated with radiation. AIM: To explore the possible relationship between 51 single nucleotide polymorphisms (SNPs) in candidate genes encoding DNA damage, recognition/repair/response and clinical radiation toxicity in a retrospective cohort of patients (n=82) treated with conformal radiotherapy (3DCRT) for prostate cancer. In this study, we tested techniques from Machine Learning (ML) to build classifiers and to predict toxicity in patients' treated with radiation. SNPs (Single Nucleotide Polymorphisms) are commonly occurring genetic variations. SNPs may affect an individual's susceptibility to disease or response to particular treatment by altering the expression of the gene in which it occurs. SNPs (Single Nucleotide Polymorphisms) are commonly occurring genetic variations. SNPs may affect an individual's susceptibility to disease or response to particular treatment by altering the expression of the gene in which it occurs Analysis of Single Nucleotide Polymorphisms in Candidate Genes and Application of Machine Learning Techniques to Predict Radiation Toxicity in Prostate Cancer Patients Treated with Conformal Radiotherapy Methods SNPs served as features (independent variables) and the patient response to treatment as the class label (dependent variable). Patients (n=28) with adverse reactions (rectal bleeding) to radiation more than 90 days after treatment were considered as negative and the remaining 54 as positives in a binary classification. We considered two types of classifiers: the "J48" decision tree and the "KStar" nearest-neighbor. For each classifier, we also used information gain to rank the quality of the SNPs and then considered classifiers based on the top k SNPs, for different "ks”. We used ten-fold cross validation to estimate the quality (predictive accuracy) of each classifier with each feature subset as a way to identify the best classification system. We ran a permutation test (using 4000 trials) to test the significance of our results. Decision Trees is a tree-structured decision diagram based on the training data. It can be used to classify new data. Information Gain is a concept coming from the information and decision tree theory. It defines the increase in information which is caused by adding a new attribute node to a rule or decision tree. Usually an attribute with high information gain should be preferred to other attributes. Results Our initial analysis suggested 70-80% prediction accuracy by the following SNPs in this rank order: XRCC3 (A>G, 5’ UTR Nt 4541), CYP2D6*4 (G>A, Splicing defect), BRCA2 (A>G, K 1132 K), MLH1 (C>T, V 219 I), BRCA1 (A>G, R 356 Q), RAD51 (G>T, 5’ UTR Nt 172), BRCA2 (A>G, S 455 S), BRCA2 (C>A, N 289 H), and BRCA2 (A>G, D 991 N). The 4,000-trial permutation test demonstrated significance at the p<0.05 level for both J48 and KStar classifiers. Radiation toxicity: Patients treated with conformal radiotherapy (3DCRT) were given a RTOG toxicity score from We assigned positive and negative labels for each patient based on toxicity scores such that a score of 2 or higher during the course of the treatment was considered negative or experiencing adverse reaction to radiation therapy, while others were given given a positive label. Machine Learning: The field of machine learning is concerned with the question of how to construct computer programs that automatically improve with experience. [1] The techniques are designed to find patterns in training data and classify new data. KStar is a nearest neighbor method with a generalized distance function based on transformations. Permutation test: Randomly rearrange LABELS of data, and run through the same algorithm. 0 if attribute “a” is NOT correlated with class “c” Positive if correlated K-fold Cross Validation is a common method used for model checking. ( Example: when K=3) Reference [1] Mitchell, T. Machine Learning. McGraw-Hill, Boston, Conclusion: Machine Learning techniques can be used for SNP data analyses and clinical treatment outcome prediction. This preliminary analysis demonstrates the utility of Machine Learning in discriminating between populations according to SNP data towards identifying predictive SNPs for use in radio- genomics in the near future. Acknowledgements This work was funded by the Research Initiatives Program of the Alberta Cancer Board.