 Developed Struct-SVM classifier that takes into account domain knowledge to improve identification of protein-RNA interface residues  Results show that.

Slides:



Advertisements
Similar presentations
Molecular Biomedical Informatics Machine Learning and Bioinformatics Machine Learning & Bioinformatics 1.
Advertisements

Welcome Each of You to My Molecular Biology Class.
Iowa State University Department of Computer Science Artificial Intelligence Research Laboratory Research supported in part by grants from the National.
Mike Arnoult 9/30/2010 The role of Artificial Neural Networks in Phage Research.
Computational Molecular Biology (Spring’03) Chitta Baral Professor of Computer Science & Engg.
1 7/27/2008 Center for Computational Intelligence, Learning, and Discovery Bioinformatics and Computational Biology Program ROC 2008 meeting A Computational.
Reduced Support Vector Machine
13.2 Ribosomes and Protein Synthesis
The Social Web: A laboratory for studying s ocial networks, tagging and beyond Kristina Lerman USC Information Sciences Institute.
Protein Synthesis Ordinary Level. Lesson Objectives At the end of this lesson you should be able to 1.Outline the steps in protein synthesis 2.Understand.
Bioinformatics Jan Taylor. A bit about me Biochemistry and Molecular Biology Computer Science, Computational Biology Multivariate statistics Machine learning.
CLOSING DISCUSSION Advances and challenges in Protein-RNA: recognition, regulation, and prediction Discussion facilitator: Manny Ares Banff International.
Bioinformatics and Computational Biology Graduate Program Carla Mann December 11, 2014 Rocky Mountain Bioinformatics Conference Snowmass, CO RNABindRPlus.
Protein Secondary Structure Prediction with inclusion of Hydrophobicity information Tzu-Cheng Chuang, Okan K. Ersoy and Saul B. Gelfand School of Electrical.
Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and Discovery Program.
Transcription of Text by Incremental Support Vector machine Anurag Sahajpal and Terje Kristensen.
Protein Local 3D Structure Prediction by Super Granule Support Vector Machines (Super GSVM) Dr. Bernard Chen Assistant Professor Department of Computer.
Finish up array applications Move on to proteomics Protein microarrays.
From Structure to Function. Given a protein structure can we predict the function of a protein when we do not have a known homolog in the database ?
13.2 Ribosomes and Protein Synthesis
An algorithm to guide selection of specific biomolecules to be studied by wet-lab experiments Jessica Wehner and Madhavi Ganapathiraju Department of Biomedical.
CSCI 6900/4900 Special Topics in Computer Science Automata and Formal Grammars for Bioinformatics Bioinformatics problems sequence comparison pattern/structure.
Function first: a powerful approach to post-genomic drug discovery Stephen F. Betz, Susan M. Baxter and Jacquelyn S. Fetrow GeneFormatics Presented by.
The Genetic Code.
Discovering the Correlation Between Evolutionary Genomics and Protein-Protein Interaction Rezaul Kabir and Brett Thompson
Iowa State University Department of Computer Science Artificial Intelligence Research Laboratory Research supported in part by a grant from the National.
Identification of cell cycle-related regulatory motifs using a kernel canonical correlation analysis Presented by Rhee, Je-Keun Graduate Program in Bioinformatics.
Center for Computational Intelligence, Learning, and Discovery Artificial Intelligence Research Laboratory Department of Computer Science Supported in.
Exploring Alternative Splicing Features using Support Vector Machines Feature for Alternative Splicing Alternative splicing is a mechanism for generating.
12.3 DNA, RNA, and Protein Objective: 6(C) Explain the purpose and process of transcription and translation using models of DNA and RNA.
Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and Discovery Program.
Biological Signal Detection for Protein Function Prediction Investigators: Yang Dai Prime Grant Support: NSF Problem Statement and Motivation Technical.
CSBSI 2007 Bioinformatics and Computational Biology Program Department of Genetics, Development, and Cell Biology Department of Computer Science Generating.
PREDICTION OF CATALYTIC RESIDUES IN PROTEINS USING MACHINE-LEARNING TECHNIQUES Natalia V. Petrova (Ph.D. Student, Georgetown University, Biochemistry Department),
Central dogma: the story of life RNA DNA Protein.
Learning from Positive and Unlabeled Examples Investigator: Bing Liu, Computer Science Prime Grant Support: National Science Foundation Problem Statement.
B IOINFORMATICS AND C OMPUTATIONAL B IOLOGY A Computational Method to Identify RNA Binding Sites in Proteins Jeff Sander Iowa State University Rocky 2006.
Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and Discovery Program.
Jobs, Careers, Internships, Senior Projects and Research Computer Application Development K-12 education Industrial Training Bioinformatics Validation.
LOGO iDNA-Prot|dis: Identifying DNA-Binding Proteins by Incorporating Amino Acid Distance- Pairs and Reduced Alphabet Profile into the General Pseudo Amino.
Bioinformatics and Computational Biology
Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and Discovery Program.
DNA Microarray Data Analysis using Artificial Neural Network Models. by Venkatanand Venkatachalapathy (‘Venkat’) ECE/ CS/ ME 539 Course Project.
Computational Approaches for Biomarker Discovery SubbaLakshmiswetha Patchamatla.
Overview of Bioinformatics Module Denis Manley.. Contact Details Lecturer Name: Denis Manley Room number: KE-1-013a
Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and Discovery Program.
Iowa State University Department of Computer Science Center for Computational Intelligence, Learning, and Discovery Harris T. Lin, Sanghack Lee, Ngot Bui.
يادگيري ماشين Machine Learning Lecturer: A. Rabiee
Objectives: Terminology Components The Design Cycle Resources: DHS Slides – Chapter 1 Glossary Java Applet URL:.../publications/courses/ece_8443/lectures/current/lecture_02.ppt.../publications/courses/ece_8443/lectures/current/lecture_02.ppt.
Feature Extraction Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and.
Prediction of Protein Binding Sites in Protein Structures Using Hidden Markov Support Vector Machine.
Support Vector Machines (SVM): A Tool for Machine Learning Yixin Chen Ph.D Candidate, CSE 1/10/2002.
CZ5225 Methods in Computational Biology Lecture 2-3: Protein Families and Family Prediction Methods Prof. Chen Yu Zong Tel:
Typically, classifiers are trained based on local features of each site in the training set of protein sequences. Thus no global sequence information is.
Final Report (30% final score) Bin Liu, PhD, Associate Professor.
Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and Discovery Program.
Nawanol Theera-Ampornpunt, Seong Gon Kim, Asish Ghoshal, Saurabh Bagchi, Ananth Grama, and Somali Chaterji Fast Training on Large Genomics Data using Distributed.
1 CISC 841 Bioinformatics (Fall 2008) Review Session.
Section 3: DNA, RNA, and Protein
What is Pattern Recognition?
Artificial Intelligence Research Laboratory
Extra Tree Classifier-WS3 Bagging Classifier-WS3
Ontology-Based Information Integration Using INDUS System
محاضرة عامة التقنيات الحيوية (هندسة الجينات .. مبادئ وتطبيقات)
חיזוי ואפיון אתרי קישור של חלבון לדנ"א מתוך הרצף
Combining HMMs with SVMs

9 Future Challenges for Bioinformatics
Protein Structure Prediction by A Data-level Parallel Proceedings of the 1989 ACM/IEEE conference on Supercomputing Speaker : Chuan-Cheng Lin Advisor.
Presentation transcript:

 Developed Struct-SVM classifier that takes into account domain knowledge to improve identification of protein-RNA interface residues  Results show that the ROC curve of Struct-SVM dominates the ROC curve of Support Vector Machine (SVM) classifier Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and Discovery Program Department of Computer Science Predicting Protein-RNA Binding Sites Using Structural Information Cornelia Caragea, Michael Terribilini, Jivko Sinapov, Jae-Hyung Lee, Fadi Towfic, Drena Dobbs and Vasant Honavar Introduction Dataset Fig. 1. Receiver Operaring Characteristi (ROC) Curves for SVM and Struct-SVM classifiers on the protein-RNA dataset RNA molecules play diverse functional and structural roles in cells: Learning System L X test,j = surface Training Data Collection of Surface Windows Collection of Non-Surface Windows Test Data Final Predictions Resulting Classifier X test,j yes no h(x test,j )=yh(x test,j )=-1 Seq2SeqWins SeqWins2TargetAA Sequence: Label: windowise x i =(x i,1,…,x i,j-k,…,x i,j,…,x i,j+k,…,x i,m ) y i =(y i,1,…,y i,j-k,…,y i,j,…,y i,j+k,…,y i,m ) Seq2SeqWins SeqWins2TargetAA SeqWins2ZeroOne SeqWins2Blast SeqWins2SS SS2ZeroOne TargetAA2Struct Struct2Blast SeqWins2CXValue SeqWins2Roughness x’ i,j-1 =(x i,j-1-k,…,x i,j-1,…,x i,j-1+k ) x’ i,j =(x i,j-k,…,x i,j,…,x i,j+k ) x’ i,j+1 =(x i,j+1-k,…,x i,j+1,…,x i,j+1+k ) … … x’ i,j-1 =(x i,j-1 ) x’ i,j =(x i,j ) x’ i,j+1 =(x i,j+1 ) … … 1T0K_B SINQKLALVIKSGK YTLGYKSTVKSLRQ GKSKLIIIAANTPV LRKSELEYYAMLSK TKVYYFQGGNNELG TAVGKLFRVGVVSI LEAGDSDILTTLA ANTPVLRKS xixi y i {0,1}*  messengers for transferring genetic information from DNA to proteins  primary genetic material in many viruses  enzymes important for protein synthesis and RNA processing  essential and ubiquitous regulators of gene expression in living organisms These functions depend on interactions between RNA molecules and specific proteins in cells. Protein-RNA interface residue identification  RNA-Protein Interface dataset, RB181: consists of RNA-binding protein sequences extracted from structures of known RNA-protein complexes solved by X-ray crystallography in the Protein Data Bank Feature Extraction Struct-SVM Classifier A machine learning classifier that incorporates domain knowledge to improve classification (that is, the structure of the protein) Results Classifier/ PerfMeasure SVMStruct-SVM Accuracy Correlation Coefficient Area Under ROC Curve Table 1. Accuracy, Correlation Coefficient and Area Under the ROC Curves for SVM and Struct- SVM Conclusions References [1] Chen, Y., Varani, G. (2005). Protein families and RNA recognition. Febs J 272: [2] Burges, C. J. C. (1998) A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2:121–167, 1998 [3] Towfic, F., Caragea, C., Dobbs, D., and Honavar, V. (2008). Struct-NB: Predicting protein-RNA binding sites using structural features. International Journal of Data Mining and Bioinformatics, In press. Acknowledgements: This work is supported in part by a grant from the National Institutes of Health (GM ) to Vasant Honavar & Drena Dobbs