Biological sequence analysis and information processing by artificial neural networks.

Slides:



Advertisements
Similar presentations
A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C. Wunsch.
Advertisements

NEURAL NETWORKS Backpropagation Algorithm
Artificial Neural Networks (1)
Perceptron Learning Rule
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU T cell Epitope predictions using bioinformatics (Neural Networks and hidden.
Artificial Neural Networks - Introduction -
Artificial Neural Networks 2 Morten Nielsen BioSys, DTU.
Biological sequence analysis and information processing by artificial neural networks Søren Brunak Center for Biological Sequence Analysis Technical University.
Biological sequence analysis and information processing by artificial neural networks.
PROTEIN SECONDARY STRUCTURE PREDICTION WITH NEURAL NETWORKS.
Artificial Neural Networks 2 Morten Nielsen Depertment of Systems Biology, DTU.
Protein Secondary Structures
1 Part I Artificial Neural Networks Sofia Nikitaki.
RBF Neural Networks x x1 Examples inside circles 1 and 2 are of class +, examples outside both circles are of class – What NN does.
Biological sequence analysis and information processing by artificial neural networks Morten Nielsen CBS.
Biological sequence analysis and information processing by artificial neural networks Søren Brunak Center for Biological Sequence Analysis Technical University.
Artificial Neural Networks Thomas Nordahl Petersen & Morten Nielsen.
CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.
Pairwise Alignment Global & local alignment Anders Gorm Pedersen Molecular Evolution Group Center for Biological Sequence Analysis.
Project list 1.Peptide MHC binding predictions using position specific scoring matrices including pseudo counts and sequences weighting clustering (Hobohm)
CHAPTER 11 Back-Propagation Ming-Feng Yeh.
Biological sequence analysis and information processing by artificial neural networks Søren Brunak Center for Biological Sequence Analysis Technical University.
Data mining and statistical learning - lecture 11 Neural networks - a model class providing a joint framework for prediction and classification  Relationship.
Information Fusion Yu Cai. Research Article “Comparative Analysis of Some Neural Network Architectures for Data Fusion”, Authors: Juan Cires, PA Romo,
Hazırlayan NEURAL NETWORKS Radial Basis Function Networks II PROF. DR. YUSUF OYSAL.
Artificial Neural Networks for Secondary Structure Prediction CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (slides by J. Burg)
Chapter 7 Artificial Neural Networks
Traffic Sign Recognition Using Artificial Neural Network Radi Bekker
Lecture 11, CS5671 Secondary Structure Prediction Progressive improvement –Chou-Fasman rules –Qian-Sejnowski –Burkhard-Rost PHD –Riis-Krogh Chou-Fasman.
The dynamic nature of the proteome
C. Benatti, 3/15/2012, Slide 1 GA/ICA Workshop Carla Benatti 3/15/2012.
Artificial Neural Network Theory and Application Ashish Venugopal Sriram Gollapalli Ulas Bardak.
Using Neural Networks in Database Mining Tino Jimenez CS157B MW 9-10:15 February 19, 2009.
IE 585 Introduction to Neural Networks. 2 Modeling Continuum Unarticulated Wisdom Articulated Qualitative Models Theoretic (First Principles) Models Empirical.
Introduction to Neural Networks Debrup Chakraborty Pattern Recognition and Machine Learning 2006.
ANNs (Artificial Neural Networks). THE PERCEPTRON.
Chapter 3 Neural Network Xiu-jun GONG (Ph. D) School of Computer Science and Technology, Tianjin University
11 CSE 4705 Artificial Intelligence Jinbo Bi Department of Computer Science & Engineering
Machine Learning Dr. Shazzad Hosain Department of EECS North South Universtiy
Michigan REU Final Presentations, August 10, 2006Matt Jachowski 1 Multivariate Analysis, TMVA, and Artificial Neural Networks Matt Jachowski
Neural Networks for Protein Structure Prediction Brown, JMB 1999 CS 466 Saurabh Sinha.
Sequence encoding, Cross Validation Morten Nielsen BioSys, DTU
Artificial Neural Networks An Introduction. What is a Neural Network? A human Brain A porpoise brain The brain in a living creature A computer program.
Project list 1.Peptide MHC binding predictions using position specific scoring matrices including pseudo counts and sequences weighting clustering (Hobohm)
Modelling Language Evolution Lecture 1: Introduction to Learning Simon Kirby University of Edinburgh Language Evolution & Computation Research Unit.
Construction of Substitution Matrices
Artificiel Neural Networks 2 Morten Nielsen Department of Systems Biology, DTU IIB-INTECH, UNSAM, Argentina.
What is a Project Purpose –Use a method introduced in the course to describe some biological problem How –Construct a data set describing the problem –Define.
N. Saoulidou & G. Tzanakos1 ANN Basics : Brief Review N. Saoulidou, Fermilab & G. Tzanakos, Univ. of Athens.
Back-Propagation Algorithm AN INTRODUCTION TO LEARNING INTERNAL REPRESENTATIONS BY ERROR PROPAGATION Presented by: Kunal Parmar UHID:
Basic Overview of Bioinformatics Tools and Biocomputing Applications II Dr Tan Tin Wee Director Bioinformatics Centre.
CS621 : Artificial Intelligence
Introduction to Neural Networks Introduction to Neural Networks Applied to OCR and Speech Recognition An actual neuron A crude model of a neuron Computational.
Artificiel Neural Networks 2 Morten Nielsen Department of Systems Biology, DTU.
Protein Prediction with Neural Networks! Chris Alvino CS152 Fall ’06 Prof. Keller.
CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models.
Artificial Neural Networks (ANN). Artificial Neural Networks First proposed in 1940s as an attempt to simulate the human brain’s cognitive learning processes.
CAP6938 Neuroevolution and Artificial Embryogeny Neural Network Weight Optimization Dr. Kenneth Stanley January 18, 2006.
Prediction of T cell epitopes using artificial neural networks Morten Nielsen, CBS, BioCentrum, DTU.
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
“ Using Sequence Motifs for Enhanced Neural Network Prediction of Protein Distance Constraints ” J.Gorodkin, O.Lund, C.A.Anderson, S.Brunak On ISMB 99.
“Principles of Soft Computing, 2 nd Edition” by S.N. Sivanandam & SN Deepa Copyright  2011 Wiley India Pvt. Ltd. All rights reserved. CHAPTER 2 ARTIFICIAL.
Michael Holden Faculty Sponsor: Professor Gordon H. Dash.
Learning in Neural Networks
CSE 473 Introduction to Artificial Intelligence Neural Networks
Dr. Kenneth Stanley September 6, 2006
XOR problem Input 2 Input 1
network of simple neuron-like computing elements
Artificial Neural Networks Thomas Nordahl Petersen & Morten Nielsen
Artificial Neural Networks Thomas Nordahl Petersen & Morten Nielsen
Presentation transcript:

Biological sequence analysis and information processing by artificial neural networks

Objectives InputNeural networkOutput Neural network: is a black box that no one can understand over predict performance

Pairvise alignment >carp Cyprinus carpio growth hormone 210 aa vs. >chicken Gallus gallus growth hormone 216 aa scoring matrix: BLOSUM50, gap penalties: -12/ % identity; Global alignment score: carp MA--RVLVLLSVVLVSLLVNQGRASDN-----QRLFNNAVIRVQHLHQLAAKMINDFEDSLLPEERRQLSKIFPLSFCNSD ::. :...:.:. : :.. :: :::.:.:::: :::...::..::..:.:.:: :. chicken MAPGSWFSPLLIAVVTLGLPQEAAATFPAMPLSNLFANAVLRAQHLHLLAAETYKEFERTYIPEDQRYTNKNSQAAFCYSE carp YIEAPAGKDETQKSSMLKLLRISFHLIESWEFPSQSLSGTVSNSLTVGNPNQLTEKLADLKMGISVLIQACLDGQPNMDDN : ::.:::..:..:..:::.:. ::.:: : : ::..:.:. :.... ::: ::. ::..:.. :.:. chicken TIPAPTGKDDAQQKSDMELLRFSLVLIQSWLTPVQYLSKVFTNNLVFGTSDRVFEKLKDLEEGIQALMRELEDRSPR---G carp DSLPLP-FEDFYLTM-GENNLRESFRLLACFKKDMHKVETYLRVANCRRSLDSNCTL.: :.. :...:. :... ::.:::::.:::::::.:.:::.::::. chicken PQLLRPTYDKFDIHLRNEDALLKNYGLLSCFKKDLHKVETYLKVMKCRRFGESNCTI

HUNKAT

Biological Neural network

Biological neuron

Diversity of interactions in a network enables complex calculations Similar in biological and artificial systems Excitatory (+) and inhibitory (-) relations between compute units

Biological neuron structure

Transfer of biological principles to artificial neural network algorithms Non-linear relation between input and output Massively parallel information processing Data-driven construction of algorithms Ability to generalize to new data items

Neural networks Neural networks can learn higher order correlations XOR function: 0 0 => => => => 0 (1,1) (1,0) (0,0) (0,1) No linear function can separate the points

Error estimates XOR 0 0 => => => => 0 (1,1) (1,0) (0,0) (0,1) Predict 0 1 Error 0 1 Mean error: 1/4

Neural networks v1v1 v2v2 Linear function

Neural networks w 11 w 12 v1v1 w 21 w 22 v2v2 Higher order function

Neural networks. How does it work? w 12 v1v1 w 21 w 22 v2v2 w t2 w t1 w 11 vtvt Input 1 (Bias) {

Neural networks (0 0) Input 1 (Bias) { o 1 =-6 O 1 =0 o 2 =-2 O 2 =0 y 1 =-4.5 Y 1 =0

Neural networks (1 0 && 0 1) Input 1 (Bias) { o 1 =-2 O 1 =0 o 2 =4 O 2 =1 y 1 =4.5 Y 1 =1

Neural networks (1 1) Input 1 (Bias) { o 1 =2 O 1 =1 o 2 =10 O 2 =1 y 1 =-4.5 Y 1 =0

What is going on? XOR function: 0 0 => => => => Input 1 (Bias) { y2y2 y1y1

What is going on? (1,1) (1,0) (0,0) (0,1) x2x2 x1x1 y1y1 y2y2 (1,0) (2,2) (0,0)

DEMO

Training and error reduction

Transfer of biological principles to neural network algorithms Non-linear relation between input and output Massively parallel information processing Data-driven construction of algorithms

A Network contains a very large set of parameters –A network with 5 hidden neurons predicting binding for 9meric peptides has 9x20x5=900 weights Over fitting is a problem Stop training when test performance is optimal Neural network training years Temperature

Neural network training. Cross validation Cross validation Train on 4/5 of data Test on 1/5 => Produce 5 different neural networks each with a different prediction focus

Neural network training curve Maximum test set performance Most cable of generalizing

Network training Encoding of sequence data Sparse encoding Blosum encoding Sequence profile encoding

Sparse encoding of amino acid sequence windows

Sparse encoding Inp Neuron AAcid A R N D C Q E

BLOSUM encoding (Blosum50 matrix) A R N D C Q E G H I L K M F P S T W Y V A R N D C Q E G H I L K M F P S T W Y V

Sequence encoding (continued) Sparse encoding V: L: V. L=0 (unrelated) Blosum encoding V: L: V. L = 0.88 (highly related) V. R = (close to unrelated)

Applications of artificial neural networks Talk recognition Prediction of protein secondary structure Prediction of Signal peptides Post translation modifications Glycosylation Phosphorylation Proteasomal cleavage MHC:peptide binding

Higher order sequence correlations Neural networks can learn higher order correlations! –What does this mean? S S => 0 L S => 1 S L => 1 L L => 0 Say that the peptide needs one and only one large amino acid in the positions P3 and P4 to fill the binding cleft How would you formulate this to test if a peptide can bind? => XOR function

What have we learned Neural networks are not so bad as their reputation Neural networks can deal with higher order correlations Be careful when training a neural network –Always use cross validated training