Lecture 11, CS5671 Secondary Structure Prediction Progressive improvement –Chou-Fasman rules –Qian-Sejnowski –Burkhard-Rost PHD –Riis-Krogh Chou-Fasman.

Slides:

Advertisements

Similar presentations

Multi-Layer Perceptron (MLP)

Advertisements

NEURAL NETWORKS Backpropagation Algorithm

Artificial Intelligence 13. Multi-Layer ANNs Course V231 Department of Computing Imperial College © Simon Colton.

CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 7: Learning in recurrent networks Geoffrey Hinton.

Neural Networks  A neural network is a network of simulated neurons that can be used to recognize instances of patterns. NNs learn by searching through.

For Wednesday Read chapter 19, sections 1-3 No homework.

Support Vector Machines

CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.

CS 4700: Foundations of Artificial Intelligence

Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.

PROTEIN SECONDARY STRUCTURE PREDICTION WITH NEURAL NETWORKS.

INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

Supervised learning 1.Early learning algorithms 2.First order gradient methods 3.Second order gradient methods.

Chapter 9 Structure Prediction. Motivation Given a protein, can you predict molecular structure Want to avoid repeated x-ray crystallography, but want.

Biological inspiration Animals are able to react adaptively to changes in their external and internal environment, and they use their nervous system to.

CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.

Structure Prediction in 1D

Before we start ADALINE

MACHINE LEARNING 12. Multilayer Perceptrons. Neural Networks Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

Artificial Neural Networks

Detecting the Domain Structure of Proteins from Sequence Information Niranjan Nagarajan and Golan Yona Department of Computer Science Cornell University.

Artificial Neural Networks for Secondary Structure Prediction CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (slides by J. Burg)

Neural Networks Lecture 8: Two simple learning algorithms

Dr. Hala Moushir Ebied Faculty of Computers & Information Sciences

Radial Basis Function Networks

Artificial Neural Networks

Classification Part 3: Artificial Neural Networks

Computer Science and Engineering

Neural Networks AI – Week 23 Sub-symbolic AI Multi-Layer Neural Networks Lee McCluskey, room 3/10

11 CSE 4705 Artificial Intelligence Jinbo Bi Department of Computer Science & Engineering

Sequence analysis: Macromolecular motif recognition Sylvia Nagl.

Neural Networks for Protein Structure Prediction Brown, JMB 1999 CS 466 Saurabh Sinha.

CS 478 – Tools for Machine Learning and Data Mining Backpropagation.

CSC321: Neural Networks Lecture 2: Learning with linear neurons Geoffrey Hinton.

CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 9: Ways of speeding up the learning and preventing overfitting Geoffrey Hinton.

Applying Neural Networks Michael J. Watts

Neural and Evolutionary Computing - Lecture 9 1 Evolutionary Neural Networks Design  Motivation  Evolutionary training  Evolutionary design of the architecture.

Protein secondary structure Prediction Why 2 nd Structure prediction? The problem Seq: RPLQGLVLDTQLYGFPGAFDDWERFMRE Pred:CCCCCHHHHHCCCCEEEECCHHHHHHCC.

Non-Bayes classifiers. Linear discriminants, neural networks.

Neural Networks and Backpropagation Sebastian Thrun , Fall 2000.

Fundamentals of Artificial Neural Networks Chapter 7 in amlbook.com.

Neural Networks - lecture 51 Multi-layer neural networks  Motivation  Choosing the architecture  Functioning. FORWARD algorithm  Neural networks as.

Neural Networks Vladimir Pleskonjić 3188/ /20 Vladimir Pleskonjić General Feedforward neural networks Inputs are numeric features Outputs are in.

Matching Protein  -Sheet Partners by Feedforward and Recurrent Neural Network Proceedings of Eighth International Conference on Intelligent Systems for.

Lecture 8, CS5671 Neural Network Concepts Weight Matrix vs. NN MLP Network Architectures Overfitting Parameter Reduction Measures of Performance Sequence.

CSC321: Introduction to Neural Networks and Machine Learning Lecture 15: Mixtures of Experts Geoffrey Hinton.

Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.

Sparse nonnegative matrix factorization for protein sequence motifs information discovery Presented by Wooyoung Kim Computer Science, Georgia State University.

Intro. ANN & Fuzzy Systems Lecture 11. MLP (III): Back-Propagation.

Predicting Structural Features Chapter 12. Structural Features Phosphorylation sites Transmembrane helices Protein flexibility.

Substitution Matrices and Alignment Statistics BMI/CS 776 Mark Craven February 2002.

Improved Protein Secondary Structure Prediction. Secondary Structure Prediction Given a protein sequence a 1 a 2 …a N, secondary structure prediction.

Learning: Neural Networks Artificial Intelligence CMSC February 3, 2005.

Pattern Recognition Lecture 20: Neural Networks 3 Dr. Richard Spillman Pacific Lutheran University.

Learning with Neural Networks Artificial Intelligence CMSC February 19, 2002.

Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.

Today’s Lecture Neural networks Training

Machine Learning Supervised Learning Classification and Regression

Fall 2004 Backpropagation CS478 - Machine Learning.

Applying Neural Networks

Learning with Perceptrons and Neural Networks

One-layer neural networks Approximation problems

Neural Networks CS 446 Machine Learning.

Neural Networks and Backpropagation

Machine Learning Today: Reading: Maria Florina Balcan

ECE 471/571 – Lecture 12 Perceptron.

network of simple neuron-like computing elements

Neural Network - 2 Mayank Vatsa

Artificial Intelligence 10. Neural Networks

CS621: Artificial Intelligence Lecture 22-23: Sigmoid neuron, Backpropagation (Lecture 20 and 21 taken by Anup on Graphical Models) Pushpak Bhattacharyya.

Presentation transcript:

Lecture 11, CS5671 Secondary Structure Prediction Progressive improvement –Chou-Fasman rules –Qian-Sejnowski –Burkhard-Rost PHD –Riis-Krogh Chou-Fasman rules –Based on statistical analysis of residue frequencies in different kinds of secondary structure –Useful, but of limited accuracy

Lecture 11, CS5672 Qian-Sejnowski Pioneering NN approach Input: 13 contiguous amino acid residues Output: Prediction of secondary structure of central residue Architecture: –Fully connected MLP –Orthogonal encoding of input, –Single hidden layer with 40 units –3 neuron output layer Training: –Initial weight between -0.3 and 0.3 –Backpropagation with the LMS (Steepest Descent) algorithm –Output: Helix xor Sheet xor Coil (Winner take all)

Lecture 11, CS5673 Qian-Sejnowski Performance –Dramatic improvement over Chou-Fasman –Assessment Q = 62.7% (Proportion of correct predictions) Correlation coefficient (Eq 6.1) –Better parameter because »It considers all of TP, FP, TN and FN »Chi-squared test can be used to assess significance –C  = 0.35; C  = 0.29, C c = 0.38; Refinement –Outputs as inputs into second network (13X3 inputs, otherwise identical) –Q = 64.3%; C  = 0.41; C  = 0.31, C c = 0.41

Lecture 11, CS5674 PHD (Rost-Sander) Drawback of QS method –Large number of parameters (10 4 versus 2 X 10 4 examples) leads to overfitting –Theoretical limit on accuracy using only sequence per se as input~ 68% Key aspect of PHD: Use evolutionary information –Go beyond single sequence by using information from similar sequences (Enhance signal-noise ratio; “Look for more swallows before declaring summer”) through multiple sequence alignments –Prediction in context of conservation (similar residues) within families of proteins –Prediction in context of whole protein

Lecture 11, CS5675 PHD (Rost-Sander) Find proteins similar to the input protein Construct a multiple sequence alignment Use frequentist approach to assess position-wise conservation Include extra information (similarity) in the network input –Position-wise conservation weight (Real) –Insertion (Boolean); Deletion (Boolean) Overfitting minimized by –Early stopping and –Jury of heterogeneous networks for prediction Performance –Q = 69.7%; C  = 0.58; C  = 0.5, C c = 0.5

Lecture 11, CS5676 PHD input Fig 6.2

Lecture 11, CS5677 PHD architecture Fig 6.2

Lecture 11, CS5678 Riis-Krogh NN Drawback of PHD –Large input layer –Network globally optimized for all 3 classes; scope for optimizing wrt each predicted class Key aspects of RK –Use local encoding with weight sharing to minimize number of parameters –Different network for prediction of each class

Lecture 11, CS5679 RK architecture (Fig 6.3)

Lecture 11, CS56710 RK architecture (Fig 6.4)

Lecture 11, CS56711 Riis-Krogh NN Find proteins similar to the input protein Construct a multiple sequence alignment Use frequentist approach to assess position-wise conservation BUT first predict structure of each sequence separately, followed by integration based on conservation weights

Lecture 11, CS56712 Riis-Krogh NN Architecture –Local encoding Each amino acid represented by analog value (‘real correlation’, not algebraic) Weight sharing to minimize parameters Extra hidden layer as part of input –For helix prediction network, use sparse connectivity based on known periodicity –Use ensembles of networks differing in architecture for prediction (hidden units) –Second integrative network used for prediction Performance –Q = 71.3%; C  = 0.59; C  = 0.5, C c = 0.5 –Corresponds to theoretical upper bound for a contiguous window based method

Lecture 11, CS56713 NN tips & tricks Avoid overfitting (avoid local minima) –Use the fewest parameters possible Transform/filter input Use weight sharing Consider partial connectivity –Use large number of training examples –Early stopping –Online learning as opposed to batch/offline learning (“One of the few situations where noise is beneficial”) –Start with different values for parameters –Use random descent (“ascent”) when needed

Lecture 11, CS56714 NN tips & tricks Improving predictive performance –Experiment with different network configurations –Combine networks (ensembles) –Use priors in processing input (Context information, non-contiguous information) –Use appropriate measures of performance (e.g., correlation coefficient for binary output) –Use balanced training Improving computational performance –Optimization methods based on second derivatives

Lecture 11, CS56715 Measures of accuracy Vector of TP, FP, TN, FN is best, but not very intuitive measure of distance between data (target) and model (prediction), and restricted to binary output Alternative: Single measures (transformation of above vector) Proportions based on TP, FP, TN, FN –Sensitivity (Minimize false negatives) –Specificity (Minimize false positives) –Accuracy (Minimize wrong predictions)

Lecture 11, CS56716 Measures of error/accuracy L p distances (Minkowski distances) –(  i |d i - m i | p ) 1/p –L 1 distance = Hamming/Manhattan distance =  i |d i - m i | –L 2 distance = Euclidean/Quadratic distance = (  i |d i - m i | 2 ) 1/2 Pearson correlation coefficient –  I (d i – E[d])(m i - E[m])/  d  m –(TP.TN – FP.FN)/(TP+FN)(TP+FP)(TN+FP)(TN+FN) Relative entropy (déjà vu) Mutual information (déjà vu aussi)