Protein Prediction with Neural Networks! Chris Alvino CS152 Fall ’06 Prof. Keller
Introduction Proteins, made from amino acids Proteins, made from amino acids Polar forces interact for craaazzzy combinatoric explosion! Polar forces interact for craaazzzy combinatoric explosion! Just how crazzzzyyy? Just how crazzzzyyy?
Real Crazy Using crude workload estimates for a petaflop/second capacity machine leads to an estimate of THREE YEARS to simulate 100 MICROSECONDS of protein folding. Using crude workload estimates for a petaflop/second capacity machine leads to an estimate of THREE YEARS to simulate 100 MICROSECONDS of protein folding.
Why Neural Nets? Not so crazy Not so crazy Relatively accurate results Relatively accurate results 70-80% accurate70-80% accurate Patterns learned can lead to useful biological data Patterns learned can lead to useful biological data Used to quickly check existing databases Used to quickly check existing databases
Early Methods: Black Box Approach Protein Folding Analysis by an Artifical Neural Network Approach Protein Folding Analysis by an Artifical Neural Network Approach Authors: R. Sacile and C. Ruggiero Authors: R. Sacile and C. Ruggiero Published 1993 Published 1993
Early Methods: Black Box Approach Standard Back Prop Algorithm Standard Back Prop Algorithm
Early Methods: Black Box Approach 3 Layers 3 Layers Input = Window size = 13 amino acidsInput = Window size = 13 amino acids Hidden Layer = 20 neuronsHidden Layer = 20 neurons Output Layer: 3 possible (alpha, beta, coil)Output Layer: 3 possible (alpha, beta, coil)
Early Methods: Black Box Approach 7 training sets 7 training sets Each consists of around 1500 residuals (amino acids)Each consists of around 1500 residuals (amino acids) Training took 3-4 hours Training took 3-4 hours
Results
Artificial Neural Networks and Hidden Markov Models for Predicting the Protein Structures: The Secondary Structure Prediction in Caspases Thimmappa S. Anekonda (2002)
Current State of the Art Neural Networks and Hidden Markov Models Neural Networks and Hidden Markov Models
Hidden Markov what? Hidden Markov models (HMMs), originally developed for other applications such as speech recognition, are generative, probabilistic models of sequential information. An observed sequence is modeled as being the stochastic result of an underlying unobserved random walk through the hidden states of the model. The parameters of an HMM are the transition probabilities between the hidden states and the symbol emission probabilities from each hidden state.
State transitions in a hidden Markov model (example) x — hidden states y — observable outputs a — transition probabilities b — output probabilities State transitions in a hidden Markov model (example) x — hidden states y — observable outputs a — transition probabilities b — output probabilities
Caspases, the friendly Ghost Caspases are a family of intracellular cysteine endopeptidases. They play a key role in inflammation and mammalian apoptosis or programmed cell death.
Clash of the Titans PHDSec PHDSec Utilizes evolutionary informationUtilizes evolutionary information PSIPRED PSIPRED Uses iterated PSI-BLAST profiles as input instead of multiple sequeence alignments like PHDSecUses iterated PSI-BLAST profiles as input instead of multiple sequeence alignments like PHDSec SAM-T02 SAM-T02 Uses ANN and HMMUses ANN and HMM PROF King PROF King Uses seven GOR-based predictions and ANNUses seven GOR-based predictions and ANN