Machine Learning Methods of Protein Secondary Structure Prediction Presented by Chao Wang.

Slides:



Advertisements
Similar presentations
Secondary structure prediction from amino acid sequence.
Advertisements

PhyCMAP: Predicting protein contact map using evolutionary and physical constraints by integer programming Zhiyong Wang and Jinbo Xu Toyota Technological.
The amino acids in their natural habitat. Topics: Hydrogen bonds Secondary Structure Alpha helix Beta strands & beta sheets Turns Loop Tertiary & Quarternary.
Protein Threading Zhanggroup Overview Background protein structure protein folding and designability Protein threading Current limitations.
High Throughput Computing and Protein Structure Stephen E. Hamby.
1 Profile Hidden Markov Models For Protein Structure Prediction Colin Cherry
A Hidden Markov Model for Protein Secondary Structure Prediction
درس بیوانفورماتیک December 2013 مدل ‌ مخفی مارکوف و تعمیم ‌ های آن به نام خدا.
1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala.
Structure Prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray crystallography, NMR) [2]
Amino Acid and Protein1. 2  The formation of a peptide bond between glycine and alanine is shown in Figure 5.8. The product is called dipeptide, the.
Protein Secondary Structures
Chapter 9 Structure Prediction. Motivation Given a protein, can you predict molecular structure Want to avoid repeated x-ray crystallography, but want.
Profile-profile alignment using hidden Markov models Wing Wong.
Structure Prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray crystallography, NMR) [2]
Mossbauer Spectroscopy in Biological Systems: Proceedings of a meeting held at Allerton House, Monticello, Illinois. Editors: J. T. P. DeBrunner and E.
Computational Biology, Part 10 Protein Structure Prediction and Display Robert F. Murphy Copyright  1996, 1999, All rights reserved.
CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.
Structure Prediction in 1D
Similar Sequence Similar Function Charles Yan Spring 2006.
Amino acid side chains stabilise the enzyme shape.
Introduction to Bioinformatics - Tutorial no. 8 Protein Prediction: - PROSITE - Pfam - SCOP - TOPITS - genThreader.
Detecting the Domain Structure of Proteins from Sequence Information Niranjan Nagarajan and Golan Yona Department of Computer Science Cornell University.
Phylogeny Estimation: Traditional and Bayesian Approaches Molecular Evolution, 2003
Protein Sequence Alignment and Database Searching.
CRB Journal Club February 13, 2006 Jenny Gu. Selected for a Reason Residues selected by evolution for a reason, but conservation is not distinguished.
Proteins Secondary Structure Predictions Structural Bioinformatics.
Protein Structure Prediction. Historical Perspective Protein Folding: From the Levinthal Paradox to Structure Prediction, Barry Honig, 1999 A personal.
Predicting Secondary Structure of All-Helical Proteins Using Hidden Markov Support Vector Machines Blaise Gassend, Charles W. O'Donnell, William Thies,
Protein Modeling Lab Things to check before the modeling lab:
Protein Secondary Structure Prediction Some of the slides are adapted from Dr. Dong Xu’s lecture notes.
Sequence analysis: Macromolecular motif recognition Sylvia Nagl.
Multiple Alignment and Phylogenetic Trees Csc 487/687 Computing for Bioinformatics.
Protein Secondary Structure Prediction Based on Position-specific Scoring Matrices Yan Liu Sep 29, 2003.
Neural Networks for Protein Structure Prediction Brown, JMB 1999 CS 466 Saurabh Sinha.
Secondary structure prediction
2 o structure, TM regions, and solvent accessibility Topic 13 Chapter 29, Du and Bourne “Structural Bioinformatics”
Web Servers for Predicting Protein Secondary Structure (Regular and Irregular) Dr. G.P.S. Raghava, F.N.A. Sc. Bioinformatics Centre Institute of Microbial.
The α-helix forms within a continuous strech of the polypeptide chain 5.4 Å rise, 3.6 aa/turn  1.5 Å/aa N-term C-term prototypical  = -57  ψ = -47 
Bioinformatics Ayesha M. Khan 9 th April, What’s in a secondary database?  It should be noted that within multiple alignments can be found conserved.
Protein Secondary Structure Prediction G P S Raghava.
Meng-Han Yang September 9, 2009 A sequence-based hybrid predictor for identifying conformationally ambivalent regions in proteins.
Study of Protein Prediction Related Problems Ph.D. candidate Le-Yi WEI 1.
PROTEIN PHYSICS LECTURES 22-23
Protein Structure Prediction ● Why ? ● Type of protein structure predictions – Sec Str. Pred – Homology Modelling – Fold Recognition – Ab Initio ● Secondary.
Hierarchy of Protein Structure
Prediction of Protein Binding Sites in Protein Structures Using Hidden Markov Support Vector Machine.
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
Comparative methods Basic logics: The 3D structure of the protein is deduced from: 1.Similarities between the protein and other proteins 2.Statistical.
Combining Evolutionary Information Extracted From Frequency Profiles With Sequence-based Kernels For Protein Remote Homology Detection Name: ZhuFangzhi.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
Protein backbone Biochemical view:
Protein Prediction with Neural Networks! Chris Alvino CS152 Fall ’06 Prof. Keller.
Levels of Protein Structure. Why is the structure of proteins (and the other organic nutrients) important to learn?
Levels of Protein Structure. Why is the structure of proteins (and the other organic nutrients) important to learn?
Proteins Structure Predictions Structural Bioinformatics.
Protein Structure and Function. Proteins are organic compounds made from amino acids held together by peptide bonds.
Structural organization of proteins
Improved Protein Secondary Structure Prediction. Secondary Structure Prediction Given a protein sequence a 1 a 2 …a N, secondary structure prediction.
BIOINFORMATION A new taxonomy-based protein fold recognition approach based on autocross-covariance transformation - - 王红刚 14S
Protein Structure BL
Feature Extraction Introduction Features Algorithms Methods
Database extraction of residue-specific empirical potentials
Prediction of RNA Binding Protein Using Machine Learning Technique
Lecture 5 Protein Structure.
Introduction to Bioinformatics II
There are four levels of structure in proteins
Protein Structures.
Yang Zhang, Andrzej Kolinski, Jeffrey Skolnick  Biophysical Journal 
Yang Liu, Perry Palmedo, Qing Ye, Bonnie Berger, Jian Peng 
Presentation transcript:

Machine Learning Methods of Protein Secondary Structure Prediction Presented by Chao Wang

What is secondary structure? How to evaluate secondary structure prediction? How secondary structure prediction affects the accuracy of tertiary structure prediction? Our perspective: ``elite''

What is secondary structure?

Hydrogen bond: a non-covalent bond A hydrogen bond is identified if E in the following equation is less than -0.5 kcal/mol

8-state annotation by DSSP

Prediction Early methods of secondary-structure prediction were restricted to predicting the three predominate states: helix, sheet, or random coil. These methods were based on the helix- or sheet-forming propensities of individual amino acids, sometimes coupled with rules for estimating the free energy of forming secondary structure elements. Such methods were typically ~60% accurate in predicting which of the three states (helix/sheet/coil) a residue adopts.

A significant increase in accuracy (to nearly ~80%) was made by exploiting multiple sequence alignment; knowing the full distribution of amino acids that occur at a position (and in its vicinity, typically ~7 residues on either side) throughout evolution provides a much better picture of the structural tendencies near that position. For illustration, a given protein might have a glycine at a given position, which by itself might suggest a random coil there. However, multiple sequence alignment might reveal that helix-favoring amino acids occur at that position (and nearby positions) in 95% of homologous proteins spanning nearly a billion years of evolution. Moreover, by examining the average hydrophobicity at that and nearby positions, the same alignment might also suggest a pattern of residue solvent accessibility consistent with an α-helix. Taken together, these factors would suggest that the glycine of the original protein adopts α-helical structure, rather than random coil. Several types of methods are used to combine all the available data to form a 3-state prediction, including neural networks, hidden Markov models and support vector machines. Modern prediction methods also provide a confidence score for their predictions at every position.

Outline CNF model by Jinbo Multi-step learning model by Yaoqi Iterative deep learning model by Yaoqi Our perspective: Elite. –A new enperiment to detect how elite affects secondary structure prediction.

Methods –How to model the probability –Feature Selection Results –vs. other methods –Improvement

Protein 8-class secondary structure prediction using conditional neural fields Zhiyong Wang, Feng Zhao, Jian Peng, and Jinbo Xu Proteomics. 2011

Model

Training & Prediction

Features

Training/testing set

Results Outperform SSpro8 on each state

Regularization factor effect: insensitive, optimal when the factor is set to 9.

Neff effective: for SS prediction, it may not be the best strategy to use evolutionary information in as many homologs as possible. Instead, we should use a subset of sequence homologs to build sequence profile when there are many sequence homologs available.

J Comput Chem. 2012