Hydrogen bonds in Rosetta: a phenomonological study Jack Snoeyink Dept. of Computer Science UNC Chapel Hill.

Slides:



Advertisements
Similar presentations
Functional Site Prediction Selects Correct Protein Models Vijayalakshmi Chelliah Division of Mathematical Biology National Institute.
Advertisements

Using a Mixture of Probabilistic Decision Trees for Direct Prediction of Protein Functions Paper by Umar Syed and Golan Yona department of CS, Cornell.
Protein Structure C483 Spring 2013.
A brief refresher on protein structure Topic 3. Perhaps the most important structural bioinformatics result ever published… Chothia, C. & Lesk, A. M.
Rosetta Energy Function Glenn Butterfoss. Rosetta Energy Function Major Classes: 1. Low resolution: Reduced atom representation Simple energy function.
©CMBI 2001 The amino acids in their natural habitat.
Computational Experiments with a Lone-Pair Based Hydrogen-Bonding Energy Function in Mini-Rosetta YOUR NAME HERE (Arial 28 pt italic)YOUR PROJECT URL HERE.
The amino acids in their natural habitat. Topics: Hydrogen bonds Secondary Structure Alpha helix Beta strands & beta sheets Turns Loop Tertiary & Quarternary.
Protein Secondary Structure II Lecture 2/24/2003.
Jack Snoeyink & Matt O’Meara Dept. Computer Science UNC Chapel Hill.
GAFolder and Cyclic Coordinate Descent Protein Structure Energy Minimization David Arndt - June 1, 2007.
Protein-a chemical view A chain of amino acids folded in 3D Picture from on-line biology bookon-line biology book Peptide Protein backbone N / C terminal.
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
Structures and Structure Descriptions Chapter 8 Protein Bioinformatics.
Structure Prediction in 1D
Protein Structures.
A PEPTIDE BOND PEPTIDE BOND Polypeptides are polymers of amino acid residues linked by peptide group Peptide group is planar in nature which limits.
Proteins: Levels of Protein Structure Conformation of Peptide Group
eHiTS Score Darryl Reid, Zsolt Zsoldos, Bashir S. Sadjad, Aniko Simon, The next stage in scoring function evolution: a new statistically.
Exploratory Data Analysis. Computing Science, University of Aberdeen2 Introduction Applying data mining (InfoVis as well) techniques requires gaining.
Housekeeping Your performance on the exam has caused me to re-evaluate how homework will be handled I will now be picking up every problem assigned on.
CRB Journal Club February 13, 2006 Jenny Gu. Selected for a Reason Residues selected by evolution for a reason, but conservation is not distinguished.
Lecture 10: Protein structure
Proteins. Proteins? What is its How does it How is its How does it How is it Where is it What are its.
02/03/10 CSCE 769 Dihedral Angles Homayoun Valafar Department of Computer Science and Engineering, USC.
Protein “folding” occurs due to the intrinsic chemical/physical properties of the 1° structure “Unstructured” “Disordered” “Denatured” “Unfolded” “Structured”
Representations of Molecular Structure: Bonds Only.
ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory.
Study of Loop Length & Residue Composition of β-Hairpin Motif
Measures of Dispersion & The Standard Normal Distribution 2/5/07.
Molecular Biology 2.6 Structure of DNA and RNA. Nucleic Acids The nucleic acids DNA and RNA are polymers of nucleotides.
2. Introduction to Rosetta and structural modeling (From Ora Schueler-Furman) Approaches for structural modeling of proteins The Rosetta framework and.
Organic Chemistry AP Biology. Carbohydrates How do you recognize a carb? Function?
©CMBI 2001 Step 5: The amino acids in their natural habitat.
Chapter 2 EDRS 5305 Fall Descriptive Statistics  Organize data into some comprehensible form so that any pattern in the data can be easily seen.
Chapter 6: Analyzing and Interpreting Quantitative Data
Developing a Force Field Molecular Mechanics. Experimental One Dimensional PES Quantum mechanics tells us that vibrational energy levels are quantized,
Bioinformatics 2 -- lecture 9
Protein Structure and Bioinformatics. Chapter 2 What is protein structure? What are proteins made of? What forces determines protein structure? What is.
Protein backbone Biochemical view:
Forward and inverse kinematics in RNA backbone conformations By Xueyi Wang and Jack Snoeyink Department of Computer Science UNC-Chapel Hill.
Electrostatics of Channels—pK’s and potential of mean force for permeation Sameer Varma, NCSA/UIUC/Beckman Institute Computational Biology/Nanoscience.
BIVARIATE/MULTIVARIATE DESCRIPTIVE STATISTICS Displaying and analyzing the relationship between continuous variables.
Levels of Protein Structure. Why is the structure of proteins (and the other organic nutrients) important to learn?
1 Three-Body Delaunay Statistical Potentials of Protein Folding Andrew Leaver-Fay University of North Carolina at Chapel Hill Bala Krishnamoorthy, Alex.
Tymoczko • Berg • Stryer © 2015 W. H. Freeman and Company
We propose an accurate potential which combines useful features HP, HH and PP interactions among the amino acids Sequence based accessibility obtained.
Presented by Shouyong Peng Shouyong Peng May 27, 2005 Journal Club.
Nucleic Acids DNA & RNA.
Protein Structure BL
The heroic times of crystallography
Fig. 2 System flowchart. Each of the four iterations contains two models (SS, and ASA/HSE/CN/ANGLES), for a total of eight LSTM-BRNN based models. The.
Feature Extraction Introduction Features Algorithms Methods
Hierarchical Structure of Proteins
Protein Structures.
Structural Basis for Vertebrate Filamin Dimerization
Yang Zhang, Andrzej Kolinski, Jeffrey Skolnick  Biophysical Journal 
Introduction to Regression
ACTIVE FIGURE 6.4 A Ramachandran diagram showing the sterically reasonable values of the angles  and . The shaded regions indicate particularly favorable.
Influence of Protein Scaffold on Side-Chain Transfer Free Energies
Protein Structure INTRODUCTION OF PROTIEN. Organic compounds containing C,H,O,N,P,S Comprise 50% of dry weight of cell. Made up of Amino acids. Protein.
Statistical Prediction and Molecular Dynamics Simulation
Structural Basis for Vertebrate Filamin Dimerization
Introduction to Regression
James E. Milner-White, James D. Watson, Guoying Qi, Steven Hayward 
Fig 3.13 Reproduced from: Biochemistry by T.A. Brown, ISBN: © Scion Publishing Ltd, 2017.
Protein structure prediction
Hybridization.
Christian X. Weichenberger, Manfred J. Sippl  Structure 
Volume 94, Issue 11, Pages (June 2008)
Presentation transcript:

Hydrogen bonds in Rosetta: a phenomonological study Jack Snoeyink Dept. of Computer Science UNC Chapel Hill

Key points  My biases  Hydrogen bonds in Rosetta Structure-derived potential of KMB03 Existing definition/scoring Comparing natives & decoys Proposed recategorization  Bad smells in code  Open questions

Phenomenology defined:  Movement originated by E. Husserl in 1905  A philosophy based on the premise that reality consists of objects and events as they are perceived or understood in human consciousness and not of anything independent of human consciousness.

Phenomenology defined:  Movement originated by E. Husserl in 1905  A philosophy based on the premise that reality consists of objects and events as they are perceived or understood in human consciousness and not of anything independent of human consciousness.

Structure-derived potential KMB03  Energy from observed structures: distance dependence for helix

Structure-derived potential KMB03  Energy from observed structures: statistically derived energies…

Structure-derived potential KMB03  Energy from observed structures: as implemented in Rosetta…

Three tasks in Hbond scoring  Identify pairs of atoms that Hbond  Classify Hbond types  Evaluate energies for Hbonds Rosetta++ mixes these tasks together…

Three tasks in Hbond scoring As described in KMB03  Identify pairs of atoms that Hbond Params: AH distance, ,   Classify Hbond types BB: helix, strand, other; AH distance SS,BS,SB: acceptor hybridization; AH dist  Evaluate energies for Hbonds Sum three potentials on AH distance, , ,  Amino acid weights Residue neighbors for donor/acceptor

Three tasks in Hbond scoring As implemented in Rosetta++  Identify pairs of atoms that Hbond Params: AH distance, ,   Classify Hbond types BB: separation short |sep|≤4; long range SS,BS,SB: acceptor hybridization; AH dist  Evaluate energies for Hbonds Sum three potentials on AH distance, ,  Amino acid weights OR Residue neighbors for donor/acceptor

SS bonds: native & decoy sp 2 ED QN bb sp 3 TS Y ring H dist 

SS bonds: native & decoy sp 2 ED QN bb sp 3 TS Y ring H dist   Natives: Dunbrack set of 3157 structures some pdb errors  Decoys: Best 20 for each of Rhiju’s ab initio runs on 62 structures small proteins few parallel beta strands  Rosetta places Hs & determines Hbonds  Filter energies < -0.1  Visualization: Tufte’s small multiples  Normalization Express counts as fraction of all Hbonds to support comparison of colors in each plot Plot with common x axis; scale y to max height

Energy distribution of bonds involving a sidechain atom before/after filtering

Number (and percentage) of bonds under the existing classification CountsPercentage NativeDecoysNativeDecoys BB Helix (+/-4) 185,204 38, Turn (+/-3) 79,110 8, Other 150,945 19, S sp2 ED QB bb 132,522 6, sp3 TS Y 23,641 2, ring H 2, TOTALS 573,747 75,

Observations  Rosetta does well at optimizing what it is told.  Decoy distributions are more sharply peaked than natives.  Relax preserves more non-helix bonds than ab initio, but produces same shapes for param distrib’ns. To test changes, it suffices to run relax.

SS bonds: native & decoy sp 2 ED QN bb sp 3 TS Y ring H dist 

SS bonds: native & decoy sp 2 ED QN bb sp 3 TS Y ring H dist 

SS,BS,SB bonds: native & decoy sp 2 ED QN bb sp 3 TS Y ring H dist 

AH Distance NB: donor effects small # omit C bimodal H R & QNacc

Theta A-H-D angle NB: small #s width R N E &N H

Psi AHD angle NB: R E & EDacc

Chi A 2 -A torsion NB: Polar & charged prefs

__ H-D torsion NB: Polar & charged prefs

Three tasks in Hbond scoring Proposed changes  Identify pairs of atoms that Hbond Params: AH distance, ,   Classify Hbond types BB: finer separation (Beta?) SS,BS,SB: finer don/acc chemical types  Evaluate energies for Hbonds: options 1. Sum three potentials on AH distance, ,  2. Potential on three variables AH distance, ,  3. Add neighbors 4. Add a torsion as 4 th or 5 th variable Weights for tuning different terms

Backbone bonds AH distance

Backbone bonds theta

Parallel vs Anti-parallel beta The standard figures are misleading; parallel and anti-parallel form similar distance, ,  distributions.

Backbone bonds psi

Backbone bonds chi torsion

Backbone bonds AH-DD 2 torsion

Refactoring Hbonds  Recategorizing should eliminate long-range & short-range Hbonds; which are used outside of hbonds.cc – they shouldn’t need to be.  Duplicated code in minimizers needs to be brought back into hbonds.cc

Refactoring  In code, a function should do one thing well.  When a function you work with is doing too many things, split it.  Duplicating code indicates that something is designed wrong.  Avoid magic numbers.