Bioinformatics 2 -- lecture 9

Slides:



Advertisements
Similar presentations
Crystallography -- lecture 21 Sidechain chi angles Rotamers Dead End Elimination Theorem Sidechain chi angles Rotamers Dead End Elimination Theorem.
Advertisements

Protein Secondary Structure II Lecture 2/24/2003.
Short fast history of protein design Site-directed mutagenesis -- protein engineering (J. Wells, 1980's) Coiled coils, helix bundles (W. DeGrado, 1980's-90's)
Protein Threading Zhanggroup Overview Background protein structure protein folding and designability Protein threading Current limitations.
An overview of amino acid structure Topic 2. Biomacromolecule A naturally occurring substance of large molecular weight e.g. Protein, DNA, lipids etc.
Prediction to Protein Structure Fall 2005 CSC 487/687 Computing for Bioinformatics.
The Structure and Functions of Proteins BIO271/CS399 – Bioinformatics.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modeling Anne Mølgaard, CBS, BioCentrum, DTU.
Two Examples of Docking Algorithms With thanks to Maria Teresa Gil Lucientes.
The Side-Chain Positioning Problem Joint work with Bernard Chazelle and Mona Singh Carl Kingsford Princeton University.
Determination of alpha-helix propensities within the context of a folded protein Blaber et al. J. Mol. Biol 1994.
Thomas Blicher Center for Biological Sequence Analysis
Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction.
Determination of alpha-helix propensities within the context of a folded protein Blaber et al. J. Mol. Biol 1994.
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modelling Thomas Blicher Center for Biological Sequence Analysis.
Protein Basics Protein function Protein structure –Primary Amino acids Linkage Protein conformation framework –Dihedral angles –Ramachandran plots Sequence.
A Kinematic View of Loop Closure EVANGELOS A. COUTSIAS, CHAOK SEOK, MATTHEW P. JACOBSON, KEN A. DILL Presented by Keren Lasker.
Protein Side Chain Packing Problem: A Maximum Edge-Weight Clique Algorithmic Approach Dukka Bahadur K.C, Tatsuya Akutsu and Tomokazu Seki Proceedings of.
Basics of protein structure and stability III: Anatomy of protein structure Biochem 565, Fall /29/08 Cordes.
A PEPTIDE BOND PEPTIDE BOND Polypeptides are polymers of amino acid residues linked by peptide group Peptide group is planar in nature which limits.
Proteins: Levels of Protein Structure Conformation of Peptide Group
Computational Structure Prediction Kevin Drew BCH364C/391L Systems Biology/Bioinformatics 2/12/15.
Protein Structure Prediction Dr. G.P.S. Raghava Protein Sequence + Structure.
Module 2: Structure Based Ph4 Design
Empirical energy function Summarizing some points about typical MM force field In principle, for a given new molecule, all force field parameters need.
What are proteins? Proteins are important; e.g. for catalyzing and regulating biochemical reactions, transporting molecules, … Linear polymer chain composed.
Protein Secondary Structure Lecture 2/19/2003. Three Dimensional Protein Structures Confirmation: Spatial arrangement of atoms that depend on bonds and.
Proteins: Secondary Structure Alpha Helix
Proteins. Proteins? What is its How does it How is its How does it How is it Where is it What are its.
Representations of Molecular Structure: Bonds Only.
RNA Secondary Structure Prediction Spring Objectives  Can we predict the structure of an RNA?  Can we predict the structure of a protein?
Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding I Prof. Corey O’Hern Department of Mechanical Engineering & Materials.
Rotamer Packing Problem: The algorithms Hugo Willy 26 May 2010.
Protein Planes Bob Fraser Protein Folding 882 Project November, 2006.
Department of Mechanical Engineering
Molecular visualization
Doug Raiford Lesson 19.  Framework model  Secondary structure first  Assemble secondary structure segments  Hydrophobic collapse  Molten: compact.
CS790 – BioinformaticsProtein Structure and Function1 Review of fundamental concepts  Know how electron orbitals and subshells are filled Know why atoms.
Part I : Introduction to Protein Structure A/P Shoba Ranganathan Kong Lesheng National University of Singapore.
Protein Structure 1 Primary and Secondary Structure.
Conformational Entropy Entropy is an essential component in ΔG and must be considered in order to model many chemical processes, including protein folding,
Altman et al. JACS 2008, Presented By Swati Jain.
Module 3 Protein Structure Database/Structure Analysis Learning objectives Understand how information is stored in PDB Learn how to read a PDB flat file.
Structure prediction: Homology modeling
Protein Folding & Biospectroscopy Lecture 6 F14PFB David Robinson.
Protein Structure and Bioinformatics. Chapter 2 What is protein structure? What are proteins made of? What forces determines protein structure? What is.
Lecture 3 Chemical building blocks: water amino acids, proteins.
Solving and Analyzing Side-Chain Positioning Problems Using Linear and Integer Programming Carleton L. Kingsford, Bernard Chazelle and Mona Singh Bioinformatics.
Protein backbone Biochemical view:
Forward and inverse kinematics in RNA backbone conformations By Xueyi Wang and Jack Snoeyink Department of Computer Science UNC-Chapel Hill.
CS-ROSETTA Yang Shen et al. Presented by Jonathan Jou.
In silico Protein Design: Implementing Dead-End Elimination algorithm
Automated Refinement (distinct from manual building) Two TERMS: E total = E data ( w data ) + E stereochemistry E data describes the difference between.
Find the optimal alignment ? +. Optimal Alignment Find the highest number of atoms aligned with the lowest RMSD (Root Mean Squared Deviation) Find a balance.
Bioinformatics 2 -- lecture 20 Protein design -- the state of the art.
Protein Structure BL
Computational Structure Prediction
Protein structure is conceptually divided into four levels of organization Primary structure is the amino acid sequence of a protein's polypeptide chain.
Hierarchical Structure of Proteins
Dead-End Elimination for Protein Design with Flexible Rotamers
Protein Structure Prediction
Volume 14, Issue 2, Pages (February 2006)
Protein structure prediction.
Clare-Louise Towse, Steven J. Rysavy, Ivan M. Vulovic, Valerie Daggett 
Alice Qinhua Zhou, Diego Caballero, Corey S. O’Hern, Lynne Regan 
Volume 22, Issue 10, Pages (October 2014)
Peter König, Rafael Giraldo, Lynda Chapman, Daniela Rhodes  Cell 
Mechanism of Interaction between the General Anesthetic Halothane and a Model Ion Channel Protein, III: Molecular Dynamics Simulation Incorporating a.
The Three-Dimensional Structure of Proteins
Presentation transcript:

Bioinformatics 2 -- lecture 9 Ramachandran angles Sidechain chi angles Rotamers Dead End Elimination Theorem

Backbone angles phi and psi In 1968, G.N.Ramachandran built a model like this, ala-ala-ala, to explore the relationship between interatomic distnces and the two freely rotatable packbone angles phi and psi. Atom-atom distances that were too close were not permissible. What angles were permissible?

Ramachandran Plot Best Allowed This plot is for all amino acids except Pro and Gly. to view all residues in your protein plotted phi versus psi. Use SEQ:Measure-->Ramachandran Plot Allowed The regions labeled alpha and beta represent valleys of stability, surrounded by a high energy plateau. Values of phi are limited primarily to the range between -60 degrees and -150 degrees. For psi, the range is limited to regions centered about -60 degrees and +120 degrees

Backbone angle statistics Colors represent the frequency (in bins of 10°x10°) of phi/psi angles. E,B and H are most common. L, l and x are found most often in Gly. Allowed regions are islands. Are bonds really "freely rotatable"?

Sidechain angle space -- rotamers A random sampling of Phenylalanine sidechains, when superimposed, fall into three classes: rotamers. This simplifies the problem of sidechain modeling. All we have to do is select the right rotamers and we're close to the right answer.

Sidechain modeling Given a backbone conformation and the sequence, can we predict the sidechain conformations? ≠ Energy calculations are sensitive to small changes. So the wrong sidechain conformation will give the wrong energy.

Goal of sidechain modeling Given the sequence and only the backbone atom coordinates, accurately model the positions of the sidechains. fine lines = true structure think lines = sidechain predictions using the method of Desmet et al. Desmet et al, Nature v.356, pp339-342 (1992)

Steric interactions determine allowed rotamers 3-bond or 1-4 interactions define the preferred angles, but these may differ greatly in energy depending on the atom groups involved. H O=C N CA CB H O=C N CA CB H O=C N CA CB CG H CG H CG H "m" "p" "t" -60° gauche 180° anti/trans +60° gauche

Exercise: measure a rotamer select these atoms 5 4 3 2 1 Create a tripeptide TWV, using Protein Builder Now, create "meters" for the chi1 and chi2 angles Dihedral (from right side menu) Select N-CA-CB-CG (1-2-3-4) Select CA-CB-CG-CD1 (2-3-4-5)

Trp sidechain is hard to rotate Rotamers of W*: p-90 +60 -90 p90 +60 +90 t-105 180 -105 t90 180 90 m0 -65 5 m95 -65 95 W sidechain is shown here lying over Thr backbone Rendering the molecule as space filling (Render-->Space filling) allows you to better visualize the contacts.

Rotamer Libraries Rotamer libraries have been compiled by clustering the sidechains of each amino acid over the whole database. Each cluster is a representative conformation (or rotamer), and is represented in the library by the best sidechain angles (chi angles), the "centroid" angles, for that cluster. Two commonly used rotamer libraries: *Jane & David Richardson: http://kinemage.biochem.duke.edu/databases/rotamer.php Roland Dunbrack: http://dunbrack.fccc.edu/bbdep/index.php *rotamers of W on the previous page are from the Richardson library.

Exploring Rotamers using MOE The environment of a buried leucine in 1A07. The interior of a protein is tightly packed. Bad packing produces voids or collisions.

Exercise: Rotamer explorer Open 1A07 from the Protein Database Edit-->Add hydrogens Compute-->partial charges Select an amino acid in the interior. SE: Edit-->Rotamer Explorer (get from MOE) Select rotamer with the lowest energy. Are the current chi angles close to the angles of a rotamer? How close? Is it the lowest energy rotamer? Select “Mutate”. (The coordinates are permanently changed.)

Exercise: Rotamer explorer Select an amino acid on the surface. SE: Edit-->Rotamer Explorer (get from MOE) Are the current angles close to a rotamer? Is it the lowest energy rotamer? What interactions does the best rotamer have? Mutate. Then select a nearby sidechain and do the same thing. How many times would you have to mutate before you could be sure of having the lowest energy rotamer set?

Dead end elimination theorem There is a global minimum energy conformation (GMEC), where each residue has a unique rotamer. In other words: GMEC is the set of rotamers that has the lowest energy. Energy is a pairwise thing. Total energy can be broken down into pairwise interactions. Each atom is either fixed (backbone) or movable (sidechain). fixed-fixed fixed-movable movable-movable E is a constant, =Etemplate E depends on rotamer, but independent of other rotamers E depends on rotamer, and depends on surrounding rotamers

Theoretical complexity of sidechain modeling The Global Minimum Energy Configuration (GMEC) is one, unique set of rotamers. How many possible sets of rotamers are there? n1 n2 n3 n4 n5 … nL where n1 is the number of rotamers for residue 1, and so on. Estimated complexity for a protein of 100 residue, with an average of 5 rotamers per position: 5100 = 8*1069 DEE reduces the complexity of the problem from 5L to approximately (5L)2

Dead end elimination theorem Each residue is numbered (i or j) and each residue has a set of rotamers (r, s or t). So, the notation ir means "choose rotamer r for position i". The total energy is the sum of the three components: fixed-movable fixed-fixed movable-movable Eglobal = Etemplate + iE(ir) + ijE(ir,js) where r and s are any choice of rotamers. NOTE: Eglobal ≥ EGMEC for any choice of rotamers.

Dead end elimination theorem If ig is in the GMEC and it is not, then we can separate the terms that contain ig or it and re-write the inequality. EGMEC = Etemplate + E(ig) + jE(ig,jg) + jE(jg) + jkE(jg,kg) ...is less than... EnotGMEC = Etemplate + E(it) + jE(it,jg) + jE(jg) + jkE(jg,kg) Canceling all terms in black, we get: E(ir) + j E(irjs) > E(ig) + j E(ig,js) So, if we find two rotamers ir and it, and: E(ir) + j mins E(irjs) > E(it) + j maxs E(it,js) Then ir cannot possibly be in the GMEC.

Dead end elimination theorem E(ir) + j mins E(irjs) > E(it) + j maxs E(it,js) This can be translated into plain English as follows: If the "worst case scenario" for rotamer t is better than the "best case scenario" for rotamer r, then you can eliminate r.

Exercise: Dead End Elimination Using the DEE worksheet: (1) Find a rotamer that satisfies the DEE theorem. (2) Eliminate it. (3) Repeat until each residue has only one rotamer. What is the final GMEC energy?

DEE exercise Three sidechains. Each with three rotamers. Therefore, there are 3x3x3=27 ways to arrange the sidechains. • Each rotamer has an energy E(r), which is the non-bonded energy between sidechain and template. • Each pair of rotamers has an interaction energy E(r1,r2), which is the non-bonded energy between sidechains. 3 a b c 1 2

1 2 3 1 2 3 DEE exercise r2 E(r1,r2) r1 a b c a b c a b c -1 1 1 3 5 1 -1 1 1 3 5 1 5 5 -1 -2 2 5 0 5 -1 0 0 0 5 10 E(r2) a b c 1 -1 3 5 1 5 5 1 1 -1 0 0 1 12 5 0 4 3 0 a b c r2 2 E(r1,r2) -2 0 0 2 5 0 5 -1 0 0 12 4 0 5 3 1 0 0 a b c 3 0 0 5 0 0 0 0 0 10 E(r1)

DEE exercise: instructions If the “best case scenario” for r1 is worse than the “worst case scenario” for r2 you can eliminate r1. (1) The best (worst) energies are found using the worksheet: Add E(r1) to the sum of the lowest (highest) E(r1,r2) that have not been previously eliminated. (2) There are 9 possible DEE comparisons to make: 1a versus 1b, 1a versus 1c, 1b versus 1c, 2a versus 2b, etc. etc. For each comparison, find the minimum and maximum energy choices of the other rotamers. If the maximum energy of r1 is less than the minimum energy of r2, eliminate r2. (3) Scratch out the eliminated rotamer and repeat until one rotamer per position remains.

Sequence design using DEE Did you notice that Rotamer Explorer in MOE allows you to choose a different sidechain? Choosing an amino acid for each position, based on the backbone structure and the energy function, is called Protein Design.