Protein Folding & Biospectroscopy Lecture 6 F14PFB David Robinson.

Slides:



Advertisements
Similar presentations
Protein NMR terminology COSY-Correlation spectroscopy Gives experimental details of interaction between hydrogens connected via a covalent bond NOESY-Nuclear.
Advertisements

Tutorial Homology Modelling. A Brief Introduction to Homology Modeling.
Prediction to Protein Structure Fall 2005 CSC 487/687 Computing for Bioinformatics.
Structural bioinformatics
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modeling Anne Mølgaard, CBS, BioCentrum, DTU.
Tertiary protein structure viewing and prediction July 1, 2009 Learning objectives- Learn how to manipulate protein structures with Deep View software.
Protein Structure, Databases and Structural Alignment
Tertiary protein structure viewing and prediction July 5, 2006 Learning objectives- Learn how to manipulate protein structures with Deep View software.
Thomas Blicher Center for Biological Sequence Analysis
Summary Protein design seeks to find amino acid sequences which stably fold into specific 3-D structures. Modeling the inherent flexibility of the protein.
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
Tertiary protein structure modelling May 31, 2005 Graded papers will handed back Thursday Quiz#4 today Learning objectives- Continue to learn how to manipulate.
1 Protein Structure Prediction Reporter: Chia-Chang Wang Date: April 1, 2005.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modelling Thomas Blicher Center for Biological Sequence Analysis.
Homology Modeling Seminar produced by Hanka Venselaar.
Protein Tertiary Structure Prediction Structural Bioinformatics.
Protein Structures.
Inverse Kinematics for Molecular World Sadia Malik April 18, 2002 CS 395T U.T. Austin.
Computational Structure Prediction Kevin Drew BCH364C/391L Systems Biology/Bioinformatics 2/12/15.
Protein Structure Prediction Dr. G.P.S. Raghava Protein Sequence + Structure.
Homology Modeling David Shiuan Department of Life Science and Institute of Biotechnology National Dong Hwa University.
Construyendo modelos 3D de proteinas ‘fold recognition / threading’
Computer-Assisted Drug Design (1) i)Random Screening ii)Lead Development and Optimization using Multivariate Statistical Analyses. iii)Lead Generation.
Tertiary Structure Prediction Methods Any given protein sequence Structure selection Compare sequence with proteins have solved structure Homology Modeling.
What are proteins? Proteins are important; e.g. for catalyzing and regulating biochemical reactions, transporting molecules, … Linear polymer chain composed.
COMPARATIVE or HOMOLOGY MODELING
CRB Journal Club February 13, 2006 Jenny Gu. Selected for a Reason Residues selected by evolution for a reason, but conservation is not distinguished.
Lecture 10 – protein structure prediction. A protein sequence.
3D Structure Prediction and Assessment David Wishart Athabasca 3-41
 Four levels of protein structure  Linear  Sub-Structure  3D Structure  Complex Structure.
Representations of Molecular Structure: Bonds Only.
Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding I Prof. Corey O’Hern Department of Mechanical Engineering & Materials.
Lecture 12 CS5661 Structural Bioinformatics Motivation Concepts Structure Prediction Summary.
Bioinformatics 2 -- Lecture 8 More TOPS diagrams Comparative modeling tutorial and strategies.
1 P9 Extra Discussion Slides. Sequence-Structure-Function Relationships Proteins of similar sequences fold into similar structures and perform similar.
Biomolecular Nuclear Magnetic Resonance Spectroscopy BASIC CONCEPTS OF NMR How does NMR work? Resonance assignment Structure determination 01/24/05 NMR.
Protein Folding Programs By Asım OKUR CSE 549 November 14, 2002.
Protein Folding & Biospectroscopy F14PFB David Robinson Mark Searle Jon McMaster
Department of Mechanical Engineering
Biomolecular Nuclear Magnetic Resonance Spectroscopy FROM ASSIGNMENT TO STRUCTURE Sequential resonance assignment strategies NMR data for structure determination.
Protein Classification II CISC889: Bioinformatics Gang Situ 04/11/2002 Parts of this lecture borrowed from lecture given by Dr. Altman.
Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009
Applied Bioinformatics Week 12. Bioinformatics & Functional Proteomics How to classify proteins into functional classes? How to compare one proteome with.
Module 3 Protein Structure Database/Structure Analysis Learning objectives Understand how information is stored in PDB Learn how to read a PDB flat file.
Protein Modeling Protein Structure Prediction. 3D Protein Structure ALA CαCα LEU CαCαCαCαCαCαCαCα PRO VALVAL ARG …… ??? backbone sidechain.
Predicting Protein Structure: Comparative Modeling (homology modeling)
Protein Structure Prediction: Homology Modeling & Threading/Fold Recognition D. Mohanty NII, New Delhi.
Introduction to Protein Structure Prediction BMI/CS 576 Colin Dewey Fall 2008.
Protein Design with Backbone Optimization Brian Kuhlman University of North Carolina at Chapel Hill.
Programme Last week’s quiz results + Summary Fold recognition Break Exercise: Modelling remote homologues Summary.
Protein Structure Prediction Graham Wood Charlotte Deane.
Protein Structure and Bioinformatics. Chapter 2 What is protein structure? What are proteins made of? What forces determines protein structure? What is.
Structural alignment methods Like in sequence alignment, try to find best correspondence: –Look at atoms –A 3-dimensional problem –No a priori knowledge.
Proteins: 3D-Structure Chapter 6 (9 / 17/ 2009)
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
Lecture 11 CS5661 Structural Bioinformatics – Structure Comparison Motivation Concepts Structure Comparison.
CS-ROSETTA Yang Shen et al. Presented by Jonathan Jou.
Protein Structure Prediction: Threading and Rosetta BMI/CS 576 Colin Dewey Fall 2008.
Protein Structure Prediction. Protein Sequence Analysis Molecular properties (pH, mol. wt. isoelectric point, hydrophobicity) Secondary Structure Super-secondary.
Lab Lab 10.2: Homology Modeling Lab Boris Steipe Departments of Biochemistry and.
Computational Structure Prediction
Protein Structure Prediction and Protein Homology modeling
Protein Structure Prediction
Protein Structures.
Homology Modeling.
Protein structure prediction.
Programme Last week’s quiz results + Summary
Protein structure prediction
Homology modeling in short…
Presentation transcript:

Protein Folding & Biospectroscopy Lecture 6 F14PFB David Robinson

Protein Folding 1.Introduction 2.Protein Structure 3.Interactions 4.Protein Folding Models 5.Biomolecular Modelling 6.Bioinformatics

Homology Modelling Based on the observation that “Similar sequences exhibit similar structures” Known structure is used as a template to model an unknown (but likely similar) structure with known sequence First applied in late 1970’s using early computer imaging methods (Tom Blundell)

Homology Modelling Offers a method to “Predict” the 3D structure of proteins for which it is not possible to obtain X-ray or NMR data Can be used in understanding function, activity, specificity, etc. Of interest to drug companies wishing to do structure-aided drug design

Terminology target is sequence of protein with unknown 3D structure template is sequence of protein already with known 3D structure (modelled experimentally usually) alignment of sequence is rearrangement of subsets (of a sequence) to find maximum similarities rigid bodies conserved  -helices &  -sheets sidechains: decorative stuff for backbone (v. imp. for interactions), actually residues connected to rigid bodies. loops connecting to the core-regions (  -helices &  -sheets) spatial restraints (bond length, bond angles etc.) PDB Protein Data Bank

Homology Modelling Identify homologous sequences in PDB Align query sequence with homologues Find Structurally Conserved Regions (SCRs) Identify Structurally Variable Regions (SVRs) Generate coordinates for core region Generate coordinates for loops Add side chains (Check rotamer library) Refine structure using energy minimization Validate structure

Step 1: ID Homologues in PDB PRTEINSEQENCEPRTEINSEQUENC EPRTEINSEQNCEQWERYTRASDFHG TREWQIYPASDFGHKLMCNASQERWW PRETWQLKHGFDSADAMNCVCNQWER GFDHSDASFWERQWK Query SequencePDB

Step 1: ID Homologues in PDB PRTEINSEQENCEPRTEINSEQUENC EPRTEINSEQNCEQWERYTRASDFHG TREWQIYPASDFGHKLMCNASQERWW PRETWQLKHGFDSADAMNCVCNQWER GFDHSDASFWERQWK Query SequencePDB PRTEINSEQENCEPRTEINSEQUENC EPRTEINSEQNCEQWERYTRASDFHG TREWQIYPASDFGHKLMCNASQERWW PRETWQLKHGFDSADAMNCVCNQWER GFDHSDASFWERQWK PRTEINSEQENCEPRTEINSEQUENC EPRTEINSEQNCEQWERYTRASDFHG TREWQIYPASDFG PRTEINSEQENCEPRTEINSEQUENC EPRTEINSEQNCEQWERYTRASDFHG TREWQIYPASDFGPRTEINSEQENCE PRTEINSEQUENCEPRTEINSEQNCE QWERYTRASDFHGTREWQIYPASDFG TREWQIYPASDFGPRTEINSEQENCE PRTEINSEQUENCEPRTEINSEQNCE QWERYTRASDFHGTREWQ PRTEINSEQENCEPRTEINSEQUENC EPRTEINSEQQWEWEWQWEWEQWEWE WQRYEYEWQWNCEQWERYTRASDFHG TREWQIYPASDWERWEREWRFDSFG PRTEINSEQENCEPRTEINSEQUENC EPRTEINSEQNCEQWERYTRASDFHG TREWQIYPASDFGHKLMCNASQERWW PRETWQLKHGFDSADAMNCVCNQWER GFDHSDASFWERQWK PRTEINSEQENCEPRTEINSEQUENC EPRTEINSEQNCEQWERYTRASDFHG TREWQIYPASDFG PRTEINSEQENCEPRTEINSEQUENC EPRTEINSEQNCEQWERYTRASDFHG TREWQIYPASDFGPRTEINSEQENC PRTEINSEQENCEPRTEINSEQUENC EPRTEINSEQQWEWEWQWEWEQWEWE WQRYEYEWQWNCEQWERYTRASDFHG TR Hit #1 Hit #2

Step 2: Align Sequences GENETICS G E N E S I S GENETICS G E N E S20 10 I S Dynamic Programming Identity MatrixSum Matrix

Step 2: Align Sequences GENETICS G E N E S I S GENETICS G0 E0 N0 E0 S I0 S Dynamic Programming Identity MatrixSum Matrix

Step 2: Align Sequences GENETICS G E N E S I S GENETICS G0 E0 N0 E0 S I0 S Dynamic Programming 10 Identity MatrixSum Matrix

Step 2: Align Sequences GENETICS G E N E S I S GENETICS G0 E0 N0 E0 S I0 S Dynamic Programming 10 Identity MatrixSum Matrix

Step 2: Align Sequences GENETICS G E N E S I S GENETICS G0 E0 N0 E0 S I0 S Dynamic Programming 10 Identity MatrixSum Matrix

Step 2: Align Sequences GENETICS G E N E S I S GENETICS G0 E0 N0 E0 S I0 S Dynamic Programming Identity MatrixSum Matrix

20 Step 2: Align Sequences GENETICS G E N E S I S GENETICS G0 E0 N0 E0 S I0 S Dynamic Programming 10 Identity MatrixSum Matrix 10

20 Step 2: Align Sequences GENETICS G E N E S I S GENETICS G0 E0 N0 E0 S I0 S Dynamic Programming 10 Identity MatrixSum Matrix 10

Step 2: Align Sequences ACDEFGHIKLMNPQRST--FGHQWERT-----TYREWYEG ASDEYAHLRILDPQRSTVAYAYE--KSFAPPGSFKWEYEA MCDEYAHIRLMNPERSTVAGGHQWERT----GSFKEWYAA Query Hit #1 Hit #2 Hit #1Hit #2

Alignment Key step in Homology Modelling Global (Needleman-Wunsch) alignment is absolutely required Small error in alignment can lead to big error in structural model Multiple alignments are usually better than pairwise alignments

Alignment Thresholds

Hit #1Hit #2 Step 3: Find SCR’s ACDEFGHIKLMNPQRST--FGHQWERT-----TYREWYEG ASDEYAHLRILDPQRSTVAYAYE--KSFAPPGSFKWEYEA MCDEYAHIRLMNPERSTVAGGHQWERT----GSFKEWYAA HHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCBBBBBBBBB Query Hit #1 Hit #2 SCR #1 SCR #2

Structurally Conserved Regions (SCR’s) Correspond to the most stable regions (usually interior) of protein Correspond to sequence regions with lowest level of gapping, highest level of sequence conservation Usually correspond to secondary structures

Hit #1Hit #2 Step 4: Find SVR’s ACDEFGHIKLMNPQRST--FGHQWERT-----TYREWYEG ASDEYAHLRILDPQRSTVAYAYE--KSFAPPGSFKWEYEA MCDEYAHIRLMNPERSTVAGGHQWERT----GSFKEWYAA HHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCBBBBBBBBB Query Hit #1 Hit #2 SVR (loop)

Structurally Variable Regions (SVR’s) Correspond to the least stable or most flexible regions (usually exterior) of protein Correspond to sequence regions with highest level of gapping, lowest level of sequence conservation Usually correspond to loops and turns

ATOM 1 N ALA A TRX 152 ATOM 2 CA ALA A TRX 153 ATOM 3 C ALA A TRX 154 ATOM 4 O ALA A TRX 155 ATOM 5 CB ALA A TRX 156 ATOM 6 OG SER A TRX 157 ATOM 7 N GLU A TRX 158 ATOM 8 CA GLU A TRX 159 ATOM 9 C GLU A TRX 160 ATOM 10 O GLU A TRX 161 Step 5: Generate Coordinates ATOM 1 N SER A TRX 152 ATOM 2 CA SER A TRX 153 ATOM 3 C SER A TRX 154 ATOM 4 O SER A TRX 155 ATOM 5 CB SER A TRX 156 ATOM 6 OG SER A TRX 157 ATOM 7 N ASP A TRX 158 ATOM 8 CA ASP A TRX 159 ATOM 9 C ASP A TRX 160 ATOM 10 O ASP A TRX 161 ALA

Step 5: Generate Core Coordinates For identical amino acids, transfer all atom coordinates (XYZ) to query protein For similar amino acids, transfer backbone coordinates & replace side chain atoms while respecting  angles For different amino acids, transfer only the backbone coordinates (XYZ) to query sequence

Step 6: Replace SVRs (loops) FGHQWERTFGHQWERT YAYE--KS Query Hit #1

Loop Library Loops extracted from PDB using high resolution (<2 Å) X-ray structures Typically thousands of loops in DB Includes loop coordinates, sequence, # residues in loop, C  -C  distance, preceding 2 o structure and following 2 o structure (or their C  coordinates)

Step 6: Replace SVRs (loops) Must match desired # residues Must match C  -C  distance (<0.5 Å) Must not bump into other parts of protein (no C  -C  distance <3.0 Å) Preceding and following C  ’s (3 residues) from loop should match well with corresponding C  coordinates in template structure

Step 6: Replace SVRs (loops) Loop placement and positioning is done using superposition algorithm Loop fits are evaluated using RMSD calculations and standard “bump checking” If no “good” loop is found, some algorithms create loops using randomly generated  angles

Step 7: Add Side Chains

C COOHH2NH2N H NH 3 + Amino Acid Side Chains

Step 7: Add Side Chains Done primarily for SVRs (not SCRs) Rotamer placement and positioning is done via a superposition algorithm using rotamers taken from a standardized library (Trial & Error) Rotamer fits are evaluated using simple “bump checking” methods

Step 8: Energy Minimization

Energy Minimization Removes atomic overlaps and unnatural strains in the structure Stabilizes or reinforces strong hydrogen bonds, breaks weak ones Brings protein to lowest energy in about 1-2 minutes CPU time

Actual Modelled The Final Result

Summary Identify homologous sequences in PDB Align query sequence with homologues Find Structurally Conserved Regions (SCRs) Identify Structurally Variable Regions (SVRs) Generate coordinates for core region Generate coordinates for loops Add side chains (Check rotamer library) Refine structure using energy minimization Validate structure

(a)Sketch the chemical structure corresponding to the atoms listed in the excerpt from the Protein Data Bank (PDB) file below. Add the missing side chain atoms for Ala. Indicate the position of the peptide bond on your sketch. ATOM 4 NSER A ATOM 2 CA SER A ATOM 3 C SER A ATOM 4 O SER A ATOM 5 CB SER A ATOM 6 OG SER A ATOM 7 N ALA A ATOM 8 CA ALA A ATOM 9 C ALA A ATOM 10 O ALA A ATOM 11 CB ALA A (b)Indicate on your sketch in part (a) the N-terminus and C-terminus, and the main- chain dihedral angles, , , and . (c)Place the following main-chain bond lengths in order of decreasing length (longest first): C α -C, C-O, C-N and N-C α. Explain the physical reason for this ordering. (d)Sketch a graph showing the variation of energy versus separation of two particles for a van der Waals interaction.