Tertiary Structure Prediction Methods Any given protein sequence Structure selection Compare sequence with proteins have solved structure Homology Modeling.

Slides:



Advertisements
Similar presentations
PROTEOMICS 3D Structure Prediction. Contents Protein 3D structure. –Basics –PDB –Prediction approaches Protein classification.
Advertisements

Tutorial Homology Modelling. A Brief Introduction to Homology Modeling.
Prediction to Protein Structure Fall 2005 CSC 487/687 Computing for Bioinformatics.
1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala.
Structural bioinformatics
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modeling Anne Mølgaard, CBS, BioCentrum, DTU.
Tertiary protein structure viewing and prediction July 1, 2009 Learning objectives- Learn how to manipulate protein structures with Deep View software.
Chapter 9 Structure Prediction. Motivation Given a protein, can you predict molecular structure Want to avoid repeated x-ray crystallography, but want.
Protein Structure, Databases and Structural Alignment
Protein structure (Part 2 of 2).
Structure Prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray crystallography, NMR) [2]
Tertiary protein structure viewing and prediction July 5, 2006 Learning objectives- Learn how to manipulate protein structures with Deep View software.
Thomas Blicher Center for Biological Sequence Analysis
The Protein Data Bank (PDB)
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
1 Protein Structure Prediction Reporter: Chia-Chang Wang Date: April 1, 2005.
IV. Protein Structure Prediction and Determination Methods of protein structure determination Critical assessment of structure prediction Homology modelling.
1 Protein Structure Prediction Charles Yan. 2 Different Levels of Protein Structures The primary structure is the sequence of residues in the polypeptide.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modelling Thomas Blicher Center for Biological Sequence Analysis.
Protein Tertiary Structure Prediction Structural Bioinformatics.
Protein Structures.
Bioinformatics Ayesha M. Khan Spring 2013.
Protein Structure Prediction and Analysis
Computational Structure Prediction Kevin Drew BCH364C/391L Systems Biology/Bioinformatics 2/12/15.
Protein modelling ● Protein structure is the key to understanding protein function ● Protein structure ● Topics in modelling and computational methods.
Homology Modeling David Shiuan Department of Life Science and Institute of Biotechnology National Dong Hwa University.
Protein Tertiary Structure Prediction
Comparative Protein Modeling Jason Wiscarson ( Lloyd Spaine ( Comparative or homology modeling, is a computational.
Practical session 2b Introduction to 3D Modelling and threading 9:30am-10:00am 3D modeling and threading 10:00am-10:30am Analysis of mutations in MYH6.
Genomics and Personalized Care in Health Systems Lecture 9 RNA and Protein Structure Leming Zhou, PhD School of Health and Rehabilitation Sciences Department.
COMPARATIVE or HOMOLOGY MODELING
CRB Journal Club February 13, 2006 Jenny Gu. Selected for a Reason Residues selected by evolution for a reason, but conservation is not distinguished.
Lecture 10 – protein structure prediction. A protein sequence.
Representations of Molecular Structure: Bonds Only.
Lecture 12 CS5661 Structural Bioinformatics Motivation Concepts Structure Prediction Summary.
Multiple Alignment and Phylogenetic Trees Csc 487/687 Computing for Bioinformatics.
1 P9 Extra Discussion Slides. Sequence-Structure-Function Relationships Proteins of similar sequences fold into similar structures and perform similar.
HOMOLOGY MODELLING Chris Wilton. Homology Modelling   What is it and why do we need it? principles of modelling, applications available   Using Swiss-Model.
© Wiley Publishing All Rights Reserved. Protein 3D Structures.
Neural Networks for Protein Structure Prediction Brown, JMB 1999 CS 466 Saurabh Sinha.
Multiple Alignments Motifs/Profiles What is multiple alignment? HOW does one do this? WHY does one do this? What do we mean by a motif or profile? BIO520.
Protein Folding Programs By Asım OKUR CSE 549 November 14, 2002.
MolIDE2: Homology Modeling Of Protein Oligomers And Complexes Qiang Wang, Qifang Xu, Guoli Wang, and Roland L. Dunbrack, Jr. Fox Chase Cancer Center Philadelphia,
Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009
Multiple Mapping Method with Multiple Templates (M4T): optimizing sequence-to-structure alignments and combining unique information from multiple templates.
Protein secondary structure Prediction Why 2 nd Structure prediction? The problem Seq: RPLQGLVLDTQLYGFPGAFDDWERFMRE Pred:CCCCCHHHHHCCCCEEEECCHHHHHHCC.
Applied Bioinformatics Week 12. Bioinformatics & Functional Proteomics How to classify proteins into functional classes? How to compare one proteome with.
Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:
Structure prediction: Homology modeling
Protein Modeling Protein Structure Prediction. 3D Protein Structure ALA CαCα LEU CαCαCαCαCαCαCαCα PRO VALVAL ARG …… ??? backbone sidechain.
Predicting Protein Structure: Comparative Modeling (homology modeling)
Protein Structure Prediction: Homology Modeling & Threading/Fold Recognition D. Mohanty NII, New Delhi.
Introduction to Protein Structure Prediction BMI/CS 576 Colin Dewey Fall 2008.
Protein Folding & Biospectroscopy Lecture 6 F14PFB David Robinson.
Protein Structure Prediction Graham Wood Charlotte Deane.
Homology Modeling 原理、流程,還有如何用該工具去預測三級結構 Lu Chih-Hao 1 1.
Structural alignment methods Like in sequence alignment, try to find best correspondence: –Look at atoms –A 3-dimensional problem –No a priori knowledge.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
V diagonal lines give equivalent residues ILS TRIVHVNSILPSTN V I L S T R I V I L P E F S T Sequence A Sequence B Dot Plots, Path Matrices, Score Matrices.
Protein Tertiary Structure Prediction Structural Bioinformatics.
Molecular mechanics Classical physics, treats atoms as spheres Calculations are rapid, even for large molecules Useful for studying conformations Cannot.
Lab Lab 10.2: Homology Modeling Lab Boris Steipe Departments of Biochemistry and.
PROTEIN MODELLING Presented by Sadhana S.
Protein Structure Visualisation
Computational Structure Prediction
Protein Structure Prediction and Protein Homology modeling
Protein Structures.
Homology Modeling.
Protein structure prediction.
Presentation transcript:

Tertiary Structure Prediction Methods Any given protein sequence Structure selection Compare sequence with proteins have solved structure Homology Modeling > 35% Fold Recognition ab initio Folding < 35% Structure refinement Final Structure Structure selection

Why Homology modelling ? X-ray Diffraction – Only a small number of proteins can be made to form crystals. – A crystal is not the protein’s native environment. – Very time consuming. NMR Distance Measurement – – Not all proteins are found in solution. – This method generally looks at isolated proteins rather than protein complexes. – Very time consuming

Homology Modeling: Principles, tools and techniques Development of molecular biology: rapid identification, isolation and sequencing of genes. Problem : time-consuming task to obtain the 3D- structure of proteins. Alternative strategy in structural biology is to develop models of protein when the constraints from X-ray diffraction or NMR are not yet available. Homology modeling is the method that can be applied to generate reasonable models of protein structure.

Database approach to homology modelling As of June 2000, 12,500 protein structures have been deposited into the Protein Data Bank (PDB) and 86,500 protein sequence entries were contained in SwissProt protein sequence database. This is a 1:7 ratio – relatively few structures are known. The number of sequence will increase much faster than the number of structures due to advances in sequencing.

Sequence similarity methods These methods can be very accurate if there is > 50% sequence similarity. They are rarely accurate if the sequence similarity < 30%. They use similar methods as used for sequence alignment such as the dynamic programming algorithm, hidden markov models, and clustering algorithms.

What is Homology Modeling? Predicts the three-dimensional structure of a given protein sequence (TARGET) based on an alignment to one or more known protein structures (TEMPLATES) If similarity between the TARGET sequence and the TEMPLATE sequence is detected, structural similarity can be assumed. In general, 30% sequence identity is required for generating useful models.

Structural Prediction by Homology Modeling Structural Databases Reference Proteins Conserved RegionsProtein Sequence Predicted Conserved Regions Initial Model Structure Analysis Refined Model SeqFold,Profiles-3D, PSI-BLAST, BLAST & FASTA, Fold-recognition methods (FUGUE) Cα Matrix Matching Sequence Alignment Coordinate Assignment Loop Searching/generation WHAT IF, PROCHECK, PROSAII,.. Sidechain Rotamers and/or MM/MD MODELER

How good can homology modeling be? Sequence Identity %Comparable to medium resolution NMR Substrate Specificity 30-60%Molecular replacement in crystallography Support site-directed mutagenesis through visualization <30%Serious errors

Significance of Protein Structure What does a structure offer in the way of biological knowledge?  Location of mutants and conserved residues  Ligand and functional sites  Clefts/Cavities  Evolutionary Relationships  Mechanisms

The importance of the sequence alignment the quality of the sequence alignment is of crucial importance Misplaced gaps, representing insertions or deletions, will cause residues to be misplaced in space Careful inspection and adjustment on Automatic alignment may improve the quality of the modeling.

Programs for Model Protein Construction MODELLER 4.0 –guitar.rockefeller.edu/modeller/modeller.html SWISS-MOD Server – SCWRL (SideChain placement With Rotamer Library) –

Protein Structural Databases Templates can be found using the TARGET sequence as a query for searching using FASTA or BLAST –PDB ( –MODELLER ( –ModBase ( info.html) –3DCrunch (

Gaining confidence in template searching Once a suitable template is found, it is a good idea to do a literature search (PubMed) on the relevant fold to determine what biological role(s) it plays. Does this match the biological/biochemical function that you expect?

Other factors to consider in selecting templates Template environment –pH –Ligands present? Resolution of the templates Family of proteins –Phylogenetic tree construction can help find the subfamily closest to the target sequence Multiple templates?

Target-Template Alignment No current comparative modeling method can recover from an incorrect alignment Use multiple sequence alignments as initial guide. Consider slightly alternative alignments in areas of uncertainty, build multiple models Sequence-Structure alignment programs –Tries to put gaps in variable regions/loops Note: sequence from database versus sequence from the actual PDB are not always identical

Target-Multiple Template Alignment Alignment is prepared by superimposing all template structures Add target sequence to this alignment Compare with multiple sequence alignment and adjust

Adjusting the alignment Using tools such as Joy ( www- cryst.bioc.cam.ac.uk/~joy/) to view secondary structure along the alignment and use this information as criteria for adjustments Avoid gaps in secondary structure elements

Secondary Structure Prediction  The Predict Protein server   Adding secondary structure prediction algorithms can help make decisions on whether helices should be shortened/extended in areas of poor sequence identity.  PHD program

Constructing Multi-domain protein models Building a multi-domain protein using templates corresponding to the individual domains proteinAaaaaaaaaaaaaa proteinB bbbbbbbbbbbbbbb Targetaaaaaaaaaaaaabbbbbbbbbbbbbbb

Multiple model approach  Reminder: Consider the effects of different substitution matrices, different gap penalties, and different algorithms. (Vogt et al. J. Mol. Biol. 1995, 249: )  Construct multiple models  Use structural analysis programs to determine best model Jaroszewski, Pawlowski and Godsik, J. Molecular Modeling, 1998, 4: Venclovas, Ginalski and Fidelis. PROTEINS, 1999, 3:73-80 (Suppl)

Model Building Rigid-Body Assembly –Assembles a model from a small number of rigid bodies obtained from aligned protein structure –Implemented in COMPOSER Segment Matching Satisfaction of Spatial Restraints –MODELLER –guitar.rockefeller.edu/modeller/modeller.html

Initial model and procedures  Calculate coordinates for atoms that have equivalent atoms in the templates as an average over all templates  CHARMM internal coordinates are used for remaining unknown coordinates  Generate stereochemical and homology derived restraints

Modeller Main input are restraints on the spatial structure of AA and ligands to be modeled. Output is a 3D structure that satisfies these restraints Restraints are obtained from related protein structures (homology modeling) - obtained automatically, NMR structures, secondary struture packing and other experimental data

Spatial restraints ?  Minimizes the objective function, F, with respect to the Cartesian coordinates of the protein atoms  F(R) = Σ c i (f i,p i )  R are the cartesian coordinates of the atoms  c is a restraint dependant on f,p  f is a geometric feature of a molecule and include the distance, angle and dihedral values  p are parameters to help describe some restraints

What are the Restraints ? distances, angles, dihedral angles, pairs of dihedral angles and some other spatial features defined by atoms or pseudo atoms.

Sidechain Conformation Protein sidechains play a key role in molecular recognition and packing of hydrophobic cores of globular proteins Protein sidechain conformations tend to exist in a limited number of canonical shapes, usually called rotamers Rotamer libraries can be constructed where only 3-50 conformations are taken into account for each side chain

Sidechains on surface of protein Exposed sidechains on surface can be highly flexible without a single dominant conformation So ultimately if these solvent exposed sidechains do not form binding interactions with other molecules or involved in say, a catalytic reaction, then accuracy may not be crucial—also look at the B-factors Can refine the sidechains with molecular mechanics minimization –Sampling? –Scoring?

Errors in Homology Modeling a) Side chain packing b) Distortions and shifts c) no template

Errors in Homology Modeling d) Misalignments e) incorrect template Marti-Renom et al., Ann. Rev. Biophys. Biomol. Struct., 2000, 29:

Detection of Errors First check should include a stereochemical check on the modeled structure—PROCHECK, WHATCHECK, DISTAN– which will show deviations from normal bond lengths, dihedrals, etc. Visualization– follow the backbone trace and then subsequently move out to Cα-Cβ orientation.

PROCHECK procheck/procheck/procheck.html