COMPARATIVE or HOMOLOGY MODELING

Slides:



Advertisements
Similar presentations
Applications of Homology Modeling
Advertisements

Rosetta Energy Function Glenn Butterfoss. Rosetta Energy Function Major Classes: 1. Low resolution: Reduced atom representation Simple energy function.
Tutorial Homology Modelling. A Brief Introduction to Homology Modeling.
Protein Threading Zhanggroup Overview Background protein structure protein folding and designability Protein threading Current limitations.
Prediction to Protein Structure Fall 2005 CSC 487/687 Computing for Bioinformatics.
Protein Tertiary Structure Prediction
Structural bioinformatics
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modeling Anne Mølgaard, CBS, BioCentrum, DTU.
CISC667, F05, Lec21, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction 3-Dimensional Structure.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Can protein model accuracy be identified? Morten Nielsen, CBS, BioCentrum, DTU.
Tertiary protein structure viewing and prediction July 5, 2006 Learning objectives- Learn how to manipulate protein structures with Deep View software.
Homology modelling ? X-ray ? NMR ?. Homology Modelling !
Thomas Blicher Center for Biological Sequence Analysis
The Protein Data Bank (PDB)
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
Tertiary protein structure modelling May 31, 2005 Graded papers will handed back Thursday Quiz#4 today Learning objectives- Continue to learn how to manipulate.
1 Protein Structure Prediction Reporter: Chia-Chang Wang Date: April 1, 2005.
Molecular modelling / structure prediction (A computational approach to protein structure) Today: Why bother about proteins/prediction Concepts of molecular.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Can protein model accuracy be identified? Morten Nielsen, CBS, BioCentrum, DTU.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modelling Thomas Blicher Center for Biological Sequence Analysis.
Homology Modeling Seminar produced by Hanka Venselaar.
Protein Tertiary Structure Prediction Structural Bioinformatics.
Protein Structures.
An introduction and homology modeling
Bioinformatics Ayesha M. Khan Spring 2013.
Protein Structure Prediction and Analysis
Computational Structure Prediction Kevin Drew BCH364C/391L Systems Biology/Bioinformatics 2/12/15.
Protein modelling ● Protein structure is the key to understanding protein function ● Protein structure ● Topics in modelling and computational methods.
Homology Modeling David Shiuan Department of Life Science and Institute of Biotechnology National Dong Hwa University.
Construyendo modelos 3D de proteinas ‘fold recognition / threading’
Forces and Prediction of Protein Structure Ming-Jing Hwang ( 黃明經 ) Institute of Biomedical Sciences Academia Sinica
Tertiary Structure Prediction Methods Any given protein sequence Structure selection Compare sequence with proteins have solved structure Homology Modeling.
Comparative Protein Modeling Jason Wiscarson ( Lloyd Spaine ( Comparative or homology modeling, is a computational.
Macromolecular structure
Practical session 2b Introduction to 3D Modelling and threading 9:30am-10:00am 3D modeling and threading 10:00am-10:30am Analysis of mutations in MYH6.
CRB Journal Club February 13, 2006 Jenny Gu. Selected for a Reason Residues selected by evolution for a reason, but conservation is not distinguished.
Representations of Molecular Structure: Bonds Only.
1 P9 Extra Discussion Slides. Sequence-Structure-Function Relationships Proteins of similar sequences fold into similar structures and perform similar.
Protein Folding Programs By Asım OKUR CSE 549 November 14, 2002.
Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009
Multiple Mapping Method with Multiple Templates (M4T): optimizing sequence-to-structure alignments and combining unique information from multiple templates.
Protein secondary structure Prediction Why 2 nd Structure prediction? The problem Seq: RPLQGLVLDTQLYGFPGAFDDWERFMRE Pred:CCCCCHHHHHCCCCEEEECCHHHHHHCC.
Applied Bioinformatics Week 12. Bioinformatics & Functional Proteomics How to classify proteins into functional classes? How to compare one proteome with.
Altman et al. JACS 2008, Presented By Swati Jain.
Structure prediction: Homology modeling
Predicting Protein Structure: Comparative Modeling (homology modeling)
Protein Structure Prediction: Homology Modeling & Threading/Fold Recognition D. Mohanty NII, New Delhi.
Protein Folding & Biospectroscopy Lecture 6 F14PFB David Robinson.
Protein Structure Prediction Graham Wood Charlotte Deane.
Homology Modeling 原理、流程,還有如何用該工具去預測三級結構 Lu Chih-Hao 1 1.
Query sequence MTYKLILNGKTKGETTTEAVDAATAEKVFQYANDN GVDGEWTYTE Structure-Sequence alignment “Structure is better preserved than sequence” Me! Non-redundant.
Structural alignment methods Like in sequence alignment, try to find best correspondence: –Look at atoms –A 3-dimensional problem –No a priori knowledge.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
CS-ROSETTA Yang Shen et al. Presented by Jonathan Jou.
Protein Structure Prediction: Threading and Rosetta BMI/CS 576 Colin Dewey Fall 2008.
Forces and Prediction of Protein Structure Ming-Jing Hwang ( 黃明經 ) Institute of Biomedical Sciences Academia Sinica
Lab Lab 10.2: Homology Modeling Lab Boris Steipe Departments of Biochemistry and.
Molecular modelling José R. Valverde CNB/CSIC © José R. Valverde, 2014 CC-BY-NC-SA.
PROTEIN MODELLING Presented by Sadhana S.
Computational Structure Prediction
Protein Structure Prediction and Protein Homology modeling
Protein dynamics Folding/unfolding dynamics
Protein Folding and Protein Threading
Protein Structures.
Molecular Modeling By Rashmi Shrivastava Lecturer
3-Dimensional Structure
Homology Modeling.
Protein structure prediction.
Homology modeling in short…
Presentation transcript:

COMPARATIVE or HOMOLOGY MODELING The aim is to build a 3-D model for a protein of unknown structure (target) on the basis of sequence similarity to proteins of known structure (templates). Accuracy varies from simply identifying the correct fold to generating a high resolution model Homology modeling is the most accurate protein structure prediction method – but that doesn’t mean it works perfectly! 1) 3D structures of proteins in a given family are more conserved than their sequences 2) Approximately 1/3 of all sequences are recognizably related to at least one known structure 3) The number of unique protein folds is limited

Homology Modeling Stages: What we hope to learn Homology Modeling Stages: 1) Sequence Alignment 2) Model Building a. Fold selection/generation b. Side chain positioning c. Loop generation d. Energy optimization 3) Evaluating the Model

Required Accuracy versus Application

Steps in Homology Modeling Figure 5.1.1 from MA Marti-Renom and A. Sali “Modeling Protein Structure from Its Sequence” Current Prototocols in Bioinformatics (2003). 5.1.1-5.1.32

Steps in Homology Modeling Crucial Steps Figure 5.1.1 from MA Marti-Renom and A. Sali “Modeling Protein Structure from Its Sequence” Current Prototocols in Bioinformatics (2003). 5.1.1-5.1.32

Step 1: Template Selection PDB www.rcsb.org/pdb/

Template Search Core BLAST http://www.ncbi.nlm.nih.gov/BLAST/ FastA http://www.ebi.ac.uk/fasta33/ SSM http://www.ebi.ac.uk/msd-srv/ssm/ PredictProtein http://www.predictprotein.org/ 123D; SARF2; PDP http://123d.ncifcrf.gov/ GenTHREADER http://bioinf.cs.ucl.ac.uk/psipred/ UCLA-DOE http://fold.doe-mbi.ucla.edu/ Core

Template Search Core BLAST http://www.ncbi.nlm.nih.gov/BLAST/ FastA http://www.ebi.ac.uk/fasta33/ SSM http://www.ebi.ac.uk/msd-srv/ssm/ PredictProtein http://www.predictprotein.org/ 123D; SARF2; PDP http://123d.ncifcrf.gov/ GenTHREADER http://bioinf.cs.ucl.ac.uk/psipred/ UCLA-DOE http://fold.doe-mbi.ucla.edu/ Core

Step 2: Sequence Alignment This is the most crucial step in the process Homology modeling cannot recover from a bad initial alignment

Sequence Alignment EMBOSS http://www.ebi.ac.uk/emboss/align/ Tcoffee http://www.igs.cnrs-mrs.fr/Tcoffee ClustalW http://www.ebi.ac.uk/clustalw/ SwissModel http://www.expasy.org/spdbv/ BCM http://searchlauncher.bcm.tmc.edu/multi-align/ POA http://www.bioinformatics.ucla.edu/poa/ STAMP http://www.ks.uiuc.edu/Research/vmd/

Sequence Alignment EMBOSS http://www.ebi.ac.uk/emboss/align/ Tcoffee http://www.igs.cnrs-mrs.fr/Tcoffee ClustalW http://www.ebi.ac.uk/clustalw/ SwissModel http://www.expasy.org/spdbv/ BCM http://searchlauncher.bcm.tmc.edu/multi-align/ POA http://www.bioinformatics.ucla.edu/poa/ STAMP http://www.ks.uiuc.edu/Research/vmd/

Homology Modeling: Threading “Scaffold” Structure or Template A1 A3 A2 A4 A5 … Target sequence Alignment between target(s) and scaffold(s) Threading Energy* Energy includes contributions from matches (favorable) , gaps (unfavorable), and hydrogen bonds. *R. Goldstein, Z. Luthey-Schulten, P. Wolynes (1992, PNAS), K. Koretke et.al. (1996, Proteins)

Step 3: Model Building rigid-body assembly (example: COMPOSER) segment matching (using positions of matching atoms) (example: SEGMOD) satisfaction of spatial restraints - distance geometry (example: Modeller) - Threading (example: SwissModeler)

Step 3: Model Building Overlap template structures and generate backbone Generation of loops (data based or energy based) Side chain generation based on known preferences Overall model optimization (energy minimization)

Gaps in the Template Why Modeling Loops is Difficult Difference in the symmetry contacts in the crystals of the template and the real structure to be modeled. Loops are flexible and can be distorted by neighboring residues The mutation of a residue to proline within the loop It is currently not possible to confidently model loops > 8 aa. There are two approaches Data-base searches Conformational searches using energy scoring functions (SwissModel) Solvation can have a large effect on loops

Modeling Servers SwissModel http://swissmodel.expasy.org/SWISS-MODEL.html Modeller http://salilab.org Geno3D http://geno3d-pbil.ibcp.fr ESyPred http://www.fundp.ac.be/sciences/biologie/urbm/bioinfo/esypred/ 3D-jigsaw http://www.bmm.icnet.uk/servers/3djigsaw/ CPHmodels http://www.cbs.dtu.dk/services/CPHmodels/

Side Chain Modeling: Rotamer Libraries When we study the rotamers of residues that are conserved in different proteins with known 3D structure we observe in more than 90% of all cases similar side chain orientations. The problem of placing side chains is thus reduced to concentrating on those residues that are not conserved in the sequence. Two sub-problems: 1) finding potentially good rotamers, and 2) determining the best one among the candidates. SC Lovell et. al. “The Penultimate Rotamer Library” Proteins: Structure Function and Genetics 40, 389-408 (2000).

Evaluating the Model: Looking for Statistically Unlikely Structures Errors in side chain packing Template distortions because of crystal packing forces Loop generation Misalignments Incorrect templates

Evaluation Servers COLORADO3D http://genesilico.pl/ PROCHECK http://www.biochem.ucl.ac.uk/~roman/procheck/procheck.html VERIFY3D http://fold.doe-mbi.ucla.edu/ PROSAII http://www.came.sbg.ac.at/ WHATCHECK http://swift.cmbi.kun.nl/WIWWWI/modcheck.html

RMSD Accuracy Probabilities of SWISS-MODEL accuracy for target-template identity classes. Percent sequence identity Total number of models Percent models with rmsd lower than 1 Å Percent models with rmsd lower than 2 Å Percent models with rmsd lower than 3 Å Percent models with rmsd lower than 4 Å Percent models with rmsd lower than 5 Å Percent models with rmsd higher than 5 Å 25-29 125 10 30 46 67 33 30-39 222 18 45 66 77 23 40-49 156 9 44 63 78 91 50-59 155 55 79 86 60-69 145 38 72 85 92 8 70-79 137 42 71 82 88 12 80-89 173 94 95 5 90-95 59 83

RMSD Accuracy Probabilities of SWISS-MODEL accuracy for target-template identity classes. Percent sequence identity Total number of models Percent models with rmsd lower than 1 Å Percent models with rmsd lower than 2 Å Percent models with rmsd lower than 3 Å Percent models with rmsd lower than 4 Å Percent models with rmsd lower than 5 Å Percent models with rmsd higher than 5 Å 25-29 125 10 30 46 67 33 30-39 222 18 45 66 77 23 40-49 156 9 44 63 78 91 50-59 155 55 79 86 60-69 145 38 72 85 92 8 70-79 137 42 71 82 88 12 80-89 173 94 95 5 90-95 59 83

RMSD Accuracy Probabilities of SWISS-MODEL accuracy for target-template identity classes. Percent sequence identity Total number of models Percent models with rmsd lower than 1 Å Percent models with rmsd lower than 2 Å Percent models with rmsd lower than 3 Å Percent models with rmsd lower than 4 Å Percent models with rmsd lower than 5 Å Percent models with rmsd higher than 5 Å 25-29 125 10 30 46 67 33 30-39 222 18 45 66 77 23 40-49 156 9 44 63 78 91 50-59 155 55 79 86 60-69 145 38 72 85 92 8 70-79 137 42 71 82 88 12 80-89 173 94 95 5 90-95 59 83