Protein Tertiary Structure Prediction

Slides:



Advertisements
Similar presentations
Secondary structure prediction from amino acid sequence.
Advertisements

Functional Site Prediction Selects Correct Protein Models Vijayalakshmi Chelliah Division of Mathematical Biology National Institute.
PROTEOMICS 3D Structure Prediction. Contents Protein 3D structure. –Basics –PDB –Prediction approaches Protein classification.
Tutorial Homology Modelling. A Brief Introduction to Homology Modeling.
Protein Threading Zhanggroup Overview Background protein structure protein folding and designability Protein threading Current limitations.
PDB-Protein Data Bank SCOP –Protein structure classification CATH –Protein structure classification genTHREADER–3D structure prediction Swiss-Model–3D.
Structural bioinformatics
Structure Prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray crystallography, NMR) [2]
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modeling Anne Mølgaard, CBS, BioCentrum, DTU.
Protein Structure, Databases and Structural Alignment
Protein structure (Part 2 of 2).
Structure Prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray crystallography, NMR) [2]
Protein Fold recognition Morten Nielsen, Thomas Nordahl CBS, BioCentrum, DTU.
Protein Fold recognition
Computational Biology, Part 10 Protein Structure Prediction and Display Robert F. Murphy Copyright  1996, 1999, All rights reserved.
The Protein Data Bank (PDB)
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
Protein Tertiary Structure. Primary: amino acid linear sequence. Secondary:  -helices, β-sheets and loops. Tertiary: the 3D shape of the fully folded.
Protein structure determination & prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray.
1 Protein Structure Prediction Charles Yan. 2 Different Levels of Protein Structures The primary structure is the sequence of residues in the polypeptide.
Protein Structure and Function Prediction. Predicting 3D Structure –Comparative modeling (homology) –Fold recognition (threading) Outstanding difficult.
Protein Tertiary Structure Prediction Structural Bioinformatics.
PDB-Protein Data Bank SCOP –Protein structure classification CATH –Protein structure classification genTHREADER–3D structure prediction Swiss-Model–3D.
Protein Tertiary Structure Prediction Structural Bioinformatics.
Protein Structures.
Protein Sequence Analysis - Overview Raja Mazumder Senior Protein Scientist, PIR Assistant Professor, Department of Biochemistry and Molecular Biology.
Bioinformatics Ayesha M. Khan Spring 2013.
Protein Structure Prediction and Analysis
Homology Modeling David Shiuan Department of Life Science and Institute of Biotechnology National Dong Hwa University.
Protein Tertiary Structure Prediction
Construyendo modelos 3D de proteinas ‘fold recognition / threading’
Practical session 2b Introduction to 3D Modelling and threading 9:30am-10:00am 3D modeling and threading 10:00am-10:30am Analysis of mutations in MYH6.
COMPARATIVE or HOMOLOGY MODELING
Representations of Molecular Structure: Bonds Only.
Lecture 12 CS5661 Structural Bioinformatics Motivation Concepts Structure Prediction Summary.
1 P9 Extra Discussion Slides. Sequence-Structure-Function Relationships Proteins of similar sequences fold into similar structures and perform similar.
© Wiley Publishing All Rights Reserved. Protein 3D Structures.
Protein Folding Programs By Asım OKUR CSE 549 November 14, 2002.
MolIDE2: Homology Modeling Of Protein Oligomers And Complexes Qiang Wang, Qifang Xu, Guoli Wang, and Roland L. Dunbrack, Jr. Fox Chase Cancer Center Philadelphia,
Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009
Protein secondary structure Prediction Why 2 nd Structure prediction? The problem Seq: RPLQGLVLDTQLYGFPGAFDDWERFMRE Pred:CCCCCHHHHHCCCCEEEECCHHHHHHCC.
Module 3 Protein Structure Database/Structure Analysis Learning objectives Understand how information is stored in PDB Learn how to read a PDB flat file.
Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:
Bioinformatics how to … use publicly available free tools to predict protein structure by comparative modeling.
Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department.
Protein Modeling Protein Structure Prediction. 3D Protein Structure ALA CαCα LEU CαCαCαCαCαCαCαCα PRO VALVAL ARG …… ??? backbone sidechain.
Protein Structure Prediction ● Why ? ● Type of protein structure predictions – Sec Str. Pred – Homology Modelling – Fold Recognition – Ab Initio ● Secondary.
Predicting Protein Structure: Comparative Modeling (homology modeling)
Protein Structure Prediction: Homology Modeling & Threading/Fold Recognition D. Mohanty NII, New Delhi.
Introduction to Protein Structure Prediction BMI/CS 576 Colin Dewey Fall 2008.
Protein Folding & Biospectroscopy Lecture 6 F14PFB David Robinson.
Homology Modeling 原理、流程,還有如何用該工具去預測三級結構 Lu Chih-Hao 1 1.
BMC Bioinformatics 2005, 6(Suppl 4):S3 Protein Structure Prediction not a trivial matter Strict relation between protein function and structure Gap between.
Structural alignment methods Like in sequence alignment, try to find best correspondence: –Look at atoms –A 3-dimensional problem –No a priori knowledge.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
Protein Tertiary Structure Prediction Structural Bioinformatics.
Proteins Structure Predictions Structural Bioinformatics.
Protein Structure Prediction: Threading and Rosetta BMI/CS 576 Colin Dewey Fall 2008.
3.3b1 Protein Structure Threading (Fold recognition) Boris Steipe University of Toronto (Slides evolved from original material.
Lab Lab 10.2: Homology Modeling Lab Boris Steipe Departments of Biochemistry and.
Protein Structure Visualisation
Protein Structure Prediction and Protein Homology modeling
Protein dynamics Folding/unfolding dynamics
Protein dynamics Folding/unfolding dynamics
Protein Structure Prediction
Protein Structures.
Protein Sequence Analysis - Overview -
Homology Modeling.
Protein structure prediction.
Protein structure prediction
Presentation transcript:

Protein Tertiary Structure Prediction Structural Bioinformatics Protein Tertiary Structure Prediction

The Different levels of Protein Structure Primary: amino acid linear sequence. Secondary: -helices, β-sheets and loops. Tertiary: the 3D shape of the fully folded polypeptide chain

Predicting 3D Structure Outstanding difficult problem Comparative modeling (homology) Based on structural homology Fold recognition (threading) Based on sequence homology

Comparative Modeling Based on Sequence homology Similar sequences suggests similar structure

Sequence and Structure alignments of two Retinol Binding Protein

Structure Alignments There are many different algorithms for structural Alignment. The outputs of a structural alignment are a superposition of the atomic coordinates and a minimal Root Mean Square Distance (RMSD) between the structures. The RMSD of two aligned structures indicates their divergence from one another. Low values of RMSD mean similar structures

Based on Sequence homology Comparative Modeling Similar sequence suggests similar structure Builds a protein structure model based on its alignment to one or more related protein structures in the database

Based on Sequence homology Comparative Modeling Accuracy of the comparative model is related to the sequence identity on which it is based >50% sequence identity = high accuracy 30%-50% sequence identity= 90% modeled <30% sequence identity =low accuracy (many errors)

Homology Threshold for Different Alignment Lengths Threshold (t) Alignment length (L) A sequence alignment between two proteins is considered to imply structural homology if the sequence identity is equal to or above the homology threshold t in a sequence region of a given length L. The threshold values t(L) are derived from PDB

Comparative Modeling Similarity particularly high in core Alpha helices and beta sheets preserved Even near-identical sequences vary in loops

Comparative Modeling Methods Based on Sequence homology Comparative Modeling Methods MODELLER (Sali –Rockefeller/UCSF) SCWRL (Dunbrack- UCSF ) SWISS-MODEL http://swissmodel.expasy.org//SWISS-MODEL.html

Based on Sequence homology Comparative Modeling Modeling of a sequence based on known structures Consist of four major steps : Finding a known structure(s) related to the sequence to be modeled (template), using sequence comparison methods such as PSI-BLAST 2. Aligning sequence with the templates 3. Building a model 4. Assessing the model

Based on Structure homology Fold Recognition

Based on Secondary Structure Protein Folds: sequential and spatial arrangement of secondary structures Hemoglobin TIM

Similar folds usually mean similar function Transcription factors Homeodomain

The same fold can have multiple functions Rossmann 12 functions 31 functions TIM barrel

Based on Structure homology Fold Recognition Methods of protein fold recognition attempt to detect similarities between protein 3D structure that have no significant sequence similarity. Search for folds that are compatible with a particular sequence. "the turn the protein folding problem on it's head” rather than predicting how a sequence will fold, they predict how well a fold will fit a sequence

Based on Structure homology Basic steps in Fold Recognition : Compare sequence against a Library of all known Protein Folds (finite number) Query sequence MTYGFRIPLNCERWGHKLSTVILKRP... Goal: find to what folding template the sequence fits best There are different ways to evaluate sequence-structure fit

Based on Secondary Structure homology There are different ways to evaluate sequence-structure fit Potential fold 1) ... 56) ... n) ... ... MAHFPGFGQSLLFGYPVYVFGD... -10 ... -123 ... 20.5

Programs for fold recognition Based on Secondary Structure homology Programs for fold recognition TOPITS (Rost 1995) GenTHREADER (Jones 1999) SAMT02 (UCSC HMM) 3D-PSSM http://www.sbg.bio.ic.ac.uk/~3dpssm/

Ab Initio Modeling Compute molecular structure from laws of physics and chemistry alone Theoretically Ideal solution Practically nearly impossible WHY ? Exceptionally complex calculations Biophysics understanding incomplete

Ab Initio Methods Rosetta (Bakers lab, Seattle) Undertaker (Karplus, UCSC)

CASP - Critical Assessment of Structure Prediction Competition among different groups for resolving the 3D structure of proteins that are about to be solved experimentally. Current state - ab-initio - the worst, but greatly improved in the last years. Modeling - performs very well when homologous sequences with known structures exist. Fold recognition - performs well.

What can you do? FOLDIT Solve Puzzles for Science A computer game to fold proteins http://fold.it/portal/puzzles

Predicting function from structure What’s Next Predicting function from structure

Structural Genomics : a large scale structure determination project designed to cover all representative protein structures ATP binding domain of protein MJ0577 Zarembinski, et al., Proc.Nat.Acad.Sci.USA, 99:15189 (1998)

Wanted ! As a result of the Structure Genomic initiative many structures of proteins with unknown function will be solved Wanted ! Automated methods to predict function from the protein structures resulting from the structural genomic project.

Approaches for predicting function from structure ConSurf - Mapping the evolution conservation on the protein structure http://consurf.tau.ac.il/

Approaches for predicting function from structure PFPlus – Identifying positive electrostatic patches on the protein structure http://pfp.technion.ac.il/

A method to distinguish DNA from RNA-binding proteins DNA binding interface RNA binding interface

RNA and DNA binding interfaces tend to have different geometric features DNA binding interface RNA binding interface So further, in order to differentiate between RNA and DNA binding interfaces we needed a geometric method that would be able to characterize the different interfaces. Landscapes as proteins have different type of surfaces. Those two pictures describe two different types of landscapes. When looking carefully on the surface of the landscape, a composition of polymorphic shapes can be found. For example Peaks and Pits. The composition of those shapes can help characterize different surfaces. We thought that applying a similar approach for analyzing and characterizing the DNA-binding and RNA-binding interfaces would help distinguishing the two groups.

Applying Differential Geometry to characterize DNA and RNA binding proteins k1 - minimal curvature K2- MAXIMAL CURVATURE The first step in the new method we developed was to extract the geometric properties of each point on protein-binding interface. A surface is composed of many points. Finding the geometric properties Mean and Gaussian curvatures can help classify a point as a part of local geometry shape for example a pit, a peak or a valley. The Mean and Gaussian curvatures’ calculation is based on k1 and k2. k1 and k2 are the principle curvatures. What are the principle curvatures k1 and k2? A curvature is equals to one over the radius. For example lets look at a certain point on a top of a hill. Imagine ourselves standing on that point looking towards all the directions around for all the paths that pass throw this point. The most moderate path is k1 the minimal curvature. The most steep path is k2 the maximal curvature. The principle curvatures have directions or signs. If the curvature climbs up from a point than the curvature is positive, on the other hand if the curvature goes down than it is negative. The curvatures on a plane are zero. The signs of k1 and k2, the principle curvatures determine the signs of the Mean and the Gaussian curvatures. H=(k1+k2)/2 Mean Curvature K=k1*k2 Gaussian Curvature

Applying Differential Geometry to characterize DNA and RNA proteins Flat Peak Pit Minimal Surface The signs of the Mean and Gaussian curvatures can classify the surface type a certain point is belonging to. The eight fundamental surface types are shown here. After extracting the Mean and Gaussian curvatures, the next step of the method was to classify each point on protein binding interface to one of those eight fundamental surface types, based on the point’s K and H. A point that has a positive K and negative H is considered as a part of a peak, while a point which has both positive K and H should be a part of a Pit. Ridge Saddle ridge Valley Saddle valley

Applying Differential Geometry for DNA and RNA function prediction Frequency of points The signs of the Mean and Gaussian curvatures can classify the surface type a certain point is belonging to. The eight fundamental surface types are shown here. After extracting the Mean and Gaussian curvatures, the next step of the method was to classify each point on protein binding interface to one of those eight fundamental surface types, based on the point’s K and H. A point that has a positive K and negative H is considered as a part of a peak, while a point which has both positive K and H should be a part of a Pit.

RNA binding surfaces are distinguished from DNA binding surfaces based on Differential Geometric features 76% RNA-binding 78% DNA binding The signs of the Mean and Gaussian curvatures can classify the surface type a certain point is belonging to. The eight fundamental surface types are shown here. After extracting the Mean and Gaussian curvatures, the next step of the method was to classify each point on protein binding interface to one of those eight fundamental surface types, based on the point’s K and H. A point that has a positive K and negative H is considered as a part of a peak, while a point which has both positive K and H should be a part of a Pit.

Differential Geometry can correctly determine whether a given binding domain binds RNA or DNA Frequency of points RNA pattern DNA pattern Shazman et al, NAR 2011

How can we view the protein structure ? Download the coordinates of the structure from the PDB http://www.rcsb.org/pdb/ Launch a 3D viewer program For example we will use the program Pymol The program can be downloaded freely from the Pymol homepage http://pymol.org Upload the coordinates to the viewer

Pymol example Launch Pymol Open file “1aqb” (PDB coordinate file) Display sequence Hide everything Show main chain / hide main chain Show cartoon Color by ss Color red Color green, resi 1:40 Help : http://pymol.org