2 Structural bioinformatics Predicting Protein structure
3 What is Structural Bioinformatics? is the branch of bioinformatics which is related to the analysis and prediction of the three-dimensional structure of biological macromolecules such as proteins, RNA, and DNA.It deals with generalizations about macromolecular 3D structure such as comparisons of overall folds and local motifs, principles of molecular folding, evolution, and binding interactions, and structure/function relationships,
4 Structural bioinformatics vs. bioinformatics DNA mappingDNA and protein sequenceDevelopment of algorithms for data miningDetermine of 3D structures in biomoleculesAnalysis and comparison of biomolecular structuresPrediction of biomolecular structure.
5 Experimental techniques for structure determination X-ray CrystallographyNuclear Magnetic Resonance spectroscopy (NMR)
8 Structure Prediction Approaches Homology (Comparative) ModelingBased on sequence similarity with a protein forwhich a structure has been solved.Threading (Fold Recognition)Requires a structure similar to a known structureAb-initio fold predictionNot based on similarity to a sequence\structure
9 Ab-initio fold prediction Given only the sequence, try to predict the structure based on physico-chemical properties (energy, hydrophobicity etc.)
10 Fold Recognition (Threading) Given a sequence and a library of folds, thread the sequence through each fold. Take the one with the highest score.
11 Homology Modeling – Basic Idea A protein structure is defined by its amino acid sequence.Closely related sequences adopt highly similar structures, distantly related sequences may still fold into similar structures.Three-dimensional structure ofproteins from the same family ismore conserved than theirprimary sequences.שני חלבונים מפולד ה-TIM BARREL פונקציה זהה, אחוז זהות בינוני וניתן לראות שהמבנה דומה מאודאפשר אפילו לחזק את האמירה:Therefore, if similarity between two proteins is detectable at the sequence level, structural similarity can usually be assumed.Moreover, proteins that share low or even nondetectable sequence similarity often will have similar structures.Triophospate ismoerases44.7% sequence identity0.95 RMSD
12 General Scheme Searching for structures related to the query sequence Selecting templatesAligning query sequence with template structuresBuilding a model for the query using information from the template structuresEvaluating the modelFiser A et al. Methods in Enzymology 374: (2004)
15 How to select the right template? Close subfamily - phylogenetic tree“Environment” similarity
16 More than one template Two ways to combine multiple templates: Global model – alignment with different domain of the target with little overlap between themLocal model – alignment with the same part of the targetIn general, it is frequently beneficial to include in the modeling process all the templates that differ substantially from each other, if they share approximately the same overall similarity to the target sequence.
17 3. AligningAll comparative modeling programs depend on a target-template alignment.When the sequence similarity between the template and target proteins is high, simple pairwise alignments are usually fine (e.g. Needleman- Wunsch global alignment).But some times blast is required.
18 Sequence alignment algorithms Examples: the two most used in homology modeling are:BLAST: General strategy is to optimize the maximal segment pair (MSP) score - BLAST computes similarity, not alignmentFastA (local alignment): searches for both full and partial sequence matches, i.e., local similarity obtained; more sensitive than BLAST, but slower; many gaps may represent a problem
20 5. Model EvaluationThe accuracy of the model depends on its sequence identity with the template:Internal evaluation – self consistency checksExternal evaluation – relies on information that was not used in the model calculation