Bioinformatics 2 -- Lecture 8 More TOPS diagrams Comparative modeling tutorial and strategies.

Slides:



Advertisements
Similar presentations
1 Introduction to Sequence Analysis Utah State University – Spring 2012 STAT 5570: Statistical Bioinformatics Notes 6.1.
Advertisements

PROTEOMICS 3D Structure Prediction. Contents Protein 3D structure. –Basics –PDB –Prediction approaches Protein classification.
Protein Structure Prediction
PDB-Protein Data Bank SCOP –Protein structure classification CATH –Protein structure classification genTHREADER–3D structure prediction Swiss-Model–3D.
Prediction to Protein Structure Fall 2005 CSC 487/687 Computing for Bioinformatics.
© Wiley Publishing All Rights Reserved. Analyzing Protein Sequences.
Protein Tertiary Structure Prediction
Alpha/Beta structures Barrels, sheets and horseshoes.
Tertiary protein structure viewing and prediction July 5, 2006 Learning objectives- Learn how to manipulate protein structures with Deep View software.
Computational Biology, Part 10 Protein Structure Prediction and Display Robert F. Murphy Copyright  1996, 1999, All rights reserved.
Summary Protein design seeks to find amino acid sequences which stably fold into specific 3-D structures. Modeling the inherent flexibility of the protein.
The Protein Data Bank (PDB)
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
Tertiary protein structure modelling May 31, 2005 Graded papers will handed back Thursday Quiz#4 today Learning objectives- Continue to learn how to manipulate.
Protein Tertiary Structure. Primary: amino acid linear sequence. Secondary:  -helices, β-sheets and loops. Tertiary: the 3D shape of the fully folded.
Protein structure prediction May 30, 2002 Quiz#4 on June 4 Learning objectives-Understand difference between primary secondary and tertiary structure.
Protein Tertiary Structure Prediction Structural Bioinformatics.
Protein structures in the PDB
PDB-Protein Data Bank SCOP –Protein structure classification CATH –Protein structure classification genTHREADER–3D structure prediction Swiss-Model–3D.
Protein Structure Prediction II
Protein Tertiary Structure Prediction Structural Bioinformatics.
Protein Structures.
Lecture 3. α domain structures Coiled-coil, knobs and hole packing Four-helix bundle Donut ring large structure Globin fold Ridges and grooves model CS882,
Bioinformatics for biomedicine Protein domains and 3D structure Lecture 4, Per Kraulis
Protein Structure Prediction Dr. G.P.S. Raghava Protein Sequence + Structure.
Protein Tertiary Structure Prediction
Macromolecular structure
Genomics and Personalized Care in Health Systems Lecture 9 RNA and Protein Structure Leming Zhou, PhD School of Health and Rehabilitation Sciences Department.
PDBe-fold (SSM) A web-based service for protein structure comparison and structure searches Gaurav Sahni, Ph.D.
Lecture 10 – protein structure prediction. A protein sequence.
Representations of Molecular Structure: Bonds Only.
RNA Secondary Structure Prediction Spring Objectives  Can we predict the structure of an RNA?  Can we predict the structure of a protein?
Multiple Alignment and Phylogenetic Trees Csc 487/687 Computing for Bioinformatics.
1 P9 Extra Discussion Slides. Sequence-Structure-Function Relationships Proteins of similar sequences fold into similar structures and perform similar.
CATH – a hierarchic classification of protein domain structures Rui Kuang.
Protein Folding Programs By Asım OKUR CSE 549 November 14, 2002.
Protein Classification II CISC889: Bioinformatics Gang Situ 04/11/2002 Parts of this lecture borrowed from lecture given by Dr. Altman.
Part I : Introduction to Protein Structure A/P Shoba Ranganathan Kong Lesheng National University of Singapore.
Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009
Bioinformatics Ayesha M. Khan 9 th April, What’s in a secondary database?  It should be noted that within multiple alignments can be found conserved.
Pattern Matching Rhys Price Jones Anne R. Haake. What is pattern matching? Pattern matching is the procedure of scanning a nucleic acid or protein sequence.
Module 3 Protein Structure Database/Structure Analysis Learning objectives Understand how information is stored in PDB Learn how to read a PDB flat file.
Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:
Protein Structure Prediction: Homology Modeling & Threading/Fold Recognition D. Mohanty NII, New Delhi.
Protein Folding & Biospectroscopy Lecture 6 F14PFB David Robinson.
Data and Knowledge Engineering Laboratory Clustered Segment Indexing for Pattern Searching on the Secondary Structure of Protein Sequences Minkoo Seo Sanghyun.
Pairwise sequence alignment Lecture 02. Overview  Sequence comparison lies at the heart of bioinformatics analysis.  It is the first step towards structural.
Comparative methods Basic logics: The 3D structure of the protein is deduced from: 1.Similarities between the protein and other proteins 2.Statistical.
V diagonal lines give equivalent residues ILS TRIVHVNSILPSTN V I L S T R I V I L P E F S T Sequence A Sequence B Dot Plots, Path Matrices, Score Matrices.
Protein Tertiary Structure Prediction Structural Bioinformatics.
Proteins Structure Predictions Structural Bioinformatics.
Protein Structure Prediction. Protein Sequence Analysis Molecular properties (pH, mol. wt. isoelectric point, hydrophobicity) Secondary Structure Super-secondary.
Sequence: PFAM Used example: Database of protein domain families. It is based on manually curated alignments.
Chapter 14 Protein Structure Classification
The heroic times of crystallography
GO! with Microsoft Office 2016
Beta sheets come in two flavors: parallel (shown on this slide) and anti parallel. The geometry of the individual beta strandis are almost identical in.
Protein Structure Prediction
BLAST.
Protein Structures.
Homology Modeling.
Protein structure prediction.
Volume 99, Issue 1, Pages (October 1999)
Volume 8, Issue 11, Pages (November 2000)
Volume 5, Issue 10, Pages (October 1997)
Volume 12, Issue 11, Pages (November 2004)
Protein structure prediction
The Three-Dimensional Structure of Proteins
Homology modeling in short…
Volume 8, Issue 11, Pages (November 2000)
Presentation transcript:

Bioinformatics 2 -- Lecture 8 More TOPS diagrams Comparative modeling tutorial and strategies.

Principles of Comparative modeling Proteins that have common ancestors have the same fold. Changes in structure lead to changes in function : enzyme reaction mechanism, ligand binding specificity, signaling, sub-cellular location, stability, etc. We can infer functional differences from structural differences. We can use energy calculations and simulations to find structural differences. note: Comparative modeling == Homology modeling

What can we do by molecular modeling? Structure-based drug design Examples: trimethoprim, HIV protease inhibitors. Protein design Examples: TNT binding protein Function prediction Examples: structural genomics

TOPS topology cartoons A simple way to draw a protein beta strand pointing up beta strand pointing down alpha helix A parallel beta sheet An anti- parallel beta sheet connections

Reminder: TOPS diagrams number strands and helices draw connections in front (middle) or back (side) N C 3-layer  sandwich, mixed, up-down-up-up 4312.

Draw barrels on a circle all anti-parallel beta-barrel, closed, n=6, N C

TOPS tips Draw beta strands together only if they are H-bonded. Draw beta strands in a circle if they form a barrel. If there are multiple domains, draw them as clearly separated cartoons. If you can’t arrange all secondary structures perpendicular to the screen, find the best approximate solution. (For example, look at helix 3 in the example) Ignore short helices. Sometimes a loop is really a strand. Check H-bonding if in doubt. Beta strands that are close to each other are not necessarilly in the same sheet. Again check H-bonding.

Practice drawing TOPS cartoons Draw TOPS cartoons for the following proteins (downloadable in MOE using File-->Protein Database) easy: 1SH11AB1 hard: 1K77.A1FUS harder: 1IKO.P4SBV.A Draw the TOPS cartoon and name the fold SCOP-style.

Comparative modeling requires good bookkeeping skills. Proteins are big complicated molecules. Modeling them requires a plan. Alignments must be modified and re-modified. Structurally conserved regions (SCRs) must be identified. Quality of loops must be assessed. Good quality regions can be ‘fixed’ (frozen) while others are modified. Residues known to have functional significance need special consideration.

Planning a homology modeling project Choosing a Target: Find out what is known about the sequence you wish to model. Where did it come from? How was it discovered? Is the function known? Choosing templates: Do a database search to get families of known structures, sometimes called a “basis set”. Repeat the search if necessary using parts of the sequence. Study the alignment and edit it. Merge multiple templates into one if necessary. Bookkeeping: Make a simple TOPS model for your template and label it. Refer to this when building the model. Keep track of what is homology-based, and what is not.

Planning a homology modeling project Alignment: Start with the automatic alignment from Dynamic Programming. Inspect locations of gaps and insertions. Modify alignment if necessary. Every residue that is aligned is considered to be structurally conserved ! If you do not believe it should be structurally conserved, unalign it or re-align it. Model building: Carry out the loop search, splicing, sidechain rotamer search, energy minimization. Automated model building part: (1) Place aligned sidechains based on identities, similarities. (2) Search for loops to model insertions and deletions. (3) Swap loops, choose lowest energy. (4) Place loop sidechains, (5) Energy minimize

Planning a homology modeling project (3) For a detailed description of the automated MOE-Homology method, read the “ promodel.htm ” page under: MOE: Help-->Tutorials-->Homology Modeling.. click on “ Building 3D Protein Models ”

Run the MOE Homology Modeling tutorial Help-->Tutorials-->Homology Modeling aka Comparative modeling Run the tutorial word-for-word except... Stop at the points indicated in the following slides and do the exercises.

1st stopping point: After aligning structures Draw a TOPS diagram of the structure. Number the beta strands N to C. Number the alpha helices N to C, independent of the beta strand numbering. Your template sequence is now summarized as:                  

Rearrange sequences Open the Command window after running Homology-->Align. You will see the identity table. Move the sequences around to put the most similar together, keeping the query at the top. pro_Align: pairwise percentage residue identity Chains : :1KTE :1B4Q.A :1H75.A :1FOV.A :1EGO :1J0F.A :1AAZ.A not in order

Rearrange sequences Modify gaps if nexessary. Constrain using Edit-->constrain residues Re-run Homology-->Align. pro_Align: pairwise percentage residue identity Chains : :1KTE :1B4Q.A :1FOV.A :1EGO :1H75.A :1J0F.A :1AAZ.A reordered

Evaluate structural conservation in the basis set For each secondary structure and each intervening loop/coil region, look at it carefully. Is it conserved? Are the lengths different? Are any of the loops the same length as the query? Enter a short note for each ss and each loop:    low rmsd, sometimes not present     loop  short, not variable.    SCR (short for Structurally Conserved Region)     loop  variable. Tmpl 2,3,4,5,6 matches.    SCR Bottom line: If you see a SCR, don't put a gap there.

SCRs Structurally Conserved Regions (SCR) are assumed to be evolutionarilly invariant. SCRs should be ‘fixed’ during energy minimization. Initially all atoms, and eventually just the backbone atoms.

Loops Three types of loops: Designated (conserved) Loop: coordinates derived from homology to the template, not a loop search. May be flexible. If so, don't fix during energy minimization. Variable Loop: Variable from model to model in the basis set. Coordinates derived from a loop search or one of the templates, or constructed by hand. Not fixed during energy minimization Outgap (extension): a variable loop at the end of a chain, usually constructed by hand from a secondary structure prediction.

continue tutorial. Now choose a template and run SE:Homology-->Homology model. You may choose minimization=none for faster results, or minimization=coarse for better results if you have time. You may set the number of models to 5 for faster results. Exercise 3. Turn in the MOE file after running the tutorial exactly according to the instructions. Due Feb 14 Exercise 4: Refine your model by re-aligning and manual intervention. (see Lecture 9) Due Feb 19.