MASS and MultiProt methods. Problem Definition Input: a collection of 3D protein structures Goal: find substructures common to two or more proteins.

Slides:



Advertisements
Similar presentations
PROTEOMICS 3D Structure Prediction. Contents Protein 3D structure. –Basics –PDB –Prediction approaches Protein classification.
Advertisements

A 3-D reference frame can be uniquely defined by the ordered vertices of a non- degenerate triangle p1p1 p2p2 p3p3.
Pfam(Protein families )
PDB-Protein Data Bank SCOP –Protein structure classification CATH –Protein structure classification genTHREADER–3D structure prediction Swiss-Model–3D.
Seminar in structural bioinformatics Multiple structural alignment of proteins By Elad Kaspani.
Structural bioinformatics
Biology 224 Dr. Tom Peavy Sept 27 & 29 Protein Structure & Analysis.
Protein Structure Alignment Human Myoglobin pdb:2mm1 Human Hemoglobin alpha-chain pdb:1jebA Sequence id: 27% Structural id: 90% Another example: G-Proteins:
Strict Regularities in Structure-Sequence Relationship
Structural Bioinformatics Workshop Max Shatsky Workshop home page:
Protein Structure, Databases and Structural Alignment
Structure Prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray crystallography, NMR) [2]
Agenda A brief introduction The MASS algorithm The pairwise case Extension to the multiple case Experimental results.
Domain Assignment to Transcription Factors 416 Proteins with at least one SCOP DBD assignment 416 Proteins with at least one SCOP DBD assignment PFAM assignments.
FLEX* - REVIEW.
1 Alignment of Flexible Protein Structures Based on: FlexProt: Alignment of Flexible Protein Structures Without a Pre-definition of Hinge Regions / M.
Protein structures in the PDB
Protein structure Classification Ole Lund, Associate professor, CBS, DTU.
BLOSUM Information Resources Algorithms in Computational Biology Spring 2006 Created by Itai Sharon.
PDB-Protein Data Bank SCOP –Protein structure classification CATH –Protein structure classification genTHREADER–3D structure prediction Swiss-Model–3D.
Protein Structure Prediction II
Protein Structure Alignment
Protein Tertiary Structure Prediction Structural Bioinformatics.
Sigma-aldrich.com/cellsignaling Modular Structure of Transcription Factors.
DNA Motif and protein domain discovery
Packaging DNA T G A Histone C octomer Histone proteins 2 nm
Lecture 3. α domain structures Coiled-coil, knobs and hole packing Four-helix bundle Donut ring large structure Globin fold Ridges and grooves model CS882,
Department of Biochemistry
Bioinformatics Analysis of YqjG: an introduction and some questions YqjG: “Uncharacterized protein” from Escherichia coli UniProt ID = P42620 (YQJG_ECOLI)
IBGP/BMI 705 Lab 4: Protein structure and alignment TA: L. Cooper.
Protein Tertiary Structure Prediction
Construyendo modelos 3D de proteinas ‘fold recognition / threading’
Macromolecular structure
A computational study of protein folding pathways Reducing the computational complexity of the folding process using the building block folding model.
EECS 730 Introduction to Bioinformatics Structure Comparison Luke Huan Electrical Engineering and Computer Science
Chapter 11: Transcription Initiation Complex Copyright © Garland Science 2007.
Biology 224 Instructor: Tom Peavy Feb 21 & 26, Protein Structure & Analysis.
 Four levels of protein structure  Linear  Sub-Structure  3D Structure  Complex Structure.
CATH – a hierarchic classification of protein domain structures Rui Kuang.
PROTEIN STRUCTURE CLASSIFICATION SUMI SINGH (sxs5729)
Protein Structure Comparison. Sequence versus Structure The protein sequence is a string of letters: there is an optimal solution (DP) to the problem.
Protein Classification II CISC889: Bioinformatics Gang Situ 04/11/2002 Parts of this lecture borrowed from lecture given by Dr. Altman.
Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009
Protein Strucure Comparison Chapter 6,7 Orengo. Helices α-helix4-turn helix, min. 4 residues helix3-turn helix, min. 3 residues π-helix5-turn helix,
Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:
A data-mining approach for multiple structural alignment of proteins WY Siu, N Mamoulis, SM Yiu, HL Chan The University of Hong Kong Sep 9, 2009.
The Crystal Structure of Helicobacter pylori Cysteine-rich Protein B Reveals a Novel Fold for a Penicillin-binding Protein Lucas Lu¨ thy, Markus G. Gru¨
Comparing and Classifying Domain Structures
X-ray crystallography – an overview (based on Bernie Brown’s talk, Dept. of Chemistry, WFU) Protein is crystallized (sometimes low-gravity atmosphere is.
EMBL-EBI MSDfold (SSM) A web service for protein structure comparison and structure searches Eugene Krissinel
Query sequence MTYKLILNGKTKGETTTEAVDAATAEKVFQYANDN GVDGEWTYTE Structure-Sequence alignment “Structure is better preserved than sequence” Me! Non-redundant.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
PROTEIN PHYSICS LECTURE 1. Globularproteins Fibrous proteins H-bonds (NH:::OC) & hydrophobic forces Membraneproteins.
An Efficient Index-based Protein Structure Database Searching Method 陳冠宇.
Local Flexibility Aids Protein Multiple Structure Alignment Matt Menke Bonnie Berger Lenore Cowen.
Protein structure, domains, and interactions Curtis Huttenhower Harvard T.H. Chan School of Public Health Department of Biostatistics.
Find the optimal alignment ? +. Optimal Alignment Find the highest number of atoms aligned with the lowest RMSD (Root Mean Squared Deviation) Find a balance.
Structural Bioinformatics Elodie Laine Master BIM-BMC Semester 3, Genomics of Microorganisms, UMR 7238, CNRS-UPMC e-documents:
Chapter 14 Protein Structure Classification
חיזוי ואפיון אתרי קישור של חלבון לדנ"א מתוך הרצף
Volume 11, Issue 9, Pages (September 2003)
Volume 8, Issue 3, Pages (March 2000)
Classification: understanding the diversity and principles of
Debanu Das, Millie M Georgiadis  Structure 
Volume 14, Issue 5, Pages (May 2006)
Protein Structure Alignment
Recognition of Specific DNA Sequences
Volume 2, Issue 1, Pages 1-4 (January 1994)
Structure of the Histone Acetyltransferase Hat1
Conserved motifs in the ABC
Presentation transcript:

MASS and MultiProt methods

Problem Definition Input: a collection of 3D protein structures Goal: find substructures common to two or more proteins

The problem is complicated due to: -Similar substructures instead of identical -Partial alignments (smaller common substructures) -Subset alignments AA B B A B CC Common substructures: P1P1 P2P2 P3P3 A B C

MultiProtMASS Algorithm Considers all structures simultaneously Based on contiguous fragments Based on secondary structures Applications Subset structural core detection Sequential as well as non-sequential alignment Fine structural core detection Fast fold detection MultiProt and MASS

Ensemble: 10 proteins from 4 different folds and 6 different superfamilies in SCOP Runtime: 48 seconds Core: 4-helical bundle Non-Topological Alignment Helix-Bundle Ensemble

P: Q:

Classification of DNA-Binding Proteins  The ensemble contains 18 DNA-binding proteins that can be classified into 5 structural classes: –Classic zinc finger (7 molecules) –Histones (3 molecules) –Phage repressors (3 molecules) –Restriction endonuclease-like (3 molecules) –Winged helix (3 molecules). Subset Alignments

A. Zinc Finger D. Restriction endonuclease-like E. Winged Helix C. Phage repressorsB. Histones - DNA

 The ensemble contains 12 sequentially non- redundant structures taken from the two families of the Actin depolymerizing proteins fold: −Cofilin-like (CL) family (4 molecules) −Gelsolin-like (GL) family (8 molecules) Cofilin-like and Gelsolin-like Families Cofilin-like and Gelsolin-like Families Subset Alignments (cont.)

A. Alignment of all 12 proteins B. Alignment of all 8 GL proteins C. Alignment of all 4 CL proteins D. Alignment of 3 CL proteins PDB:1f7s lacks this helix 28 residues RMSD residues RMSD residues RMSD residues RMSD 1.3

Detection of Two Common Motifs 46 residues RMSD residues RMSD DNA of PDB: 1cgpA - DNA of PDB: 1ddnA -DNA of PDB: 1fokA A Winged-helix proteins B A B

PTB phosphotyrosine-binding domains [binding diversity]

Recognition of conserved core of PLP dependent transferase for focused docking