ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory.

Slides:



Advertisements
Similar presentations
Protein Structure.
Advertisements

Protein Structure and Physics. What I will talk about today… -Outline protein synthesis and explain the basic steps involved. -Go over the Chemistry of.
Protein Structure Prediction
Protein Structure Prediction using ROSETTA
The amino acids in their natural habitat. Topics: Hydrogen bonds Secondary Structure Alpha helix Beta strands & beta sheets Turns Loop Tertiary & Quarternary.
Protein Threading Zhanggroup Overview Background protein structure protein folding and designability Protein threading Current limitations.
1 September, 2004 Chapter 5 Macromolecular Structure.
Chapter 9 Structure Prediction. Motivation Given a protein, can you predict molecular structure Want to avoid repeated x-ray crystallography, but want.
Bioinformatics Ch1. Introduction (continue-2) 阮雪芬 Nov7, 2002 NTUST
Mossbauer Spectroscopy in Biological Systems: Proceedings of a meeting held at Allerton House, Monticello, Illinois. Editors: J. T. P. DeBrunner and E.
The Protein Data Bank (PDB)
1 Protein Structure Prediction Charles Yan. 2 Different Levels of Protein Structures The primary structure is the sequence of residues in the polypeptide.
微生物應用工業 Ch5. 微生物催化劑 阮雪芬 Nov 19, 2002 NTUT
Daily Starter  Explain how a peptide bond is formed. (What is the reaction called and how does it happen?)
Homework for next week Green q 1,2,3 p29 Do evaluation points from Biuret Practical Revise test on all work next week Bring evidence you have revised please.
Protein Structure Prediction Dr. G.P.S. Raghava Protein Sequence + Structure.
Proteins. You need to know that: Proteins have a variety of functions within all living organisms. The general structure of an amino acid Condensation.
 All amino acids have a common structure, but differ in their R groups.  The asymmetric carbon must have the four attachments in a particular order.
Genomics and Personalized Care in Health Systems Lecture 9 RNA and Protein Structure Leming Zhou, PhD School of Health and Rehabilitation Sciences Department.
Proteins. Proteins? What is its How does it How is its How does it How is it Where is it What are its.
02/03/10 CSCE 769 Dihedral Angles Homayoun Valafar Department of Computer Science and Engineering, USC.
Protein “folding” occurs due to the intrinsic chemical/physical properties of the 1° structure “Unstructured” “Disordered” “Denatured” “Unfolded” “Structured”
Representations of Molecular Structure: Bonds Only.
RNA Secondary Structure Prediction Spring Objectives  Can we predict the structure of an RNA?  Can we predict the structure of a protein?
Neural Networks for Protein Structure Prediction Brown, JMB 1999 CS 466 Saurabh Sinha.
Protein Folding Programs By Asım OKUR CSE 549 November 14, 2002.
Secondary structure prediction
Doug Raiford Lesson 19.  Framework model  Secondary structure first  Assemble secondary structure segments  Hydrophobic collapse  Molten: compact.
Protein Structure 1 Primary and Secondary Structure.
Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009
Protein structure and function Part - I
Protein secondary structure Prediction Why 2 nd Structure prediction? The problem Seq: RPLQGLVLDTQLYGFPGAFDDWERFMRE Pred:CCCCCHHHHHCCCCEEEECCHHHHHHCC.
Protein Folding and Modeling Carol K. Hall Chemical and Biomolecular Engineering North Carolina State University.
Emergent or Otherwise? CMSC/BIOL 361 Emergence 1/22/08.
New Strategies for Protein Folding Joseph F. Danzer, Derek A. Debe, Matt J. Carlson, William A. Goddard III Materials and Process Simulation Center California.
Modelling protein tertiary structure Ram Samudrala University of Washington.
Below is the database schema used by the RCSB Protein Data Bank Each box indicates a separate attribute set Bioinformatics databases are very large PROTEIN.
© SSER Ltd.. The significance of proteins cannot be over-emphasised, since they are intimately connected with all phases of the chemical and physical.
Protein Structure and Bioinformatics. Chapter 2 What is protein structure? What are proteins made of? What forces determines protein structure? What is.
Protein Structure  The structure of proteins can be described at 4 levels – primary, secondary, tertiary and quaternary.  Primary structure  The sequence.
Query sequence MTYKLILNGKTKGETTTEAVDAATAEKVFQYANDN GVDGEWTYTE Structure-Sequence alignment “Structure is better preserved than sequence” Me! Non-redundant.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
Protein backbone Biochemical view:
CMSC 120: Introduction to Computing: Visualizing Information 1/22/08.
Levels of Protein Structure. Why is the structure of proteins (and the other organic nutrients) important to learn?
PROTEINS Characteristics of Proteins Contain carbon, hydrogen, oxygen, nitrogen, and sulfur Serve as structural components of animals Serve as control.
Protein Structure and Function. Primary Structure  Made of amino acids covalently bonded together.
Levels of Protein Structure. Why is the structure of proteins (and the other organic nutrients) important to learn?
Tymoczko • Berg • Stryer © 2015 W. H. Freeman and Company
CARBON AND MOLECULAR DIVERSITY The structure and function of macromolecules: Proteins and Nucleic Acids Chapter 5.
Structural organization of proteins
© SSER Ltd.. Proteins are huge three-dimensional molecules whose building blocks or monomers are the variety of different amino acids found in nature.
Organic Macromolecules: Proteins and Nucleic Acids.
CHM 708: MEDICINAL CHEMISTRY
© SSER Ltd..
Hierarchical Structure of Proteins
PROTEINS Polymers (long chains) of AMINO ACIDS
Protein dynamics Folding/unfolding dynamics
Diverse Macromolecules
Chapter Outline 14.1 Nucleic Acid Building Blocks
Protein Structure Prediction
Protein Structure Chapter 14.
Rosetta: De Novo determination of protein structure
Homology Modeling.
Protein structure prediction.
Tertiary and Quaternary Protein Structure
A Protein Interface.
Protein structure prediction
The Three-Dimensional Structure of Proteins
Four Levels of Protein Structure
Presentation transcript:

ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory

The Protein Structure Prediction Problem To determine how proteins, the building blocks of living cells, fold themselves into three-dimensional shapes that define the role they play in life.

Importance of Protein Structure Prediction The shape of a protein determines its function. Knowledge of structure is used in many ways: –Drug design –Design of synthetic proteins –Re-engineering defective proteins Genome projects are providing sequences for many proteins whose structure will need to be determined.

Protein Structures ProGlyLeuSer Proteins consist of a long chain of amino acids, the primary structure N OH R H N O H R H N OH R H N O H R H N O H R H N O H R H N O H R H N O H R H Side chain H-bond Backbone Amino acid

Protein Structures ProGlyLeuSer Proteins consist of a long chain of amino acids, the primary structure The constituent amino acids may encourage hydrogen bonding that form regular structures, called secondary structures The secondary structures fold together to form a compact 3-dimensional shape, called the tertiary structure  -helix  -sheet

The problem can be formulated as a global minimization problem, as it is assumed that the tertiary structure occurs at the global minimum of the free energy function of the primary sequence Ab Initio Approach Our Goal: To provide an approach that relies more on physical principles than on information from known proteins

Ab Initio Method Tertiary structure is believed to minimize potential energy: Min V MM (x) where x = atom coordinates Difficulties: Proposed energy function may not match nature O(e n 2 ) local minima Very large parameter space e.g., modestly sized protein 100 amino acids ~ 1,600 atoms ~ 4,800 variables

The Search Algorithm Given the amino acid sequence of a protein, find the global minimum of the free energy function. Generate Starting Configurations Global Optimization Phase 1Phase 2

Secondary Structure Predictions in Phase 1 SKIGIDGFGRIGRLVLRAALSCGAQ CBBBBBCCCAAAAAAACCCBBBBBC Sequence: Type: Weight: Sequence: Servers predict secondary structure likely to be in a target protein based on a large database of known proteins.

Matching the predicted strands is a combinatorial problem Which strands are paired? Which orientation? ? ? ? parallel anti-parallel Which residues are paired? odd even

There are n!2 n-2 possible n-stranded motifs 96 motifs for n=4 960 motifs for n=5 It takes weeks to create some of these configurations using constrained local minimizations! Distribution of Beta Sheets in Proteins with Applications to Structure Prediction Ruckzinski, Kooperberg, Bonneau, and Baker, Proteins 48,2002

CASP4 Competition Fourth community-wide experiment on the Critical Assessment of Techniques for Protein Structure Prediction (2000) Our group predicted 8 proteins Largest protein had 240 aa Most complex fold had 2 β-strands

ProteinShop Interactive tool for protein manipulation Designed to quickly create initial configurations It takes weeks to create a number of configurations using constrained minimizations It takes a few hours to create the same configurations with ProteinShop

Phase 1 with ProteinShop Phase 1 Amino Acid Sequence Phase 2 Initial Configurations Final Configuration 2 nd ary Structure Prediction Geometry Generation Structure Sequence Direct Manipulation Pre-configuration Initial Configurations ProteinShop takes minutes

CASP4 Competition (before ProteinShop) CASP5 Competition (with ProteinShop) Our group predicted 20 proteins Largest protein had 417 aa Most complex fold had 13 β-strands Our group predicted 8 proteins Largest protein had 240 aa Most complex fold had 2 β-strands

Phase 2 Phase 1 Amino Acid Sequence Phase2: Global Optimization Initial Configurations Final Configuration Subspace Selection Initial Configurations Subspace Optimization Candidate Selection Final Configuration Takes months to converge using hundreds of processors on Seaborg!

Phase 2 with ProteinShop Phase 1 Amino Acid Sequence Phase2: Global Optimization Initial Configurations Final Configuration Subspace Selection Initial Configurations Subspace Optimization Candidate Selection Final Configuration Monitoring System Direct Manipulation Steering System Will reduce computation time

Monitoring System Monitor progress of overall optimization/each optimization process

Monitoring System Monitor progress of overall optimization/each optimization process Alert user to important events during optimization A sudden drop in internal energy A group of processes getting stuck Test new heuristics for expanding nodes of the tree

Steering System Change configurations during optimization to account for developments not anticipated during Phase 1 Manipulate proteins that don’t seem to be realistic or that are stuck in a local minimum Allow pruning of the optimization tree Assign multiple processes to a configuration that just had a drop in internal energy Assign stuck processes to other configurations

Plans for the Future  Use of the monitoring and steering features to develop and test a new method for protein structure prediction  Compete in CASP6 (Critical Assessment of Techniques for Protein Structure Prediction)  Expand and enhance ProteinShop

O. Kreylos, N. Max, B. Hamann, S. Crivelli, and W. Bethel. Interactive Protein Manipulation, Winner of the Best Application Award IEEE Visualization 2003, Seattle. ProteinShop Available to academic and non-profit organizations proteinshop.lbl.gov