Common File Formats in Rosetta Steven Combs. The Files Flags/Option files Resfiles Params PDB Silent Atom tree diffs.

Slides:



Advertisements
Similar presentations
What is it that makes up an atom?
Advertisements

EBI is an Outstation of the European Molecular Biology Laboratory. PDBeChem The Ligand Database.
Rosetta Energy Function Glenn Butterfoss. Rosetta Energy Function Major Classes: 1. Low resolution: Reduced atom representation Simple energy function.
Lactate dehydrogenase + 38 ATP + 2 ATP. How does lactate dehydrogenase perform its catalytic function ?
An overview of amino acid structure Topic 2. Biomacromolecule A naturally occurring substance of large molecular weight e.g. Protein, DNA, lipids etc.
Chemical Biology 03 BLOOD
Protein-a chemical view A chain of amino acids folded in 3D Picture from on-line biology bookon-line biology book Peptide Protein backbone N / C terminal.
1 Levels of Protein Structure Primary to Quaternary Structure.
Amino Acids and Proteins 1.What is an amino acid / protein 2.Where are they found 3.Properties of the amino acids 4.How are proteins synthesized 1.Transcription.
Introduction to Structural Bioinformatics Dong Xu Computer Science Department 271C Life Sciences Center 1201 East Rollins Road University of Missouri-Columbia.
Protein Primer. Outline n Protein representations n Structure of Proteins Structure of Proteins –Primary: amino acid sequence –Secondary:  -helices &
ProteinStructuralDatabases. Proteins are built from amino-acids. Introduction H | NH2-c-CO2H | R.
Protein: Linear chain of amino acids called residues (4 in this toy protein) Ser Trp Leu O N N N N O O C C C C O O CαCα CαCα CαCα CαCα Lys H H H H H The.
Physics of Protein Folding. Why is the protein folding problem important? Understanding the function Drug design Types of experiments: X-ray crystallography.
Computing for Bioinformatics Lecture 8: protein folding.
High Throughput Processing of the Structural Information of the Protein Data Bank Zoltán Szabadka, Vince Grolmusz Department of Computer Science Eötvös.
1 Computational Biology, Part 13 Retrieving and Displaying Macromolecular Structures Robert F. Murphy Copyright  1996, 1999, All rights reserved.
The Protein Databank Working with protein data-files.
Chapter 10 Organic Chemistry
A Kinematic View of Loop Closure EVANGELOS A. COUTSIAS, CHAOK SEOK, MATTHEW P. JACOBSON, KEN A. DILL Presented by Keren Lasker.
1 Computational Biology, Part 11 Retrieving and Displaying Macromolecular Structures Robert F. Murphy Copyright  1996, 1999, All rights reserved.
Comparing protein structure and sequence similarities Sumi Singh Sp 2015.
Inverse Kinematics for Molecular World Sadia Malik April 18, 2002 CS 395T U.T. Austin.
Python programs How can I run a program? Input and output.
Structure and Function of Proteins Lecturer: Dr. Ora Furman Oct 2009 Winter 2009/10 Teaching Assistants: Miraim Oxsman Sivan Pearl.
Being a binding site: Characterizing Residue-Composition of Binding Sites on Proteins joint work with Zoltán Szabadka and Gábor Iván, Protein Information.
Practical session 2b Introduction to 3D Modelling and threading 9:30am-10:00am 3D modeling and threading 10:00am-10:30am Analysis of mutations in MYH6.
The Chemistry of Carbon
The.pdb file format, and other resources for structural information Topic 5 Chapter 10 & 11, Du and Bourne “Structural Bioinformatics”
What are proteins? Proteins are important; e.g. for catalyzing and regulating biochemical reactions, transporting molecules, … Linear polymer chain composed.
EMBL-EBI Adel Golovin MSDsite The project is funded by the European Commission as the TEMBLOR, contract-no. QLRI-CT under the RTD programme.
Protein Sequences. The Genetic Code The natural extension of the genetic code…
Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding I Prof. Corey O’Hern Department of Mechanical Engineering & Materials.
Uncommon Amino Acids, Amino Acids Forming Proteins & Primary Structure of a Protein ( ) pg By: Emily, Kennedy.
EBI is an Outstation of the European Molecular Biology Laboratory. Annotation Procedures for Structural Data Deposited in the PDBe at EBI.
1.Overall amino acid structure 2.Amino acid stereochemistry 3.Amino acid sidechain structure & classification 4.‘Non-standard’ amino acids 5.Amino acid.
Molecules of Life Chapter 2. Protein Functions  Act as enzymes  Structural- cytoskeleton (actin, tubulin, others)  Mechanical- actin and myosin in.
Department of Mechanical Engineering
Proteins and Amino Acids 1. Biological Functions of Proteins Facilitate biochemical reactions Structural support Storage and Transport Immune protection.
Proteins Enzymes are Proteins. Proteins Proteins: a chain of amino acids 20 different amino acids are found in proteins.
Molecular visualization
CHAPTER 2 CHEMISTRY OF LIFE. 2-1 The Nature of Matter.
09/06/12 CSCE 769 Amino Acids, Polypeptides and Proteins Homayoun Valafar Department of Computer Science and Engineering, USC.
Copyright © 2007, BioXGEM Lab., Institute of Bioinformatics, NCTU All rights reserved. ARPPI: A knowledge-based model for protein-protein interactions.
Doug Raiford Lesson 17.  Framework model  Secondary structure first  Assemble secondary structure segments  Hydrophobic collapse  Molten: compact.
Python + PyMOL Arbitrary Python code possible within python scripts PyMOL functionality available by importing PyMOL's modules cmd, cgo, stored etc. Call.
EMBOSS over a Grid 1. 1st EELA Grid School December 4th of 2006 Eduardo MURRIETA LEON Romualdo ZAYAS-LAGUNAS Pierre-Alain BRANGER Jérôme VERLEYEN Roberto.
Video Questions What do the following prefixes mean? Mono, Poly, Exo, and End What do you need to have “LIFE”? Draw a picture of an atom. Why do atoms.
Introduction to Protein Structure Prediction BMI/CS 576 Colin Dewey Fall 2008.
Databanks + New tools = New insights THE AXIOM S imple A tom D epth I ndex C alculator protein fold barcoding CATH – ADAPT…
 most interaction with PyMOL is via a scripting language, not all functions are available from menus  keyword followed optionally by one or more comma-separated.
EBI is an Outstation of the European Molecular Biology Laboratory. PDBeChem The Ligand Database.
X-ray detection xray/facilities.html.
Topic 1 Roland Dunbrack. Modeling of Biological Units Model data files of single proteins may require –sequence alignment(s) to templates (entry and chain)
Oliver Thomas. Atoms Unable to be cut Basic unit of matter Made of protons, neutrons, and electrons Protons are positive Neutrons carry no charge Electrons.
©CMBI 2008 Databases Data must be in a certain format for software to recognize Every database can have its own format but some data elements are essential.
Lecture 3 Chemical building blocks: water amino acids, proteins.
Rotamer Trie: Storing & evaluating sets of rotamers Andrew Leaver-Fay, UNC Comp Sci Brian Kuhlman, UNC Biochem Jack Snoeyink, UNC Comp Sci.
Marlou Snelleman 2012 Protein structure. Overview Sequence to structure Hydrogen bonds Helices Sheets Turns Hydrophobicity Helices Sheets Structure and.
EBI is an Outstation of the European Molecular Biology Laboratory. A web based integrated search service to understand ligand binding and secondary structure.
Structural Bioinformatics Elodie Laine Master BIM-BMC Semester 3, Genomics of Microorganisms, UMR 7238, CNRS-UPMC e-documents:
Sequence: PFAM Used example: Database of protein domain families. It is based on manually curated alignments.
Dhanvantari GenOME to hit
PDBemotif A web based integrated search service to understand ligand binding and secondary structure properties in macromolecular structures.
Getting the Most out of the PDBe
Section 1 Powerpoint Assignment for Micbio 565, 2012
Chemistry Goals: Unit 2 (P4.p2A) Distinguish between an element, compound, or mixture based on drawings or formulae. (P4.p2B) Id a pure substance based.
Amino Acids Amine group -NH2 Carboxylic group -COOH
Levels of Protein Structure
Crystal structure description
Presentation transcript:

Common File Formats in Rosetta Steven Combs

The Files Flags/Option files Resfiles Params PDB Silent Atom tree diffs

Flags / Options Paramaters for protocols Formatted in 3 different ways Command Line Fixbb.release –database -s 1thfD.pdb –ex1 –ex2 –packing:repack_only –out:prefix repack_ -resfile 1thfD.resfile

Flags / Options Paramaters for protocols Formatted in 3 different ways space / tab indented –database -in -file -s 1thfD.pdb -out -prefix repack_ -packing -ex1 -ex2 -repack_only

Flags / Options Paramaters for protocols Formatted in 3 different ways Mixed –database -in -file -s 1thfD.pdb -out -prefix repack_ -packing:ex1 -packing:ex2 -packing:repack_only Mixed Fixbb.release –s 1thfD.pdb –ex1 –database -out -prefix repack_ -packing:repack_only

Resfiles Defines which residues to pack/design Great documentation! _user_guide/resfiles.html

Params Files Definition for residues Residues are defined by type (ie residue ala is typed ALA) Residue types found in: database/chemical/residue_types/fa_standard/residue_types/l-caa/ Ligand residue types made through molfile_to_params.py

NAME GLY Residue type IO_STRING GLY G 3 letter name and 1 letter name TYPE POLYMER Defines type as polymer AA GLY Defines as an amino acid gly ATOM N Nbb NH Atom name found in pdb ATOM CA CAbb CT Atom type used by Rosetta ATOM C CObb C 0.51 Molecular mechanics atom type used by Rosetta (mainly used in backrub application) ATOM O OCbb O Atom charge ATOM H HNbb H 0.31 ATOM 1HA Hapo HB 0.09 ATOM 2HA Hapo HB 0.09 LOWER_CONNECT N What atom is the polymer connected to at the lower end of residue UPPER_CONNECT C What atom is the polymer connected to at the upper end of residue BOND N CA BOND N H BOND CA C BOND CA 1HA BOND CA 2HA BOND C O PROPERTIES PROTEIN Different properties, can be POLAR, CHARGED, AROMATIC, etc etc NBR_ATOM CA Closest atom to the center of mass NBR_RADIUS Radius of neighbor atom FIRST_SIDECHAIN_ATOM NONE Where does the sidechain begin ACT_COORD_ATOMS CA END Where the center of charge is. Used by fa_pair for polar amino acids ICOOR_INTERNAL N N CA C ICOOR_INTERNAL CA N CA C ICOOR_INTERNAL C CA N C ICOOR_INTERNAL UPPER C CA N ICOOR_INTERNAL O C CA UPPER ICOOR_INTERNAL 1HA CA N C ICOOR_INTERNAL 2HA CA N 1HA ICOOR_INTERNAL LOWER N CA C ICOOR_INTERNAL H N CA LOWER Defines what atoms are bonded to each other Internal coordinate system. Defines how atoms are connected to each other in 3D space

PDB Files Holds 3D coordinates for proteins ATOM 1953 N LEU D N ATOM 1954 CA LEU D C ATOM 1955 C LEU D C ATOM 1956 O LEU D O ATOM 1957 CB LEU D C ATOM 1958 CG LEU D C ATOM 1959 CD1 LEU D C ATOM 1960 CD2 LEU D C TER 1961 LEU D 253 HETATM 1962 P PO4 D P Ligand lines Atom # Atom name Residue name Chain ID Residue # X-coord Y-coord Z-coord occupancy B-factor Element name Atom lines

Atom Tree Diffs Only used in ligand docking (output that contains models) Extension is *.silent Allows for storing only coordinates that have changed between a reference model Very confusing format Extracted using extract_atomtree_diffs.release

Silent Files Specify by –in:file:silent and –out:file:silent (as opposed to –in:file:s and –out:pdb) These are smaller files compared to PDBs. They store the internal coordinates of the model in terms of dihedral angles. If want to input/output silent files, also specify – in:file:silent_struct_type binary and – out:file:silent_struct_type binary Binary silent files store the same information in ASCII binary and are even more compressed and more robust in terms of extracting PDB coordinates from them.