Methods in 3D Structure Determination

Slides:



Advertisements
Similar presentations
Chemistry 2100 Lecture 10.
Advertisements

Protein x-ray crystallography
Introduction to protein x-ray crystallography. Electromagnetic waves E- electromagnetic field strength A- amplitude  - angular velocity - frequency.
Determination of Protein Structure. Methods for Determining Structures X-ray crystallography – uses an X-ray diffraction pattern and electron density.
Review of Basic Principles of Chemistry, Amino Acids and Proteins Brian Kuhlman: The material presented here is available on the.
Lactate dehydrogenase + 38 ATP + 2 ATP. How does lactate dehydrogenase perform its catalytic function ?
Applications of knowledge discovery to molecular biology: Identifying structural regularities in proteins Shaobing Su Supervisor: Dr. Lawrence B. Holder.
Proteins Function and Structure.
Review: Amino Acid Side Chains Aliphatic- Ala, Val, Leu, Ile, Gly Polar- Ser, Thr, Cys, Met, [Tyr, Trp] Acidic (and conjugate amide)- Asp, Asn, Glu, Gln.
FUNDAMENTALS OF MOLECULAR BIOLOGY Introduction -Molecular Biology, Cell, Molecule, Chemical Bonding Macromolecule -Class -Chemical structure -Forms Important.
Case Studies Class 5. Computational Chemistry Structure of molecules and their reactivities Two major areas –molecular mechanics –electronic structure.
Structure Prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray crystallography, NMR) [2]
Protein-a chemical view A chain of amino acids folded in 3D Picture from on-line biology bookon-line biology book Peptide Protein backbone N / C terminal.
1 Levels of Protein Structure Primary to Quaternary Structure.
Amino Acids and Proteins 1.What is an amino acid / protein 2.Where are they found 3.Properties of the amino acids 4.How are proteins synthesized 1.Transcription.
Scoring Matrices June 19, 2008 Learning objectives- Understand how scoring matrices are constructed. Workshop-Use different BLOSUM matrices in the Dotter.
Protein Primer. Outline n Protein representations n Structure of Proteins Structure of Proteins –Primary: amino acid sequence –Secondary:  -helices &
ProteinStructuralDatabases. Proteins are built from amino-acids. Introduction H | NH2-c-CO2H | R.
Proteins Function and Structure. Proteins more than 50% of dry mass of most cells functions include – structural support – storage, transport – cellular.
IV. Protein Structure Prediction and Determination Methods of protein structure determination Critical assessment of structure prediction Homology modelling.
Other structure programs Insight II Swiss PDB-Viewer QUANTA/CHARMm and Quanta 2005 for proteins SYBYL with FUGUE and FlexX for drugs O or FRODO for x-ray.
1 Computational Biology, Part 13 Retrieving and Displaying Macromolecular Structures Robert F. Murphy Copyright  1996, 1999, All rights reserved.
1 Computational Biology, Part 11 Retrieving and Displaying Macromolecular Structures Robert F. Murphy Copyright  1996, 1999, All rights reserved.
The relative orientation observed for  helices packed on ß sheets.
Protein Structure FDSC400. Protein Functions Biological?Food?
Computing and Chemistry 3-41 Athabasca Hall Sept. 16, 2013.
Proteins are polymers of amino acids.
Protein Structural Prediction. Protein Structure is Hierarchical.
Structure and Function of Proteins Lecturer: Dr. Ora Furman Oct 2009 Winter 2009/10 Teaching Assistants: Miraim Oxsman Sivan Pearl.
Protein structure prediction
Chapter 12 Protein Structure Basics. 20 naturally occurring amino acids Free amino group (-NH2) Free carboxyl group (-COOH) Both groups linked to a central.
Number of released entries Year. Growth of Molecular Complexity Number of Chains Year Number of Structures Containing that Number of Chains.
Part II : Introduction To Protein Structure Kong Lesheng Victor Tong Joo Chuan National University of Singapore.
X-ray and NMR Topic 7 Chapter 4 & 5, Du and Bourne “Structural Bioinformatics”
The.pdb file format, and other resources for structural information Topic 5 Chapter 10 & 11, Du and Bourne “Structural Bioinformatics”
SMART Teams: Students Modeling A Research Topic Jmol Training 101!
 Four levels of protein structure  Linear  Sub-Structure  3D Structure  Complex Structure.
Protein Structure: NMR Spectroscopy Microbiology 343 David Wishart
BIOCHEMISTRY REVIEW Overview of Biomolecules Chapter 4 Protein Sequence.
Macromolecular Visualization or… Where to go when ChemDraw just isn’t enough Martin Case Chem
Protein structure and modelling ● Orientation ● Protein structure ● Protein modelling Andreas Heger University of Helsinki Bioinformatics Group Slides.
The Strategy of Atomic Resolution Structural Biology Break down complexity so that the system can be understood at a fundamental level Build up a picture.
EBI is an Outstation of the European Molecular Biology Laboratory. Annotation Procedures for Structural Data Deposited in the PDBe at EBI.
Considerations for Protein Crystallography (BT Chapter 18) 1.Growing crystals Usually require 0.5mm in shortest dimension, except if using Synchrotron.
Department of Mechanical Engineering
Secondary structure prediction
1 10/26/2015 MOLECULES. 2 10/26/2015 H 2 N-CH-C-OH O R Monomer E.g. protein Monomer vs polymer amino acid monomer R is a side group.
Biomolecular Nuclear Magnetic Resonance Spectroscopy FROM ASSIGNMENT TO STRUCTURE Sequential resonance assignment strategies NMR data for structure determination.
Indiana University School of C571/C696 Chemical Information Tech. 2004, Lecture 7. Page 1 C571/C696 Chemical Information Technology David Wild
The digestive system, thermodynamics, enzymes, and transport across membranes May 12, 2003 Learning objectives- Be capable of manipulating protein structures.
1. Diffraction intensity 2. Patterson map Lecture
Protein Secondary Structure Prediction G P S Raghava.
X RAY CRYSTALLOGRAPHY. WHY X-RAY? IN ORDER TO BE OBSERVED THE DIMENTIONS OF AN OBJECT MUST BE HALF OF THE LIGHT WAVELENGHT USED TO OBSERVE IT.
A program of ITEST (Information Technology Experiences for Students and Teachers) funded by the National Science Foundation Background Session #3 DNA &
Protein Folding & Biospectroscopy Lecture 6 F14PFB David Robinson.
Proteins.
Proteins Structure of proteins Proteins are made of C, H, O and nitrogen and may have sulfur. The monomers of proteins are amino acids An amino acid.
Atomic structure model
X-ray crystallography – an overview (based on Bernie Brown’s talk, Dept. of Chemistry, WFU) Protein is crystallized (sometimes low-gravity atmosphere is.
X-ray detection xray/facilities.html.
©CMBI 2008 Databases Data must be in a certain format for software to recognize Every database can have its own format but some data elements are essential.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
Protein structure prediction Haixu Tang School of Informatics.
Lecture 10 CS566 Fall Structural Bioinformatics Motivation Concepts Structure Solving Structure Comparison Structure Prediction Modeling Structural.
Arginine, who are you? Why so important?. Release 2015_01 of 07-Jan-15 of UniProtKB/Swiss-Prot contains sequence entries, comprising
Protein Structure and Properties
Number of released entries
Haixu Tang School of Inforamtics
Nobel Laureates of X Ray Crystallography
Determining protein structures
Presentation transcript:

Methods in 3D Structure Determination David Wishart June 2005 Methods in 3D Structure Determination David Wishart University of Alberta Edmonton, AB david.wishart@ualberta.ca Lecture 3.1 (c) CGDN 2005

3D Structure Review of Polypeptide Structure Protein Structure Determination by X-ray Protein Structure Determination by NMR The Protein Data Bank Lecture 3.1

Much Ado About Structure Structure Function Structure Mechanism Structure Origins/Evolution Structure-based Drug Design Solving the Protein Folding Problem Lecture 3.1

Protein Structures Are Complex Lactate Dehydrogenase: Mixed a / b Immunoglobulin Fold: b Hemoglobin B Chain: a Lecture 3.1

Proteins are Complex Average residue contains 8 “heavy” atoms Average protein contains 300 amino acids Average structure contains 2400 atoms First structure (sperm whale myoglobin) took ~5 years with a team of ~15 key punch operators working around the clock to solve Most structures still take 1 year to solve Lecture 3.1

Solving Protein Structures Only 2 kinds of techniques allow one to get atomic resolution pictures of macromolecules X-ray Crystallography (first applied in 1961 - Kendrew & Perutz) NMR Spectroscopy (first applied in 1983 - Ernst & Wuthrich) Lecture 3.1

X-ray Crystallography Lecture 3.1

X-ray Crystallography Crystallization Diffraction Apparatus Diffraction Principles Conversion of Diffraction Data to Electron Density Resolution Chain Tracing Lecture 3.1

Crystallization Protein Crystal Lecture 3.1

Crystallization Lecture 3.1

Diffraction Apparatus Lecture 3.1

Diffraction Apparatus Lecture 3.1

A Bigger Diffraction Apparatus Synchrotron Light Source Lecture 3.1

Diffraction Principles nl = 2dsinq Lecture 3.1

Diffraction Principles Corresponding Diffraction Pattern A string of atoms Lecture 3.1

Protein Crystal Diffraction Diffraction Pattern Lecture 3.1

Diffraction Apparatus Lecture 3.1

Converting Diffraction Data to Electron Density F T Lecture 3.1

Fourier Transformation i(xyz)(hkl) F(x,y,z) = f(hkl)e d(hkl) Converts from units of inverse space to cartesian coordinates Lecture 3.1

Diffracting a Cat Diffraction data with phase information Real Lecture 3.1

Reconstructing a Cat FT Easy FT Hard Lecture 3.1

The Phase Problem Diffraction data only records intensity, not phase information (half the information is missing) To reconstruct the image properly you need to have the phases (even approx.) Guess the phases (molecular replacement) Search phase space (direct methods) Bootstrap phases (isomorphous replacement) Uses differing wavelengths (anomolous disp.) Lecture 3.1

MAD & X-ray Crystallography MAD (Multiwavelength Anomalous Dispersion Requires synchrotron beam lines Requires protein with multiple scattering centres (selenomethionine labeled) Allows rapid phasing Proteins can now be “solved” in just 1-2 days Lecture 3.1

Resolution 1.2 Å 2 Å 3 Å Lecture 3.1

Chain Tracing Electron Chain Final Density Trace Model Lecture 3.1

Refinement Lecture 3.1

Refinement R R = S(|Fo-Fc|)/S(Fo) Fc = calculated structure factor # iterations R R = S(|Fo-Fc|)/S(Fo) Fc = calculated structure factor Fo = observed structure factor Lecture 3.1

The Final Result http://www-structure.llnl.gov/Xray/101index.html ORIGX2 0.000000 1.000000 0.000000 0.00000 2TRX 147 ORIGX3 0.000000 0.000000 1.000000 0.00000 2TRX 148 SCALE1 0.011173 0.000000 0.004858 0.00000 2TRX 149 SCALE2 0.000000 0.019585 0.000000 0.00000 2TRX 150 SCALE3 0.000000 0.000000 0.018039 0.00000 2TRX 151 ATOM 1 N SER A 1 21.389 25.406 -4.628 1.00 23.22 2TRX 152 ATOM 2 CA SER A 1 21.628 26.691 -3.983 1.00 24.42 2TRX 153 ATOM 3 C SER A 1 20.937 26.944 -2.679 1.00 24.21 2TRX 154 ATOM 4 O SER A 1 21.072 28.079 -2.093 1.00 24.97 2TRX 155 ATOM 5 CB SER A 1 21.117 27.770 -5.002 1.00 28.27 2TRX 156 ATOM 6 OG SER A 1 22.276 27.925 -5.861 1.00 32.61 2TRX 157 ATOM 7 N ASP A 2 20.173 26.028 -2.163 1.00 21.39 2TRX 158 ATOM 8 CA ASP A 2 19.395 26.125 -0.949 1.00 21.57 2TRX 159 ATOM 9 C ASP A 2 20.264 26.214 0.297 1.00 20.89 2TRX 160 ATOM 10 O ASP A 2 19.760 26.575 1.371 1.00 21.49 2TRX 161 ATOM 11 CB ASP A 2 18.439 24.914 -0.856 1.00 22.14 2TRX 162 http://www-structure.llnl.gov/Xray/101index.html Lecture 3.1

Bioinformatics & X-ray Energy minimization and molecular dynamics (for structure refinement) Structure quality assessment (threading, comparative geometry) Structure comparison and classification (VAST, SCOP, CATH, CE) Automated chain tracing (threading) Domain, secondary structure, and supersecondary structure assignment Lecture 3.1

NMR Spectroscopy Radio Wave Transceiver Lecture 3.1

Principles of NMR Measures nuclear magnetism or changes in nuclear magnetism in a molecule NMR spectroscopy measures the absorption of light (radio waves) due to changes in nuclear spin orientation NMR only occurs when a sample is in a strong magnetic field Different nuclei absorb at different energies (frequencies) Lecture 3.1

Principles of NMR Lecture 3.1

Principles of NMR N N hn S S Low Energy High Energy Lecture 3.1

The Bell Analogy CH3 NH CH Lecture 3.1

FT NMR Free Induction Decay FT NMR spectrum Lecture 3.1

Fourier Transformation iwt F(w) = f(t)e dt Converts from units of time to units of frequency Lecture 3.1

1H NMR Spectra Exhibit... Chemical Shifts (peaks at different frequencies or ppm values) Splitting Patterns (from spin coupling) Different Peak Intensities (# 1H) 8.0 7.0 6.0 5.0 4.0 3.0 2.0 1.0 0.0 Lecture 3.1

Protein NMR Spectrum Lecture 3.1

2D Gels & 2D NMR SDS PAGE Lecture 3.1

Multidimensional NMR 1D 2D 3D MW ~ 500 MW ~ 10,000 MW ~ 30,000 Lecture 3.1

The NMR Process Obtain protein sequence Collect TOCSY & NOESY data Use chemical shift tables and known sequence to assign TOCSY spectrum Use TOCSY to assign NOESY spectrum Obtain inter and intra-residue distance information from NOESY data Feed data to computer to solve structure Lecture 3.1

Multidimensional NMR NOESY TOCSY Lecture 3.1

Assigning Chemical Shifts Lecture 3.1

Measuring NOEs Lecture 3.1

NMR Spectroscopy Chemical Shift Assignments NOE Intensities J-Couplings Distance Geometry Simulated Annealing Lecture 3.1

The Final Result ORIGX2 0.000000 1.000000 0.000000 0.00000 2TRX 147 SCALE1 0.011173 0.000000 0.004858 0.00000 2TRX 149 SCALE2 0.000000 0.019585 0.000000 0.00000 2TRX 150 SCALE3 0.000000 0.000000 0.018039 0.00000 2TRX 151 ATOM 1 N SER A 1 21.389 25.406 -4.628 1.00 23.22 2TRX 152 ATOM 2 CA SER A 1 21.628 26.691 -3.983 1.00 24.42 2TRX 153 ATOM 3 C SER A 1 20.937 26.944 -2.679 1.00 24.21 2TRX 154 ATOM 4 O SER A 1 21.072 28.079 -2.093 1.00 24.97 2TRX 155 ATOM 5 CB SER A 1 21.117 27.770 -5.002 1.00 28.27 2TRX 156 ATOM 6 OG SER A 1 22.276 27.925 -5.861 1.00 32.61 2TRX 157 ATOM 7 N ASP A 2 20.173 26.028 -2.163 1.00 21.39 2TRX 158 ATOM 8 CA ASP A 2 19.395 26.125 -0.949 1.00 21.57 2TRX 159 ATOM 9 C ASP A 2 20.264 26.214 0.297 1.00 20.89 2TRX 160 ATOM 10 O ASP A 2 19.760 26.575 1.371 1.00 21.49 2TRX 161 ATOM 11 CB ASP A 2 18.439 24.914 -0.856 1.00 22.14 2TRX 162 Lecture 3.1

Bioinformatics & NMR Energy minimization and molecular dynamics (for structure refinement) Structure quality assessment (threading, comparative geometry) Structure comparison and classification (VAST, SCOP, CATH, CE) Automated assignment, fuzzy logic, chemical shift prediction Lecture 3.1

X-ray Versus NMR X-ray NMR Producing enough protein for trials Crystallization time and effort Crystal quality, stability and size control Finding isomorphous derivatives Chain tracing & checking Producing enough labeled protein for collection Sample “conditioning” Size of protein Assignment process is slow and error prone Measuring NOE’s is slow and error prone Lecture 3.1

The PDB PDB - Protein Data Bank Established in 1971 at Brookhaven National Lab (7 structures) Primary archive for macromolecular structures (proteins, nucleic acids, carbohydrates) Moved from BNL to RCSB (Research Collaboratory for Structural Bioinformatics) in 1998 Lecture 3.1

http://www.rcsb.org/pdb/ The PDB Lecture 3.1

The PDB Contains coordinate data (primarily) from X-ray, NMR and modelling Contains files in 2 formats PDB format mmCIF (macromolecular Crystallographic Information File Format) Contains >21,000 entries Currently growing exponentially Lecture 3.1

PDB Growth Lecture 3.1

Sequences vs. Structures 200000 180000 160000 140000 120000 Sequences 100000 Structures 80000 60000 40000 20000 Lecture 3.1

PDB Composition (2003) Lecture 3.1

PDB Data Entry Lecture 3.1

PDB File Format HEADER ELECTRON TRANSPORT 19-MAR-90 2TRX 2TRXA 1 COMPND THIOREDOXIN 2TRXA 2 SOURCE (ESCHERICHIA $COLI) 2TRX 4 AUTHOR S.K.KATTI,D.M.LE*MASTER,H.EKLUND 2TRX 5 REVDAT 2 15-JAN-93 2TRXA 1 HEADER COMPND 2TRXA 3 REVDAT 1 15-OCT-91 2TRX 0 2TRX 6 JRNL AUTH S.K.KATTI,D.M.LE*MASTER,H.EKLUND 2TRX 7 JRNL TITL CRYSTAL STRUCTURE OF THIOREDOXIN FROM ESCHERICHIA 2TRX 8 JRNL TITL 2 $COLI AT 1.68 ANGSTROMS RESOLUTION 2TRX 9 JRNL REF J.MOL.BIOL. V. 212 167 1990 2TRX 10 JRNL REFN ASTM JMOBAK UK ISSN 0022-2836 070 2TRX 11 REMARK 1 HEADER - PDB accession, date, function CMPND - name of molecule or compound SOURCE - origin or source of molecule (species) REVDAT - revision dates JRNL - primary reference (journal) describing structure REMARK - a comment made by depositor Lecture 3.1

PDB File Format REMARK - a comment made by depositor REMARK 6 CORRECTION. CORRECT CLASSIFICATION ON HEADER RECORD AND 2TRXA 5 REMARK 6 REMOVE E.C. CODE. 15-JAN-93. 2TRXA 6 SEQRES 1 A 108 SER ASP LYS ILE ILE HIS LEU THR ASP ASP SER PHE ASP 2TRX 74 SEQRES 2 A 108 THR ASP VAL LEU LYS ALA ASP GLY ALA ILE LEU VAL ASP 2TRX 75 SEQRES 3 A 108 PHE TRP ALA GLU TRP CYS GLY PRO CYS LYS MET ILE ALA 2TRX 76 SEQRES 4 A 108 PRO ILE LEU ASP GLU ILE ALA ASP GLU TYR GLN GLY LYS 2TRX 77 SEQRES 5 A 108 LEU THR VAL ALA LYS LEU ASN ILE ASP GLN ASN PRO GLY 2TRX 78 SEQRES 6 A 108 THR ALA PRO LYS TYR GLY ILE ARG GLY ILE PRO THR LEU 2TRX 79 SEQRES 7 A 108 LEU LEU PHE LYS ASN GLY GLU VAL ALA ALA THR LYS VAL 2TRX 80 SEQRES 8 A 108 GLY ALA LEU SER LYS GLY GLN LEU LYS GLU PHE LEU ASP 2TRX 81 SEQRES 9 A 108 ALA ASN LEU ALA 2TRX 82 HET CU 109 1 COPPER ++ ION 2TRX 100 HET CU 109 1 COPPER ++ ION 2TRX 101 HET MPD 601 8 2-METHYL-2,4-PENTANEDIOL 2TRX 102 HET MPD 602 8 2-METHYL-2,4-PENTANEDIOL 2TRX 103 REMARK - a comment made by depositor SEQRES - sequence of protein in 3 letter code HET - names of heteroatoms Lecture 3.1

PDB File Format FORMUL - chemical formula of heteroatoms FORMUL 3 CU 2(CU1 ++) 2TRX 110 FORMUL 4 MPD 8(C6 H14 O2) 2TRX 111 FORMUL 5 HOH *140(H2 O1) 2TRX 112 HELIX 1 A1A SER A 11 LEU A 17 1 DISORDERED IN MOLECULE B 2TRX 113 HELIX 2 A2A CYS A 32 TYR A 49 1 BENT BY 30 DEGREES AT RES 39 2TRX 114 HELIX 3 A3A ASN A 59 ASN A 63 1 2TRX 115 HELIX 4 31A THR A 66 TYR A 70 5 DISTORTED H-BONDING C-TERMINS 2TRX 116 HELIX 5 A4A SER A 95 LEU A 107 1 2TRX 117 SHEET 1 B1A 5 LYS A 3 THR A 8 0 2TRX 123 SHEET 2 B1A 5 LEU A 53 ASN A 59 1 O VAL A 55 N ILE A 5 2TRX 124 SHEET 3 B1A 5 GLY A 21 TRP A 28 1 N TRP A 28 O LEU A 58 2TRX 125 SHEET 4 B1A 5 PRO A 76 LYS A 82 -1 O THR A 77 N PHE A 27 2TRX 126 SHEET 5 B1A 5 VAL A 86 GLY A 92 -1 N GLY A 92 O LYS A 82 2TRX 127 SSBOND 1 CYS A 32 CYS A 35 2TRX 143 FORMUL - chemical formula of heteroatoms HELIX - location of helices as identified by depositor SHEET location of beta sheets as identified by depositor SSBOND - location and exisitence of disulfide bond Lecture 3.1

PDB File Format ORIGX1 1.000000 0.000000 0.000000 0.00000 2TRX 146 ORIGX2 0.000000 1.000000 0.000000 0.00000 2TRX 147 ORIGX3 0.000000 0.000000 1.000000 0.00000 2TRX 148 SCALE1 0.011173 0.000000 0.004858 0.00000 2TRX 149 SCALE2 0.000000 0.019585 0.000000 0.00000 2TRX 150 SCALE3 0.000000 0.000000 0.018039 0.00000 2TRX 151 ATOM 1 N SER A 1 21.389 25.406 -4.628 1.00 23.22 2TRX 152 ATOM 2 CA SER A 1 21.628 26.691 -3.983 1.00 24.42 2TRX 153 ATOM 3 C SER A 1 20.937 26.944 -2.679 1.00 24.21 2TRX 154 ATOM 4 O SER A 1 21.072 28.079 -2.093 1.00 24.97 2TRX 155 ATOM 5 CB SER A 1 21.117 27.770 -5.002 1.00 28.27 2TRX 156 ATOM 6 OG SER A 1 22.276 27.925 -5.861 1.00 32.61 2TRX 157 ATOM 7 N ASP A 2 20.173 26.028 -2.163 1.00 21.39 2TRX 158 ATOM 8 CA ASP A 2 19.395 26.125 -0.949 1.00 21.57 2TRX 159 ATOM 9 C ASP A 2 20.264 26.214 0.297 1.00 20.89 2TRX 160 ATOM 10 O ASP A 2 19.760 26.575 1.371 1.00 21.49 2TRX 161 ORIGXn - scaling factors to transform from orthogonal coords. SCALEn - scaling factors to transform to fractional cryst. Coords. ATOM - atomic coordinates of molecule Lecture 3.1

PDB File Format Residue Name Atom # X coord (Å) Z coord (Å) B-factor Atom Name Residue # Y coord (Å) Occupancy ATOM 1 N SER A 1 21.389 25.406 -4.628 1.00 23.22 2TRX 152 ATOM 2 CA SER A 1 21.628 26.691 -3.983 1.00 24.42 2TRX 153 ATOM 3 C SER A 1 20.937 26.944 -2.679 1.00 24.21 2TRX 154 ATOM 4 O SER A 1 21.072 28.079 -2.093 1.00 24.97 2TRX 155 ATOM 5 CB SER A 1 21.117 27.770 -5.002 1.00 28.27 2TRX 156 ATOM 6 OG SER A 1 22.276 27.925 -5.861 1.00 32.61 2TRX 157 ATOM 7 N ASP A 2 20.173 26.028 -2.163 1.00 21.39 2TRX 158 ATOM 8 CA ASP A 2 19.395 26.125 -0.949 1.00 21.57 2TRX 159 ATOM 9 C ASP A 2 20.264 26.214 0.297 1.00 20.89 2TRX 160 ATOM 10 O ASP A 2 19.760 26.575 1.371 1.00 21.49 2TRX 161 Lecture 3.1

PDB File Format Spacing is critical (Fortran compatible) Often inconsistent (30+ years old) Watch for unusual residues (ACE, SME) Some files have 1 structure (X-ray), others have 2 structures (chain A and B in unit cell), others have >20 (NMR) Some have missing atoms, others have hydrogens, others don’t Lecture 3.1

Structure File Conversion Alchemy (t) AMBER PREP (prep) Ball and Stick (bs) Biosym .CAR (car) Boogie (boog) Cacao Cartesian (caccrt) Cambridge CADPAC (cadpac) CHARMm (charmm) Chem3D Cartesian 1 (c3d1) Chem3D Cartesian 2 (c3d2) CSD CSSR (cssr) CSD FDAT (fdat) CSD GSTAT (c) Feature (feat) Free Form Fractional (f) GAMESS Output (famout) Gaussian Z-Matrix (g) Gaussian Output (gauout) Hyperchem (hin) MDL Isis (isis) Mac Molecule(macmol) Macromodel (k) Micro World (micro) MM2 Input (mi) MM2 Ouput (mo) MM3 (mm3) MMADS (mmads) MDL MOLfile (mdl) MOLIN (molen) Mopac Cartesian (ac) Mopac Internal (ai) Mopac Output (ao) PC Model (pc) PDB (p) Quanta (quanta) ShelX (shelx) Spartan (spar) Spartan Semi-Empirical (semi) Spartan Mol. Mechanics (spmm) Sybyl Mol (mol) Sybyl Mol2 (mol2) Conjure (con) Maccs 2d (maccs2) Maccs 3d (maccs3) UniChem XYZ (unixyz) XYZ (x) XED (xed) http://smog.com/chem/babel/ Lecture 3.1

Lecture 3.1

Lecture 3.1

Lecture 3.1

Lecture 3.1

Lecture 3.1

Conclusions X-ray and NMR spectroscopy are both capable of determining 3D structures in a high throughput manner X-ray is more powerful than NMR for coordinate determination, NMR is more powerful for understanding “processes” and motions The Protein Databank (PDB) is the primary resource for protein structure information Lecture 3.1