Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to RCSB PDB Data, Tools and Resources

Similar presentations


Presentation on theme: "Introduction to RCSB PDB Data, Tools and Resources"— Presentation transcript:

1 Introduction to RCSB PDB Data, Tools and Resources
Maria Dominguez, Shuchismita Dutta, Ph.D.

2 Learning Objectives Introduction to PDB RCSB PDB Query Explore and Learn Visualize and Analyze

3 Protein Data Bank (PDB)
First open access digital resource in biology (est with 7 entries) Single global archive of 3-D macromolecular structures (contains >122,000 entries) US PDB = RCSB PDB Headquartered at Rutgers/UCSD (NSF, NIH, DOE) Part of Worldwide PDB (with EU and Japan) Makes PDB data freely available to all via Some of the first few structures in the PDB

4 Why Use PDB Data? Visualize Analyze Compare Structures
The molecule, its parts or complexes Analyze Stability and interactions of the molecule Structure function relationships Compare Structures Under different conditions Bound to various molecules (ligands or partner proteins) In health and disease Engineer and Design Mutations, additions, deletions to manipulate the function Facilitate tracking Discover drugs 2hhb/1hho

5 Where Does the Data Come From?
Sample  Structural Data Pipeline Target Selection Isolation, Expression, Purification, Crystallization Data Collection Structure Determination PDB Deposition & Release X-ray NMR Structures in the PDB are experimentally determined – by X-ray crystallography, Nuclear Magnetic Resonance (NMR) or electron microscopy (EM). In order to determine the structures a target is decided on, adequate amount of protein is produces and purified. For X-ray structures the protein has to be crystallized, for NMR a concentrated solution needs to be prepared, while for the EM experiment, the sample is placed on a grid. Data is collected in the respective experiments and used to build a model of the protein(s) in the structure. The structural coordinates and experimental data used to compute the models are all submitted to the PDB along with details about the experiment. The RCSB PDB enables users to freely access and use this material. 3D Models Annotations Publications EM You come here  

6 PDB Data Atomic coordinates and primary experimental data
Experimental details - sample preparation, data collection and structure solution Sequence(s) of polymers (proteins and nucleic acids) in the structure Information about ligands in the structure Links to various resources that describe sequence, function and other properties of the molecule. Classification of structures by sequence, structure, function and other criteria A Million files like this  are downloaded every day The primary data archived in the PDB are the 3D coordinates of all atoms in the structure. In addition information about the experiment, experimental data used to determine the structure and links to various other bioinformatics databases are also included.

7 Using the RCSB PDB Website …
Default View View for Students and Educators What can you do at the RCSB PDB website? Query: find relevant structure(s) Structure Summary: what is in the structure Visualize: what does the structure look like Integrate: to explore structure function relationships

8 Educational Resources (pdb101.rcsb.org)
Resources to help understand biology at molecular and atomic levels Paper Models Animations Posters

9 Learning Objectives Introduction to PDB RCSB PDB Query
Search Browse Explore and Learn Visualize and Analyze

10 Search and Refine By … Name, PDB ID, keywords
Entry properties e.g. author, deposition/release date, citation Sequence Annotations Chemical components (ligands, drugs, etc.)

11 RCSB PDB Query Reports

12 Browse By Annotation For example Gene Ontology Source Organism
Biological process Cellular component Molecular function Source Organism Molecular Structure SCOP CATH EC numbers Membrane proteins Anatomical Therapeutic Chemical

13 Search by PDB ID PDB ID: A 4-character identifier for an entry in the Protein Data Bank, it is both unique and immutable. PDB IDs are the most direct method for retrieving structures from the database, these IDs are randomly assigned at the time of deposition and have no particular meaning. One or more PDB IDs can be typed or copied into the search box. Multiple ID searches can be done by separating these with commas or line breaks.

14 Learning Objectives Introduction to PDB RCSB PDB Query
Explore and Learn Structure Summary Page Links to Other Resources Visualize and Analyze

15 Result: Search by PDB ID (4INS)
Seen on all pages - for new search from anywhere Searching for a PDB entry by PDB ID will show this page – called Structure Summary Page Entry specific information, details

16 Tabs on Top of Page Structure Summary 3D View Annotations Sequence
Overall structure information + details about composition of PDB structure 3D View Options to interactively explore the structure Annotations Information about PDB structure or its components from other bioinformatics resource Sequence Of all polymers (protein, DNA, RNA) in the structure with annotations of secondary structure, mutations, etc. Sequence Similarity Comparison of given structure to entire PDB by sequence Structure Similarity Comparison of given structure to entire PDB by structure Experiment Details of how the structure of the Protein/Complex was determined Literature List of primary or other articles that reference the given structure Not discussed here

17 Structure Summary Page -1
PDB ID Display/Download file Structure/Experiment Description; Validation Summary Visualization of Structure Literature The literature box was introduced on the RCSB PDB website in With many publications made freely available online, the RCSB PDB took the opportunity to directly link structural data with the literature.

18 PDB Coordinates The PDB archives 3D coordinates of molecular structures By clicking on PDB Format, you can acquire the molecule’s 3D coordinates The coordinates include residue name, residue #, X, Y, and Z coordinates for atoms, occupancy values, and B-factors. Occupancy numbers are usually 1 (indicating that it is present at that location) or 0 indicating that it is not present at that location). It can sometimes have two or more different values for the same atom(s) (indicating that the atom may have alternate positions, due to conformational flexibility of the molecule). When there are multiple conformations it may be written as 0.5/0.5 (50:50), if two conformations are equally occupied, or any other ratios if they exist in disproportion. B-factor refers to a temperature-dependent atomic vibrations as measured during the time of x-ray crystallography. Lower values of it indicate that the atom may be reliably located at that position.

19 Experiment Description
Angstrom is a unit of length, used to measure distances between atoms. 1Å = 10−10m (one ten-billionth of a meter)= 0.1 nm. An structure resolved at 1.5Å resolution is therefore of exceptional quality. Like other HORMONES, Insulin circulates in the blood and regulates glucose uptake by cells/tissues. The Insulin used for this experiment was derived from the organism Sus scrofa (or pig) The description box presents a simple yet highly valuable overview of the molecule being analyzed. Experimental Data and Validation: (refer to RCSB PDB website for further details: Validation allows users to determine structure quality – does the structure model match the experimental data; does it agree with prior knowledge about its structure and function The deposition authors are those responsible for solving this particular structure. The structure was determined using X-Ray Diffraction.

20 Structure Summary Page - 2
Macromolecule Entities Small Molecules The blue bars in the Macromolecules box indicates the sequence of molecules in the PDB entry, matched against the UniProt sequence and various other domain and other annotations. Small Molecules of Ligands are small molecules that may interact covalently or non-covalently to the proteins and DNA/RNA in the structure. Ligands have at least one atom and may have many atoms and complex connectivity and structure. Where available the ligand’s binding properties (when in complex with the protein/DNA/RNA) is also reported. This data is usually experimentally determined and derived from other bioinformatics data resources.

21 Macromolecular Entities
Indicates number of copies of this protein, and their polymer chain identifiers (chain ID) This row indicates secondary structure. Yellow areas represent β-strands and red areas stand for α- helixes. See the exact residues involved in the helix/strand etc. using mouse-over options. For each kind of protein (with a different polymer sequence) you can see the mapping of the region in the structure to the sequence of the complete gene product as listed in UniProt. Other annotations such as regions of helix and strands etc. are also marked here. The results of clicking on the + sign in the left bottom corner of the box is shown in the next slide. Clicking on this + sign will open the Protein Feature view (see next slide) displaying the region of the protein present in the structure, compared to the complete gene product. Two copies of Insulin protein Chain B Two copies of Insulin protein Chain A

22 PDB IDs and regions of protein included in the experiment
Protein Feature View Links to UniProt with sequence, function and integration of links to various other bioinformatics resources UniProt , Pfam, etc. annotations This page maps all PDB entries that match the protein sequence listed under the UniProt ID. For example in this case the UniProt ID is P01315 (pig insulin). The lower panels list the PDB IDs and the regions (domains) that they map to on the UniProt (protein) sequence. PDB IDs and regions of protein included in the experiment Can use this page to identify other relevant structures – e.g. with different domains, mutations etc.

23 Structure Summary Page - 3
Experimental Data and Validation Entry History The blue bars in the Macromolecules box indicates the sequence of molecules in the PDB entry, matched against the UniProt sequence and various other domain and other annotations.

24 Learning Objectives Introduction to PDB RCSB PDB Query
Explore and Learn Visualize and Analyze 3D Structures Ligands and their neighborhood Analyzing interactions

25 Visualization Metaphors/Conventions
What does a molecule look like? Wireframe Ribbons Before delving into visualization it may be helpful to understand some basics about visualization. Here the coordinated may be represented as atoms and bonds, ribbons, or surfaces. Combinations of these representations may also be used All atoms Backbone Spacefill

26 Visualize: Biological Assembly
Deposited coordinates (or Asymmetric Unit) Toggle through various biological assemblies (monomer, dimer, trimer and hexamer) Learn more about Biological Assemblies at

27 Visualize from Structure Summary Page
By clicking on the NGL, JSmol, or PV links you will be redirected to the 3D View Tab where you can explore the molecule’s visual further. These visualization tools can also be accessed from the 3D View tab and then by selecting the specific tool using the drop down menu options next to the image. See next slide

28 3D View (Jmol/JSmol) Display symmetry/ assembly
The Jsmol or Jmol tool is most popular online visualization software that can be directly used without installing any visualization software package Structure display options Explore interactions

29 Explore Ligand Interactions
Small molecular ligands, (ions, cofactors, inhibitors or drugs) found in a structure are usually important for its structure and/or function. Exploring the interactions around ligands can highlight amino acid residues critical to the protein’s functions. Exploring the kinds of interactions (hydrogen bond, hydrophobic, charge based etc.) can help understand the protein’s mechanism of action Analyze interactions around ligand Using online resources the environment of key ligands can be explored as shown above. The coordinates may also be visualized using other software e.g. Chimera, Pymol, etc. for analysis and making publication quality images.

30 Summary Introduction to PDB RCSB PDB Query Explore and Learn
Search Browse Explore and Learn Structure Summary Page Links to Other Resources Visualize and Analyze 3D Structures Ligands and their neighborhood Analyzing interactions


Download ppt "Introduction to RCSB PDB Data, Tools and Resources"

Similar presentations


Ads by Google