Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Computational Biology, Part 13 Retrieving and Displaying Macromolecular Structures Robert F. Murphy Copyright  1996, 1999, 2001-2007. All rights reserved.

Similar presentations


Presentation on theme: "1 Computational Biology, Part 13 Retrieving and Displaying Macromolecular Structures Robert F. Murphy Copyright  1996, 1999, 2001-2007. All rights reserved."— Presentation transcript:

1 1 Computational Biology, Part 13 Retrieving and Displaying Macromolecular Structures Robert F. Murphy Copyright  1996, 1999, 2001-2007. All rights reserved.

2 2 Retrieving 3D structures Protein Data Bank (PDB) Protein Data Bank (PDB)  home page = http://www.rcsb.org/pdb/ NCBI NCBI  via Structure Database BLAST BLAST  via links following sequence similarity searches

3 3 Displaying Structures Most web page displays relating to sequences are two-dimensional and easily interpreted by visual inspection. Appreciating molecular structures requires viewing them from various directions and modifying the display to emphasize different portions of the molecule Most web page displays relating to sequences are two-dimensional and easily interpreted by visual inspection. Appreciating molecular structures requires viewing them from various directions and modifying the display to emphasize different portions of the molecule To display 3D structures locally, we can use programs such as Cn3D or RasMol, public domain programs available for wide range of computers, including MacOS, Windows and Unix To display 3D structures locally, we can use programs such as Cn3D or RasMol, public domain programs available for wide range of computers, including MacOS, Windows and Unix

4 4 PDB files In order to optimally display, rotate and color the 3D structure, we need to download a copy of the coordinates for each atom in the molecule to our local computer In order to optimally display, rotate and color the 3D structure, we need to download a copy of the coordinates for each atom in the molecule to our local computer The most common format for storage and exchange of atomic coordinates for biological molecules is PDB file format The most common format for storage and exchange of atomic coordinates for biological molecules is PDB file format

5 5 PDB files PDB file format is a text (ASCII) format, with an extensive header that can be read and interpreted either by programs or by people PDB file format is a text (ASCII) format, with an extensive header that can be read and interpreted either by programs or by people We can request either the header only or the entire file We can request either the header only or the entire file

6 6 Example PDB file HEADER SYNTHETIC PROTEIN MODEL 02-JUL-90 1AL1 1AL1 2 COMPND ALPHA - 1 (AMPHIPHILIC ALPHA HELIX) 1AL1 3 SOURCE SYNTHETIC 1AL1 4 AUTHOR C.P.HILL,D.H.ANDERSON,L.WESSON,W.F.DE*GRADO,D.EISENBERG 1AL1 5 REVDAT 2 15-JAN-95 1AL1A 1 HET 1AL1A 1 REVDAT 1 15-OCT-91 1AL1 0 1AL1 6 JRNL AUTH C.P.HILL,D.H.ANDERSON,L.WESSON,W.F.DE*GRADO, 1AL1 7 JRNL AUTH 2 D.EISENBERG 1AL1 8 JRNL TITL CRYSTAL STRUCTURE OF ALPHA=1=: IMPLICATIONS FOR 1AL1 9 JRNL TITL 2 PROTEIN DESIGN 1AL1 10 JRNL REF SCIENCE V. 249 543 1990 1AL1 11 JRNL REFN ASTM SCIEAS US ISSN 0036-8075 038 1AL1 12 REMARK 1 1AL1 13 REMARK 1 REFERENCE 1 1AL1 14 REMARK 1 AUTH D.EISENBERG,W.WILCOX,S.M.ESHITA,P.M.PRYCIAK,S.P.HO 1AL1 15 REMARK 1 TITL THE DESIGN, SYNTHESIS, AND CRYSTALLIZATION OF AN 1AL1 16 REMARK 1 TITL 2 ALPHA-*HELICAL PEPTIDE 1AL1 17 REMARK 1 REF PROTEINS.STRUCT.,FUNCT., V. 1 16 1986 1AL1 18 REMARK 1 REF 2 GENET. 1AL1 19 REMARK 1 REFN ASTM PSFGEY US ISSN 0887-3585 867 1AL1 20 REMARK 2 1AL1 21 REMARK 2 RESOLUTION. 2.7 ANGSTROMS. 1AL1 22 REMARK 3 1AL1 23 REMARK 3 REFINEMENT. BY THE RESTRAINED LEAST SQUARES PROCEDURE OF J. 1AL1 24 REMARK 3 KONNERT AND W. HENDRICKSON (PROGRAM *PROLSQ*). THE R 1AL1 25 REMARK 3 VALUE IS 0.255 FOR ALL DATA. THE R VALUE IS 0.211 FOR ALL 1AL1 26 REMARK 3 REFLECTIONS IN THE RESOLUTION RANGE 10.0 TO 2.7 ANGSTROMS 1AL1 27 REMARK 3 WITH FOBS.GT. 2*SIGMA(FOBS). THE RMS DEVIATION FROM 1AL1 28 REMARK 3 IDEALITY OF THE BOND LENGTHS IS 0.013 ANGSTROMS. THE RMS 1AL1 29 REMARK 3 DEVIATION FROM IDEALITY OF THE BOND ANGLE DISTANCES IS 1AL1 30

7 7 Example PDB file SEQRES 1 13 ACE GLU LEU LEU LYS LYS LEU LEU GLU GLU LEU LYS GLY 1AL1 39 HET SO4 13 5 SULFATE ION 1AL1A 5 FORMUL 2 SO4 O4 S1 1AL1 41 HELIX 1 HL1 ACE 0 LEU 10 1 1AL1 42 CRYST1 62.350 62.350 62.350 90.00 90.00 90.00 I 41 3 2 48 1AL1 43 ORIGX1 1.000000 0.000000 0.000000 0.00000 1AL1 44 ORIGX2 0.000000 1.000000 0.000000 0.00000 1AL1 45 ORIGX3 0.000000 0.000000 1.000000 0.00000 1AL1 46 SCALE1 0.016038 0.000000 0.000000 0.00000 1AL1 47 SCALE2 0.000000 0.016038 0.000000 0.00000 1AL1 48 SCALE3 0.000000 0.000000 0.016038 0.00000 1AL1 49 ATOM 1 C ACE 0 31.227 38.585 11.521 1.00 25.00 1AL1 50 ATOM 2 O ACE 0 30.433 37.878 10.859 1.00 25.00 1AL1 51 ATOM 3 CH3 ACE 0 30.894 39.978 11.951 1.00 25.00 1AL1 52 ATOM 4 N GLU 1 32.153 37.943 12.252 1.00 25.00 1AL1 53 ATOM 5 CA GLU 1 32.594 36.639 11.811 1.00 25.00 1AL1 54 ATOM 6 C GLU 1 32.002 35.428 12.514 1.00 25.00 1AL1 55 ATOM 7 O GLU 1 32.521 34.279 12.454 1.00 25.00 1AL1 56 ATOM 8 CB GLU 1 34.093 36.609 11.812 1.00 25.00 1AL1 57 … ATOM 102 OXT GLY 12 20.888 27.022 1.650 1.00 25.00 1AL1 144 TER 103 GLY 12 1AL1 145 HETATM 104 S SO4 13 31.477 38.950 15.821 0.50 25.00 1AL1 146 HETATM 105 O1 SO4 13 31.243 38.502 17.238 0.50 25.00 1AL1 147 HETATM 106 O2 SO4 13 30.616 40.133 15.527 0.50 25.00 1AL1 148 HETATM 107 O3 SO4 13 31.158 37.816 14.905 0.50 25.00 1AL1 149 HETATM 108 O4 SO4 13 32.916 39.343 15.640 0.50 25.00 1AL1 150 CONECT 104 105 106 107 108 1AL1 151 CONECT 105 104 1AL1 152 CONECT 106 104 1AL1 153 CONECT 107 104 1AL1 154 CONECT 108 104 1AL1 155 MASTER 29 0 1 1 0 0 0 6 100 1 5 1 1AL1A 6 END 1AL1 157

8 8 Cn3D format Cn3D uses a special format that combines the atomic coordinates with sequence information Cn3D uses a special format that combines the atomic coordinates with sequence information It can also show more than one structure superimposed It can also show more than one structure superimposed It is a binary format so difficult to view directly (e.g., via text editor) It is a binary format so difficult to view directly (e.g., via text editor)

9 9 Example structure retrieval session (Use Entrez Structure database to retrieve and view structure of 1HOC, MHC class I using Cn3D) (Use Entrez Structure database to retrieve and view structure of 1HOC, MHC class I using Cn3D) (Cross to PDB link) (Cross to PDB link) (Download PDB file and view it with Rasmol) (Download PDB file and view it with Rasmol)

10 10

11 11 Useful RasMol commands show sequence lists all amino acids in each chain show sequence lists all amino acids in each chain select *a selects all residues in chain A select *a selects all residues in chain A colour red displays the selected residues in red colour red displays the selected residues in red

12 12

13 13

14 14 3HHB - all alpha Display: ribbons Display: ribbons Color: group Color: group

15 15 1CD8 - all beta Display: cartoons Display: cartoons

16 16 1KFJ - alpha/beta Display: cartoons Display: cartoons Select *a Select *a Colour violet Colour violet Select *b Select *b Colour yellow Colour yellow

17 17 1AL1 - Amphiphilic Alpha Helix select all select all colour white colour white ribbons ribbons select charged and not backbone select charged and not backbone wireframe wireframe colour red colour red select hydrophobic and not backbone select hydrophobic and not backbone colour blue colour blue

18 18 1AL1 - Amphiphilic Alpha Helix select all select all spacefill spacefill

19 19 Structural homology It is useful for new proteins whose 3D structure is not known to be able to find proteins whose 3D structure is known that are expected to have a similar structure to the unknown It is useful for new proteins whose 3D structure is not known to be able to find proteins whose 3D structure is known that are expected to have a similar structure to the unknown It is also useful for proteins whose 3D structure is known to be able to find other proteins with similar structures It is also useful for proteins whose 3D structure is known to be able to find other proteins with similar structures

20 20 Finding proteins with known structures based on sequence homology If you want to find known 3D structures of proteins that are similar in primary amino acid sequence to a particular sequence, can use BLAST web page and choose the PDB database If you want to find known 3D structures of proteins that are similar in primary amino acid sequence to a particular sequence, can use BLAST web page and choose the PDB database This is not the PDB database of structures, rather a database of amino acid sequences for those proteins in the structure database This is not the PDB database of structures, rather a database of amino acid sequences for those proteins in the structure database Links are available to retrieve PDB files Links are available to retrieve PDB files

21 21 Finding proteins with similar structures to a known protein For literature and sequence databases, Entrez allows neighbors to be found for a selected entry based on “homology” in terms (MEDline database) or sequence (protein and nucleic acid sequence databases) For literature and sequence databases, Entrez allows neighbors to be found for a selected entry based on “homology” in terms (MEDline database) or sequence (protein and nucleic acid sequence databases) Entrez also allows neighbors to be chosen for entries in the structure database Entrez also allows neighbors to be chosen for entries in the structure database

22 22 Finding proteins with similar structures to a known protein Proteins with similar structures are termed “VAST Neighbors” or “related structures” by Entrez (VAST refers to the method used to evaluate similarity of structure) Proteins with similar structures are termed “VAST Neighbors” or “related structures” by Entrez (VAST refers to the method used to evaluate similarity of structure) VAST or structure neighbors may or may not have sequence homology to each other VAST or structure neighbors may or may not have sequence homology to each other

23 23 Finding proteins with similar structures to tubulin

24 24 Finding proteins with structures similar to tubulin

25 25 Finding proteins with structures similar to tubulin

26 26 Finding proteins with similar structures to tubulin


Download ppt "1 Computational Biology, Part 13 Retrieving and Displaying Macromolecular Structures Robert F. Murphy Copyright  1996, 1999, 2001-2007. All rights reserved."

Similar presentations


Ads by Google