Practical session 2b Introduction to 3D Modelling and threading 9:30am-10:00am 3D modeling and threading 10:00am-10:30am Analysis of mutations in MYH6 Miguel Andrade Max Delbrück Center for Molecular Medicine
Introduction Protein tertiary structure Secondary structure elements fold together
Introduction Protein quaternary structure Folded proteins form a complex
Protein domains are structural units (average 160 aa) that share: Function Folding Evolution Proteins normally are multidomain (average 300 aa) Introduction
Protein domains are structural units (average 160 aa) that share: Function Folding Evolution Proteins normally are multidomain (average 300 aa) Introduction
X-ray crystallography (70,714 in PDB) need crystals Nuclear Magnetic Resonance (NMR) (9,312) proteins in solution lower size limit (600 aa) Electron microscopy (422) Low resolution (>5A) Determination of protein structure
resolution 2.4 A
Determination of protein structure resolution 2.4 A
Structural genomics Currently: 81K 3D structures from around 27K seqs 16M sequences in UniProt only 0.17%!
Structural genomics Currently: 81K 3D structures from around 27K seqs 16M sequences in UniProt 50% sequences covered (25% in 1995) only 0.17%!
Strategy for analysis Query Sequence Yes 3D Modeling by homology No 2D Prediction 3D Ab initio 3D Threading Similar to PDB sequence? Predict domains Cut
3D structure prediction Approaches Class 1 Comparative modeling Class 2 Ab initio Need sequence only Need similarity to a known structure Threading
3D structure prediction Approaches Search for sequences of known structure similar to target Comparative modeling (30% to 50% id) Model from template core regions and from loops and side chains of structures that might be unrelated Atom coordinates from conserved residues Optimize distance constraints derived from sequence-template alignment Extra methods for loops, turns, and side chains
3D structure prediction Approaches Thread target sequence through a library of known folds. Select right fold based on energy considerations More computational cost – but detect more distant relationships Threading (identity can be lower than 30%)
3D structure prediction Approaches Explore conformational space Limit the number of atoms Break the problem into fragments of sequence Optimize hydrophobic residue burial and pairing of beta-strands Limited success Ab initio
Relation between sequence identity and accuracy/applications From: Baker and Sali (2001) Science
3D structure prediction Applications: target design Query sequence catalytic center known 3D Leu Gly model 3D by homology Gly Lys + similar to LG GK
3D structure prediction Applications: fit to low res 3D Query sequence 1 low resolution 3D (electron microscopy) Query sequence 2
3D structure prediction GenTHREADER David Joneshttp://bioinf.cs.ucl.ac.uk/psipred/ Input sequence Relatively quick, 5 minutes GenTHREADER Jones (1999) J Mol Biol
Output GenTHREADER 3D structure prediction GenTHREADER
3D structure prediction Phyre Kelley et al (2000) J Mol Biol Kelley and Sternberg (2009) Nature Protocols
3D structure prediction I-Tasser Jeffrey Skolnick Tasser Yang Zhang I-Tasser Lee and Skolnick (2008) Biophysical Journal Roy et al (2010) Nature Methods Threading Fold 66% sequences <200 aa long of low homology to PDB Just submit your sequence and wait… (some days) Output are predicted structures (PDB format)
3D structure prediction I-Tasser Roy et al (2010) Nature Methods
3D structure prediction I-Tasser
3D structure prediction I-Tasser
3D structure prediction QUARK
3D structure prediction MODbase Andrej Salihttp://modbase.compbio.ucsf.edu/ Pieper et al (2011) Nucleic Acids Research
3D structure prediction MODbase