Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bioinformatics 2 -- lecture 20 Protein design -- the state of the art.

Similar presentations


Presentation on theme: "Bioinformatics 2 -- lecture 20 Protein design -- the state of the art."— Presentation transcript:

1 Bioinformatics 2 -- lecture 20 Protein design -- the state of the art

2 Protein folding/ protein design folding sequence structure design

3 Sequence space maps to structure space..as many-to-one. sequence families fold

4 Site-directed mutagenesis, minimization (J. Wells, 1980's-90’s) Coiled coils, helix bundles (DeGrado, 1980's-90's) Binary patterning (Hecht, 1990’s) Extreme protein stabilization (Mayo, 1990's) Binding pocket design (Hellinga, 2000) New fold design (Kuhlman & Baker, 2002-4) Protein-protein interface design (Gray & Baker, 2004) Open source protein design algorithm EGAD (Pokala, 2005) Novel enzyme by design (Baker, 2008) Experimental approaches: in vitro evolution phage display Short history of protein design Computational approaches: Dead-End elimination binary patterning

5 Minimizing proteins (Bing Li et al, Science, 1995) rational design ANP = atrial natriuretic peptide

6 Binary patterning in proteins (Kamtekar et al, Science, 1993) rational design

7

8 Computational protein design using Dead-End Elimination Select positions for mutating. Select allowed amino acids at those positions. For the selected amino acids, try all sidechain orientations (rotamers). Chose the set of rotamers that gives the lowest “energy”

9 Sidechain Rotamers Sidechain conformations fall into three classes called rotational isomers, or rotamers. A random sampling of Phenylalanine sidechains, w/backbone superimposed

10 Sidechain rotamers CG H H H O=CO=C N CA CB CG H H H O=CO=C N CA CB CG H H H O=CO=C N CA CB "m" "p" "t" -60° gauche 180° anti/trans+60° gauche 1-4 interactions differ greatly in energy depending on the moieties involved.

11 Rotamer stability is dependent on the backbone  angles W sidechain is shown here lying over Thr backbone Rotamers of W*:  P|  =-140,  =160 P|  =-60,  =-40 p-90+60-900.3720.079 p90+60+900.2380.005 t-105180-1050.0330.251 t90180900.0210.268 m0-6550.0380.124 m95-65950.1830.203

12 Rotamer Libraries Rotamer libraries have been compiled by clustering the sidechains of each amino acid over the whole database. Each cluster is a representative conformation (or rotamer), and is represented in the library by the best sidechain angles (chi angles), the "centroid" angles, for that cluster. Two commonly used rotamer libraries: *Jane & David Richardson: http://kinemage.biochem.duke.edu/databases/rotamer.php Roland Dunbrack: http://dunbrack.fccc.edu/bbdep/index.php *rotamers of W on the previous page are from the Richardson library.

13 sidechain prediction Desmet et al, Nature v.356, pp339-342 (1992) Given the sequence and only the backbone atom coordinates, accurately model the positions of the sidechains. fine lines = true structure thick lines = sidechain predictions using the method of Desmet et al.

14 Theoretical complexity of sequence design Estimated number of sidechain rotamers: R=193 Typical small protein length: L=100 residues Sequence complexity: 20 100 = 1.3*10 130 Rotamer complexity: 193 100 = 3.6*10 228 Complexity of DEE algorithm: O( R 2 L 2 ) = 3.6*10 8

15 Dead end elimination theorem Each residue is numbered (i or j) and each residue has a set of rotamers (r, s or t). So, the notation i r means "choose rotamer r for position i". The total energy is the sum of the three components: NOTE: E global ≥ E GMEC for any choice of rotamers. E global = E template +  i E(i r ) +  i  j E(i r,j s ) where r and s are any choice of rotamers. fixed-fixed fixed-movable movable-movable

16 Dead end elimination theorem If i g is in the GMEC and i t is not, then we can separate the terms that contain i g or i t and re-write the inequality. E(i r ) +  j min s E(i r j s ) > E(i t ) +  j max s E(i t,j s ) E GMEC = E template + E(i g ) +  j E(i g,j g ) +  j E(j g ) +  j  k E(j g,k g ) E notGMEC = E template + E(i t ) +  j E(i t,j g ) +  j E(j g ) +  j  k E(j g,k g )...is less than... E(i r ) +  j E(i r j s ) > E(i g ) +  j E(i g,j s ) Canceling all terms in black, we get: So, if we find two rotamers i r and i t, and: Then i r cannot possibly be in the GMEC.

17 Dead end elimination theorem E(i r ) +  j min s E(i r j s ) > E(i t ) +  j max s E(i t,j s ) DEE theorem can be translated into plain English as follows: If the "worst case scenario" for t is better than the "best case scenario" for r, then you always choose t.

18 DEE algorithm E(r 1 ) 11 351 55 -225 05 000 001 1250 430 35 155 11 -200 250 50 0124 053 100 r 1 r 2 E(r 1,r 2 ) 1 2 3 213 a b c a b c a b c abcabcabc 0050000012 0 0 5 0 0 0 0 0 E(r 2 ) Find two columns (rotamers) within the same residue, where one is always better than the other. Eliminate the rotamer that can always be beat. (repeat until only 1 rotamer per residue) E global = E template +  i E(i r ) +  i  j E(i r,j s ) x x x x x x

19 DEE algorithm E(r 1 ) 11 351 55 -225 05 000 001 1250 430 35 155 11 -200 250 50 0124 053 100 r 1 r 2 E(r 1,r 2 ) 1 2 3 213 a b c a b c a b c abcabcabc 0050000012 0 0 5 0 0 0 0 0 E(r 2 ) Find two columns (rotamers) within the same residue, where one is always better than the other. Eliminate the rotamer that can always be beat. (repeat until only 1 rotamer per residue) E global = E template +  i E(i r ) +  i  j E(i r,j s )

20 Sequence design using DEE abcabc 1 2 3 -111 351 55-1 -2 1 5 2 0 5 -1 2 0 0 0 3 0 0 1 1 12 5 0 -3 4 3 0 1 -135 155 11-1 -200 250 5-10 223 0124 053 100 1-31 r1r1 r2r2 E(r 1,r 2 ) 1 2 3 21 3 abcabc abcabc abababab abcabcabcabcab a b 00500 000122 0 5 0 12 2 E(r 2 ) E(r 1 ) Asp Leu “Rotamers” within the DEE framework can have different atoms. i.e. they can be different amino acids. Using DEE, we choose the best set of rotamers. Now we have the sequence of the lowest energy structure. In the example, we have D or L at position 3.

21 Proteins can be made super-stable [GdnHCl] Folded natural seq designed seq 8M Malakauskas SM and Mayo SL (1998) “Design, Structure, and Stability of a Hyperthermophilic Protein Variant.” Nature Struct. Biol., 5, p.470. some amazing accomplishments in protein design 4M

22 Computationally re-designed proteins are consistently stable Dantas et al., J. Mol. Biol. (2003) 332, 449–460 some amazing accomplishments in protein design

23 Distinct conformational states can be stabilized.  M  2 integrin I domain in 2 conformations 2 crystal structures are known. They differ in the highlighted region. Shimaoka et al designed sequences for each form, open and closed. The two designs were shown to have different physiological properties. Shimaoka, M., Shifman, J. M., Takagi, J., Mayo, S. L., Springer, T. A. (2000) “Computational design of an integrin I domain stabilized in the high affinity conformation.” Nature Struc. Biol. 7(8), 674-678. some amazing accomplishments in protein design PDB codes: 1an5 1n9z

24 New binding sites can be designed Looger, L. L., Dwyer, M. A., Smith, J. J. & Hellinga, H. W. Nature 423, 185–190 (2003). Used to bind arabinose, now it binds seratonin. some amazing accomplishments in protein design

25 Re-designing a binding site Recent success of sequence design

26 DEE with alternative sequences and ligands abcabc L 2 3 -111 351 55-1 -2 1 5 2 0 5 -1 2 0 0 0 3 0 0 1 1 12 5 0 -3 4 3 0 1 -135 155 11-1 -200 250 5-10 223 0124 053 100 1-31 r2r2 E(r 1,r 2 ) L 2 3 21 3 abcabc abcabc abababab a bc a b 00500 000122 0 5 0 12 2 E(r 2 ) E(r 1 ) Asp Leu Each alternative ligand position is another “rotamer”. Ligand conformers. r1r1

27 An appropriate binding site was found Looger, L. L., Dwyer, M. A., Smith, J. J. & Hellinga, H. W. Nature 423, 185–190 (2003). The native ligand (arabinose) is approximately the same size as the targeted ligand (seratonin).

28 A space was carved out for the ligand Looger, L. L., Dwyer, M. A., Smith, J. J. & Hellinga, H. W. Nature 423, 185–190 (2003). All sidechains in the binding site were truncated to alanines, and a space was defined (yellow) for the new ligand. Lots of possible ligand orientations were made. Ligand orientations were treated like rotamers in DEE!

29 Sidechains were chosen using DEE Looger, L. L., Dwyer, M. A., Smith, J. J. & Hellinga, H. W. Nature 423, 185–190 (2003). Ligand position, along with sidechain identity and sidechain rotamer are chosen simultaneously.

30 H-bonding is key Successful designs have all H-bond donors and acceptors satisfied.

31 New folds can be designed Kuhlman et al.Science, v.302(5649), 1364-1368 (2003) New proteins can be designed that have never been seen before. The designs are accurate (compare red and blue above) and they are highly stable. some amazing accomplishments in protein design

32 Designing an enzyme An enzyme must bind the transition state tighter than the ground states High-resolution crystal structure closely matches the computational design. “Kemp elimination catalysts by computational enzyme design” Röthlisberger, Khersonsky,... David Baker. Nature 19 March 2008

33 In-class exercise: redesign an active site Make cocaine! (Use Builder. See diagram) Bind it to a G-protein coupled receptor (adenergic receptor) PDB code 1BAK Avoid collisions with backbone. Ignore sidechains. Bury it deep for maximum affinity. Use rotamer explorer to modify the sequence of the binding site. Two considerations The shape of the site must fit the ligand The protein must still fold. (changes should be similar sidechains if possible)


Download ppt "Bioinformatics 2 -- lecture 20 Protein design -- the state of the art."

Similar presentations


Ads by Google