10. Protein interface Alanine Scanning and Design, continued

10. Protein interface Alanine Scanning and Design, continued

Interface manipulation and design
Computational Alanine scanning Computational binding prediction Interface design Specificity switch Negative design Multistate design De novo design Design of multiprotein assemblies: cages and layers

Negative design Problem: optimization for a given fold / interaction does not guarantee that other alternative folds / interactions are not more favorable for a sequence Solubility: prevent aggregation Compactness: prevent molten globule states Specificity: Negative design prevents alternative conformations / interactions

Design of Homo-dimeric coiled-coils (Havranek & Harbury NSB 2003)
Negative design against hetero-dimer Sequence 2 is better than Sequence 1: specific, even though higher in energy

Multispecificity design: binding to many partners Humphris & Kortemme (2007) PLoS CB
What are the restrictions of evolution on protein binding? How is promiscuity achieved? Design of proteins with a shared interface through which they bind several partners (not at once, of course). Through such design an evolutionary question can be explored - was the evolutionary pressure the same for these proteins or is the sequence of this binder protein a compromise sequence that can bind all partners, but alternative sequences could have had stronger binding energies? For each position in the interface, compare

Multispecificity design: binding to many partners
Protocol: Dataset comprised of 20 proteins with solved complex structures. 2. redesign interface sequence using a genetic algorithm They took 20 different proteins, each known to bind several partners through a shared interface, with the pairwise interactions solved by crystallography. The design strategy was to find the optimal sequences to bind the single partners Vs the optimal sequence to bind all of them - design either considering only one state at a time or a design that checks for each proposed sequence the binding energy to all partners (just as done in negative design, only here the purpose is to lower the overall energy of all states and not to lower one relative to the others).

Multi-faceted binding in Hub protein RAN
The second row in the table lists the native sequence (red), and the following rows list sequences predicted to be optimal in each simulation: multi- constraint (second sequence, MULTI), single-constraint (third through seventh sequences, SINGLE). Plus signs denote that the wild-type amino acid residue type was recovered as optimal. The number and percent of interface residues identical to native is shown for each simulation in the rightmost column. Grey shading denotes positions that do not interact with the given partner. at position 74, the design simulations predict that three of Ran’s binding prefer side-chains larger than the wild-type glycine (form hydrogen bonds). However, tight steric constraints for binding the remaining two partners necessitate glycine to be the ‘‘optimal’’ compromise for this interface position. at position 76, Arg is predicted to be highly shared among all partners. It is recovered by almost every single-constraint simulation, where it mediates an inter-chain hydrogen bonding network. Humphris & Kortemme (2007) PLoS CB (grey –not at interface in that structure)

Nature gains multispecificity by two strategies
Group I: Small shared interface - little improvement in sequence recovery by using multiple constraints Group II: Large shared interface - Multiple constraints improve sequence recovery The number of residues recovered as identical to native are plotted for each of the 20 proteins. The size of the shared interface is shown for each protein in red. For roughly half the dataset, (group II, pink shading), sequence recovery from the multi-constraint simulations (black) significantly out-performed the average single-constraint recovery (grey). The remaining proteins (group I, blue shading) showed similar native recovery regardless of whether sequences were optimized with respect to one or all characterized partners. Error bars represent the best and worst native sequence recovery in a single-constraint optimization. Humphris & Kortemme (2007) PLoS CB

Difference in binding contribution
Group I: single-constraint performs as well as multi-constraint Group II: multi-constraint performs better than single-constraint “tradeoff value”: improvement in energy of single design compared to multi design. Highly shared residues: residues with low tradeoff values Tradeoff at each interface position in the dataset was estimated by the per-residue difference in scores of amino acids chosen when each partner was optimized alone as compared with when all binding partners were considered in the optimization procedure. Such a parameter allows to evaluate the energy loss in binding to a partner upon the need to "settle" for a different interface residue. The percentage of interface sites displaying the lowest level (0–0.5) of ‘‘tradeoff value’’ is shown for all 20 proteins. Such positions are predicted to be highly shared, in that no partner considered had to ‘‘give up’’ potential gain so that other partners could fulfill their optimal interactions. Blue and pink shading denotes whether each protein was assigned to group I or II. Humphris & Kortemme (2007) PLoS CB

Difference in binding contribution
Medium compromise: CheY Low compromise: Ovomucoid inhibitor High compromise: Ran When looking at the roles these proteins play in the cell, it is evident that in the complexes having a low percent of shared residues (most positions require some compromise) there is a need to bind many different partners, e.g. ubiquitin (that binds many E3 ligases), whereas in the complexes having a high percent of shared residues (low compromise), there's strong binding to a small number of partners. Humphris & Kortemme (2007) PLoS CB

What next? De novo design of interaction (Fleishman 2011, Science; Fleishman 2011, JMB)
Aim: design a new interaction from stratch System: high-affinity binder to constant region of Influenza Hemagglutinin (1918 pandemic) could help for general vaccine – eradication of influenza broadly neutralizing antibody known (CR6261) The goal was not to use an existing interaction as template, but to create a completely new interaction. They chose the Hemagglutinin protein from the influenza virus, which is responsible for binding to the host cell membrane and fusion with the endosome membrane. Antibodies against this protein already exist, but most bind to its head, a region that mutates frequently and escapes the immune system. The stem region has a conserved, accessible patch which only 2 antibodies are known to bind. This was chosen as the binding patch to which a protein partner will be designed. It was reasonable to assume such a binder can be evolved, since evolution occasionally shows convergence - two very different proteins that bind their targets in different ways.

Overview of approach (Fleishman 2011, Science)
The strategy was to design an interface that has both high shape complementarity and a core region of highly optimized, hot spot–like residue interactions. First hotspot residues on the interface are generated/identified and then a scaffold protein, from a large protein set, that can accommodate these hotspots is grafted onto these hotspots to yield a binder.

1. Hotspot library design
Hotspots from known interface More general: individual residues mapped on surface Dock single amino acids onto defined surface patches of the target: HS1 HS2 HS3 Create libraries (inverse rotamer approach) There are two possible approaches to construct the hotspots library - either start with a known interaction, in this case, the hemagglutinin bound to the CR6261 antibody and use the hotspots in the antibody (detected by alanine scanning; for this case took Tyr from the antibody ). If no starting structure exists, it is possible to dock amino acids to structure, identify where they have good interaction energies. There were 3 identified residues near in space that are hotspots. If indeed the hotspots are created by docking, it is possible to create inverse rotamers (start at the end of the side chain and build it back up to the ca atom) for several low energy conformations of each hotspot from the docking.

2. Find shape complementary scaffolds
Search set of 865 proteins Easy to express Use Patchdock to find shape complementarity Refine with RosettaDock with constraints to match as many hotspots as possible Filter >1000A2 buried surface area < -15 REU > 0.65 shape complementary In this step first a protein that matches the hotspots positions and the surface of the hemagglutinin is searched. For each of the hotspots an attempt to refine the scaffold and the side chains so they fit without distorting bond angles is made, followed by filtering, to make sure that shape complementarity is maintained and binding energy is significant. The scaffold is removed of its side chains so they don't clash with the hotspots/protein partner - these will be redesigned in any case and are removed so they don't interfere with detecting a good backbone scaffold. replace all interface residues in scaffold with Ala (except Gly & Pro) to increase chance of match

3. Incorporate hotspot residues
Replace matching positions on scaffold with hotspot residues from library: For each position near hotspot in scaffold For each rotamer in library attach scaffold to hotspot 2. optimize RB orientation Applied to: HS1 -> HS2 (2 residue strategy) HS3 ->HS1 &HS2 (three residue strategy The next step is to make sure that the somewhat matching hotspots and scaffold match perfectly. For each hotspot position, a rotamer was chosen and an attempt to align the scaffold to it and optimize the rigid body orientation to eliminate steric clashes and maintain the hotspot's interaction. If the attempt was not successful, an alternative rotamer was tested in a similar manner. This was applied to the second hotspot given the first, in case of a 2-hotspots based binder and to two hotspots given the third, in the case of a 3-hotspots binder. At the end of the step the scaffold matches all 2/3 hotspots.

4. Design scaffold residues around hotspots
Several rounds of design/minimization Reduce # mutations: Residues with improvement of <0.5REU are reverted back to wt In this step, to minimize divergence from native (to get the expected fold with higher probability) any mutations that improved the energy by less than 0.5 Rosetta Energy Units (REU) was mutated back to the wt residue that appeared in the original wt protein.

2/88 bound with medium affinity
5. Results 88 designs, derived from 79 different protein scaffolds, average of 11 mutations Importance of structural genomics – provides good scaffolds Experimental assessment: yeast display Allows for fast validation of many candidates Specificity of binding assessed by competition with Cr6261 neutralizing antibody The average mutation number might mean in general it is not that hard to take two unrelated proteins and with only a small number of mutations (11 here) get a new interface between them. To quickly screen the 88 designs, they performed yeast surface display - a methodology where yeast express the desired protein to be tested on their surface. For proteins detected using yeast display, the assay that checks that the binding to hemagglutinin happens indeed in the intended patch was of competetive binding to the antibody, which is known to bind there. 2/88 bound with medium affinity

6. Proof: crystal structure

7. What next? Affinity maturation with yeast surface display
Express protein of interest on surface Identify rapidly binding partners fast in vitro evolution Simultaneous detection of expression and binding strepavidin biotin The gene encoding the protein to be expressed is attached to the Aga2 protein which is localized to the protein surface - saves the need for protein purification for in-vitro binding assays, there is a cMyc tag that is identified by an antibody only if the protein is folded, the partner, in this case the hemagglutinin (HA), is tagged by biotin - if the two bind, washing will not remove the HA and the biotin will be bound by fluorescent streptavidin. This is also a system that allows in-vitro evolution - isolated genes that encode binders can be amplified with inserted mutations. phycoerythrin

Affinity maturation Few mutations increase affinity dramatically,
So the improvement in binding by in-vitro evolution both gives a final product, much more potent than the computationally designed one (22 and 38 NM, as shown by SPR, ELISA and co-elution on a size exclusion column) and highlights the weak spots in computational design, which can be then fixed by learning from the experiment results. Binding titrations of the two designed binders to HA as measured by yeast surface display. Red circles represent the affinity-matured design; blue squares, the scaffold protein from which the design is derived; and black crosses, the design in the presence of 750 nM inhibitory CR6261 Fab. Few mutations increase affinity dramatically, ….. and identify weaknesses of computational approach

8. What can we improve? Steric interactions Salt bridges Solvation
First row shows differences between design and evolved protein for the first binder, HB36, and the second row shows if for the second protein, HB80. From these differences it is possible to learn what rosetta did not model properly: repulsive interactions: A60V (A), M26T (B); e.g. a very small, backbone RMSd move in the backbone allows accommodation of a Val instead of Ala, without clashes; M26T showed that the interaction within the monomer was not emphasized enough - T had a better steric fit to Y40. electrostatics: N64K (C), N36K (D); e.g. K64 was not appreciated by Rosetta to have stronger binding than N64, since in score12 electrostatics is not modelled explicitly. solvation: D47S (E), D12G (F). e.g. D47 interacts with an ILE on HA and the cost of its desolvation was higher than estimated by rosetta

Generalization of de novo design (Fleishman 2011, JMB)
Protocol tested on a benchmark of interactions: The next work was an attempt to generalize the approach: a benchmark of interactions was used to try the protocol for two types of generated hotspots - either ones from the native interaction, only diversified, or a combination of these wt diversified hotspots and de novo generated hotspots (by the single residues docking approach). Table shows the different interactions, which hotspot generation method was used, the hotspots identity, structure RMSd, seqid (rather low, hints only a small number of residues really are critical for a fold), rank of the closed to native design out of the different binding clusters (all conformations that cluster within 4A) and if additional scaffolds were found, other than the native structure - not always an alternative structure exists. From the ranking, it can be suggested that Rosetta's energy function can identify the binding states from alternative complexes (they are almost always ranked first).

Incorporate negative design
Observation: Restricted side chain plasticity in interfaces of native complexes Hotspot residue are prepositioned Do not change upon binding Stabilized in monomer Might prevent non-native interactions by restricting side chain conformation Negative design in Nature Aim: reproduce this in our designs Stabilize hotspots within monomer Good internal as well as binding energy Design in clusters Critical feature for successful designs! The next challenge is to make sure the hotspots interact as planned with the receptor and also interact favorably within the designed monomer, since it was observed that hotspots are stabilized by interactions with other residues within a monomer and maximizing binding energy without considering the monomer's stability leads to design of proteins that have alternative conformations.

Challenges ahead: challenging interfaces in nature
Networks of hydrogen bonds and waters Strand pairing The diversity of protein interface characteristics observed in nature suggests future challenges for computational design. In the HA work, a hydrophobic helix was designed to bind a hydrophobic groove. The greater challenges are to design a sophisticated water mediated hydrogen bond network as seen in barnase-barstar . (C) Strand pairings at an interface feature regular repeats of polar atoms. (D) Imitating an antibody interface that features long loops will require precise backbone conformational sampling and scoring methods. Loops provide a rich diversity of backbone conformations, such that binding can occur using only tyrosine and serine side chains (5). (E) The quaternary structure of an antibody is stabilized by a sheet-sheet interface. Antibodies: Considerable loop flexibility allows creation of binding partners using Y/S alone Sheet interactions

Interface design - summary
Binding Prediction Effect of point mutations effectively predicted Prediction of binding specificity of different protein pairs is difficult Polar effects are modeled less well than hydrophobic interactions Design of binding Creation of specificity switches is difficult, but possible Combine computational design with experimental refinement (e.g. in vitro evolution) Negative design can be important to achieve binding specificity De novo design of interaction achieved!! Specificity switch, e.g. the design of new, specific Im7-E7 binding pairs was achieved, partially.

Multiprotein complex design: cages and layers
Natural large protein assemblies Biotechnology Goal : Efficient design of protein assemblies Application: vaccine design,molecular delivery agent These are multiprotein complexes that create a cavity that at some point, if controlled, can release their cargo in needed locations.

Design strategies Fusion of natural oligomers Interface design
One way to get cages is to take two different existing interfaces and combine them on one protein. Another way is to redesign an interface.

Goal: Design cage like protein nanomaterials
Can be generated from sets of three-fold rotational symmetry axes, which allows the use of protein trimers with C3 symmetry as building blocks. the number of new protein-protein interfaces that must be designed is reduced because of the natural interface within the oligomer. Furthermore, the energetic contribution of each designed interaction is multiplied by the symmetry of the building block, which reduces the number of distinct new interactions required to overcome the entropic cost of self assembly. e.g. Octahedral symmetry; Trimer building blocks King, Sheffer, et al. Science 2012

Computational method: Rosetta matdes_dock
Symmetric docking of protein building blocks in a target symmetric architecture Design low-energy protein-protein interfaces between the building blocks

Step 1. Symmetrical docking
Single subunit + Symmetry definition file Symmetrical arrangement of building block at the origin Full space of contacting symmetry configurations sampled Select conformations with highest Cb contacts for interface design King, Sheffer, … Baker. Science 2012

Step 2. Interface design King, Sheffer, et al. Science 2012
Form grid of configuration centered at docked configuration Select amino acids to be changed with specific criteria Design these residues only (RosettaDesign) Test mutations using Foldit Model selection according to: interface size, shape complementarity and binding energy Minimize side chains at the designed positions I ) First, the position must be at an inter-building block interface, defined as a position in which the beta carbon is within a user-defined cutoff distance to a beta carbon in a different building block (in this study the default 10 Å cutoff was used. II ) Second, the position must not be making contacts within the oligomeric building block (any atoms within 5 Å), with the exception that if a residue is making intra-building block contacts but has a high clash score in Rosetta, it is selected for design. III ) Third, the residue must have a nonzero solvent accessible surface area in the monomeric state. IV ) Fourth, interface proline and glycine residues are not designed, but proline side chains are allowed to repack. ORA: I don’t understand what the first rectangle means: what is grid of configurations? This needs a figure. What are the “specific criteria”? what do you mean by “Test mutations using FOLDIT”? You need to explain foldit King, Sheffer, et al. Science 2012

Experimental methods: PAGE
Determine size of each protein (PAGE; non-denaturing conditions) Shift in apparent size relative to the corresponding wt Trimer -> self-assembly to designed size King, Sheffer, et al. Science 2012

Experimental Methods: Negative - Stain Electron Microscopy
Particles of expected size (~13 nm) Resemblance to projections along symmetry axes

EM and crystal structure are similar

Extension to multi - component systems
King, Bale, Sheffer, … & Baker. Nature 2014

Layer design Shane, … & Baker. Science 2015
Designed different types of symmetries in layers. Useful for platforms that present different molecules. Shane, … & Baker. Science 2015

Design of cages and layers: Conclusions
Symmetric docking & interface design: simple & generally applicable strategy to broad range of symmetric materials Building blocks: here naturally occurring oligomeric proteins, but de novo design possible too Designed, self-assembling protein materials: basis of advanced functional materials and custom designed molecular machines

Using Rosetta - the day after
Robetta ( “ROSIE” ( Backrub ( Design ( “FlexPepDock” ( “PIPER-FlexPepDock” ( Robetta - structure prediction, Interface alanine scanning, DNA interface residue scanning and fragment libraries creation. Backrub - for modelling small movements in proteins that do not propagate along the whole protein, as those seen in high resolution structures of proteins. Design - for planning new proteins by suggesting sequences that fit an existing fold or a new one. ROSIE - a platform that provides means to run a number of different Rosetta protocols, e.g. RNA structure prediction, protein-protein docking, increasing net charge on a protein surface, modelling a symmetric homo-multimer, detecting dominant linear.streches in protein interfaces.

10. Protein interface Alanine Scanning and Design, continued

Similar presentations

Presentation on theme: "10. Protein interface Alanine Scanning and Design, continued"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

10. Protein interface Alanine Scanning and Design, continued

Similar presentations

Presentation on theme: "10. Protein interface Alanine Scanning and Design, continued"— Presentation transcript:

Similar presentations

About project

Feedback