Presentation is loading. Please wait.

Presentation is loading. Please wait.

ROTAMER OPTIMIZATION FOR PROTEIN DESIGN THROUGH MAP ESTIMATION AND PROBLEM-SIZE REDUCTION Hong, Lippow, Tidor, Lozano-Perez. JCC. 2008. Presented by Kyle.

Similar presentations


Presentation on theme: "ROTAMER OPTIMIZATION FOR PROTEIN DESIGN THROUGH MAP ESTIMATION AND PROBLEM-SIZE REDUCTION Hong, Lippow, Tidor, Lozano-Perez. JCC. 2008. Presented by Kyle."— Presentation transcript:

1 ROTAMER OPTIMIZATION FOR PROTEIN DESIGN THROUGH MAP ESTIMATION AND PROBLEM-SIZE REDUCTION Hong, Lippow, Tidor, Lozano-Perez. JCC. 2008. Presented by Kyle Roberts

2 Problem Statement  Protein structure prediction  Homology modeling  Side-chain placement  Protein design problems  Given backbone and energy function find minimum energy side-chain conformation  The Global Minimum Energy Conformation (GMEC) problem

3 Current Approaches  Dead-End Elimination (DEE)  Branch-and-bound method ( Leach, Lemon. Proteins 1998 )  Linear Programming  Dynamic Programming  Approximate Methods (SCMF, MC, BP)

4 New Approach: BroMAP  Branch-and-bound method with new subproblem- pruning method  Focus on dense networks where all residues interact with one another  Attack smaller sub-problems separately  Can utilize DEE during sub-problems

5 General Idea 1

6 1 32 1 highlow

7 General Idea 1 32 45 2 1 highlow

8 General Idea 1 32 45 76 2 1 3 highlow

9 General Idea 1 32 45 76 98 2 1 3 4 highlow

10 General Idea 1 32 45 76 98 2 1 3 4 5

11 1 32 4 1011 5 76 98 2 1 3 4 5 6 7 8 910 11

12 General Idea 1 32 4 1011 5 76 98 2 1 3 4 5 6 7 8 910 11 -160, -200 -140, -180 -140, -200-100 -170 -100, -200 -60, -162 -60, -200 50, -200 50, -152 -100, -162 -60, -130

13 Recursive Steps 1. Select subproblem from queue 2. If easily solved, solve the subproblem 1. Update the minimum energy (U) seen and return 3. Compute lower bound (LB) and upper bound (UB) on minimum energy for subproblem 1. If UB is less than U, set U to the UB 4. Prune subproblem if LB is greater than U 5. Exclude ineligible conformations from search (DEE) 6. Pick one residue, and split rotamers into two groups 7. Add child subproblems to the queue and return

14 2. Solving Subproblems  Use DEE/A* to solve subproblems directly  Goldstein singles  Singles using split flags  Logical singles-pairs elimination  Goldstein’s condition with one magic bullet  Logical singles-pairs elimination  Do unification if possible  (Small enough: <200,000 rotamers) N.A. Pierce, J.A. Spriet, J. Desmet, S.L. Mayo. JCC. 2000.

15 3. Bounding Subproblems  Tree-reweighted max-product algorithm (TRMP)  Relatively low computation cost  Can be used to compute lower-bounds for parts of the conformational space efficiently  (Discussed later)

16 subproblem rotamer 4/5. Prune Subproblem and Rotamers  Subproblem can be pruned if the current global upper bound (U) is lower than subproblem’s lower bound Energy Figures taken from: Hong, Lippow, Tidor, Lozano-Pérez. JCC. 2008 unless otherwise stated

17 Representing Problem as Graph 1 2 3 1 2 3

18

19 Subproblem Splitting  Split rotamers at a given position into two groups (high lower bounds and low lower bounds)  Splitting position is selected to so that maximum and minimum rotamer lower bounds is large

20 Subproblem Selection  Use depth first search to choose which subproblem to expand  This leads to quickly finding a good upper bound in order to allow additional pruning

21 Summary 1 32 4 1011 5 76 98 2 1 3 4 5 6 7 8 910 11 1. Direct solution by DEE 2. Lower/Upper bound subproblem 3. Problem-size reduction: 1. DEE 2. Elimination by TRMP bounds 4. Prune subproblem if possible 5. Split at one position

22 MAP Estimation  Maximum a-posteriori (MAP) estimation problem:  Find a MAP assignment x* such that  We can convert this to the GMEC problem if  Maximizing the probability => minimizing energy where Energy of conformation x number of residue positions

23 Max Marginals find MAP Estimation  The max marginal, µ i, is defined as the maximum of p(x) when one position x i is constrained to a given rotamer:  For any tree distribution p(x) can be factorized into:

24 Max Marginal Example  Consider 3 residue positions with 2 rotamers each 1 3 2 P({1,1,1}) = 1/50 P({1,1,0}) = 4/50 P({1,0,1}) = 16/50 P({0,1,1}) = 4/50 P({0,0,0}) = 1/50 P({1,0,0}) = 4/50 P({0,0,1}) = 4/50 P({0,1,0}) = 16/50

25 Max Marginal Example Cont. P({1,1,1}) = 1/50 P({1,1,0}) = 4/50 P({1,0,1}) = 16/50 P({0,1,1}) = 4/50 P({0,0,0}) = 1/50 P({1,0,0}) = 4/50 P({0,0,1}) = 4/50 P({0,1,0}) = 16/50 The same logic applies to the 2 nd and 3 rd residue positions

26 Max Marginal Example Cont. P({1,1,1}) = 1/50 P({1,1,0}) = 4/50 P({1,0,1}) = 16/50 P({0,1,1}) = 4/50 P({0,0,0}) = 1/50 P({1,0,0}) = 4/50 P({0,0,1}) = 4/50 P({0,1,0}) = 16/50

27 Using Max Marginal for MAP Assignment  “Maximum value of p(x) can be obtained simply by finding the maximum value of each µ i (x i ) and µ ij (x i, x i )”

28 Max-Product Algorithm Find max marginal µ s Wainwright, Jaakkola, Willsky. Statistics and Computing. 2004

29 Max-Product Algorithm Node t passes message to all of its neighbors S Wainwright, Jaakkola, Willsky. Statistics and Computing. 2004

30 Max-Product Doesn’t work for Cycles  The algorithm can produce the exact max-marginals for tree-distributions  Even with exact max-marginals this might not give a MAP solution  The protein design problem is dense, so there are going to be lots of cycles in the graph  For general cyclic distributions there is no known method that efficiently computes max-marginals  We will use pseudo-max-marginals instead

31 Pseudo-max-marginals  Break a cyclic distribution into a convex combination of distributions over a set of spanning trees  Then the pseudo-max-marginals ν = {v i, v ij } are defined by construction:  A given tree distribution is  So total probability is

32 Pseudo-Max Marginals 1) ρ-reparameterization 2) Tree consistency Maximal Stars

33 Pseudo-max marginal example 1 3 2

34 Tree-reweighted max-product algorithm (TRMP)  Edge-based reparameterization update algorithm to find pseudo max-marginals  Maintains the ρ -reparameterization criteria  Upon convergence, satisfies the “tree-consistency condition”, that the pseudo-max-marginals converge to the max-marginals of each tree distribution

35

36

37

38 Bounding GMEC with TRMP  First require that pseudo-max-marginals obey normal form (i.e. they are all ≤1)

39 Bounding conformation with given rotamer

40 Back to Pseudo-Max-Marginal Example  Bound Energy:  Bound Rotamer: P({1,1,1}) = 1/98 P({1,1,0}) = 16/98 P({1,0,1}) = 16/98 P({0,1,1}) = 16/98 P({0,0,0}) = 1/98 P({1,0,0}) = 16/98 P({0,0,1}) = 16/98 P({0,1,0}) = 16/98 1 3 2

41 Summary of Bounding Subproblems  Started with max-marginals, but they didn’t work for cycles so moved to pseudo-max-marginals  Break up full cycle graph into stars, and then use TRMP to find pseudo-max-marginals  Since pseudo-max-marginals are in normal form and tree consistent, we can use them to bound the actual

42 Results  Test cases:  FN3: 94-residue B-sheet  D44.1 and D1.3: Antibodies that bind hen egg-white lysozyme  EPO: Human erythropoeitin complexed with receptor  Ran DEE/A* and BroMAP (their algorithm) and allowed 7 days to finish  DEE/A* solved 51 cases, BroMAP solved 65 out of 68 total cases

43 No.BroDEET-BrF-BrSkewF-UbLeafRdctnRC%DE%A*%A*%TR 22.6 E 33.1 E 431250.900.4930.72.123642.80.356.3 32.4 E 32.3 E 431260.930.4927.72.553246.20.652.6 42.8 E 31.3 E 423 1033.73.01043.90.355.5 52.7 E 32.1 E 426 10.5527.43.12037.20.462.2 91.2 E 24.8 E 2331027.61.9308.974.117.0 104.6 E 21.3 E 313100.750.3726.91.02747.670.414.4 115.7 E 33.5 E 4109170.810.3626.20.856633.878.911.2 152.9 E 23.5 E 200NA0 094.60.44.7 231.5 E 22.6 E 200NA0 086.7012.6 243.2 E 23.1 E 2441025.34.33062.315.121.6 252.9 E 21.2 E 300NA0 089.6010.4 261.4 E 31.7 E 311 10.8929.21.65046.10.453.2 334.1 E 22.1 E 313 1027.92.43034.74.559.8 341.1 E 33.7 E 319 1030.02.32032.22.764.8 352.8 E 34.1 E 421 1028.73.03050.70.648.6 364.6 E 32.3 E 425 1027.93.39053.20.745.9 372.5 E 2 00NA0 076.02.421.2 442.2 E 23.8 E 1860.710.5428.21.87178.275.514.1 458.8 E 22.0 E 2881026.25.16848.623.825.4

44

45 INT CORE CORE++ Solved By Limited DEE BroMAP and DEE BroMAP only Nothing

46

47 Conclusions  Exact solution approach for large, dense protein design problems  Solved harder problems faster than DEE/A* and solved some that DEE/A* couldn’t  Performance advantage:  Smaller search trees  Can perform additional elimination and informed branching from inexpensive lower bounds

48 Comparison to DACS  Both partition on residue position in order to increase pruning and reduce A* search tree  BroMAP makes an informed partition  BroMAP uses intermediate information to prune partial conformations  BroMAP loses ability to enumerate “in order”


Download ppt "ROTAMER OPTIMIZATION FOR PROTEIN DESIGN THROUGH MAP ESTIMATION AND PROBLEM-SIZE REDUCTION Hong, Lippow, Tidor, Lozano-Perez. JCC. 2008. Presented by Kyle."

Similar presentations


Ads by Google