ROTAMER OPTIMIZATION FOR PROTEIN DESIGN THROUGH MAP ESTIMATION AND PROBLEM-SIZE REDUCTION Hong, Lippow, Tidor, Lozano-Perez. JCC. 2008. Presented by Kyle.

Slides:



Advertisements
Similar presentations
Variational Methods for Graphical Models Micheal I. Jordan Zoubin Ghahramani Tommi S. Jaakkola Lawrence K. Saul Presented by: Afsaneh Shirazi.
Advertisements

Traveling Salesperson Problem
DMOR Branch and bound. Integer programming Modelling logical constraints and making them linear: – Conjuction – Disjunction – Implication – Logical constraints.
Types of Algorithms.
Join-graph based cost-shifting Alexander Ihler, Natalia Flerova, Rina Dechter and Lars Otten University of California Irvine Introduction Mini-Bucket Elimination.
Section 3: Appendix BP as an Optimization Algorithm 1.
Techniques for Dealing with Hard Problems Backtrack: –Systematically enumerates all potential solutions by continually trying to extend a partial solution.
Exact Inference in Bayes Nets
Movie theatre service on brightness and volume range leading to maximum clique graph By, Usha Kavirayani.
ICCV 2007 tutorial Part III Message-passing algorithms for energy minimization Vladimir Kolmogorov University College London.
Progress in Linear Programming Based Branch-and-Bound Algorithms
5-1 Chapter 5 Tree Searching Strategies. 5-2 Satisfiability problem Tree representation of 8 assignments. If there are n variables x 1, x 2, …,x n, then.
Branch & Bound Algorithms
MPE, MAP AND APPROXIMATIONS Lecture 10: Statistical Methods in AI/ML Vibhav Gogate The University of Texas at Dallas Readings: AD Chapter 10.
Algorithm Strategies Nelson Padua-Perez Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
CS774. Markov Random Field : Theory and Application Lecture 06 Kyomin Jung KAIST Sep
Nonholonomic Multibody Mobile Robots: Controllability and Motion Planning in the Presence of Obstacles (1991) Jerome Barraquand Jean-Claude Latombe.
F IXING M AX -P RODUCT : A U NIFIED L OOK AT M ESSAGE P ASSING A LGORITHMS Nicholas Ruozzi and Sekhar Tatikonda Yale University.
Bayesian network inference
Branch and Bound Searching Strategies
Branch and Bound Similar to backtracking in generating a search tree and looking for one or more solutions Different in that the “objective” is constrained.
On the Construction of Energy- Efficient Broadcast Tree with Hitch-hiking in Wireless Networks Source: 2004 International Performance Computing and Communications.
1 Branch and Bound Searching Strategies 2 Branch-and-bound strategy 2 mechanisms: A mechanism to generate branches A mechanism to generate a bound so.
Ch 13 – Backtracking + Branch-and-Bound
Branch and Bound Algorithm for Solving Integer Linear Programming
5-1 Chapter 5 Tree Searching Strategies. 5-2 Breadth-first search (BFS) 8-puzzle problem The breadth-first search uses a queue to hold all expanded nodes.
Protein Side Chain Packing Problem: A Maximum Edge-Weight Clique Algorithmic Approach Dukka Bahadur K.C, Tatsuya Akutsu and Tomokazu Seki Proceedings of.
Backtracking.
Distributed Constraint Optimization * some slides courtesy of P. Modi
1 Adversary Search Ref: Chapter 5. 2 Games & A.I. Easy to measure success Easy to represent states Small number of operators Comparison against humans.
Game Trees: MiniMax strategy, Tree Evaluation, Pruning, Utility evaluation Adapted from slides of Yoonsuck Choe.
Minimax Trees: Utility Evaluation, Tree Evaluation, Pruning CPSC 315 – Programming Studio Spring 2008 Project 2, Lecture 2 Adapted from slides of Yoonsuck.
Distributed Constraint Optimization Michal Jakob Agent Technology Center, Dept. of Computer Science and Engineering, FEE, Czech Technical University A4M33MAS.
CS774. Markov Random Field : Theory and Application Lecture 08 Kyomin Jung KAIST Sep
An Extended Dead-End Elimination Algorithm to Determine Gap-Free Lists of Low Energy States EDDA KLOPPMANN, G. MATTHIAS ULLMANN, TORSTEN BECKER.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
RNA Secondary Structure Prediction Spring Objectives  Can we predict the structure of an RNA?  Can we predict the structure of a protein?
Fundamentals of Algorithms MCS - 2 Lecture # 7
Rotamer Packing Problem: The algorithms Hugo Willy 26 May 2010.
Algorithms  Al-Khwarizmi, arab mathematician, 8 th century  Wrote a book: al-kitab… from which the word Algebra comes  Oldest algorithm: Euclidian algorithm.
BackTracking CS335. N-Queens The object is to place queens on a chess board in such as way as no queen can capture another one in a single move –Recall.
Minimax with Alpha Beta Pruning The minimax algorithm is a way of finding an optimal move in a two player game. Alpha-beta pruning is a way of finding.
1 Variable Elimination Graphical Models – Carlos Guestrin Carnegie Mellon University October 11 th, 2006 Readings: K&F: 8.1, 8.2, 8.3,
Applications of Dynamic Programming and Heuristics to the Traveling Salesman Problem ERIC SALMON & JOSEPH SEWELL.
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
Update any set S of nodes simultaneously with step-size We show fixed point update is monotone for · 1/|S| Covering Trees and Lower-bounds on Quadratic.
1 Branch and Bound Searching Strategies Updated: 12/27/2010.
1 Solving problems by searching Chapter 3. Depth First Search Expand deepest unexpanded node The root is examined first; then the left child of the root;
Types of Algorithms. 2 Algorithm classification Algorithms that use a similar problem-solving approach can be grouped together We’ll talk about a classification.
Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:
Branch and Bound Algorithms Present by Tina Yang Qianmei Feng.
Join-graph based cost-shifting Alexander Ihler, Natalia Flerova, Rina Dechter and Lars Otten University of California Irvine Introduction Mini-Bucket Elimination.
Tightening LP Relaxations for MAP using Message-Passing David Sontag Joint work with Talya Meltzer, Amir Globerson, Tommi Jaakkola, and Yair Weiss.
Branch and Bound Searching Strategies
Distributed cooperation and coordination using the Max-Sum algorithm
Searching a Linear Subspace Lecture VI. Deriving Subspaces There are several ways to derive the nullspace matrix (or kernel matrix). ◦ The methodology.
1 Chapter 5 Branch-and-bound Framework and Its Applications.
Introduction of BP & TRW-S
BackTracking CS255.
Signal processing and Networking for Big Data Applications: Lecture 9 Mix Integer Programming: Benders decomposition And Branch & Bound NOTE: To change.
Backtracking And Branch And Bound
Types of Algorithms.
Types of Algorithms.
Expectation-Maximization & Belief Propagation
Lecture 3: Exact Inference in GMs
Types of Algorithms.
Major Design Strategies
Branch-and-Bound Algorithm for Integer Program
Major Design Strategies
Minimax Trees: Utility Evaluation, Tree Evaluation, Pruning
Presentation transcript:

ROTAMER OPTIMIZATION FOR PROTEIN DESIGN THROUGH MAP ESTIMATION AND PROBLEM-SIZE REDUCTION Hong, Lippow, Tidor, Lozano-Perez. JCC Presented by Kyle Roberts

Problem Statement  Protein structure prediction  Homology modeling  Side-chain placement  Protein design problems  Given backbone and energy function find minimum energy side-chain conformation  The Global Minimum Energy Conformation (GMEC) problem

Current Approaches  Dead-End Elimination (DEE)  Branch-and-bound method ( Leach, Lemon. Proteins 1998 )  Linear Programming  Dynamic Programming  Approximate Methods (SCMF, MC, BP)

New Approach: BroMAP  Branch-and-bound method with new subproblem- pruning method  Focus on dense networks where all residues interact with one another  Attack smaller sub-problems separately  Can utilize DEE during sub-problems

General Idea 1

highlow

General Idea highlow

General Idea highlow

General Idea highlow

General Idea

General Idea , , , , , , , , , , -130

Recursive Steps 1. Select subproblem from queue 2. If easily solved, solve the subproblem 1. Update the minimum energy (U) seen and return 3. Compute lower bound (LB) and upper bound (UB) on minimum energy for subproblem 1. If UB is less than U, set U to the UB 4. Prune subproblem if LB is greater than U 5. Exclude ineligible conformations from search (DEE) 6. Pick one residue, and split rotamers into two groups 7. Add child subproblems to the queue and return

2. Solving Subproblems  Use DEE/A* to solve subproblems directly  Goldstein singles  Singles using split flags  Logical singles-pairs elimination  Goldstein’s condition with one magic bullet  Logical singles-pairs elimination  Do unification if possible  (Small enough: <200,000 rotamers) N.A. Pierce, J.A. Spriet, J. Desmet, S.L. Mayo. JCC

3. Bounding Subproblems  Tree-reweighted max-product algorithm (TRMP)  Relatively low computation cost  Can be used to compute lower-bounds for parts of the conformational space efficiently  (Discussed later)

subproblem rotamer 4/5. Prune Subproblem and Rotamers  Subproblem can be pruned if the current global upper bound (U) is lower than subproblem’s lower bound Energy Figures taken from: Hong, Lippow, Tidor, Lozano-Pérez. JCC unless otherwise stated

Representing Problem as Graph

Subproblem Splitting  Split rotamers at a given position into two groups (high lower bounds and low lower bounds)  Splitting position is selected to so that maximum and minimum rotamer lower bounds is large

Subproblem Selection  Use depth first search to choose which subproblem to expand  This leads to quickly finding a good upper bound in order to allow additional pruning

Summary Direct solution by DEE 2. Lower/Upper bound subproblem 3. Problem-size reduction: 1. DEE 2. Elimination by TRMP bounds 4. Prune subproblem if possible 5. Split at one position

MAP Estimation  Maximum a-posteriori (MAP) estimation problem:  Find a MAP assignment x* such that  We can convert this to the GMEC problem if  Maximizing the probability => minimizing energy where Energy of conformation x number of residue positions

Max Marginals find MAP Estimation  The max marginal, µ i, is defined as the maximum of p(x) when one position x i is constrained to a given rotamer:  For any tree distribution p(x) can be factorized into:

Max Marginal Example  Consider 3 residue positions with 2 rotamers each P({1,1,1}) = 1/50 P({1,1,0}) = 4/50 P({1,0,1}) = 16/50 P({0,1,1}) = 4/50 P({0,0,0}) = 1/50 P({1,0,0}) = 4/50 P({0,0,1}) = 4/50 P({0,1,0}) = 16/50

Max Marginal Example Cont. P({1,1,1}) = 1/50 P({1,1,0}) = 4/50 P({1,0,1}) = 16/50 P({0,1,1}) = 4/50 P({0,0,0}) = 1/50 P({1,0,0}) = 4/50 P({0,0,1}) = 4/50 P({0,1,0}) = 16/50 The same logic applies to the 2 nd and 3 rd residue positions

Max Marginal Example Cont. P({1,1,1}) = 1/50 P({1,1,0}) = 4/50 P({1,0,1}) = 16/50 P({0,1,1}) = 4/50 P({0,0,0}) = 1/50 P({1,0,0}) = 4/50 P({0,0,1}) = 4/50 P({0,1,0}) = 16/50

Using Max Marginal for MAP Assignment  “Maximum value of p(x) can be obtained simply by finding the maximum value of each µ i (x i ) and µ ij (x i, x i )”

Max-Product Algorithm Find max marginal µ s Wainwright, Jaakkola, Willsky. Statistics and Computing. 2004

Max-Product Algorithm Node t passes message to all of its neighbors S Wainwright, Jaakkola, Willsky. Statistics and Computing. 2004

Max-Product Doesn’t work for Cycles  The algorithm can produce the exact max-marginals for tree-distributions  Even with exact max-marginals this might not give a MAP solution  The protein design problem is dense, so there are going to be lots of cycles in the graph  For general cyclic distributions there is no known method that efficiently computes max-marginals  We will use pseudo-max-marginals instead

Pseudo-max-marginals  Break a cyclic distribution into a convex combination of distributions over a set of spanning trees  Then the pseudo-max-marginals ν = {v i, v ij } are defined by construction:  A given tree distribution is  So total probability is

Pseudo-Max Marginals 1) ρ-reparameterization 2) Tree consistency Maximal Stars

Pseudo-max marginal example 1 3 2

Tree-reweighted max-product algorithm (TRMP)  Edge-based reparameterization update algorithm to find pseudo max-marginals  Maintains the ρ -reparameterization criteria  Upon convergence, satisfies the “tree-consistency condition”, that the pseudo-max-marginals converge to the max-marginals of each tree distribution

Bounding GMEC with TRMP  First require that pseudo-max-marginals obey normal form (i.e. they are all ≤1)

Bounding conformation with given rotamer

Back to Pseudo-Max-Marginal Example  Bound Energy:  Bound Rotamer: P({1,1,1}) = 1/98 P({1,1,0}) = 16/98 P({1,0,1}) = 16/98 P({0,1,1}) = 16/98 P({0,0,0}) = 1/98 P({1,0,0}) = 16/98 P({0,0,1}) = 16/98 P({0,1,0}) = 16/

Summary of Bounding Subproblems  Started with max-marginals, but they didn’t work for cycles so moved to pseudo-max-marginals  Break up full cycle graph into stars, and then use TRMP to find pseudo-max-marginals  Since pseudo-max-marginals are in normal form and tree consistent, we can use them to bound the actual

Results  Test cases:  FN3: 94-residue B-sheet  D44.1 and D1.3: Antibodies that bind hen egg-white lysozyme  EPO: Human erythropoeitin complexed with receptor  Ran DEE/A* and BroMAP (their algorithm) and allowed 7 days to finish  DEE/A* solved 51 cases, BroMAP solved 65 out of 68 total cases

No.BroDEET-BrF-BrSkewF-UbLeafRdctnRC%DE%A*%A*%TR 22.6 E 33.1 E E 32.3 E E 31.3 E E 32.1 E E 24.8 E E 21.3 E E 33.5 E E 23.5 E 200NA E 22.6 E 200NA E 23.1 E E 21.2 E 300NA E 31.7 E E 22.1 E E 33.7 E E 34.1 E E 32.3 E E 2 00NA E 23.8 E E 22.0 E

INT CORE CORE++ Solved By Limited DEE BroMAP and DEE BroMAP only Nothing

Conclusions  Exact solution approach for large, dense protein design problems  Solved harder problems faster than DEE/A* and solved some that DEE/A* couldn’t  Performance advantage:  Smaller search trees  Can perform additional elimination and informed branching from inexpensive lower bounds

Comparison to DACS  Both partition on residue position in order to increase pruning and reduce A* search tree  BroMAP makes an informed partition  BroMAP uses intermediate information to prune partial conformations  BroMAP loses ability to enumerate “in order”