The Accelerated Weighted Ensemble

Slides:



Advertisements
Similar presentations
Computational methods in molecular biophysics (examples of solving real biological problems) EXAMPLE I: THE PROTEIN FOLDING PROBLEM Alexey Onufriev, Virginia.
Advertisements

Protein Threading Zhanggroup Overview Background protein structure protein folding and designability Protein threading Current limitations.
A COMPLEX NETWORK APPROACH TO FOLLOWING THE PATH OF ENERGY IN PROTEIN CONFORMATIONAL CHANGES Del Jackson CS 790G Complex Networks
StreamMD Molecular Dynamics Eric Darve. MD of water molecules Cutoff is used to truncate electrostatic potential Gridding technique: water molecules are.
Graphical Models for Protein Kinetics Nina Singhal CS374 Presentation Nov. 1, 2005.
Atomistic Protein Folding Simulations on the Submillisecond Timescale Using Worldwide Distributed Computing Qing Lu CMSC 838 Presentation.
Vienna, Simulating Protein Folding - Some Ideas Christian Hedegaard Jensen Dmitry Nerukh.
Behaviour of velocities in protein folding events Aldo Rampioni, University of Groningen Leipzig, 17th May 2007.
Stochastic Roadmap Simulation: An Efficient Representation and Algorithm for Analyzing Molecular Motion Mehmet Serkan Apaydin, Douglas L. Brutlag, Carlos.
Molecular Dynamics and Normal Mode Analysis of WW domain Santanu Chatterjee 1, Christopher Sweet 1, Tao Peng 2, John Zintsmaster 2, Brian Wilson 2, Jesus.
Leipzig, 17 May Markov Models of Protein Folding - Application to Molecular Dynamics Simulations Christian Hedegaard Jensen.
22/5/2006 EMBIO Meeting 1 EMBIO Meeting Vienna, 2006 Heidelberg Group IWR, Computational Molecular Biophysics, University of Heidelberg Kei Moritsugu MD.
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques117 Jan 2006 Bioinformatics Data Analysis & Tools Molecular simulations & sampling.
ChE 551 Lecture 19 Transition State Theory Revisited 1.
Coordinates and Pathways in MM and QM/MM modeling Haiyan Liu School of Life Sciences, University of Science and Technology of China.
Modeling molecular dynamics from simulations Nina Singhal Hinrichs Departments of Computer Science and Statistics University of Chicago January 28, 2009.
Motif Discovery in Protein Sequences using Messy De Bruijn Graph Mehmet Dalkilic and Rupali Patwardhan.
Algorithms and Software for Large-Scale Simulation of Reactive Systems _______________________________ Ananth Grama Coordinated Systems Lab Purdue University.
Elastic Applications in the Cloud Dinesh Rajan University of Notre Dame CCL Workshop, June 2012.
ChE 452 Lecture 24 Reactions As Collisions 1. According To Collision Theory 2 (Equation 7.10)
COLLABORATIVE SPECTRUM MANAGEMENT FOR RELIABILITY AND SCALABILITY Heather Zheng Dept. of Computer Science University of California, Santa Barbara.
Massively Parallel Ensemble Methods Using Work Queue Badi’ Abdul-Wahid Department of Computer Science University of Notre Dame CCL Workshop 2012.
Statistical Physics of the Transition State Ensemble in Protein Folding Alfonso Ramon Lam Ng, Jose M. Borreguero, Feng Ding, Sergey V. Buldyrev, Eugene.
Deca-Alanine Stretching
CZ5225 Methods in Computational Biology Lecture 4-5: Protein Structure and Structural Modeling Prof. Chen Yu Zong Tel:
Molecular Dynamics Simulation
Anton Supercomputer Brandon Dean 4/28/15. History Named after Antonie van Leeuwenhoek – “father of microbiology” Molecular Dynamics (MD) simulations were.
HOW TO UNBOIL AN EGG. .. SOME REFLECTIONS ON LIVING THINGS.
Bioinformatics: Practical Application of Simulation and Data Mining Markov Modeling II Prof. Corey O’Hern Department of Mechanical Engineering Department.
Study of Pentacene clustering MAE 715 Project Report By: Krishna Iyengar.
Protein Folding and Modeling Carol K. Hall Chemical and Biomolecular Engineering North Carolina State University.
Exploring the connection between sampling problems in Bayesian inference and statistical mechanics Andrew Pohorille NASA-Ames Research Center.
Protein folding dynamics and more Chi-Lun Lee ( 李紀倫 ) Department of Physics National Central University.
Home - Distributed Parallel Protein folding Chris Garlock.
Flexible Spanners: A Proximity and Collision Detection Tool for Molecules and Other Deformable Objects Jie Gao, Leonidas Guibas, An Nguyen Computer Science.
Enrico Spiga 1,2 Andrea Scorciapino 1, Arturo Robertazzi 1, 2, Roberto Anedda 2, Mariano Casu 2, Paolo Ruggerone 1 and Matteo Ceccarelli 1 1 University.
Bioinformatics: Practical Application of Simulation and Data Mining Markov Modeling I Prof. Corey O’Hern Department of Mechanical Engineering Department.
PROTEIN FOLDING: H-P Lattice Model 1. Outline: Introduction: What is Protein? Protein Folding Native State Mechanism of Folding Energy Landscape Kinetic.
Introduction to Scalable Programming using Work Queue Dinesh Rajan and Mike Albrecht University of Notre Dame October 24 and November 7, 2012.
Review Session BS123A/MB223 UC-Irvine Ray Luo, MBB, BS.
MSc in High Performance Computing Computational Chemistry Module Parallel Molecular Dynamics (i) Bill Smith CCLRC Daresbury Laboratory
1 Molecular Simulations Macroscopic EOS (vdW, PR) Little molecular detail Empirical parameters (  ) Seeking understanding of complex systems Surfactants.
Data-Driven Time-Parallelization in the AFM Simulation of Proteins L. Ji, H. Nymeyer, A. Srinivasan, and Y. Yu Florida State University
Building Scalable Scientific Applications with Work Queue Douglas Thain and Dinesh Rajan University of Notre Dame Applied Cyber Infrastructure Concepts.
Lecture 14: Advanced Conformational Sampling Dr. Ronald M. Levy Statistical Thermodynamics.
Massively Parallel Molecular Dynamics Using Adaptive Weighted Ensemble Badi’ Abdul-Wahid PI: Jesús A. Izaguirre CCL Workshop 2013.
1 Xin Zhou Asia Pacific Center for Theoretical Physics, Dep. of Phys., POSTECH, Pohang, Korea Structuring and Sampling in Complex Conformational.
A Computational Study of RNA Structure and Dynamics Rhiannon Jacobs and Harish Vashisth Department of Chemical Engineering, University of New Hampshire,
Parastou Sadatmousavi§, & Ross C. Walker*
Modeling molecular dynamics from simulations
Monte Carlo methods 10/20/11.
Lecture 17: Kinetics and Markov State Models
Model-Driven Analysis Frameworks for Embedded Systems
Determine protein structure from amino acid sequence
Home - Distributed Parallel Protein folding
Algorithms and Software for Large-Scale Simulation of Reactive Systems
Computational Analysis
Volume 108, Issue 5, Pages (March 2015)
Masoud Aryanpour & Varun Rai
Lecture 17: Kinetics and Markov State Models
Large Time Scale Molecular Paths Using Least Action.
Volume 108, Issue 3, Pages (February 2015)
CZ5225 Methods in Computational Biology Lecture 7: Protein Structure and Structural Modeling Prof. Chen Yu Zong Tel:
Understanding protein folding via free-energy surfaces from theory and experiment  Aaron R Dinner, Andrej Šali, Lorna J Smith, Christopher M Dobson, Martin.
Creating Custom Work Queue Applications
Algorithms and Software for Large-Scale Simulation of Reactive Systems
Experimental Overview
Protein Folding and Unfolding at Atomic Resolution
Justin Spiriti Zuckerman Lab MMBioS meeting 5/22/2014
Computational issues Issues Solutions Large time scale
Presentation transcript:

The Accelerated Weighted Ensemble Greatly Improved Protein Folding Statistics Using WorkQueue and Condor Jeff Kinnison & Dr. Jesus A. Izaguirre

Studying a New Protein HP24stab Subdomain of the Villin headpiece Two-helical supersecondary structure 24 amino acids (406 atoms) Discovered in 2015, little kinetic information available

Problems with Traditional MD Computationally Expensive Molecular force fields perform expensive operations on all atoms Timescales of interest quickly become intractable with protein size GPU resources to increase efficiency are not always readily available Events of Interest are Rare Protein folding occurs on O(ns) to O(ms) scale There is no guarantee that a folding event will occur in a given simulation With these two issues, it is difficult to generate enough data to make statistically significant kinetic approximations.

Accelerated Weighted Ensemble (AWE) 1. Simulate a number of models for a short time 2. Resample to maintain the number of models in each state 3. Repeat until fluxes converge Additionally, assign each state to a macrostate (folded, transition, unfolded) and track macrostate transitions to account for non-Markovian behavior.

AWE Partition Free Energy Surface of HP24stab Partition Following Transition Pathway The partition in AWE is based on existing kinetic data, approximating the correct weights.

Distributing Simulations with WorkQueue Each simulation is independent, so parallelize simulations to increase efficiency WorkQueue allows scaling to the number of simulations in a particular AWE run AWE includes task cloning to overcome bottlenecks caused by slow worker

Preliminary Trajectory Data We created the AWE partition by collecting trajectory data using traditional MD on GPU. Each trajectory took 4 days to complete. Of the 36 trajectories collected, 19 were valid and only 9 contained folding events. Folding first passage times for the nine original trajectories that folded.

AWE Setup Two Systems MD Parameters WorkQueue 1000-cell 100-cell 10 models per state MD Parameters T = 325K Langevin Dynamics with implicit solvent (λ = .91ps-1) Amber03 force field 250ps simulation time WorkQueue Maintained a factory requesting between 100 and 1000 workers All simulations run on 4-core workers Used Condor workers only to prevent AWE workers from taking over the cluster

AWE Condor Usage

AWE Condor Usage 100-Cell Partition Simulations Per Day 1000-Cell Partition Simulations Per Day By leveraging WorkQueue and Condor, we were able to run O(10k) simulations per day.

AWE Results Started with 19 microseconds of traditional MD trajectory data containing nine folding events computed over one month.

Conclusion Both the coarse and fine partitions converged in one-sixth the time needed to generate the original trajectories and generated several orders of magnitude more folding events. By leveraging WorkQueue and Condor, AWE is able to quickly generate reliable approximations of protein kinetic properties.

Acknowledgements We would like to thank Dr. Douglas Thain and the Cooperative Computing Lab students for making WorkQueue available and helping to integrate it with AWE. All computations were run on compute nodes provided by the Notre Dame Center for Research Computing.

References Hocking, H. G.; Häse, F.; Madl, T.; Zacharias, M.; Rief, M.; Žoldák, G. A Compact Native 24- Residue Supersecondary Structure Derived from the Villin Headpiece Sub- Domain. Biophys. J. 2015, 108, 678–686. Huber, G. A.; Kim, S. Weighted-ensemble Brownian dynamics simulations for protein association reactions. Biophys. J. 1996, 70, 97. Bhatt, D.; Zhang, B. W.; Zuckerman, D. M. Steady-state simulations using weighted ensemble path sampling. J Chem. Phys. 2010, 133, 014110. Abdul-Wahid, B.; Yu, L.; Rajan, D.; Feng, H.; Darve, E.; Thain, D.; Izaguirre, J. A. Folding Proteins at 500 ns/hour with Work Queue. E-Science (e-Science), 2012 IEEE 8th International Conference on. 2012; pp 1–8.