Data-Driven Time-Parallelization in the AFM Simulation of Proteins L. Ji, H. Nymeyer, A. Srinivasan, and Y. Yu Florida State University

Slides:



Advertisements
Similar presentations
Introduction to Monte Carlo Markov chain (MCMC) methods
Advertisements

Simulazione di Biomolecole: metodi e applicazioni giorgio colombo
Arc-length computation and arc-length parameterization
Formulation of an algorithm to implement Lowe-Andersen thermostat in parallel molecular simulation package, LAMMPS Prathyusha K. R. and P. B. Sunil Kumar.
Molecular Dynamics at Constant Temperature and Pressure Section 6.7 in M.M.
Lecture 13: Conformational Sampling: MC and MD Dr. Ronald M. Levy Contributions from Mike Andrec and Daniel Weinstock Statistical Thermodynamics.
The Charm++ Programming Model and NAMD Abhinav S Bhatele Department of Computer Science University of Illinois at Urbana-Champaign
Experimental Design, Response Surface Analysis, and Optimization
1 NAMD - Scalable Molecular Dynamics Gengbin Zheng 9/1/01.
Incorporating Solvent Effects Into Molecular Dynamics: Potentials of Mean Force (PMF) and Stochastic Dynamics Eva ZurekSection 6.8 of M.M.
Two important lines of research in the behavior of proteins are - Given the sequence : predict the structure of the protein - Given the structure : predict.
Gaussian process emulation of multiple outputs Tony O’Hagan, MUCM, Sheffield.
DMEC Neurons firing Black trace is the rat’s trajectory. Red dots are spikes recorded from one neuron. Eventually a hexagonal activity pattern emerges.
Prénom Nom Document Analysis: Linear Discrimination Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Computational Modelling of Road Traffic SS Computational Project by David Clarke Supervisor Mauro Ferreira - Merging Two Roads into One As economies grow.
Dynamic Load Balancing Experiments in a Grid Vrije Universiteit Amsterdam, The Netherlands CWI Amsterdam, The
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming with MPI and OpenMP Michael J. Quinn.
Lecture 5 Today’s Topics and Learning Objectives Quinn Chapter 7 Predict performance of parallel programs Understand barriers to higher performance.
Behaviour of velocities in protein folding events Aldo Rampioni, University of Groningen Leipzig, 17th May 2007.
Parallel Flat Histogram Simulations Malek O. Khan Dept. of Physical Chemistry Uppsala University.
Monica Garika Chandana Guduru. METHODS TO SOLVE LINEAR SYSTEMS Direct methods Gaussian elimination method LU method for factorization Simplex method of.
A Pinch of Verlet-Velocity Algorithm, A Dash of Langevin Dynamics:
CSE808 F'99Xiangping Chen1 Simulation of Rare Events in Communications Networks J. Keith Townsend Zsolt Haraszti James A. Freebersyser Michael Devetsikiotis.
Monte Carlo Methods in Partial Differential Equations.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University COT 5410 – Spring 2004.
Free energies and phase transitions. Condition for phase coexistence in a one-component system:
Scaling to New Heights Retrospective IEEE/ACM SC2002 Conference Baltimore, MD.
Colorado Center for Astrodynamics Research The University of Colorado 1 STATISTICAL ORBIT DETERMINATION Satellite Tracking Example of SNC and DMC ASEN.
Interleaved Multi-Bank Scratchpad Memories: A Probabilistic Description of Access Conflicts DAC '15, June , 2015, San Francisco, CA, USA.
Computational issues in Carbon nanotube simulation Ashok Srinivasan Department of Computer Science Florida State University.
Chicago, July 22-23, 2002DARPA Simbiosys Review 1 Monte Carlo Particle Simulation of Ionic Channels Trudy van der Straaten Umberto Ravaioli Beckman Institute.
Scheduling Many-Body Short Range MD Simulations on a Cluster of Workstations and Custom VLSI Hardware Sumanth J.V, David R. Swanson and Hong Jiang University.
Performance Measurement n Assignment? n Timing #include double When() { struct timeval tp; gettimeofday(&tp, NULL); return((double)tp.tv_sec + (double)tp.tv_usec.
Performance Measurement. A Quantitative Basis for Design n Parallel programming is an optimization problem. n Must take into account several factors:
Computational Nanotechnology A preliminary proposal N. Chandra Department of Mechanical Engineering Florida A&M and Florida State University Proposed Areas.
Correlation of DNA structural features with internal dynamics and conformational flexibility H. Peter Spielmann University of Kentucky Dept. of Molecular.
ALGORITHM IMPROVEMENTS AND HPC 1. Basic MD algorithm 1. Initialize atoms’ status and the LJ potential table; set parameters controlling the simulation;
Lecture 9 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of.
Accuracy Based Generation of Thermodynamic Properties for Light Water in RELAP5-3D 2010 IRUG Meeting Cliff Davis.
Mass Transfer Coefficient
Long-Time Molecular Dynamics Simulations through Parallelization of the Time Domain Ashok Srinivasan Florida State University
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
Overcoming Scaling Challenges in Bio-molecular Simulations Abhinav Bhatelé Sameer Kumar Chao Mei James C. Phillips Gengbin Zheng Laxmikant V. Kalé.
Molecular Dynamics Simulations of Compressional Metalloprotein Deformation Andrew Hung 1, Jianwei Zhao 2, Jason J. Davis 2, Mark S. P. Sansom 1 1 Department.
Order of Magnitude Scaling of Complex Engineering Problems Patricio F. Mendez Thomas W. Eagar May 14 th, 1999.
Comparative Study of NAMD and GROMACS
Rassul Ayani 1 Performance of parallel and distributed systems  What is the purpose of measurement?  To evaluate a system (or an architecture)  To compare.
Molecular Modelling - Lecture 2 Techniques for Conformational Sampling Uses CHARMM force field Written in C++
MS 15: Data-Aware Parallel Computing Data-Driven Parallelization in Multi-Scale Applications – Ashok Srinivasan, Florida State University Dynamic Data.
CONCURRENT SIMULATION: A TUTORIAL Christos G. Cassandras Dept. of Manufacturing Engineering Boston University Boston, MA Scope of.
DO LOCAL MODIFICATION RULES ALLOW EFFICIENT LEARNING ABOUT DISTRIBUTED REPRESENTATIONS ? A. R. Gardner-Medwin THE PRINCIPLE OF LOCAL COMPUTABILITY Neural.
Introduction to Research 2007 Introduction to Research 2007 Ashok Srinivasan Florida State University Recent collaborators V.
A Pattern Language for Parallel Programming Beverly Sanders University of Florida.
Monte Carlo Linear Algebra Techniques and Their Parallelization Ashok Srinivasan Computer Science Florida State University
Scalable Time-Parallelization of Molecular Dynamics Simulations in Nano Mechanics Y. Yu, Ashok Srinivasan, and N. Chandra Florida State University
Linear Systems Numerical Methods. 2 Jacobi Iterative Method Choose an initial guess (i.e. all zeros) and Iterate until the equality is satisfied. No guarantee.
1 IP Routing table compaction and sampling schemes to enhance TCAM cache performance Author: Ruirui Guo, Jose G. Delgado-Frias Publisher: Journal of Systems.
Computational Chemistry Trygve Helgaker CTCC, Department of Chemistry, University of Oslo.
Monte Carlo Linear Algebra Techniques and Their Parallelization Ashok Srinivasan Computer Science Florida State University
Computational Techniques for Efficient Carbon Nanotube Simulation
Long-Time Molecular Dynamics Simulations in Nano-Mechanics through Parallelization of the Time Domain Ashok Srinivasan Florida State University
Parallel Computers.
Molecular Modelling - Lecture 3
The application of an atmospheric boundary layer to evaluate truck aerodynamics in CFD “A solution for a real-world engineering problem” Ir. Niek van.
Michel A. Cuendet, Olivier Michielin  Biophysical Journal 
Computational Techniques for Efficient Carbon Nanotube Simulation
Quantifying Biomolecule Diffusivity Using an Optimal Bayesian Method
Statistical Prediction and Molecular Dynamics Simulation
Computational issues Issues Solutions Large time scale
Srinivas Neginhal Anantharaman Kalyanaraman CprE 585: Survey Project
Presentation transcript:

Data-Driven Time-Parallelization in the AFM Simulation of Proteins L. Ji, H. Nymeyer, A. Srinivasan, and Y. Yu Florida State University Aim: Simulate for long time spans Solution features: Use data from prior simulations to parallelize the time domain Acknowledgments: NSF, ORNL, NERSC, NCSA

Outline Background –Limitations of Conventional Parallelization Time Parallelization –Other Time Parallelization Approaches –Data-Driven Time Parallelization Nano-Mechanics Application Time Parallelization of AFM Simulation of Proteins –Prediction –Experimental Results Scaled to an order of magnitude larger number of processors when combined with conventional parallelization Conclusions and Future Work

Background Molecular dynamics –In each time step, forces of atoms on each other modeled using some potential –After force is computed, update positions –Repeat for desired number of time steps Time steps size ~ 10 –15 seconds, due to physical and numerical considerations –Desired time range is much larger A million time steps are required to reach s ~ 500 hours of computing for ~ 40K atoms using GROMACS MD uses unrealistically large pulling speed –1 to 10 m/s instead of to10 -5 m/s

Limitations of Conventional Parallelization Results on IBM Blue Gene –Does not scale efficiently beyond 10 ms/iteration If we want to simulate to a ms –Time step 1 fs  iterations  s ≈ 300 years If we scaled to 10  s per iteration –4 months of computing time NAMD, 327K atom ATPase PME, IPDPS 2006 NAMD, 92K atom ApoA1 PME, IPDPS 2006 IBM Blue Matter, 43K Rhodopsin, Tech Report 2005 Desmond, 92K atom ApoA1, SC 2006

Time Parallelization Other Time Parallelization Approaches –Dynamic Iterations/ Waveform Relaxation Slow convergence –Parareal Method Related to shooting methods Not shown effective in realistic settings Data-Driven Time-Parallelization –Nano-Mechanics Application Tensile test on a Carbon Nanotube –Achieved granularity of 13.6  s/iteration in one simulation

Other Time Parallelization Approaches Special case: Picard iterations –Ex: dy/dt = y, y(0) = 1 becomes dy n+1 /dt = y n (t), y 0 (t) = 1 In general –dy/dt = f(y,t), y(0) = y 0 becomes dy n+1 /dt = g(y n, y n+1, t), y 0 (t) = y 0 g(u, u, t) = f(u, t) g(y n, y n+1, t) = f(y n, t): Picard g(y n, y n+1, t) = f(y n+1, t): Converges in 1 iteration –Jacobi, Gauss-Seidel, and SOR versions of g defined Many improvements –Ex: DIRM combines above with reduced order modeling Exact N = 1 N = 2 N = 3 N = 4 Waveform Relaxation Variants

Data-Driven Time Parallelization Each processor simulates a different time interval Initial state is obtained by prediction, using prior data (except for processor 0) Verify if prediction for end state is close to that computed by MD Prediction is based on dynamically determining a relationship between the current simulation and those in a database of prior results If time interval is sufficiently large, then communication overhead is small

Nano-Mechanics Application Carbon Nanotube Tensile Test Pull the CNT Determine stress-strain response and yield strain (when CNT starts breaking) using MD Use dimensionality reduction for prediction u 1 (blue) and u 2 (red) for z u 1 (green) for x is not “significant” Red line: Ideal speedup Blue: v = 0.1m/s Green: v = 1m/s, using v = 10m/s Blue: Exact 450K Red: 200 processors

Problems with multiple time-scales Fine-scale computations (such as MD) are more accurate, but more time consuming –Much of the details at the finer scale are unimportant, but some are A simple schematic of multiple time scales

Time-Parallelization of AFM Simulation of Proteins Example System: Muscle Protein - Titin –Around 40K atoms, mostly water –Na + and Cl - added for charge neutrality –NVT conditions, Langevin thermostat, 400K –Force constant on springs: 400kJ/(mol  nm 2 ) –GROMACS used for MD simulations

Verification of prediction Definition of equivalence of two states –Atoms vibrate around their mean position –Consider states equivalent if differences are within the normal range of fluctuations Mean positionDisplacement (from mean) Differences between trajectories that differ only due to the random number sequence

Prediction Use prior results with higher velocity –Trajectories with different random number sequences –Predict based on prior result closest to current states Use only the last verified state Use several recent verified states Fit parameters to the log-Weibull distribution (1/b) e (a-x)/b-e (a-x)/b Location: a = Scale: b =

Speedup Speedup on Xeon/Myrinet cluster at NCSASpeedup with combined space (8-way) - time parallelization One time interval is 10K time steps -- ~5 hours sequential time The parallel overheads, excluding prediction errors, are relatively insignificant Above results use last verified state to choose prior run Using several verified states parallelized almost perfectly on 32 processors

Validation Spatially parallel Time parallel Mean (spatial), time parallel Experimental data

Typical Differences RMSD Solid: Between exact and a time parallel runs Dashed: Between conventional runs using different random number sequences Force Dashed: Time parallel runs Solid: Conventional runs

Conclusions and Future Work Conclusions –Data-driven time parallelization promises an order of magnitude improvement in speed when combined with conventional parallelization Future Work –Better prediction –Satisfy detailed balance