Flexibility and Interoperability in a Parallel MD code Robert Brunner, Laxmikant Kale, Jim Phillips University of Illinois at Urbana-Champaign.

Flexibility and Interoperability in a Parallel MD code Robert Brunner, Laxmikant Kale, Jim Phillips University of Illinois at Urbana-Champaign

Contributors Principal investigators – Laxmikant Kale, Klaus Schulten, Robert Skeel Development team –Milind Bhandarkar, Robert Brunner, Attila Gursoy, Neal Krawetz, Ari Shinozaki, …...

Middle layers Applications Parallel Machines “Middle Layers”: Languages, Tools, Libraries

Molecular Dynamics Collection of [charged] atoms, with bonds Newtonian mechanics At each time-step Calculate forces on each atom bonds: non-bonded: electrostatic and van der Waal’s Calculate velocities and Advance positions 1 femtosecond time-step, millions needed! Thousands of atoms (1,000 - 100,000)

Molecular Dynamics Collection of [charged] atoms, with bonds Newtonian mechanics At each time-step –Calculate forces on each atom bonds: non-bonded: electrostatic and van der Waal’s –Calculate velocities and Advance positions 1 femtosecond time-step, millions needed! Thousands of atoms (1,000 - 100,000)

Further MD Use of cut-off radius to reduce work –8 - 14 Å –Faraway charges ignored! 80-95 % work is non-bonded force computations Some simulations need faraway contributions

NAMD Design Objectives Performance Scalability –To a small and large number of processors –small and large molecular systems Modifiable and extensible design –Ability to incorporate new algorithms –Reusing new libraries without re-implementation –Experimenting with alternate strategies

Force Decomposition Distribute force matrix to processors Matrix is sparse, non uniform Each processor has one block Communication: N/sqrt(P) Ratio: sqrt(P) Better scalability (can use 100+ processors) Hwang, Saltz, et al: 6% on 32 Pes 36% on 128 processor

Spatial Decomposition

Spatial decomposition modified

Implementation Multiple Objects per processor –Different types: patches, pairwise forces, bonded forces, –Each may have its data ready at different times –Need ability to map and remap them –Need prioritized scheduling Charm++ supports all of these

Charm++ Data Driven Objects Object Groups: –global object with a “representative” on each PE Asynchronous method invocation Prioritized scheduling Mature, robust, portable http://charm.cs.uiuc.edu

Data driven execution Scheduler Message Q

Object oriented design Two top level classes: –Patches: cubes containing atoms –Computes: force calculation Home patches and Proxy patches –Home patch sends coordinates to proxies, and receives forces from them –Each compute interacts with local patches only

Compute hierarchy Many compute subclasses: –Allow reuse of coordination code –Reuse of bookkeeping tasks –Easy to add new types of force objects Example: steered molecular dynamics Implementor focuses on the new force functionality

Multi-paradigm programming Long-range electrostatic interactions –Some simulations require this feature –Contributions of faraway atoms can be computed infrequently –PVM based library, DPMTA Developed at Duke, by John Board, et al Patch life cycle –better expressed as a thread

Converse Supports multi-paradigm programming Provides portability Makes it easy to implement RTS for new paradigms Several languages/libraries: –Charm++, threaded MPI, PVM, Java, md-perl, pc++, nexus, Path, Cid, CC++,..

Namd2 with Converse

Separation of concerns Different developers, with different interests and knowledge, can contribute effectively –Separation of communication and parallel logic –Threads to encapsulate “life-cycle” of patches –Adding new integrator, improving performance, new MD ideas, can be performed modularly and independently

Load balancing Collect timing data for several cycles Run heuristic load balancer –Several alternative ones Re-map and migrate objects accordingly –Registration mechanisms facilitate migration Needs a separate talk!

Performance: size of system

Performance: various machines

Speedup

Conclusion Multi-domain decomposition works well for dynamically evolving, or irregular apps –When supported by data driven objects (Charm++), user level threads, call backs Multi-paradigm programming is effective! Object oriented parallel programming: –promotes reuse, –good performance Measurement based load balancing

Flexibility and Interoperability in a Parallel MD code Robert Brunner, Laxmikant Kale, Jim Phillips University of Illinois at Urbana-Champaign.

Similar presentations

Presentation on theme: "Flexibility and Interoperability in a Parallel MD code Robert Brunner, Laxmikant Kale, Jim Phillips University of Illinois at Urbana-Champaign."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Flexibility and Interoperability in a Parallel MD code Robert Brunner, Laxmikant Kale, Jim Phillips University of Illinois at Urbana-Champaign.

Similar presentations

Presentation on theme: "Flexibility and Interoperability in a Parallel MD code Robert Brunner, Laxmikant Kale, Jim Phillips University of Illinois at Urbana-Champaign."— Presentation transcript:

Similar presentations

About project

Feedback