Welcome and Introduction “State of Charm++” Laxmikant Kale 6th Annual Workshop on Charm++ and Applications.

Slides:

Advertisements

Similar presentations

Distributed Systems Major Design Issues Presented by: Christopher Hector CS8320 – Advanced Operating Systems Spring 2007 – Section 2.6 Presentation Dr.

Advertisements

The Charm++ Programming Model and NAMD Abhinav S Bhatele Department of Computer Science University of Illinois at Urbana-Champaign

Parallel Programming Laboratory1 Fault Tolerance in Charm++ Sayantan Chakravorty.

Abhinav Bhatele, Laxmikant V. Kale University of Illinois at Urbana-Champaign Sameer Kumar IBM T. J. Watson Research Center.

Dr. Gengbin Zheng and Ehsan Totoni Parallel Programming Laboratory University of Illinois at Urbana-Champaign April 18, 2011.

1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Future Parallel Computing Systems – what to remember from the past RAMP Workshop FCRC.

A CASE STUDY OF COMMUNICATION OPTIMIZATIONS ON 3D MESH INTERCONNECTS University of Illinois at Urbana-Champaign Abhinav Bhatele, Eric Bohm, Laxmikant V.

Adaptive MPI Chao Huang, Orion Lawlor, L. V. Kalé Parallel Programming Lab Department of Computer Science University of Illinois at Urbana-Champaign.

Welcome and Introduction “State of Charm++” Laxmikant Kale.

Topology Aware Mapping for Performance Optimization of Science Applications Abhinav S Bhatele Parallel Programming Lab, UIUC.

BigSim: A Parallel Simulator for Performance Prediction of Extremely Large Parallel Machines Gengbin Zheng Gunavardhan Kakulapati Laxmikant V. Kale University.

Charm++ Load Balancing Framework Gengbin Zheng Parallel Programming Laboratory Department of Computer Science University of Illinois at.

Introduction to Parallel Programming MapReduce Except where otherwise noted all portions of this work are Copyright (c) 2007 Google and are licensed under.

1CPSD NSF/DARPA OPAAL Adaptive Parallelization Strategies using Data-driven Objects Laxmikant Kale First Annual Review October 1999, Iowa City.

SOS7, Durango CO, 4-Mar-2003 Scaling to New Heights Retrospective IEEE/ACM SC2002 Conference Baltimore, MD Distilled [Trimmed & Distilled for SOS7 by M.

Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?

Scaling to New Heights Retrospective IEEE/ACM SC2002 Conference Baltimore, MD.

Adaptive MPI Milind A. Bhandarkar

Supporting Multi-domain decomposition for MPI programs Laxmikant Kale Computer Science 18 May 2000 ©1999 Board of Trustees of the University of Illinois.

(Superficial!) Review of Uniprocessor Architecture Parallel Architectures and Related concepts CS 433 Laxmikant Kale University of Illinois at Urbana-Champaign.

Programming Models & Runtime Systems Breakout Report MICS PI Meeting, June 27, 2002.

If Exascale by 2018, Really? Yes, if we want it, and here is how Laxmikant Kale.

Advanced / Other Programming Models Sathish Vadhiyar.

A Fault Tolerant Protocol for Massively Parallel Machines Sayantan Chakravorty Laxmikant Kale University of Illinois, Urbana-Champaign.

Dynamic Load Balancing in Charm++ Abhinav S Bhatele Parallel Programming Lab, UIUC.

Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.

NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.

NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.

Overcoming Scaling Challenges in Bio-molecular Simulations Abhinav Bhatelé Sameer Kumar Chao Mei James C. Phillips Gengbin Zheng Laxmikant V. Kalé.

The Cosmic Cube Charles L. Seitz Presented By: Jason D. Robey 2 APR 03.

Fault Tolerant Extensions to Charm++ and AMPI presented by Sayantan Chakravorty Chao Huang, Celso Mendes, Gengbin Zheng, Lixia Shi.

Workshop BigSim Large Parallel Machine Simulation Presented by Eric Bohm PPL Charm Workshop 2004.

CS 484 Designing Parallel Algorithms Designing a parallel algorithm is not easy. There is no recipe or magical ingredient Except creativity We can benefit.

1 ©2004 Board of Trustees of the University of Illinois Computer Science Overview Laxmikant (Sanjay) Kale ©

Parallelizing Spacetime Discontinuous Galerkin Methods Jonathan Booth University of Illinois at Urbana/Champaign In conjunction with: L. Kale, R. Haber,

Scalable and Topology-Aware Load Balancers in Charm++ Amit Sharma Parallel Programming Lab, UIUC.

Parallelization Strategies Laxmikant Kale. Overview OpenMP Strategies Need for adaptive strategies –Object migration based dynamic load balancing –Minimal.

NGS/IBM: April2002PPL-Dept of Computer Science, UIUC Programming Environment and Performance Modeling for million-processor machines Laxmikant (Sanjay)

Programming an SMP Desktop using Charm++ Laxmikant (Sanjay) Kale Parallel Programming Laboratory Department of Computer Science.

NGS Workshop: Feb 2002PPL-Dept of Computer Science, UIUC Programming Environment and Performance Modeling for million-processor machines Laxmikant (Sanjay)

Fault Tolerance in Charm++ Gengbin Zheng 10/11/2005 Parallel Programming Lab University of Illinois at Urbana- Champaign.

1 Opportunities and Challenges of Modern Communication Architectures: Case Study with QsNet CAC Workshop Santa Fe, NM, 2004 Sameer Kumar* and Laxmikant.

1 Rocket Science using Charm++ at CSAR Orion Sky Lawlor 2003/10/21.

Motivation: dynamic apps Rocket center applications: –exhibit irregular structure, dynamic behavior, and need adaptive control strategies. Geometries are.

Hierarchical Load Balancing for Large Scale Supercomputers Gengbin Zheng Charm++ Workshop 2010 Parallel Programming Lab, UIUC 1Charm++ Workshop 2010.

Using Charm++ with Arrays Laxmikant (Sanjay) Kale Parallel Programming Lab Department of Computer Science, UIUC charm.cs.uiuc.edu.

FTC-Charm++: An In-Memory Checkpoint-Based Fault Tolerant Runtime for Charm++ and MPI Gengbin Zheng Lixia Shi Laxmikant V. Kale Parallel Programming Lab.

Teragrid 2009 Scalable Interaction with Parallel Applications Filippo Gioachin Chee Wai Lee Laxmikant V. Kalé Department of Computer Science University.

Massively Parallel Cosmological Simulations with ChaNGa Pritish Jetley, Filippo Gioachin, Celso Mendes, Laxmikant V. Kale and Thomas Quinn.

1 Scalable Cosmological Simulations on Parallel Machines Filippo Gioachin¹ Amit Sharma¹ Sayantan Chakravorty¹ Celso Mendes¹ Laxmikant V. Kale¹ Thomas R.

1 ChaNGa: The Charm N-Body GrAvity Solver Filippo Gioachin¹ Pritish Jetley¹ Celso Mendes¹ Laxmikant Kale¹ Thomas Quinn² ¹ University of Illinois at Urbana-Champaign.

Flexibility and Interoperability in a Parallel MD code Robert Brunner, Laxmikant Kale, Jim Phillips University of Illinois at Urbana-Champaign.

VisIt Project Overview

Techniques for Developing Efficient Petascale Applications

Parallel Objects: Virtualization & In-Process Components

Programming Models for SimMillennium

CHAPTER 3 Architectures for Distributed Systems

Performance Evaluation of Adaptive MPI

Scalable Fault Tolerance Schemes using Adaptive Runtime Support

Workshop on Charm++ and Applications Welcome and Introduction

Component Frameworks:

Milind A. Bhandarkar Adaptive MPI Milind A. Bhandarkar

BigSim: Simulating PetaFLOPS Supercomputers

Gengbin Zheng, Esteban Meneses, Abhinav Bhatele and Laxmikant V. Kale

IXPUG, SC’16 Lightning Talk Kavitha Chandrasekar*, Laxmikant V. Kale

Parallel Programming in C with MPI and OpenMP

An Orchestration Language for Parallel Objects

Support for Adaptivity in ARMCI Using Migratable Objects

Laxmikant (Sanjay) Kale Parallel Programming Laboratory

L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher

Presentation transcript:

Welcome and Introduction “State of Charm++” Laxmikant Kale 6th Annual Workshop on Charm++ and Applications

May 1st, th Annual Charm++ Workshop 2 Parallel Programming Laboratory GRANTS DOE/ORNL (NSF: ITR) Chemistry Car- Parinello MD, QM/MM Sr.STAFF ENABLING PROJECTS Fault-Tolerance: Checkpointing, Fault-Recovery, Proc.Evacuation Load-Balance: Centralized, Distributed, Hybrid Faucets : Dynamic Resource Management for Grids ParFUM: Supporting Unstructured Meshes (Comp.Geometry) Charm++ and Converse AMPI Adaptive MPI Projections: Performance Analysis Higher Level (Deterministic) Parallel Languages BigSim: Simulating Big Machines and Networks NASA Computational Cosmology and Visualization DOE HPC-Colony Services and Interfaces for Large Computers DOE CSAR Rocket Simulation NSF: ITR CSE ParFUM Applications NIH Biophysics NAMD NSF + Blue Waters BigSim

May 1st, th Annual Charm++ Workshop 3 A Glance at History 1987: Chare Kernel arose from parallel Prolog work –Dynamic load balancing for state-space search, Prolog, : Charm : Position Paper: –Application Oriented yet CS Centered Research –NAMD : 1994, 1996 Charm++ in almost current form: –Chare arrays, –Measurement Based Dynamic Load balancing 1997 : Rocket Center: a trigger for AMPI 2001: Era of ITRs: –Quantum Chemistry collaboration Computational Astronomy collaboration: ChaNGa 2008: Multicore meets PetaFLOPs

May 1st, th Annual Charm++ Workshop 4 PPL Mission and Approach To enhance Performance and Productivity in programming complex parallel applications –Performance: scalable to thousands of processors –Productivity: of human programmers –Complex: irregular structure, dynamic variations Approach: Application Oriented yet CS centered research –Develop enabling technology, for a wide collection of apps. –Develop, use and test it in the context of real applications How? –Develop novel Parallel programming techniques –Embody them into easy to use abstractions –So, application scientist can use advanced techniques with ease –Enabling technology: reused across many apps

May 1st, th Annual Charm++ Workshop 5 Migratable Objects (aka Processor Virtualization) Programmer : [Over] decomposition into virtual processors Runtime: Assigns VPs to processors Enables adaptive runtime strategies Implementations: Charm++, AMPI Software engineering –Number of virtual processors can be independently controlled –Separate VPs for different modules Message driven execution –Adaptive overlap of communication –Predictability : Automatic out-of-core –Asynchronous reductions Dynamic mapping –Heterogeneous clusters Vacate, adjust to speed, share –Automatic checkpointing –Change set of processors used –Automatic dynamic load balancing –Communication optimization Benefits User View System View

May 1st, th Annual Charm++ Workshop 6 Adaptive overlap and modules SPMD and Message-Driven Modules ( From A. Gursoy, Simplified expression of message-driven programs and quantification of their impact on performance, Ph.D Thesis, Apr 1994) Modularity, Reuse, and Efficiency with Message-Driven Libraries: Proc. of the Seventh SIAM Conference on Parallel Processing for Scientific Computing, San Fransisco, 1995

May 1st, th Annual Charm++ Workshop 7 Realization: Charm++’s Object Arrays A collection of data-driven objects –With a single global name for the collection –Each member addressed by an index [sparse] 1D, 2D, 3D, tree, string,... –Mapping of element objects to processors handled by the system A[0]A[1]A[2]A[3]A[..] User’s view

May 1st, th Annual Charm++ Workshop 8 Realization: Charm++’s Object Arrays A collection of data-driven objects –With a single global name for the collection –Each member addressed by an index [sparse] 1D, 2D, 3D, tree, string,... –Mapping of element objects to processors handled by the system A[0]A[1]A[2]A[3]A[..] A[3]A[0] User’s view System view

May 1st, th Annual Charm++ Workshop 9 Charm++: Object Arrays A collection of data-driven objects –With a single global name for the collection –Each member addressed by an index [sparse] 1D, 2D, 3D, tree, string,... –Mapping of element objects to processors handled by the system A[0]A[1]A[2]A[3]A[..] A[3]A[0] User’s view System view

May 1st, th Annual Charm++ Workshop 10 AMPI: Adaptive MPI

May 1st, th Annual Charm++ Workshop 11 So, What’s new?

May 1st, th Annual Charm++ Workshop 12 A large number of collaborations and grants got started

May 1st, th Annual Charm++ Workshop 13 New projects Those of you who attended last year may remember –A major concern was many grants were ending last year Good news: – We have secured adequate funding New grants, projects, and collaborations – DOE/FAST-OS – ORNL/DOE – NSF: BigSim – NASA: Cosmology – IACAT – UPCRC – Blue Waters: NCSA/NSF

May 1st, th Annual Charm++ Workshop 14 Blue Waters NSF Track 1 System at NCSA/UIUC Headed by: – Thom Dunning, – Rob Pennington Over a year and half of effort – Pre-proposal, proposal, contracts and sub-contract PPL participation NAMD scaling BigSim deployment – Early application development – Performance Prediction

May 1st, th Annual Charm++ Workshop 15 IBM Power7 based machine It is an exciting machine Let me tell you about some details

May 1st, th Annual Charm++ Workshop 16 IACAT Institute for Advanced Computing Applications and Technologies Headed by Thom Dunning Three themes funded this year PPL is participating an co-leading a theme on petascale applications – Participation by UIUC/CS computer scientists Vikram Adve Ralph Johnson Sanjay Kale David Padua.. – Application sientists Duane Johnson (co-lead) David Ceperly Klaus Schulten

May 1st, th Annual Charm++ Workshop 17 BigSim Title: –Performance Prediction for Petascale Machines and Applications Funded by NFS Goal –develop a parallel simulator that can be used for petascale machines consisting of over a million processors.

May 1st, th Annual Charm++ Workshop 18 Colony Project Overview Lawrence Livermore National Laboratory Terry Jones University of Illinois at Urbana-Champaign Laxmikant Kale Celso Mendes Sayantan Chakravorty International Business Machines Jose Moreira, Andrew Tauferner, Todd Inglett Haifa Lab: Gregory Chockler, Eliezer Dekel, Roie Melamed  Parallel Resource Instrumentation Framework  Scalable Load Balancing  OS mechanisms for Migration  Processor Virtualization for Fault Tolerance  Single system management space  Parallel Awareness and Coordinated Scheduling of Services  Linux OS for cellular architecture  Overlay networks Services and Interfaces to Support Systems with Very Large Numbers of Processors Collaborators Topics Title

May 1st, th Annual Charm++ Workshop 19 Terascale Impostors Support: NASA Participants: –Prof. Thomas Quinn (Dep. Astronomy, Univ. Washington) –Prof. Orion Lawlor (Dep. CS, Univ. Alaska) Goal –Use innovations from parallel computing and from graphics computing to enable the interactive rendering of very large cosmological datasets Example: screenshot of 800-million particle dataset. Client running Salsa at Univ. Washington, connected to a server running on 256 processors of lemieux, at PSC. The GUI allows interactively moving across the space and renders a frame every few seconds, but this is much below the “fusion frequency” required to get a 3D “feel” of the simulation. Source: Tom Quinn

May 1st, th Annual Charm++ Workshop 20 QMMM=NAMD + OpenAtom + ORNL LCF Scalable Atomistic Modeling Tools with Chemical Reactivity for Life Sciences –M Tuckerman (NYU)‏ –G. Martyna (IBM)‏ –L. Kale (UIUC)‏ –K. Schulten (UIUC)‏ –J. Dongarra (UTK)‏ DOE 2 year grant Objectives – Tune ORNL – Tune ORNL – Combine OpenAtom with NAMD – ATLAS for Cray Baker – Integrate Optimized Collective Communication library with Charm++

May 1st, th Annual Charm++ Workshop 21 UPCRC Intel and Microsoft funded center Involves 20 faculty from CS/ECE at UIUC –Co-directors: Marc Snir, Wen-Mei Hwu –Research director: Sarita Adve PPL participation – Being decided – Managed execution: adaptivre runtime system – But our past work is relevant to Deterministic languages to simplify programming Domain specific frameworks – ParFUM Applications!

May 1st, th Annual Charm++ Workshop 22 Our Applications Achieved Unprecedented Speedups

May 1st, th Annual Charm++ Workshop 23 Applications and Charm++ Application Charm++ Other Applications Issues Techniques & libraries Synergy between Computer Science Research and Biophysics has been beneficial to both

May 1st, th Annual Charm++ Workshop 24 Charm++ and Applications Synergy between Computer Science Research and Biophysics has been beneficial to both

May 1st, th Annual Charm++ Workshop 25 Parallel Objects, Adaptive Runtime System Libraries and Tools The enabling CS technology of parallel objects and intelligent Runtime systems has led to several collaborative applications in CSE Crack Propagation Space-time meshes Computational Cosmology Rocket Simulation Protein Folding Dendritic Growth Quantum Chemistry LeanCP Develop abstractions in context of full-scale applications NAMD: Molecular Dynamics STMV virus simulation

May 1st, th Annual Charm++ Workshop 26

May 1st, th Annual Charm++ Workshop 27 NAMD XT4 at ORNL

May 1st, th Annual Charm++ Workshop 28 Shallow valleys, high peaks, nicely overlapped PME green: communication Red: integration Blue/Purple: electrostatics turquoise: angle/dihedral Orange: PME 94% efficiency Apo-A1, on BlueGene/L, 1024 procs Charm++’s “Projections” Analysis too Time intervals on x axis, activity added across processors on Y axisl

May 1st, th Annual Charm++ Workshop 29 ChaNGa

May 1st, th Annual Charm++ Workshop 30 CSE: Wave Propagation and Dynamic Fracturing Support: UIUC-CSE Fellowship Participants: –Prof. Glaucio Paulino (Dep. Civil and Environmental Engineering) –Kyoungsoo Park (CEE grad student, CSE fellow) Goals –Integrate ParFUM and TopS and apply to the study of graded materials Source: Kyoungsoo ParkScaling on NCSA’s Abe

May 1st, th Annual Charm++ Workshop 31 Load Balancing on Very Large Machines Existing load balancing strategies don’t scale on extremely large machines –Consider an application with 1M objects on 64K processors Centralized –Object load data are sent to processor 0 –Integrate to a complete object graph –Migration decision is broadcast from processor 0 –Global barrier Distributed –Load balancing among neighboring processors –Build partial object graph –Migration decision is sent to its neighbors –No global barrier Topology-aware − On 3D Torus/Mesh topologies

May 1st, th Annual Charm++ Workshop 32 Fault Tolerance Automatic Checkpointing –Migrate objects to disk –In-memory checkpointing as an option –Automatic fault detection and restart Proactive Fault Tolerance –“Impending Fault” Response –Migrate objects to other processors –Adjust processor-level parallel data structures Scalable fault tolerance – When a processor out of 100,000 fails, all 99,999 shouldn’t have to run back to their checkpoints! – Sender-side message logging – Latency tolerance helps mitigate costs – Restart can be speeded up by spreading out objects from failed processor

May 1st, th Annual Charm++ Workshop 33 Fault Tolerance with Fast Recovery Recovery time –Time for a crashed processor to regain its pre-crash state –Existing solutions: Time between last checkpoint and crash –Aim: Lower recovery time Advantages of a lower recovery time –Lower execution time –Tolerate higher rate of faults Reduce the work on the recovering processor by distributing among others Other processors can not themselves be re-executing –Should not rollback all processors Object based virtualization to divide the work Leverage message logging and object-based virtualization to obtain fast recovery

May 1st, th Annual Charm++ Workshop 34 Virtualization and Message Logging Charm++ objects are the communicating entities After a crash, objects must process messages in the same sequence as before Sender side pessimistic protocol modified to work with objects At Restart distribute objects on the recovering processor among other processor to parallelize restart 34Sayantan Chakravorty Sender Receiver SN33 TN121 Message Data SN Sequence Number TN Ticket Number Message Identify a message uniquely before and after a crash Receiver processes messages in increasing consecutive order of TN Meta-Data

May 1st, th Annual Charm++ Workshop 35 Load Balancing and Fault Tolerance Load balance is important for scaling applications to large numbers of processors Load balancing needs to work with message logging –Effect of Object Migration on reliability –Crashes during the load balancing step –Load balancing and the fast restart protocol Fast restart creates a load imbalance of its own

May 1st, th Annual Charm++ Workshop 36 Performance: Progress Sayantan Chakravorty36 2D stencil in Charm VPs on 512 processors Load balance/checkpoint every 200 timesteps Recovery With in-memory checkpoint Recovery takes 280s and 204s With fast restart Recovery takes 68.6s and 65s Skip to Summary

May 1st, th Annual Charm++ Workshop 37 BigSim Simulating very large parallel machines –Using smaller parallel machines Reasons –Predict performance on future machines –Predict performance obstacles for future machines –Do performance tuning on existing machines that are difficult to get allocations on Idea: –Emulation run using virtual processor processors (AMPI) Get traces –Detailed machine simulation using traces

May 1st, th Annual Charm++ Workshop 38 Objectives and Simualtion Model Objectives: –Develop techniques to facilitate the development of efficient peta-scale applications –Based on performance prediction of applications on large simulated parallel machines Simulation-based Performance Prediction: –Focus on Charm++ and AMPI programming models Performance prediction based on PDES –Supports varying levels of fidelity processor prediction, network prediction. –Modes of execution : online and post-mortem mode

May 1st, th Annual Charm++ Workshop 39 Big Network Simulation Simulate network behavior: packetization, routing, contention, etc. Incorporate with post-mortem simulation Switches are connected in torus network BGSIM Emulator POSE Timestamp Correction BG Log Files (tasks & dependencies) Timestamp-corrected Tasks BigNetSim

May 1st, th Annual Charm++ Workshop 40 Projections: Performance visualization

May 1st, th Annual Charm++ Workshop 41 Architecture of BigNetSim

May 1st, th Annual Charm++ Workshop 42 Performance Prediction (contd.) Predicting time of sequential code: –User supplied time for every code block –Wall-clock measurements on simulating machine can be used via a suitable multiplier –Hardware performance counters to count floating point, integer, branch instructions, etc Cache performance and memory footprint are approximated by percentage of memory accesses and cache hit/miss ratio –Instruction level simulation (not implemented) Predicting Network performance: –No contention, time based on topology & other network parameters –Back-patching, modifies comm time using amount of comm activity –Network-simulation, modelling the netowrk entirely

May 1st, th Annual Charm++ Workshop 43 Multicore/SMP issues Charm++ has supported multicore/SMP for a long time –However, practically no one was using them! –It was always faster to use a separate process for each core –But this is not sustainable when the number of cores increases Need shared software cache for ChaNGA, for example Need to avoid unnecessary communication (in multicasts) –During the last year, we paid attention to the anatomy of these performance problems

May 1st, th Annual Charm++ Workshop 44 Multicore/SMP issues Charm++ advantage –User program involved no locking –The Charm model minimize false sharing Issues identified in the RTS –Locks : strength redeuction –Locks vs. Fences –False Sharing Performance of k-way sends and replies

May 1st, th Annual Charm++ Workshop 45 Multicore/SMP issues Charm++ advantage –User program involved no locking –The Charm model minimize false sharing Issues identified in the RTS –Locks : strength redeuction –Locks vs. Fences –False Sharing Performance of k-way sends and replies

May 1st, th Annual Charm++ Workshop 46 Domain Specific Frameworks Motivation Reduce tedium of parallel programming for commonly used paradigms and parallel data structures Encapsulate parallel data structures and algorithms Provide easy to use interface Used to build concurrently composible parallel modules Frameworks Unstructured Meshes:ParFUM –Generalized ghost regions –Used in Rocfrac, Rocflu at rocket center, and Outside CSAR –Fast collision detection Multiblock framework –Structured Meshes –Automates communication AMR –Common for both above Particles –Multiphase flows –MD, tree codes

May 1st, th Annual Charm++ Workshop 47 Other Ongoing Projects Parallel Debugger Scalable Performance Analysis Automatic out-of-core execution Challenges of exploiting multicores New collaborations being explored

May 1st, th Annual Charm++ Workshop 48 Summary and Messages We at PPL have advanced migratable objects technology –We are committed to supporting applications –We grow our base of reusable techniques via such collaborations Try using our technology: –AMPI, Charm++, Faucets, ParFUM,.. –Available via the web charm.cs.uiuc.edu

May 1st, th Annual Charm++ Workshop 49 Workshop 0verview System progress talks Adaptive MPI BigSim: Performance prediction Scalable Performance Analysis Fault Tolerance Cell Processor Grid Multi-cluster applications Applications Molecular Dynamics Quantum Chemistry Computational Cosmology Structural Mechanics Tutorials Charm++ AMPI ParFUM BigSim Keynote: Rick Stevens Exascale Computational Science Invited Talk: Bill Gropp The Evolution of MPI