Presentation is loading. Please wait.

Presentation is loading. Please wait.

Parallel Objects: Virtualization & In-Process Components

Similar presentations


Presentation on theme: "Parallel Objects: Virtualization & In-Process Components"— Presentation transcript:

1 Parallel Objects: Virtualization & In-Process Components
Orion Sky Lawlor Univ. of Illinois at Urbana-Champaign POHLL-2002

2 Introduction Parallel Programming is hard: Communication takes time
Message startup cost Bandwidth & contention Synchronization, race conditions Parallelism breaks abstractions Flatten data structures Hand off control between modules Harder than serial programming

3 Motivation Parallel Applications are either: Embarrassingly Parallel
Trivial, 1 RA-week effort E.g. Monte Carlo, parameter sweep, Communication totally irrelevant to performance

4 Motivation Parallel Applications are either: Embarrassingly Parallel
Excruciatingly Parallel Massive, 1+ RA-year effort E.g. “Pure” MPI codes ≥10k lines Communication, synchronization totally determine performance

5 Motivation Parallel Applications are either: Embarrassingly Parallel
Excruciatingly Parallel “We’ll be done in 6 months…” Several parallel libraries & codes & groups, dynamic & adaptive E.g. Multiphysics simulation

6 Serial Solution: Abstract!
Build layers of software High-level: Libc, C++ STL, … Mid-level: OS Kernel Silently schedule processes Keep CPU busy even when some processes block Allows a process to ignore other processes Low-level: assembler

7 Parallel Solution: Abstract!
Middle layers are missing High-level: ScaLAPACK, POOMA.. Mid-level: ? Kernel Silently schedule components Keep CPU busy even when some components block Allows a component to ignore other components Low-level: MPI

8 The missing middle layer:
Provides dynamic computation and communication overlap, even across separate modules Handles inter-module handoff Pipelines communication Improves cache utilization—smaller components Provides nice layer for advanced features, like process migration

9 Examples: Multiprogramming

10 Examples: Pipelining

11 Middle Layer: Implementation
Real OS processes/threads Robust, reliable, implemented High performance penalty No parallel features (migration!) Converse/Charm++ In-process components: efficient Piles of advanced features AMPI, MPI interface to Charm Application Framework

12 Charm++ Parallel library for Object-Oriented C++ applications
Messaging via method calls Communication “proxy” objects Methods called by scheduler System determines who runs next Multiple objects per processor Object migration fully supported Even with broadcasts, reductions

13 Mapping Work to Processors
System implementation User View

14 AMPI MPI interface, implemented on Charm++
Multiple “virtual processors” per physical processor Implemented as user-level threads Very fast context switching MPI_Recv only blocks virtual processor, not physical All the benefits of Charm++

15 Application Frameworks
Domain-specific interfaces: unstructured grids, structured grids, particle-in-cell Provide natural interface to application scientists (Fortran!) “Encapsulate” communication Built on Charm++ Most popular interfaces to Charm++

16 Charm++ Features: Migration
Automatic load balancing Balance load by migrating objects Application-independent Built-in data collection (cpu, net) Pluggable “strategy” modules Adaptive Job Scheduler Shrink/expand parallel job, by migrating objects Dramatic utilization improvment

17 Examples: Load Balancing
1. Adaptive Refinement 3. Chunks Migrated 2. Load Balancer Invoked

18 Examples: Expanding Job

19 Examples: Virtualization

20 Conclusions Parallel applications need something like a “kernel”
Neutral party to mediate CPU use Significant utilization gains Easy to put good tools in kernel Work migration support Load balancing Consider using Charm++


Download ppt "Parallel Objects: Virtualization & In-Process Components"

Similar presentations


Ads by Google