Presentation is loading. Please wait.

Presentation is loading. Please wait.

Charm++ Data-driven Objects L. V. Kale. Parallel Programming Decomposition – what to do in parallel Mapping: –Which processor does each task Scheduling.

Similar presentations


Presentation on theme: "Charm++ Data-driven Objects L. V. Kale. Parallel Programming Decomposition – what to do in parallel Mapping: –Which processor does each task Scheduling."— Presentation transcript:

1 Charm++ Data-driven Objects L. V. Kale

2 Parallel Programming Decomposition – what to do in parallel Mapping: –Which processor does each task Scheduling (sequencing) –On each processor Machine dependent expression –Express the above decisions for the particular parallel machine The parallel objects model of Charm++ automates Mapping, Scheduling, and machine dependent expression

3 Shared objects model: Basic philosophy: –Let the programmer decide what to do in parallel –Let the system handle the rest: Which processor executes what, and when With some override control to the programmer, when needed Basic model: –The program is set of communicating objects –Objects only know about other objects (not processors) –System maps objects to processors And may remap the objects for load balancing etc. dynamically Shared objects, not shared memory –in-between “shared nothing” message passing, and “shared everything” of SAS –Additional information sharing mechanisms –“Disciplined” sharing

4 Charm++ Charm++ programs specify parallel computations consisting of a number of “objects” –How do they communicate? By invoking methods on each other, typically asynchronously Also by sharing data using “specifically shared variables” –What kinds of objects? Chares: singleton objects Chare arrays: generalized collections of objects Advanced: Chare group (Used by library writers, system)

5 Data Driven Execution in Charm++ Scheduler Message Q Objects

6 Need for Proxies Consider: –Object x of class A wants to invoke method f of obj y of class B. –x and y are on different processors –what should the syntax be? y->f( …)? : doesn’t work because y is not a local pointer Needed: –Instead of “y” we must use an ID that is valid across processors –Method Invocation should use this ID –Some part of the system must pack the parameters and send them –Some part of the system on the remote processor must invoke the right method on the right object with the parameters supplied

7 Charm++ solution: proxy classes Classes with remotely invokeable methods –inherit from “chare” class (system defined) –entry methods can only have one parameter: a subclass of message For each chare class D –which has methods that we want to remotely invoke –The system will automatically generate a proxy class Cproxy_D –Proxy objects know where the real object is –Methods invoked on this class simply put the data in an “envelope” and send it out to the destination Each chare object has a proxy –CProxy_D thisProxy; // thisProxy inherited from “CBase_D” –Also you can get a proxy for a chare when you create it: CProxy_D myNewChare = CProxy_D::ckNew(arg);

8 Chare creation and method invocation CProxy_D x = CProxy_D::ckNew(25); x.f(5,7); Sequential equivalent: y = new D(25); y->f(5,7);

9 Chares (Data driven Objects) Regular C++ classes, –with some methods designated as remotely invokable (called entry methods ) Creation: of an instance of chare class C –CProxy_C myChareProxy = CProxy_C::ckNew(args); –Creates an instance of C on a specified processor “pe” CProxy_C::ckNew (args, pe); –Cproxy_C: a proxy class generated by Charm for chare class C declared by the user

10 Remote method invocation Proxy Classes: –For each chare class C, the system generates a proxy class. (C : CProxy_C) –Global: in the sense of being valid on all processors –thisProxy (analogous to this) gets you your own proxy –You can send proxies in messages –Given a proxy p, you can invoke methods: –p.method(msg);

11 CProxy_main mainProxy; main::main(CkArgMsg * m) { int i = 0; for (i=0; i<100; i++) new CProxy_piPart(); responders = 100; count = 0; mainProxy = thisProxy; // readonly initialization } void main::results(int pcount) { count += pcount; if (0 == --responders) { cout << "pi=: “ << 4.0*count/100000 << endl; CkExit(); } argc/argv Execution begins here Exit the program

12 piPart::piPart() { // declarations.. srand48((long) this); mySamples = 100000/100; for (i= 0; i<= mySamples; i++) { x = drand48(); y = drand48(); if ((x*x + y*y) <= 1.0) localCount++; } mainProxy.results(localCount); delete this; }

13 Generation of proxy classes How does charm generate the proxy classes? –Needs help from the programmer –name classes and methods that can be remotely invoked –declare this in a special “charm interface” file (pgm.ci) –Include the generated code in your program pgm.ci mainmodule PiMod { mainchare main { entry main(); entry results(int pc); }; chare piPart { entry piPart(void); }; Generates PiMod.def.h pgm.h #include “PiMod.decl.h”.. Pgm.c … #include “PiMod.def.h”

14 Charm++ Data Driven Objects Message classes Asynchronous method invocation Prioritized scheduling Object Arrays Object Groups: –global object with a “representative” on each PE Information sharing abstractions –readonly data –accumulators –distributed tables

15 Object Arrays A collection of chares, –with a single global name for the collection, and –each member addressed by an index –Mapping of element objects to processors handled by the system A[0]A[1]A[2]A[3]A[..] A[3]A[0] User’s view System view

16 Introduction Elements are parallel objects like chares Elements are indexed by a user-defined data type-- [sparse] 1D, 2D, 3D, tree,... Send messages to index, receive messages at element. Reductions and broadcasts across the array Dynamic insertion, deletion, migration-- and everything still has to work! Interfaces with automatic load balancer.

17 module m { array [1D] Hello { entry Hello(void); entry void SayHi(int HiData); }; //Create an array of Hello’s with 4 elements: int nElements=4; CProxy_Hello p = CProxy_Hello::ckNew(nElements); //Have element 2 say “hi” P[2].SayHi(12345); 1D Declare & Use In the interface (.ci) file In the.C file

18 1D Definition class Hello:public CBase_Hello{ public: Hello(void) { … thisProxy … … thisIndex … } void SayHi(int m) { if (m <1000) thisProxy[thisIndex+1].SayHi(m+1); } Hello(CkMigrateMessage *m) {} }; Inherited from ArrayElement1D

19 module m { array [3D] Hello { entry Hello(void); entry void SayHi(int HiData); }; CProxy_Hello p= CProxy_Hello::ckNew(); for (int i=0;i<800000;i++) p(x(i),y(i),z(i)).insert(); p.doneInserting(); p(12,23,7).SayHi( 34); 3D Declare & Use

20 3D Definition class Hello:public CBase_Hello{ public: Hello(void) {... thisProxy...... thisIndex.x, thisIndex.y, thisIndex.z... } void SayHi(int HiData) {... } Hello(CkMigrateMessage *m) {} };

21 Pup Routine void pup(PUP::er &p) { // Call our superclass’s pup routine: ArrayElement3D::pup(p); p|myVar1;p|myVar2;... }

22 module m{ array [Foo] Hello { entry Hello(void); entry void SayHi(int data); }; CProxy_Hello p= CProxy_Hello::ckNew(); for (...) p[CkArrayIndexFoo(..)].insert(); p.doneInserting(); p[CkArrayIndexFoo(..)].SayHi(..); Generalized “arrays”: Declare & Use

23 class Hello:public CBase_Hello { public: Hello(void) {... thisIndex... class CkArrayIndexFoo: public CkArrayIndex { Bar b; //char b[8]; float b[2];.. public: CkArrayIndexFoo(...) {... nInts=sizeof(b)/sizeof(int); } }; General Definition

24 Broadcast message SayHi: p.SayHi(data); Reduce x across all elements: contribute(sizeof(x),&x,CkReduction::sum_int,c b); Where do reduction results go? To a “callback” function, named cb above: // Call some function foo with fooData when done: CkCallback cb(foo,fooData); // Broadcast the results to my method “bar” when done: CkCallback cb(CkIndex_MyArray::bar,thisProxy); Collective ops

25 Delete element i: p[i].destroy(); Migrate to processor destPe: migrateMe(destPe); Enable load balancer: by creating a load balancing object Provide pack/unpack functions: Each object that needs this, provides a “pup” method. (pup is a single abstraction that allows data traversal for determining size, packing and unpacking) Migration support

26 Object Groups A group of objects (chares) –with exactly one representative on each processor –A single proxy for the group as a whole –invoke methods in a branch (asynchronously), all branches (broadcast), or in the local branch –creation: agroup = Cproxy_C::ckNew(msg) –remote invocation: p.methodName(msg); // p.methodName(msg, peNum); p.ckLocalBranch()->f(….);

27 Information sharing abstractions Observation: –Information is shared in several specific modes in parallel programs Other models support only a limited sets of modes: –Shared memory: everything is shared: sledgehammer approach –Message passing: messages are the only method Charm++: identifies and supports several modes –Readonly / writeonce –Tables (hash tables) –accumulators –Monotonic variables

28 Compiling Charm++ programs Need to define an interface specification file –mod.ci for each module mod –Contains declarations that the system uses to produce proxy classes –These produced classes must be included in your mod.C file –See examples provided on the class web site. More information: –Manuals, example programs, papers http://charm.cs.uiuc.edu/ These slides are currently at: –http://charm.cs.uiuc.edu/presentations/charmTutorial/

29 Fortran 90 version Quick implementation on top of Charm++ How to use: –follow example program, with the same basic concepts –Only use object arrays, for now Most useful construct Object groups can be implemented in C++, if needed

30 Further Reading More information: –Manuals, example programs, papers http://charm.cs.uiuc.edu These slides are currently at: –http://charm.cs.uiuc.edu/kale/cse320


Download ppt "Charm++ Data-driven Objects L. V. Kale. Parallel Programming Decomposition – what to do in parallel Mapping: –Which processor does each task Scheduling."

Similar presentations


Ads by Google