Presentation is loading. Please wait.

Presentation is loading. Please wait.

Charm++ Data-driven Objects L. V. Kale. Parallel Programming Decomposition – what to do in parallel Mapping: –Which processor does each task Scheduling.

Similar presentations


Presentation on theme: "Charm++ Data-driven Objects L. V. Kale. Parallel Programming Decomposition – what to do in parallel Mapping: –Which processor does each task Scheduling."— Presentation transcript:

1 Charm++ Data-driven Objects L. V. Kale

2 Parallel Programming Decomposition – what to do in parallel Mapping: –Which processor does each task Scheduling (sequencing) –On each processor Machine dependent expression –Express the above decisions for the particular parallel machine The parallel objects model of Charm++ automates Mapping, Scheduling, and machine dependent expression

3 Shared objects model: Basic philosophy: –Let the programmer decide what to do in parallel –Let the system handle the rest: Which processor executes what, and when With some override control to the programmer, when needed Basic model: –The program is set of communicating objects –Objects only know about other objects (not processors) –System maps objects to processors And may remap the objects for load balancing etc. dynamically Shared objects, not shared memory –in-between “shared nothing” message passing, and “shared everything” of SAS –Additional information sharing mechanisms –“Disciplined” sharing

4 Charm++ Charm++ programs specify parallel computations consisting of a number of “objects” –How do they communicate? By invoking methods on each other, typically asynchronously Also by sharing data using “specifically shared variables” –What kinds of objects? Chares: singleton objects Chare arrays: generalized collections of objects Advanced: Chare group (Used by library writers, system)

5 Data Driven Execution in Charm++ Scheduler Message Q Objects

6 Need for Proxies Consider: –Object x of class A wants to invoke method f of obj y of class B. –x and y are on different processors –what should the syntax be? y->f( …)? : doesn’t work because y is not a local pointer Needed: –Instead of “y” we must use an ID that is valid across processors –Method Invocation should use this ID –Some part of the system must pack the parameters and send them –Some part of the system on the remote processor must invoke the right method on the right object with the parameters supplied

7 Charm++ solution: proxy classes Classes with remotely invocable methods –inherit from “chare” class (system defined) –entry methods can only have one parameter: a subclass of message For each chare class D –which has methods that we want to remotely invoke –The system will automatically generate a proxy class Cproxy_D –Proxy objects know where the real object is –Methods invoked on this class simply put the data in an “envelope” and send it out to the destination Each chare object has a global ID –CkChareID thishandle; // thishandle inherited from “chare” –Also you can get the id of a chare when you create it: Cproxy_D *p = new Cproxy_D(msgPtr);

8 Chare creation and method invocation Msg * m = new Msg(); m->arg = 25; CProxy_D *x = new CProxy_D(m); Msg2 * m2 = new Msg2(); m2->a = 5; m2->b= 7; x->f(); Sequential equivalent: y = new D(25); y->f(5,7); x->f(new Msg2(5,7)); Alternatively:

9 Chares (Data driven Objects) Regular C++ classes, –with some methods designated as remotely invokable (called entry methods ) –entry methods have only one parameter: of type message Creation: of an instance of chare class C –Cproxy_C * p = new CProxy_C(msg); –Creates an instance of C on a specified processor “pe” new CProxy_C (msg, pe); –Cproxy_C: a proxy class generated by Charm for chare class C declared by the user

10 Messages A user-defined C++ class –inherits from a system-defined class messages can be communicated to others as parameters –Has regular data fields Declaration: normal C++, –inherit from a system defined class Creation: (just usual C++) –MsgType * m = new MsgType;

11 Remote method invocation Proxy Classes: –For each chare class C, the system generates a proxy class. (C : CProxy_C) Each chare has a global ID (ChareID) –Global: in the sense of being valid on all processors –thishandle (analogous to this) gets you the ChareID –You can send thishandle in messages –Given a handle h, you can create a proxy –CProxy_C p(h); // or q = new CProxy_C(h) –p.method(msg); // or q->method(msg);

12 CkChareID mainhandle; main::main(CkArgMsg * m) { int i = 0; for (i=0; i<100; i++) new CProxy_piPart(); responders = 100; count = 0; mainhandle = thishandle; // readonly initialization } void main::results(DataMsg *msg) { count += msg->count; if (0 == --responders) { CkPrintf("pi=: %f \n", 4.0*count/100000); CkExit(); } } argc/argv Execution begins here Exit scheduler after method returns

13 piPart::piPart() { // declarations.. CProxy_main mainproxy(mainhandle); srand48((long) this); mySamples = 100000/100; for (i= 0; i<= mySamples; i++) { x = drand48(); y = drand48(); if ((x*x + y*y) <= 1.0) localCount++; } DataMsg *result = new DataMsg; result->count = localCount; mainproxy.results(result); delete this; } mainproxy.results( new DataMsg(localCount));

14 Generation of proxy classes How does charm generate the proxy classes? –Needs help from the programmer –name classes and methods that can be remotely invoked –declare this in a special “charm interface” file (pgm.ci) –Include the generated code in your program pgm.ci mainmodule PiMod { message DataMsg; mainchare main { entry main(); entry results(DataMsg *); }; chare piPart { entry piPart(void); }; Generates PiMod.def.h pgm.h #include “PiMod.decl.h”.. Pgm.c … #include “PiMod.def.h”

15 Charm++ Data Driven Objects Message classes Asynchronous method invocation Prioritized scheduling Object Arrays Object Groups: –global object with a “representative” on each PE Information sharing abstractions –readonly data –accumulators –distributed tables

16 Object Arrays A collection of chares, –with a single global name for the collection, and –each member addressed by an index –Mapping of element objects to processors handled by the system A[0]A[1]A[2]A[3]A[..] A[3]A[0] User’s view System view

17 Introduction Elements are parallel objects like chares Elements are indexed by a user-defined data type-- [sparse] 1D, 2D, 3D, tree,... Send messages to index, receive messages at element. Reductions and broadcasts across the array Dynamic insertion, deletion, migration-- and everything still has to work! Interfaces with automatic load balancer.

18 module m { message HiMsg; array [1D] Hello { entry Hello(void); entry void SayHi(HiMsg *); }; CProxy_Hello p = CProxy_Hello::ckNew(); for (int i=12;i<73;i+=7) p[i].insert(); p.doneInserting(); p[12].SayHi(new HiMsg(...)); 1D Declare & Use In the interface (.ci) file In the.C file

19 1D Definition class Hello:public ArrayElement1D{ public: Hello(void) {... thisArrayID...... thisIndex... } void SayHi(HiMsg *m) {... } Hello(CkMigrateMessage *m) {} }; Inherited from ArrayElement1D

20 module m { message HiMsg; array [3D] Hello { entry Hello(void); entry void SayHi(HiMsg *); }; CProxy_Hello p= CProxy_Hello::ckNew(); for (int i=0;i<800000;i++) p(x(i),y(i),z(i)).insert(); p.doneInserting(); p(12,23,7).SayHi(new HiMsg(...)); 3D Declare & Use

21 3D Definition class Hello:public ArrayElement3D{ public: Hello(void) {... thisArrayID...... thisIndex.x, thisIndex.y, thisIndex.z... } void SayHi(HiMsg *m) {... } Hello(CkMigrateMessage *m) {} };

22 3D Definition class Hello:public ArrayElement3D{ public: Hello(void) {... thisArrayID...... thisIndex.x,.y,.z... } void SayHi(HiMsg *m) {... } Hello(CkMigrateMessage *m) {} void pup(PUP::er &p) { ArrayElement3D::pup(p); p(myVar1);p(myVar2);... } };

23 module m{ message HiMsg; array [Foo] Hello { entry Hello(void); entry void SayHi(HiMsg *); }; CProxy_Hello p= CProxy_Hello::ckNew(); for (...) p[CkArrayIndexFoo(..)].insert(); p.doneInserting(); p[CkArrayIndexFoo(..)].SayHi(..); Generalized “arrays”: Declare & Use

24 class Hello:public ArrayElementT { public: Hello(void) {... thisIndex... class CkArrayIndexFoo: public CkArrayIndex { Bar b; //char b[8]; float b[2];.. public: CkArrayIndexFoo(...) {... nInts=sizeof(b)/sizeof(int); } }; General Definition

25 Broadcast message SayHi: p.SayHi(new HiMsg(...)); Reduce x across all elements: contribute(sizeof(x),&x,CkReduction::sum_int ); Where do reduction results go? To a reduction “client” function, registered by the caller (typically as soon as the array is created) CProxy_A a = Cproxy_A::ckNew(); a.setReductionClient(clientFunction, (void *) refData); Collective ops

26 Delete element i: p[i].destroy(); Migrate to processor destPe: migrateMe(destPe); Enable load balancer: by creating a load balancing object Provide pack/unpack functions: Each object that needs this, provides a “pup” method. (pup is a single abstraction that allows data traversal for determining size, packing and unpacking) Migration support

27 Object Groups A group of objects (chares) –with exactly one representative on each processor –A single Id for the group as a whole –invoke methods in a branch (asynchronously), all branches (broadcast), or in the local branch –creation: groupId = new Cproxy_C(msg) –remote invocation: CProxy_C p(groupId); p.methodName(msg); // p.methodName(msg, peNum); p.LocalBranch->f(….);

28 Information sharing abstractions Observation: –Information is shared in several specific modes in parallel programs Other models support only a limited sets of modes: –Shared memory: everything is shared: sledgehammer approach –Message passing: messages are the only method Charm++: identifies and supports several modes –Readonly / writeonce –Tables (hash tables) –accumulators –Monotonic variables

29 Compiling Charm++ programs Need to define an interface specification file –mod.ci for each module mod –Contains declarations that the system uses to produce proxy classes –These produced classes must be included in your mod.C file –See examples provided on the class web site. More information: –Manuals, example programs, papers http://charm.cs.uiuc.edu These slides are currently at: –http://charm.cs.uiuc.edu/kale/cse320

30 Fortran 90 version Quick implementation on top of Charm++ How to use: –follow example program, with the same basic concepts –Only use object arrays, for now Most useful construct Object groups can be implemented in C++, if needed

31 Further Reading More information: –Manuals, example programs, papers http://charm.cs.uiuc.edu These slides are currently at: –http://charm.cs.uiuc.edu/kale/cse320


Download ppt "Charm++ Data-driven Objects L. V. Kale. Parallel Programming Decomposition – what to do in parallel Mapping: –Which processor does each task Scheduling."

Similar presentations


Ads by Google