Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distributed Galois Andrew Lenharth 2/27/2015. Goals An implementation of the operator formulation for distributed memory – Ideally forward-compatible.

Similar presentations


Presentation on theme: "Distributed Galois Andrew Lenharth 2/27/2015. Goals An implementation of the operator formulation for distributed memory – Ideally forward-compatible."— Presentation transcript:

1 Distributed Galois Andrew Lenharth 2/27/2015

2 Goals An implementation of the operator formulation for distributed memory – Ideally forward-compatible where possible Both simple programming model and fast implementation – Like Galois, may need restrictions or structure for highest performance

3 Overview PGAS (using fat pointers) Implicit, asynchronous communication Default execution mode: – Galois compatable – Implicit locking and data movement – Plugable schedulers – Speculative execution All D-Galois programs are valid Galois

4 Support Galois Implementation User Code User Context Graph Parallel Loop Contention Manager Memory Management Statistics Topology Scheduler Barrier Termination Etc

5 Support Distributed Galois Implementation User Code User Context Graph Parallel Loop Contention Manager Memory Management Statistics Topology Scheduler Barrier Termination Etc NetworkDirectory Remote Store

6 Current Status Working implementation of baseline – Asynchronous, speculative

7 Interesting Problems Livelock Asynchronous directory Abstractions for building data-structures Network hardware Network software Remote updates Scheduling

8 Solved: Livelock Source: object state transition is more complex, is asynchronous, and may require multiple steps (hence interruptable) Solution: scheme to ensure forward progress of one host Alternate: if this happens a lot for your application, a coordinated scheduling may be more appropriate (or relaxed consistency)

9 Asynchronous Directory Source: communication and workers interleave access to directory (and directly to objects stored in the directory) Solution: mostly just a pain.

10 Abstraction for building DS Source: Distributed data structures are hard (so are SM DS). Solution: Set of abstractions Federated object: different instance on each host/thread, pointers resolve locally. Federation bootstrapped by runtime. Federated objects don’t have any notion of exclusive behavior

11 Remote Updates Directory synchronization really bad when not needed (essential when needed) Many algorithms have an update and schedule behavior for their neighbors Treat this behavior as a task type – Multiple task-types per loop – Quite similar to nested parallelism

12 Remote Updates – PageRank Self.value += self.residual For n : neighbor n.residual += f(self.residual) Schedule (operator type on) {n} Self.value += self.residual For n : neighbor Schedule (update type on) {n, f(self.residual)} With a new operator: Self.redual += update Schedule (operator type on) {self}

13 Scheduling Source: Imagine SSSP using the existing schedulers (host-unaware) on distributed memory Need schedule with way to anchor work to data-structure element

14 Network hardware

15

16

17 Networks Small asynchronous messages are bad for throughput Scale-free graphs stress throughput Large messages are bad for latency Find optimal point – Sometimes latency is critical

18 Nagle’s algorithm If you don’t have a large message, wait a while to get more data Bad for latency Also, keeps MPI in it’s broken behavior range Also, requires O(P) memory for communications (assuming direct pointwise)

19 Communication pattern

20

21 Software Routing Pros: single communication channel – Scales with hosts – Aggregates all messages Cons: 2 hops (or more)


Download ppt "Distributed Galois Andrew Lenharth 2/27/2015. Goals An implementation of the operator formulation for distributed memory – Ideally forward-compatible."

Similar presentations


Ads by Google