Presentation is loading. Please wait.

Presentation is loading. Please wait.

Elixir : A System for Synthesizing Concurrent Graph Programs

Similar presentations


Presentation on theme: "Elixir : A System for Synthesizing Concurrent Graph Programs"— Presentation transcript:

1 Elixir : A System for Synthesizing Concurrent Graph Programs
Dimitrios Prountzos1 Roman Manevich2 Keshav Pingali1 1. The University of Texas at Austin 2. Ben-Gurion University of the Negev

2 Goal Allow programmer to easily implement correct and efficient parallel graph algorithms Graph algorithms are ubiquitous Social network analysis, Computer graphics, Machine learning, … Difficult to parallelize due to their irregular nature Best algorithm and implementation usually Platform dependent Input dependent Need to easily experiment with different solutions Focus: Fixed graph structure Only change labels on nodes and edges Each activity touches a fixed number of nodes

3 Example: Single-Source Shortest-Path
Problem Formulation Compute shortest distance from source node S to every other node Many algorithms Bellman-Ford (1957) Dijkstra (1959) Chaotic relaxation (Miranker 1969) Delta-stepping (Meyer et al. 1998) Common structure Each node has label dist with known shortest distance from S Key operation relax-edge(u,v) 2 5 A A C B 2 1 7 C 4 3 3 12 D E 2 2 F 9 1 G if dist(A) + WAC < dist(C) dist(C) = dist(A) + WAC

4 Dijkstra’s Algorithm Scheduling of relaxations:
Use priority queue of nodes, ordered by label dist Iterate over nodes u in priority order On each step: relax all neighbors v of u Apply relax-edge to all (u,v) 2 5 A B 3 5 1 7 C 4 3 D E 7 2 2 6 F 9 1 G <B,5> <C,3> <B,5> <B,5> <E,6> <D,7>

5 Chaotic Relaxation Scheduling of relaxations:
2 5 Scheduling of relaxations: Use unordered set of edges Iterate over edges (u,v) in any order On each step: Apply relax-edge to edge (u,v) A B 5 1 7 C 4 3 12 D E 2 2 Don’t show animation F 9 1 G (C,D) (B,C) (S,A) (C,E)

6 Insights Behind Elixir
Parallel Graph Algorithm What should be done How it should be done Operators Schedule Unordered/Ordered algorithms Order activity processing Identify new activities : activity Static Schedule Dynamic Schedule Operator Delta “TAO of parallelism” PLDI 2011

7 Insights Behind Elixir
Parallel Graph Algorithm q = new PrQueue q.enqueue(SRC) while (! q.empty ) { a = q.dequeue for each e = (a,b,w) { if dist(a) + w < dist(b) { dist(b) = dist(a) + w q.enqueue(b) } Operators Schedule Order activity processing Identify new activities Static Schedule Dynamic Schedule Dijkstra-style Algorithm

8 Contributions Language Operator Delta Inference
Operators/Schedule separation Allows exploration of implementation space Operator Delta Inference Precise Delta required for efficient fixpoint computations Automatic Parallelization Inserts synchronization to atomically execute operators Avoids data-races / deadlocks Specializes parallelization based on scheduling constraints Parallel Graph Algorithm Operators Schedule Order activity processing Identify new activities Fix shadow Static Schedule Dynamic Schedule Synchronization

9 SSSP in Elixir Graph type Operator Fixpoint Statement
Graph [ nodes(node : Node, dist : int) edges(src : Node, dst : Node, wt : int) ] Graph type relax = [ nodes(node a, dist ad) nodes(node b, dist bd) edges(src a, dst b, wt w) bd > ad + w ] ➔ [ bd = ad + w ] Operator Fixpoint Statement sssp = iterate relax ≫ schedule

10 Operators Redex pattern Guard Update
Graph [ nodes(node : Node, dist : int) edges(src : Node, dst : Node, wt : int) ] relax = [ nodes(node a, dist ad) nodes(node b, dist bd) edges(src a, dst b, wt w) bd > ad + w ] ➔ [ bd = ad + w ] Redex pattern Guard Update sssp = iterate relax ≫ schedule ad bd ad ad+w a w b a w b if bd > ad + w

11 Fixpoint Statement Graph [ nodes(node : Node, dist : int) edges(src : Node, dst : Node, wt : int) ] relax = [ nodes(node a, dist ad) nodes(node b, dist bd) edges(src a, dst b, wt w) bd > ad + w ] ➔ [ bd = ad + w ] sssp = iterate relax ≫ schedule Scheduling expression Apply operator until fixpoint

12 Scheduling Examples q = new PrQueue q.enqueue(SRC)
while (! q.empty ) { a = q.dequeue for each e = (a,b,w) { if dist(a) + w < dist(b) { dist(b) = dist(a) + w q.enqueue(b) } Graph [ nodes(node : Node, dist : int) edges(src : Node, dst : Node, wt : int) ] relax = [ nodes(node a, dist ad) nodes(node b, dist bd) edges(src a, dst b, wt w) bd > ad + w ] ➔ [ bd = ad + w ] sssp = iterate relax ≫ schedule Locality enhanced Label-correcting group b ≫unroll 2 ≫approx metric ad Dijkstra-style metric ad ≫group b

13 Operator Delta Inference
Parallel Graph Algorithm Operators Schedule Order activity processing Identify new activities Static Dynamic Schedule

14 Identifying the Delta of an Operator
? b relax1 ? a

15 Delta Inference Example
w2 relax2 a b w1 relax1 Drop animation assume (da + w1 < db) assume ¬(dc + w2 < db) db_post = da + w1 assert ¬(dc + w2 < db_post) Query Program SMT Solver (c,b) does not become active

16 Delta Inference Example – Active
Apply relax on all outgoing edges (b,c) such that: dc > db +w2 and c ≄ a relax1 relax2 a b c w1 w2 assume (da + w1 < db) assume ¬(db + w2 < dc) db_post = da + w1 assert ¬(db_post + w2 < dc) Query Program SMT Solver

17 Galois/OpenMP Parallel Runtime
System Architecture Algorithm Spec Elixir C++ Program Synthesize code Insert synchronization Galois/OpenMP Parallel Runtime Parallel Thread-Pool Graph Implementations Worklist Implementations

18 Experiments ... Explored Dimensions
Grouping Statically group multiple instances of operator Unrolling Statically unroll operator applications by factor K Dynamic Scheduler Choose different policy/implementation for the dynamic worklist ... Delete Compare against hand-written parallel implementations

19 Implementation Variant
SSSP Results Group + Unroll improve locality Implementation Variant 24 core Intel 2 GHz USA Florida Road Network (1 M nodes, 2.7 M Edges)

20 Breadth-First Search Results
Scale-Free Graph 1 M nodes, 8 M edges USA road network 24 M nodes, 58 M edges

21 Conclusion Graph algorithm = Operators + Schedule
Elixir language : imperative operators + declarative schedule Allows exploring implementation space Automated reasoning for efficiently computing fixpoints Correct-by-construction parallelization Performance competitive with hand-parallelized code

22 Thank You!

23 Backup Slides

24 Related Work DSL-Synthesis Synthesis from logical specifications
SPIRAL [Puchel et al. IEEE’05], Pochoir [Tang et al. SPAA’11], Green-Marl [Hong et al. ASPLOS’12] Synthesis from logical specifications [Itzhaky et al. OOPSLA’10] [Srivastava et al. POPL’10] Sketching[Lezama et al. PLDI 08], Paraglide [Vechev et al. PLDI’08] Term and Graph Rewriting Progress[Schurr’99], GrGen [Gei’06], GP [Plump’09] Finite Differencing [Paige’82]

25 Read paper for… Full scheduling language
Parallelizing ordered iterations Automatic reasoning to enable level-parallel execution Specialization of dynamic scheduler Synchronization details Synthesis procedures

26 Influence Patterns a d c b b d a c b=c a=d a=c b=d b=d a=c b=c a=d
Slide with all patterns b=d a=c b=c a=d


Download ppt "Elixir : A System for Synthesizing Concurrent Graph Programs"

Similar presentations


Ads by Google