Download presentation

Presentation is loading. Please wait.

Published byVictoria Trible Modified over 2 years ago

1
Bo Hong Electrical and Computer Engineering Department Drexel University bohong@coe.drexel.edu http://www.ece.drexel.edu/faculty/bohong

2
2Bo Hong Stadbc 8 6 4 5 4 6 3 7 4 Find: maximum flow from s to t Subject to: edge capacity constraints zero net-flow for u є V- {s,t}

3
Sequential Algorithms Augmenting Path ○ Ford-Fulkerson, pseudo-polynomial ○ Edmonds and Karp, O(|V|∙|E| 2 ) ○ Dinitz, O(|V| 2 ∙|E|) Preflow Push ○ Karzanov, O(|V| 3 ) Push-Relabel ○ Goldberg, O(|V| 2 ∙|E|), with dynamic trees O(|V| ∙ |E| ∙ log(|V| 2 ∙|E|) ) Parallel Algorithms Shiloach, etc. O(|V| 2 ∙log|V| ) with |V|-processor PRAM Goldberg, O(|V| 2 ∙log|V| ) with |V|-processor PRAM Anderson, etc. Global relabeling Bader, etc. Gap relabeling 3 Bo Hong

4
4 S t a d b c 3 1 03 Excessive flow: the net flow into a vertex e.g. e(c) = 5 S t a d b c Every vertex has an integer valued height e.g. h(c) = 2

5
5Bo Hong S t a d b c S ta d bc Lift: applicable when e(c)>0 and all c f (c,x) > 0 implies h(x) ≥ h(c) Actions: Lock v v = lowest such vertex x h(c) = h(v) + 1 Unlock v Push: applicable when e(a)>0 and there exists c f (a,v) > 0 and h(v)=h(a)-1 Actions: Lock a and v a->v still pushable? d = min( e(a), c f (a,v) ) e(a) = e(a) – d e(v) = e(v) + d c f (a,v) = c f (a,v) – d c f (v,a) = c f (v,a) + d Unlock a and v

6
P2 Lock x ← x+1 Unlock P1 Lock x ← x+1 Unlock 6Bo Hong s l l l l l l l l l l l l l l l l n n n n n n n n n n n n n n n n u u u u u u u u u u u u u u u u Number of processors T Lock acquisition time ( us ) 111315 0 2 4 6 8 10 12 14 16 9753 Ideal Actual But locks are expensive Locks protect shared accesses time Read x Increase 1 Update x Read x Increase 1 Update x

7
SMP computer with multiple processors sharing the memory Multi-processor systems Multi-core systems Supports atomic ‘fetch-and-add’ instruction Supports sequential consistency Bo Hong 7 P1 x ← x+c 1 … x ← x+c 2 P2 x ← x+c 3 … x ← x+c 4 Eventual result x ← x+c 1 +c 2 +c 3 +c 4 not matter how exactly the instructions were interleaved.

8
8Bo Hong S t a d b c S t a d b c Lift: applicable when e(c)>0 and all c f (c,x) > 0 implies h(x) ≥ h(c) Actions: v = lowest such vertex x h(c) = h(v) + 1 Push: applicable when e(a)>0 and there exists c f (a,x) > 0 and h(x)

9
9Bo Hong Initialize h(u), e(u), and f(u,v) h(s) = |V| h(u) = 0 for u є V – {s} f(s,u) = c(s,u) e(u) = c(s,u) f(u,v) = 0, otherwise While there exists applicable push or lift operations execute the push or lift operations asynchronously S t a d b c

10
10Bo Hong while e(u) > 0 e’ = e(u) h’ = ∞ for each (u,v) s.t. c f (u, v) > 0 if h(v) < h’ h’ = h(v) v’ = v if h(u) > h’ d = min ( e’, c f (u, v’) ) c f (u, v’) = c f (u, v’) + d c f (v’, u) = c f (v’, u) – d e(u) = e(u) – d e(v’) = e(v’) + d else h(u) = h’ + 1 while e(u) > 0 e’ = e(u) h’ = ∞ for each (u,v) s.t. c f (u, v) > 0 if h(v) < h’ h’ = h(v) v’ = v if h(u) > h’ d = min ( e’, c f (u, v’) ) c f (u, v’) = c f (u, v’) + d c f (v’, u) = c f (v’, u) – d e(u) = e(u) – d e(v’) = e(v’) + d else h(u) = h’ + 1 while e(u) > 0 e’ = e(u) h’ = ∞ for each (u,v) s.t. c f (u, v) > 0 if h(v) < h’ h’ = h(v) v’ = v if h(u) > h’ d = min ( e’, c f (u, v’) ) c f (u, v’) = c f (u, v’) + d c f (v’, u) = c f (v’, u) – d e(u) = e(u) – d e(v’) = e(v’) + d else h(u) = h’ + 1 while e(u) > 0 e’ = e(u) h’ = ∞ for each (u,v) s.t. c f (u, v) > 0 if h(v) < h’ h’ = h(v) v’ = v if h(u) > h’ d = min ( e’, c f (u, v’) ) c f (u, v’) = c f (u, v’) + d c f (v’, u) = c f (v’, u) – d e(u) = e(u) – d e(v’) = e(v’) + d else h(u) = h’ + 1 P1P2

11
11Bo Hong while e(u) > 0 e’ = e(u) h’ = ∞ for each (u,v) s.t. c f (u, v) > 0 if h(v) < h’ h’ = h(v) v’ = v if h(u) > h’ d = min ( e’, c f (u, v’) ) c f (u, v’) = c f (u, v’) + d c f (v’, u) = c f (v’, u) – d e(u) = e(u) – d e(v’) = e(v’) + d else h(u) = h’ + 1 while e(u) > 0 e’ = e(u) h’ = ∞ for each (u,v) s.t. c f (u, v) > 0 if h(v) < h’ h’ = h(v) v’ = v if h(u) > h’ d = min ( e’, c f (u, v’) ) c f (u, v’) = c f (u, v’) + d c f (v’, u) = c f (v’, u) – d e(u) = e(u) – d e(v’) = e(v’) + d else h(u) = h’ + 1 P1P2 time or

12
As long as c f (u,v) and e(u) are updated atomically, we always have h(u) ≤ h(v) + 1 for any c f (u,v) > 0, no matter how the threads are interleaved. 12 Bo Hong

13
If any e(u) > 0, then the algorithm will not terminate Property of the push and lift operations If the algorithm terminates, then there is no path from s to t in the residual graph Proof by contradiction, if such path exists, then the invariant property of function f has to be broken If the algorithm terminates, it finds a maximum flow Termination implies all e(u)=0, meaning this is a feasible flow. No path from s to t, by max-flow min-cut theorem, it has to be a maximum flow 13 Bo Hong

14
For any u s.t. e(u) > 0, there exists a path from u to s in the residual graph Property of network flow The height of any vertex is less than 2|V| - 1 The longest path can have at most |V| vertices The total number of lift operations is bound by 2|V| 2 -|V| Bound by the height of vertices The total number of saturated pushes is bound by (2|V|-1)∙|E| Bound by the total number of lift operations The total number of un-saturated pushes is bound by 4|V| 2 ∙|E| Bound by the number of lift and saturated pushes Therefore the algorithm terminates with O(|V| 2 ∙|E|) operations 14 Bo Hong

15
The algorithm terminates when e(u) = 0 for all u є V – {s,t} e(u) = 0 at a single thread is insufficient to terminate the thread An elegant solution: The net flow out of source s decreases monotonically The net flow into sink t increases monotonically When the two values become equal, we must have e(u) = 0 for all u є V – {s,t}, a necessary and sufficient termination condition. 15 Bo Hong

16
Execution results on 2-way SMP with 3.2GHz Intel Xeon Processors 4-thread results obtained when hyper-threading was enabled Bo Hong 16 Comparison Against Classical Lock-Based Algorithm Scalability of the Lock-Free Algorithm

17
Developed a lock-free multi-threaded algorithm for the max-flow problem having the same complexity bound as existing parallel algorithms eliminated lock usages thereby improving thread-level parallelism 20% improvement over existing lock-based parallel algorithms Results indicate the effectiveness of algorithmic method in reducing synchronization overheads Future work Load balancing across the threads: vertex to thread assignment, static or dynamic or hybrid? Optimize cache usages Reduce the number of operations via global and gap relabling What if edge capacities are floating-point? Bo Hong 17

Similar presentations

OK

Lecture 5: Network Flow Algorithms Max-Flow Min-Cut Single-Source Shortest-Path (SSSP) Job Sequencing.

Lecture 5: Network Flow Algorithms Max-Flow Min-Cut Single-Source Shortest-Path (SSSP) Job Sequencing.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on electricity for class 10th book Ppt on polynomials and coordinate geometry software Run ppt on android Ppt on bresenham's line drawing algorithm Ppt on reported speech in english grammar Ppt on electricity from waste water Ppt on object-oriented programming languages Ppt on information security and cyber laws Ppt on green revolution vs organic farming Convert word document to ppt online