Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear.

Similar presentations


Presentation on theme: "Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear."— Presentation transcript:

1 Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000. Some tricks for XMT Programming with Reductions and Linear Recurrences Jonathan Berry Scalable Algorithms Department Sandia National Laboratories July 24, 2008

2 Recall the PageRank and Community Detection Discussions We saw that the way loops are parallelized is crucial –We scaled once the compiler merged the loops in our rank accumulation method and removed a reduction from the resulting single loop –The strong scaling stopped if this wasn’t accomplished The kernel of our facility location-based community detection approaches also requires the removal of reductions and the processing of linear recurrences

3 Reductions: Code Carefully Consider adding up absolute values of integers: Attempt 1: int total=0; for (int i=0; i<n; i++) { if (v[i] < 0) { total += -v[i]; } else { total += v[i]; } The compiler has trouble dealing with this branched loop body; the reduction isn’t removed.

4 Reductions: Code Carefully Consider adding up absolute values of integers: Attempt 2: int total=0; for (int i=0; i<n; i++) { int incr = (v[i] = 0) * v[i]; total += incr; } This loop body has no branches and the reduction is removed correctly

5 Reductions: Code Carefully Consider a conditional reduction: Attempt 1: int max=0; for (int i=0; i<n; i++) { if (mask[i] && v[i] > max) { max = v[i]; } The complex conditional expression with short-circuit evaluation can turn off reduction removal (and has in my experience)

6 Reductions: Code Carefully Consider a conditional reduction: Attempt 2: int max=0; for (int i=0; i<n; i++) { int candidate = mask[i] * v[i]; if (candidate > max) { max = candidate; } This works! The reduction is removed from the loop.

7 Linear Recurrences The compiler will generate efficient code to parallelize linear recurrences, but you must keep the structure simple Suppose that you want to condition the additive term upon some test Attempt 1: int max=0; for (int i=0; i<n; i++) { if (v[i] < 0) { f[i] = f[i-1] + -v[i]; } else { f[i] = f[i-1] + v[i]; } This works! The reduction is removed from the loop.

8 Linear Recurrences Suppose that you want to condition the additive term upon some test Attempt 2: int max=0; for (int i=0; i<n; i++) { int incr = (v[i] =0)*v[i]; f[i] = f[i-1] + incr; } This works! The linear recurrence is parallelized We saw some nastier examples in the discussion, but they reduce to the same rule: compute the increment, match the simple template.

9 Acknowledgements Thanks to John Feo (Microsoft Research, formerly Cray) for suggesting the precomputation trick in the case of conditional reduction with compound boolean expressions.


Download ppt "Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy’s National Nuclear."

Similar presentations


Ads by Google