Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to CUDA Programming

Similar presentations


Presentation on theme: "Introduction to CUDA Programming"— Presentation transcript:

1 Introduction to CUDA Programming
Scan Algorithm Explained Andreas Moshovos Winter 2009 Introduction to CUDA Programming

2 Reading You are strongly encouraged to read the following as it a contains a more formal treatment of the algorithm, plus an overview of various applications of scan. Guy E. Blelloch. “Prefix Sums and Their Applications”. In John H. Reif (Ed.), Synthesis of Parallel Algorithms, Morgan Kaufmann,

3 Up-Sweep Down-Sweep Essentially a reduction
Two phases Up-Sweep Essentially a reduction Produces many partial results Down-Sweep Propagating the partial results to all relevant elements

4 Just a reduction: Up-Sweep 1 2 2 5 6 3 8 2 4 1 5 2 7 9 3 5 1 3 2 7 6 9
10 4 5 5 7 7 16 3 8 1 3 2 10 6 9 8 19 4 5 5 12 7 16 3 24 1 3 2 10 6 9 8 29 4 5 5 12 7 16 3 36 1 3 2 10 6 9 8 29 4 5 5 12 7 16 3 65

5 Now let’s see this is a tree
Up-Sweep Now let’s see this is a tree 1 2 2 5 6 3 8 2 4 1 5 2 7 9 3 5 3 7 9 10 5 7 16 8 10 19 12 24 29 36 Notice we only have these nodes left in our array: the rest were partial results 65 1 3 2 10 6 9 8 29 4 5 5 12 7 16 3 65

6 Up-Sweep So, this is what’s left nodes without values don’t exist, they were partial results 1 2 6 8 4 5 7 3 3 9 5 16 10 12 29 65

7 For the second phase we need to think:
Down-Sweep For the second phase we need to think: The edges in reverse The empty nodes as placeholders for partial results 1 2 6 8 4 5 7 3 3 9 5 16 10 12 29 65

8 Now let’s view the tree as a collection of nsubtrees
Down-Sweep Now let’s view the tree as a collection of nsubtrees The root of each sub tree, where it’s still present contains the reduction of all subtree elements i.e., the sum of all subtree elements 1 2 6 8 4 5 7 3 3 9 5 16 10 12 29 65

9 Let’s focus on the rightmost subtree:
Down-Sweep Let’s focus on the rightmost subtree: 1 2 6 8 4 5 7 3 3 9 5 16 10 12 29 65

10 Down-Sweep Before the last step of the down-sweep phase the yellow element will contain the sum (57) of all elements to the left of the subtree. 3 57 The last step will take the following two actions 3+ 57 = 60, this goes on the rightmost element This is the sum of all elements including 3 but excluding the right most one overwrite 3 with 57 This is the sum of all elements left of 3

11 Down-Sweep In terms of the array stored in memory the aforementioned actions look like this: 57 61 3 57 Where: the dark arrows represent addition the red dotted arrow represents a move

12 Down-Sweep Let’s now focus at the rightmost subtree that contains the last four nodes: This will be processed at the step before the previous subtree we just discussed 7 3 16

13 Down-Sweep Before the previous to the last step of the down-sweep phase the green element will contain the sum (41) of all elements to the left of the subtree. 7 3 16 41

14 The actions that will be taken at this step are:
Down-Sweep The actions that will be taken at this step are: = 57 will be written as the root of the rightmost subtree As we saw before this is the sum of all element left of the rightmost subtree 41 will replace 16 This is the sum of all elements left of the subtree rooted by 16 7 3 41 57 41

15 Down-Sweep In terms of the array stored in memory the aforementioned actions look like this: 7 41 3 57 7 16 3 41 Where: the dark arrows represent addition the red dotted arrow represents a move

16 Down-Sweep Now let’s go a step back looking at the complete right subtee (in green) 4 5 7 3 5 16 12

17 Down-Sweep Before this step the root node will contain the sum (29) of all elements of the left subtree 4 5 7 3 5 16 12 29

18 As before we’ll do two things:
Down-Sweep As before we’ll do two things: 29+12 = 41 and this becomes the root of the rightmost subtree This should be the sum of all elements to the left of that subtree for the next step (which we saw previously) 29 replaces 12 4 5 7 3 same reason: 29 is the sum of all elements left of the subtree rooted by what was 12. 5 16 29 41 29

19 Down-Sweep Let’s try to generalize what happens at every step of the down-sweep phase Let’s look at step 1: There is only one subtree shown in purple 1 2 6 8 4 5 7 3 3 9 5 16 10 12 29 65

20 Down-Sweep Before we process this tree as described before the root node must contain the sum of all elements to the left of the tree There are no elements Hence the root must be 0 1 2 6 8 4 5 7 3 3 9 5 16 10 12 29

21 Now repeat the steps we saw before
Down-Sweep Now repeat the steps we saw before = 29 and this becomes the root of the right subtree 29 gets replaced by 0 1 2 6 8 4 5 7 3 3 9 5 16 10 12 29

22 Down-Sweep In terms of the array stored in memory the aforementioned actions look like this: 1 3 2 10 6 9 8 4 5 5 12 7 16 3 29 1 3 2 10 6 9 8 29 4 5 5 12 7 16 3 Where: the dark arrows represent addition the red dotted arrow represents a move


Download ppt "Introduction to CUDA Programming"

Similar presentations


Ads by Google