Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Parallel Pattern 6c.1

Similar presentations


Presentation on theme: "Data Parallel Pattern 6c.1"— Presentation transcript:

1 Data Parallel Pattern 6c.1
ITCS 4/5145 Parallel computing, UNC-Charlotte, B. Wilkinson, Oct 22, 2012 6c.1

2 Data Parallel Computations
Same operation performed on different data elements simultaneously; i.e., in parallel. Fully synchronous. All processes operate in synchronism Particularly convenient because: • Ease of programming (essentially only one program). • Can scale easily to larger problem sizes. • Many numeric and some non-numeric problems can be cast in a data parallel form. Has been used in vector supercomputers designs in the 1970s. Versions seen in Intel processors, SSE extensions Currently used a basis of GPU operations, see later. 6c.2

3 Example To add the same constant to each element of an array:
for (i = 0; i < n; i++) a[i] = a[i] + k; Statement a[i] = a[i] + k; could be executed simultaneously by multiple processors, each using a different index i (0<i<=n). Vector supercomputers were designed to operate this way with single instruction multiple data model (SIMD) 6c.3

4 Using forall construct for data parallel pattern
Could use forall to specify data parallel operations forall (i = 0; i < n; i++) a[i] = a[i] + k However, forall is more general – it states that the n instances of the body can be executed simultaneously or in any order (not necessarily executed at the same time). We shall see that a GPU implementation of data parallel patterns does not necessarily allow all instances to execute at the same time. Note forall does imply synchronism at its end – all instances must complete before continuing, which will be true in GPUs 6.4

5 Data Parallel Example Prefix Sum Problem
Given a list of numbers, x0, …, xn-1, compute all the partial summations, i.e.: x0 + x1; x0 + x1 + x2; x0 + x1 + x2 + x3; x0 + x1 + x2 + x3 + x4; Can also be defined with associative operations other than addition. Widely studied. Practical applications in areas such as processor allocation, data compaction, sorting, and polynomial evaluation. 6.5

6 Data parallel method for prefix sum operation
6.6

7 Parallel code using forall notation
Sequential code for (j = 0, j < log(n); j++) // at each step for (i = 2j; i < n; i++) // accumulate sum x[i] = x[i] + x[i + 2j]; Parallel code using forall notation for (j=0, j< log(n); j++) // at each step forall (i = 0; i < n; i++) // accumulate sum if (i >= 2j) x[i] = x[i] + x[i + 2j]; 6c.7

8 Matrix Multiplication
Easy to make a data parallel version Change for’s to forall’s: forall (i = 0; i < n; i++) // for each row of A forall (j = 0; j < n; j++) { // for each column of B c[i][j] = 0; for (k = 0; k < n; k++) c[i][j] = c[i][j] + a[i][k] * b[k][j]; } Here the data parallel definition extended to multiple sequential operations on data items – each instance of the body is a separate thread Each instance executed in sequential order 6c.8

9 We will explore the data parallel pattern using GPUs for high performance computing, see next.
Questions so far 6.9


Download ppt "Data Parallel Pattern 6c.1"

Similar presentations


Ads by Google