1 5.1 Pipelined Computations. 2 Problem divided into a series of tasks that have to be completed one after the other (the basis of sequential programming).

Slides:



Advertisements
Similar presentations
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Advertisements

Lecture 19: Parallel Algorithms
Linear Algebra Applications in Matlab ME 303. Special Characters and Matlab Functions.
1 5.1 Pipelined Computations. 2 Problem divided into a series of tasks that have to be completed one after the other (the basis of sequential programming).
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
8 Algorithms Foundations of Computer Science ã Cengage Learning.
Numerical Algorithms ITCS 4/5145 Parallel Computing UNC-Charlotte, B. Wilkinson, 2009.
1 Parallel Algorithms II Topics: matrix and graph algorithms.
SOLVING SYSTEMS OF LINEAR EQUATIONS. Overview A matrix consists of a rectangular array of elements represented by a single symbol (example: [A]). An individual.
Chapter 2 Matrices Finite Mathematics & Its Applications, 11/e by Goldstein/Schneider/Siegel Copyright © 2014 Pearson Education, Inc.
Numerical Algorithms Matrix multiplication
Parallel Programming: Techniques and Applications Using Networked Workstations and Parallel Computers Chapter 11: Numerical Algorithms Sec 11.2: Implementing.
Lesson 8 Gauss Jordan Elimination
Numerical Algorithms • Matrix multiplication
Goldstein/Schnieder/Lay: Finite Math & Its Applications, 9e 1 of 86 Chapter 2 Matrices.
1 Lecture 25: Parallel Algorithms II Topics: matrix, graph, and sort algorithms Tuesday presentations:  Each group: 10 minutes  Describe the problem,
Lecture 21: Parallel Algorithms
Computer Science 1620 Multi-Dimensional Arrays. we used arrays to store a set of data of the same type e.g. store the assignment grades for a particular.
Pipelined Computations Divide a problem into a series of tasks A processor completes a task sequentially and pipes the results to the next processor Pipelining.
CS 584. Dense Matrix Algorithms There are two types of Matrices Dense (Full) Sparse We will consider matrices that are Dense Square.
COMPE575 Parallel & Cluster Computing 5.1 Pipelined Computations Chapter 5.
Advanced Topics in Algorithms and Data Structures 1 Two parallel list ranking algorithms An O (log n ) time and O ( n log n ) work list ranking algorithm.
Finite Mathematics & Its Applications, 10/e by Goldstein/Schneider/SiegelCopyright © 2010 Pearson Education, Inc. 1 of 86 Chapter 2 Matrices.
Section 8.4 Insertion Sort CS Insertion Sort  Another quadratic sort, insertion sort, is based on the technique used by card players to arrange.
Ch. 8 & 9 – Linear Sorting and Order Statistics What do you trade for speed?
1 Calculating Polynomials We will use a generic polynomial form of: where the coefficient values are known constants The value of x will be the input and.
Exercise problems for students taking the Programming Parallel Computers course. Janusz Kowalik Piotr Arlukowicz Tadeusz Puzniakowski Informatics Institute.
HOW TO SOLVE IT? Algorithms. An Algorithm An algorithm is any well-defined (computational) procedure that takes some value, or set of values, as input.
Linear Systems Gaussian Elimination CSE 541 Roger Crawfis.
Chapter 3: The Fundamentals: Algorithms, the Integers, and Matrices
Array Processing Simple Program Design Third Edition A Step-by-Step Approach 7.
CS 584 l Assignment. Systems of Linear Equations l A linear equation in n variables has the form l A set of linear equations is called a system. l A solution.
Copyright © 2007 Pearson Addison-Wesley. All rights reserved. A. Levitin “ Introduction to the Design & Analysis of Algorithms, ” 2 nd ed., Ch. 1 Chapter.
ΑΡΙΘΜΗΤΙΚΕΣ ΜΕΘΟΔΟΙ ΜΟΝΤΕΛΟΠΟΙΗΣΗΣ 4. Αριθμητική Επίλυση Συστημάτων Γραμμικών Εξισώσεων Gaussian elimination Gauss - Jordan 1.
Slides prepared by Rose Williams, Binghamton University ICS201 Lecture 19 : Recursion King Fahd University of Petroleum & Minerals College of Computer.
Pipelined Computations P0P1P2P3P5. Pipelined Computations a s in s out a[0] a s in s out a[1] a s in s out a[2] a s in s out a[3] a s in s out a[4] for(i=0.
CSCI-455/552 Introduction to High Performance Computing Lecture 13.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
CS 361 – Chapters 8-9 Sorting algorithms –Selection, insertion, bubble, “swap” –Merge, quick, stooge –Counting, bucket, radix How to select the n-th largest/smallest.
 2008 Pearson Education, Inc. All rights reserved JavaScript: Control Statements I.
Chun-Yuan Lin OpenMP-Programming training-5. “Type 2” Pipeline Space-Time Diagram.
Decomposition Data Decomposition – Dividing the data into subgroups and assigning each piece to different processors – Example: Embarrassingly parallel.
UNIT-I INTRODUCTION ANALYSIS AND DESIGN OF ALGORITHMS CHAPTER 1:
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Dense Linear Algebra Sathish Vadhiyar. Gaussian Elimination - Review Version 1 for each column i zero it out below the diagonal by adding multiples of.
Data Structures Using C++
M180: Data Structures & Algorithms in Java Sorting Algorithms Arab Open University 1.
8.1 8 Algorithms Foundations of Computer Science  Cengage Learning.
1. Searching The basic characteristics of any searching algorithm is that searching should be efficient, it should have less number of computations involved.
Chapter 9: Sorting1 Sorting & Searching Ch. # 9. Chapter 9: Sorting2 Chapter Outline  What is sorting and complexity of sorting  Different types of.
chap10 Chapter 10 Recursion chap10 2 Recursive Function recursive function The recursive function is a kind of function that calls.
Numerical Algorithms Chapter 11.
Prof. U V THETE Dept. of Computer Science YMA
Linear Equations.
Pipelined Computations
CS4230 Parallel Programming Lecture 12: More Task Parallelism Mary Hall October 4, /04/2012 CS4230.
Lecture 22: Parallel Algorithms
Decomposition Data Decomposition Functional Decomposition
Numerical Algorithms • Parallelizing matrix multiplication
Pipelined Computations
Introduction to High Performance Computing Lecture 12
Pipeline Pattern ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, 2012 slides5.ppt Oct 24, 2013.
Pipelined Pattern This pattern is implemented in Seeds, see
Pipeline Pattern ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, 2012 slides5.ppt March 20, 2014.
Pipeline Pattern ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson slides5.ppt August 17, 2014.
Dense Linear Algebra (Data Distributions)
Data Parallel Pattern 6c.1
Presentation transcript:

1 5.1 Pipelined Computations

2 Problem divided into a series of tasks that have to be completed one after the other (the basis of sequential programming). Each task executed by a separate process or processor.

3 Example Add all the elements of array a to an accumulating sum: for (i = 0; i < n; i++) sum = sum + a[i]; The loop could be “unfolded” to yield sum = sum + a[0]; sum = sum + a[1]; sum = sum + a[2]; sum = sum + a[3]; sum = sum + a[4];.

4 Pipeline for an unfolded loop

5 Another Example Frequency filter - Objective to remove specific frequencies (f 0, f 1, f 2, f 3, etc.) from a digitized signal, f(t). Signal enters pipeline from left:

6 Where pipelining can be used to good effect Assuming problem can be divided into a series of sequential tasks, pipelined approach can provide increased execution speed under the following three types of computations: 1. If more than one instance of the complete problem is to be Executed 2. If a series of data items must be processed, each requiring multiple operations 3. If information to start next process can be passed forward before process has completed all its internal operations

Six Stage Instruction Pipeline 7

Instruction Cycle State Diagram 8

Timing Diagram for Instruction Pipeline Operation 9

10 “Type 1” Pipeline Space-Time Diagram

11 Alternative space-time diagram

12 “Type 2” Pipeline Space-Time Diagram

13 “Type 3” Pipeline Space-Time Diagram Pipeline processing where information passes to next stage before previous state completed.

14 If the number of stages is larger than the number of processors in any pipeline, a group of stages can be assigned to each processor:

15 Computing Platform for Pipelined Applications Multiprocessor system with a line configuration

16 Example Pipelined Solutions (Examples of each type of computation)

17 Pipeline Program Examples Adding Numbers Type 1 pipeline computation

18 Basic code for process Pi : recv(&accumulation, P i-1 ); accumulation = accumulation + number; send(&accumulation, P i+1 ); except for the first process, P 0, which is send(&number, P 1 ); and the last process, P n-1, which is recv(&number, P n-2 ); accumulation = accumulation + number;

19 SPMD program if (process > 0) { recv(&accumulation, P i-1 ); accumulation = accumulation + number; } if (process < n-1) send(&accumulation, P i+1 ); The final result is in the last process. Instead of addition, other arithmetic operations could be done.

20 Pipelined addition numbers Master process and ring configuration

Sorting Numbers Insertion Sort Based on the technique used by card players to arrange a hand of cards – Player keeps the cards that have been picked up so far in sorted order – When the player picks up a new card, he makes room for the new card and then inserts it in its proper place 21

Sorting Numbers Insertion Sort Algorithm For each array element from the second to the last (nextPos = 1) – Insert the element at nextPos where it belongs in the array, increasing the length of the sorted subarray by 1 22

23 Sorting Numbers A parallel version of insertion sort.

24 Pipeline for sorting using insertion sort Type 2 pipeline computation

25 The basic algorithm for process P i is recv(&number, P i-1 ); if (number > x) { send(&x, P i+1 ); x = number; } else send(&number, P i+1 ); With n numbers, number ith process is to accept = n - i. Number of passes onward = n - i - 1 Hence, a simple loop could be used.

26 Insertion sort with results returned to master process using bidirectional line configuration

27 Insertion sort with results returned

Prime Number Generation Sieve of Eratosthenes Next_Prime = 2 ====> Mark multiples of 2 as non-prime Next_Prime = 3 ====> Mark multiples of 3 as non-prime.

29 Prime Number Generation Sieve of Eratosthenes Series of all integers generated from 2. First number, 2, is prime and kept. All multiples of this number deleted as they cannot be prime. Process repeated with each remaining number. The algorithm removes non-primes, leaving only primes. Type 2 pipeline computation

30 The code for a process, P i, could be based upon recv(&x, P i-1 ); /* repeat following for each number */ recv(&number, P i-1 ); if ((number % x) != 0) send(&number, P i+1 ); Each process will not receive the same number of numbers and is not known beforehand. Use a “terminator” message, which is sent at the end of the sequence: recv(&x, P i-1 ); for (i = 0; i < n; i++) { recv(&number, P i-1 ); If (number == terminator) break; If((number % x) != 0) send(&number, P i+1 ); }

31 Gaussian Elimination (GE) for solving Ax=b … for each column i … zero it out below the diagonal by adding multiples of row i to later rows for i = 1 to n-1 … for each row j below row i for j = i+1 to n … add a multiple of row i to row j tmp = A(j,i); for k = i to n A(j,k) = A(j,k) - (tmp/A(i,i)) * A(i,k) After i=1After i=2After i=3After i=n-1 …

32 Solving a System of Linear Equations Upper-triangular form where a’s and b’s are constants and x’s are unknowns to be found.

33 Back Substitution First, unknown x 0 is found from last equation; i.e., Value obtained for x 0 substituted into next equation to obtain x 1 ; i.e., Values obtained for x 1 and x 0 substituted into next equation to obtain x 2 : and so on until all the unknowns are found.

34 Pipeline Solution First pipeline stage computes x 0 and passes x 0 onto the second stage, which computes x 1 from x 0 and passes both x 0 and x 1 onto the next stage, which computes x 2 from x 0 and x 1, and so on. Type 3 pipeline computation

35 The ith process (0 < i < n) receives the values x 0, x 1, x 2, …, x i-1 and computes x i from the equation:

36 Sequential Code Given constants a i,j and b k stored in arrays a[ ][ ] and b[ ], respectively, and values for unknowns to be stored in array, x[ ], sequential code could be x[0] = b[0]/a[0][0]; /* computed separately */ for (i = 1; i < n; i++) { /*for remaining unknowns*/ sum = 0; For (j = 0; j < i; j++) sum = sum + a[i][j]*x[j]; x[i] = (b[i] - sum)/a[i][i]; }

37 Parallel Code Pseudo-code of process P i (1 < i < n) of could be for (j = 0; j < i; j++) { recv(&x[j], P i-1 ); send(&x[j], P i+1 ); } sum = 0; for (j = 0; j < i; j++) sum = sum + a[i][j]*x[j]; x[i] = (b[i] - sum)/a[i][i]; send(&x[i], P i+1 ); Now have additional computations to do after receiving and resending values.

38 Pipeline processing using back substitution