Fall 2008Simple Parallel Algorithms1. Fall 2008Simple Parallel Algorithms2 Scalar Product of Two Vectors Let a = (a 1, a 2, …, a n ); b = (b 1, b 2, …,

Slides:



Advertisements
Similar presentations
General algorithmic techniques: Balanced binary tree technique Doubling technique: List Ranking Problem Divide and concur Lecture 6.
Advertisements

Parallel List Ranking Advanced Algorithms & Data Structures Lecture Theme 17 Prof. Dr. Th. Ottmann Summer Semester 2006.
Parallel Algorithms.
Advanced Topics in Algorithms and Data Structures
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Efficient access to TIN Regular square grid TIN Efficient access to TIN Let q := (x, y) be a point. We want to estimate an elevation at a point q: 1. should.
Linear Algebra Applications in Matlab ME 303. Special Characters and Matlab Functions.
Heaps1 Part-D2 Heaps Heaps2 Recall Priority Queue ADT (§ 7.1.3) A priority queue stores a collection of entries Each entry is a pair (key, value)
Lecture 3: Parallel Algorithm Design
Binary Trees CSC 220. Your Observations (so far data structures) Array –Unordered Add, delete, search –Ordered Linked List –??
Discussion #33 Adjacency Matrices. Topics Adjacency matrix for a directed graph Reachability Algorithmic Complexity and Correctness –Big Oh –Proofs of.
Inexact Matching of Strings General Problem –Input Strings S and T –Questions How distant is S from T? How similar is S to T? Solution Technique –Dynamic.
Refining Edits and Alignments Υλικό βασισμένο στο κεφάλαιο 12 του βιβλίου: Dan Gusfield, Algorithms on Strings, Trees and Sequences, Cambridge University.
Sorting Chapter Sorting Consider list x 1, x 2, x 3, … x n We seek to arrange the elements of the list in order –Ascending or descending Some O(n.
C++ Programming: Program Design Including Data Structures, Third Edition Chapter 21: Graphs.
Advanced Topics in Algorithms and Data Structures 1 Rooting a tree For doing any tree computation, we need to know the parent p ( v ) for each node v.
Advanced Topics in Algorithms and Data Structures Page 1 Parallel merging through partitioning The partitioning strategy consists of: Breaking up the given.
Applied Discrete Mathematics Week 12: Trees
CSE621/JKim Lec4.1 9/20/99 CSE621 Parallel Algorithms Lecture 4 Matrix Operation September 20, 1999.
Fall 2008Array Manipulation Algorithms1. Fall 2008Array Manipulation Algorithms2 Searching Let A = (a 1, a 2, …, a n ) be a sorted array of data such.
1 02/09/05CS267 Lecture 7 CS 267 Tricks with Trees James Demmel
The Euler-tour technique
Priority Queues1 Part-D1 Priority Queues. Priority Queues2 Priority Queue ADT (§ 7.1.3) A priority queue stores a collection of entries Each entry is.
Heaps and heapsort COMP171 Fall 2005 Part 2. Sorting III / Slide 2 Heap: array implementation Is it a good idea to store arbitrary.
Fall 2008Paradigms for Parallel Algorithms1 Paradigms for Parallel Algorithms.
Advanced Topics in Algorithms and Data Structures 1 Two parallel list ranking algorithms An O (log n ) time and O ( n log n ) work list ranking algorithm.
1 Exact Set Matching Charles Yan Exact Set Matching Goal: To find all occurrences in text T of any pattern in a set of patterns P={p 1,p 2,…,p.
Important Problem Types and Fundamental Data Structures
1 Lecture 2: Parallel computational models. 2  Turing machine  RAM (Figure )  Logic circuit model RAM (Random Access Machine) Operations supposed to.
Data Structures Arrays both single and multiple dimensions Stacks Queues Trees Linked Lists.
Chapter 3: The Fundamentals: Algorithms, the Integers, and Matrices
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 February 8, 2005 Session 8.
Searching: Binary Trees and Hash Tables CHAPTER 12 6/4/15 Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education,
1 Lecture 11 POLYNOMIALS and Tree sort 2 INTRODUCTION EVALUATING POLYNOMIAL FUNCTIONS Horner’s method Permutation Tree sort.
CS 5243: Algorithms Dynamic Programming Dynamic Programming is applicable when sub-problems are dependent! In the case of Divide and Conquer they are.
Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights reserved Graphs.
Algebra 3: Section 5.5 Objectives of this Section Find the Sum and Difference of Two Matrices Find Scalar Multiples of a Matrix Find the Product of Two.
Lecture 4 on Data Structure Array. Prepared by, Jesmin Akhter, Lecturer, IIT, JU Searching : Linear search Searching refers to the operation of finding.
CSCI 115 Chapter 7 Trees. CSCI 115 §7.1 Trees §7.1 – Trees TREE –Let T be a relation on a set A. T is a tree if there exists a vertex v 0 in A s.t. there.
1 PRAM Algorithms Sums Prefix Sums by Doubling List Ranking.
Lecture 7 Introduction to Programming in C Arne Kutzner Hanyang University / Seoul Korea.
Complexity 20-1 Complexity Andrei Bulatov Parallel Arithmetic.
Sorting Chapter Sorting Consider list x 1, x 2, x 3, … x n We seek to arrange the elements of the list in order –Ascending or descending Some O(n.
Discrete Structures Trees (Ch. 11)
1 Lower Bound on Comparison-based Search We have now covered lots of searching methods –Contiguous Data (Arrays) Sequential search Binary Search –Dynamic.
Meeting 18 Matrix Operations. Matrix If A is an m x n matrix - that is, a matrix with m rows and n columns – then the scalar entry in the i th row and.
Foundation of Computing Systems
1 Fat heaps (K & Tarjan 96). 2 Goal Want to achieve the performance of Fibonnaci heaps but on the worst case. Why ? Theoretical curiosity and some applications.
1 Directed Graphs Chapter 8. 2 Objectives You will be able to: Say what a directed graph is. Describe two ways to represent a directed graph: Adjacency.
1.3 Matrices and Matrix Operations. A matrix is a rectangular array of numbers. The numbers in the arry are called the Entries in the matrix. The size.
1 Heap Sort. A Heap is a Binary Tree Height of tree = longest path from root to leaf =  (lgn) A heap is a binary tree satisfying the heap condition:
Lecture 9COMPSCI.220.FS.T Lower Bound for Sorting Complexity Each algorithm that sorts by comparing only pairs of elements must use at least 
3/12/2013Computer Engg, IIT(BHU)1 PRAM ALGORITHMS-3.
Graphs. Graph Definitions A graph G is denoted by G = (V, E) where  V is the set of vertices or nodes of the graph  E is the set of edges or arcs connecting.
Data Structures and Algorithm Analysis Graph Algorithms Lecturer: Jing Liu Homepage:
3/12/2013Computer Engg, IIT(BHU)1 PRAM ALGORITHMS-1.
Chapter 11. Chapter Summary  Introduction to trees (11.1)  Application of trees (11.2)  Tree traversal (11.3)  Spanning trees (11.4)
Lecture 3: Parallel Algorithm Design
Part-D1 Priority Queues
3.2 Properties of Determinants
Graphs Chapter 11 Objectives Upon completion you will be able to:
Fundamental Structures of Computer Science
All-to-All Pattern A pattern where all (slave) processes can communicate with each other Somewhat the worst case scenario! ITCS 4/5145 Parallel Computing,
3.2 Properties of Determinants
Dynamic Programming 1/15/2019 8:22 PM Dynamic Programming.
Unit –VIII PRAM Algorithms.
(2,4) Trees (2,4) Trees (2,4) Trees.
Important Problem Types and Fundamental Data Structures
Sorting We have actually seen already two efficient ways to sort:
Presentation transcript:

Fall 2008Simple Parallel Algorithms1

Fall 2008Simple Parallel Algorithms2 Scalar Product of Two Vectors Let a = (a 1, a 2, …, a n ); b = (b 1, b 2, …, b n ) be two vectors. The scalar product of the two vectors is given by a  b = a 1 b 1 + a 2 b 2 + … + a n b n.scalar product It can be done by O(log n) time with O(n) PEs in EREW PRAM model. Using the divide-and-conquer approach, this problem can be solved in O(log n) time, using O(n/log n) processors.

Fall 2008Simple Parallel Algorithms3 Algorithm of Scalar Product Algorithm Scalar Product Input: Arrays a[1:n] and b[1:n]. Output: The value of the scalar product stored in c 1. BEGIN For i = 1 to n do in parallel c i = a i  b i ; End-parallel p = n / 2; While p > 0 do For i = 1 to p do in parallel c i = c i + c i+p ; End-parallel p =  p/2  ; End-While END.

Fall 2008Simple Parallel Algorithms4 Matrix Multiplication If A is a matrix of order m × n, and B is a matrix of order n × p, the product C = A  B can be evaluated and it will be of order m × p. The entry in the ith row and jth column of C, C(i, j) is obtained by the scalar product of the ith row of A and jth column of B. That is

Fall 2008Simple Parallel Algorithms5 Algorithm of Matrix Multiplication Algorithm Matrix-Multiply Input: The Matrices A and B. Output: The product matrix C. BEGIN For i = 1 to m do in parallel For j = 1 to p do in parallel Evaluate C(i, j); End-parallel END.

Fall 2008Simple Parallel Algorithms6 Evaluate C(i, j) Begin For k = 1 to n do in parallel T(k) = A(i, k)  B(k, j); End-parallel r = n / 2; While r > 0 do For k = 1 to r do in parallel T(k) = T(2k–1) + T(2k); End-parallel r =  r/2  ; End-While C(i, j) = T(1); End

Fall 2008Simple Parallel Algorithms7 Complexity Analysis The complexity of the algorithm is O(log n) time, using O(mpn) processors. In particular, when A and B are square matrices, this runs in O(log n) time, using O(n 3 ) processors. If using the divide-and-conquer approach, the problem can be solved in O(log n) time, using O(n 3 /log n) processors. The algorithm needs CREW PRAM model.

Fall 2008Simple Parallel Algorithms8 Partial Sums Let A(1:n) be an array of numbers. The partial sums of the array is defined by Algorithm Sequential-Partial-Sum Input: Array A(1:n). Output: Partial sum in PS(1:n). BEGIN PS(1) = A(1); For i = 2 to n do PS(i) = PS(i-1) + A(i); End-For END.

Fall 2008Simple Parallel Algorithms9 Parallel Processing of Partial Sums 1 st Stage:  Let S(i, j) denote the value of the jth node at level h - i, counted from left to right, in the binary tree. Where, h is the height of binary tree.  Initially, S(0, j) = A(j). All the values of S(i, j) can be resolved by the algorithm of parallel sum. 2 nd Stage:  The partial sums will be PS(0, 1), PS(0, 2), …, PS(0, n).  These values are determined by a traversal from the root to the leaf of the binary tree.

Fall 2008Simple Parallel Algorithms10 Example of Partial Sum 1 st Stage: each S(i, j) is determined bottom up. i A(i)A(i)

Fall 2008Simple Parallel Algorithms11 2 nd Stage: PS(i, j) is evaluated top down. 2

Fall 2008Simple Parallel Algorithms12 Parallel Algorithm of Partial Sums Algorithm Parallel-Partial-Sum Input: Array A(1:n). Output: Partial Sums PS(0, 1:n). BEGIN For i = 1 to n do in parallel S(0, i) = A(i); End-parallel p = n; For i = 1 to (log n) do p = p / 2; For j = 1 to p do in parallel S(i, j) = S(i-1, 2j) + S(i-1, 2j-1); End-parallel End-For 1 st Stage

Fall 2008Simple Parallel Algorithms13 PS(log n, 1) = S(log n, 1); p = 1; For i = (log n) – 1 down to 0 do p = 2p; For j = 1 to p do in parallel Case j = 1: PS(i, j) = S(i, j); j = even: PS(i, j) = PS(i+1, j/2); Else: PS(i, j) = PS(i+1, j/2) + S(i, j); End-Case End-parallel End-For END. 2 nd Stage O(log n) time, O(n) PEs in CREW PRAM model

Fall 2008Simple Parallel Algorithms14 Binomial Coefficients The binomial coefficient is given by The problem here is to find all the binomial coefficients: It can be represented in the form of a triangular.

Fall 2008Simple Parallel Algorithms15 It can also be put in the form of a square table: If this dimensional array is represented by P, then we observe that

Fall 2008Simple Parallel Algorithms16 Using the fact that we get P(i, j) = P(i-1, j) + P(i, j-1). Repeating on P(i-1, j), we get The value of P(i, j) is reached by adding all the cells of the previous column upto the i th row. Such that, Also,

Fall 2008Simple Parallel Algorithms17 Example of Binomial Coefficients Suppose n = 6. Find all values of Initial Values of P(i, 0) j i

Fall 2008Simple Parallel Algorithms18 Values of P(i, 1) j i

Fall 2008Simple Parallel Algorithms19 The lower-left to top-right diagonal entries give the values of Final values of P(i, j) j i

Fall 2008Simple Parallel Algorithms20 Parallel Algorithm Algorithm Parallel-Binomial-Coefficients Input: A positive integer n. Output: The binomial Coefficients BEGIN For i = 1 to n +1 do in parallel P(i, 0) = 1; End-parallel For j = 1 to n do Find the partial sums of the (j-1)th column entries using Parallel-Partial-Sum algorithm and store in jth column. That is: End-For Output results in P(n+1, 0), P(n, 1), P(n-1, 2), …, P(1, n); END.

Fall 2008Simple Parallel Algorithms21 Complexity Analysis The algorithm can be done in O(nlog n) time, using O(n) processors. It can be implemented in a CREW PRAM model, because of the use of Partial Sum Algorithm. Can it be faster?

Fall 2008Simple Parallel Algorithms22 Approximate String Matching It’s the problem of string matching that allows errors. Edit distance: it allows us to delete, insert and substitute simple characters (with a different one) in both strings. Application areas: text retrieval, computational biology, signal processing, pattern recognition.

Fall 2008Simple Parallel Algorithms23 Example: SURGERY S U R V E Y Edit distance = 2 Time complexity O(nm)

Fall 2008Simple Parallel Algorithms24 Algorithm if i=0, C[ 0, j ] =0, 0 ≦ j ≦ n ; if j=0, C[ i, 0 ] =i, 0 ≦ j ≦ m ; others if ( Xi =Y j) then C[ i, j ] =C[ i-1, j-1 ] ; else C[ i, j ] = 1 + min ( C[ i-1, j ], C[ i, j-1 ], C[ i-1, j-1 ] ) ;

Fall 2008Simple Parallel Algorithms25 Parallel Processing SURGERY S U R V E Y SURGERY P0P0 P0P0 P0P0 P0P0 P0P0 P0P0 P0P0 P0P0 S P1P1 P1P1 P1P1 P1P1 P1P1 P1P1 P1P1 P1P1 U P2P2 P2P2 P2P2 P2P2 P2P2 P2P2 P2P2 P2P2 R P3P3 P3P3 P3P3 P3P3 P3P3 P3P3 P3P3 P3P3 V P4P4 P4P4 P4P4 P4P4 P4P4 P4P4 P4P4 P4P4 E P5P5 P5P5 P5P5 P5P5 P5P5 P5P5 P5P5 P5P5 Y P6P6 P6P6 P6P6 P6P6 P6P6 P6P6 P6P6 P6P6 O(m) time; O(n) PEs in CREW PRAM Model

Fall 2008Simple Parallel Algorithms26 Example: SURGERY S U R V E Y

Fall 2008Simple Parallel Algorithms27 Euler Circuit The Euler Circuit of a given tree can be represented as a list of directed arcs. In the Euler Circuit, whenever we travel along the arc, we have just traversed the vertex v, where p(v) is the parent of v.

Fall 2008Simple Parallel Algorithms28 Example of Euler Circuit The Euler Circuit is { }. Adjacency list of the tree vadj(v)v , 485, , 3, 5108, 11 54, 6, 7, 8,

Fall 2008Simple Parallel Algorithms29 Algorithm of Euler Circuit Algorithm Parallel-Euler-Circuit Input: A tree T represented by its adjacency list with some additional pointers. Output: Successor( ) for every arc. BEGIN For every arc do in parallel Successor( ) =, where w occurs next to u in the ordered list of vertices adjacent to v. If u appears last in the list of vertices adjacent to v, then w is the first node in the list. End-parallel END. O(1) time, O(n) PEs in EREW PRAM model

Fall 2008Simple Parallel Algorithms30 Post Order Traversal Method The post order traversal method is an order to visit the nodes of the tree. The post order traversal a tree T with root r consists of the post order traversal of the subtrees of r from left to right, followed by the root r. The post order traversal: 1, 2, 3, 6, 7, 11, 10, 8, 9, 5, 4.

Fall 2008Simple Parallel Algorithms31 Post Order Numbering The post order numbering is the function which gives the rank of the vertex in the post order traversal sequence. For the previous example, the post order numbering, post(v), is given by: Vertex v post(v)

Fall 2008Simple Parallel Algorithms32 Steps of Post Order Numbering Step 1: For every arc, if u is the parent of v, then assign weight 0 to ; otherwise, assign the weight 1 to. Step 2: Perform the prefix sum of the weights of the arcs as per the list specified by the successor function of the Euler circuit. Step 3: For every vertex v, post(v) is the prefix sum of the arc. Step 4: Post order numbering of the root is n, where n is the number of vertices in the tree.

Fall 2008Simple Parallel Algorithms33 Post order Numbering as Prefix Sum

Fall 2008Simple Parallel Algorithms34 Parallel Algorithm Algorithm Parallel-Post-Order-Numbering Input: A tree T with root r represented by Euler circuit. Output: For every vertex v, the post order numbering post(v). BEGIN For every arc do in parallel If u = p(v), assign the weight 0, Else assign the weight 1; End-parallel Find the prefix sum of the list of weights specified by the successor function; For every vertex v do in parallel post(v) = prefix sum of the arc ; End-parallel post(r) = n; END. O(log n) time, O(n) PEs in CREW model