PRAM Models Advanced Algorithms & Data Structures Lecture Theme 13 Prof. Dr. Th. Ottmann Summer Semester 2006.

Slides:



Advertisements
Similar presentations
Parallel List Ranking Advanced Algorithms & Data Structures Lecture Theme 17 Prof. Dr. Th. Ottmann Summer Semester 2006.
Advertisements

1 Parallel Algorithms (chap. 30, 1 st edition) Parallel: perform more than one operation at a time. PRAM model: Parallel Random Access Model. p0p0 p1p1.
Parallel Algorithms.
PRAM Algorithms Sathish Vadhiyar. PRAM Model - Introduction Parallel Random Access Machine Allows parallel-algorithm designers to treat processing power.
Instructor Neelima Gupta Table of Contents Parallel Algorithms.
Optimal PRAM algorithms: Efficiency of concurrent writing “Computer science is no more about computers than astronomy is about telescopes.” Edsger Dijkstra.
Advanced Topics in Algorithms and Data Structures Lecture 7.2, page 1 Merging two upper hulls Suppose, UH ( S 2 ) has s points given in an array according.
Lecture 3: Parallel Algorithm Design
INSE - Lectures 19 & 20 SE for Real-Time & SE for Concurrency  Really these are two topics – but rather tangled together.
Parallel vs Sequential Algorithms
Super computers Parallel Processing By: Lecturer \ Aisha Dawood.
PRAM (Parallel Random Access Machine)
Efficient Parallel Algorithms COMP308
TECH Computer Science Parallel Algorithms  several operations can be executed at the same time  many problems are most naturally modeled with parallelism.
Advanced Topics in Algorithms and Data Structures Lecture pg 1 Recursion.
Advanced Topics in Algorithms and Data Structures Classification of the PRAM model In the PRAM model, processors communicate by reading from and writing.
Parallel Prefix Computation Advanced Algorithms & Data Structures Lecture Theme 14 Prof. Dr. Th. Ottmann Summer Semester 2006.
Advanced Topics in Algorithms and Data Structures Page 1 Parallel merging through partitioning The partitioning strategy consists of: Breaking up the given.
Simulating a CRCW algorithm with an EREW algorithm Efficient Parallel Algorithms COMP308.
Uzi Vishkin.  Introduction  Objective  Model of Parallel Computation ▪ Work Depth Model ( ~ PRAM) ▪ Informal Work Depth Model  PRAM Model  Technique:
Slide 1 Parallel Computation Models Lecture 3 Lecture 4.
Advanced Topics in Algorithms and Data Structures An overview of the lecture 2 Models of parallel computation Characteristics of SIMD models Design issue.
Advanced Topics in Algorithms and Data Structures 1 Lecture 4 : Accelerated Cascading and Parallel List Ranking We will first discuss a technique called.
Accelerated Cascading Advanced Algorithms & Data Structures Lecture Theme 16 Prof. Dr. Th. Ottmann Summer Semester 2006.
Parallel Merging Advanced Algorithms & Data Structures Lecture Theme 15 Prof. Dr. Th. Ottmann Summer Semester 2006.
Models of Parallel Computation Advanced Algorithms & Data Structures Lecture Theme 12 Prof. Dr. Th. Ottmann Summer Semester 2006.
Parallel Algorithms - Introduction Advanced Algorithms & Data Structures Lecture Theme 11 Prof. Dr. Th. Ottmann Summer Semester 2006.
1 Matrix Addition, C = A + B Add corresponding elements of each matrix to form elements of result matrix. Given elements of A as a i,j and elements of.
1 Lecture 3 PRAM Algorithms Parallel Computing Fall 2008.
Fall 2008Paradigms for Parallel Algorithms1 Paradigms for Parallel Algorithms.
Advanced Topics in Algorithms and Data Structures 1 Two parallel list ranking algorithms An O (log n ) time and O ( n log n ) work list ranking algorithm.
Basic PRAM algorithms Problem 1. Min of n numbers Problem 2. Computing a position of the first one in the sequence of 0’s and 1’s.
Simulating a CRCW algorithm with an EREW algorithm Lecture 4 Efficient Parallel Algorithms COMP308.
RAM and Parallel RAM (PRAM). Why models? What is a machine model? – A abstraction describes the operation of a machine. – Allowing to associate a value.
1 Lecture 2: Parallel computational models. 2  Turing machine  RAM (Figure )  Logic circuit model RAM (Random Access Machine) Operations supposed to.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 February 8, 2005 Session 8.
Parallel Algorithms Sorting and more. Keep hardware in mind When considering ‘parallel’ algorithms, – We have to have an understanding of the hardware.
A.Broumandnia, 1 5 PRAM and Basic Algorithms Topics in This Chapter 5.1 PRAM Submodels and Assumptions 5.2 Data Broadcasting 5.3.
-1.1- Chapter 2 Abstract Machine Models Lectured by: Nguyễn Đức Thái Prepared by: Thoại Nam.
1 Lectures on Parallel and Distributed Algorithms COMP 523: Advanced Algorithmic Techniques Lecturer: Dariusz Kowalski Lectures on Parallel and Distributed.
RAM, PRAM, and LogP models
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 February 3, 2005 Session 7.
Parallel Algorithms. Parallel Models u Hypercube u Butterfly u Fully Connected u Other Networks u Shared Memory v.s. Distributed Memory u SIMD v.s. MIMD.
06/12/2015Applied Algorithmics - week41 Non-periodicity and witnesses  Periodicity - continued If string w=w[0..n-1] has periodicity p if w[i]=w[i+p],
Parallel Processing & Distributed Systems Thoai Nam Chapter 2.
Data Structures and Algorithms in Parallel Computing Lecture 1.
5 PRAM and Basic Algorithms
3/12/2013Computer Engg, IIT(BHU)1 PRAM ALGORITHMS-3.
3/12/2013Computer Engg, IIT(BHU)1 PRAM ALGORITHMS-1.
Exercise 1: Maximum element for the following code a- what is the basic operation? b- what is C(n)? d-what is the running time?
1 A simple parallel algorithm Adding n numbers in parallel.
PRAM and Parallel Computing
Lecture 3: Parallel Algorithm Design
PRAM Model for Parallel Computation
Parallel Algorithms (chap. 30, 1st edition)
Lecture 2: Parallel computational models
Lecture 22 review PRAM: A model developed for parallel machines
Parallel computation models
PRAM Algorithms.
PRAM Model for Parallel Computation
Pipelining and Vector Processing
Array Processor.
Parallel and Distributed Algorithms
Lecture 5 PRAM Algorithms (cont.)
CSE838 Lecture notes copy right: Moon Jung Chung
Unit –VIII PRAM Algorithms.
Professor Ioana Banicescu CSE 8843
Parallel Algorithms A Simple Model for Parallel Processing
Module 6: Introduction to Parallel Computing
Presentation transcript:

PRAM Models Advanced Algorithms & Data Structures Lecture Theme 13 Prof. Dr. Th. Ottmann Summer Semester 2006

2 In the PRAM model, processors communicate by reading from and writing to the shared memory locations. The power of a PRAM depends on the kind of access to the shared memory locations. Classification of the PRAM model

3 In every clock cycle, In the Exclusive Read Exclusive Write (EREW) PRAM, each memory location can be accessed only by one processor. In the Concurrent Read Exclusive Write (CREW) PRAM, multiple processor can read from the same memory location, but only one processor can write. Classification of the PRAM model

4 In the Concurrent Read Concurrent Write (CRCW) PRAM, multiple processor can read from or write to the same memory location. Classification of the PRAM model

5 It is easy to allow concurrent reading. However, concurrent writing gives rise to conflicts. If multiple processors write to the same memory location simultaneously, it is not clear what is written to the memory location. Classification of the PRAM model

6 In the Common CRCW PRAM, all the processors must write the same value. In the Arbitrary CRCW PRAM, one of the processors arbitrarily succeeds in writing. In the Priority CRCW PRAM, processors have priorities associated with them and the highest priority processor succeeds in writing. Classification of the PRAM model

7 The EREW PRAM is the weakest and the Priority CRCW PRAM is the strongest PRAM model. The relative powers of the different PRAM models are as follows. Classification of the PRAM model

8 An algorithm designed for a weaker model can be executed within the same time and work complexities on a stronger model. Classification of the PRAM model

9 We say model A is less powerful compared to model B if either: the time complexity for solving a problem is asymptotically less in model B as compared to model A. or, if the time complexities are the same, the processor or work complexity is asymptotically less in model B as compared to model A. Classification of the PRAM model

10 An algorithm designed for a stronger PRAM model can be simulated on a weaker model either with asymptotically more processors (work) or with asymptotically more time. Classification of the PRAM model

11 Adding n numbers on a PRAM

12 This algorithm works on the EREW PRAM model as there are no read or write conflicts. We will use this algorithm to design a matrix multiplication algorithm on the EREW PRAM. Adding n numbers on a PRAM

13 For simplicity, we assume that n = 2 p for some integer p. Matrix multiplication

14 Each can be computed in parallel. We allocate n processors for computing c i,j. Suppose these processors are P 1, P 2,…,P n. In the first time step, processor computes the product a i,m x b m,j. We have now n numbers and we use the addition algorithm to sum these n numbers in log n time. Matrix multiplication

15 Computing each takes n processors and log n time. Since there are n 2 such c i,j s, we need overall O( n 3 ) processors and O(log n ) time. The processor requirement can be reduced to O( n 3 / log n ). Exercise ! Hence, the work complexity is O( n 3 ) Matrix multiplication

16 However, this algorithm requires concurrent read capability. Note that, each element a i,j (and b i,j ) participates in computing n elements from the C matrix. Hence n different processors will try to read each a i,j (and b i,j ) in our algorithm. Matrix multiplication

17 For simplicity, we assume that n = 2 p for some integer p. Matrix multiplication

18 Hence our algorithm runs on the CREW PRAM and we need to avoid the read conflicts to make it run on the EREW PRAM. We will create n copies of each of the elements a i,j (and b i,j ). Then one copy can be used for computing each c i,j. Matrix multiplication

19 Creating n copies of a number in O (log n ) time using O ( n ) processors on the EREW PRAM. In the first step, one processor reads the number and creates a copy. Hence, there are two copies now. In the second step, two processors read these two copies and create four copies. Matrix multiplication

20 Since the number of copies doubles in every step, n copies are created in O(log n ) steps. Though we need n processors, the processor requirement can be reduced to O ( n / log n ). Exercise ! Matrix multiplication

21 Since there are n 2 elements in the matrix A (and in B ), we need O ( n 3 / log n ) processors and O (log n ) time to create n copies of each element. After this, there are no read conflicts in our algorithm. The overall matrix multiplication algorithm now take O (log n ) time and O ( n 3 / log n ) processors on the EREW PRAM. Matrix multiplication

22 The memory requirement is of course much higher for the EREW PRAM. Matrix multiplication