Download presentation

Presentation is loading. Please wait.

Published byHanna Mallard Modified over 2 years ago

1
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, @ 2004 Pearson Education Inc. All rights reserved. Partitioning and Divide-and-Conquer Strategies Chapter 4

2
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, @ 2004 Pearson Education Inc. All rights reserved. Divide and Conquer Divide the problem into sub-problems of same form as larger problem. Further divide into still smaller sub-problems (usually done by recursion) Recursive divide and conquer amenable to parallelization because separate processes can be used for divided parts.

3
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, @ 2004 Pearson Education Inc. All rights reserved. Divide-and-Conquer Using a Hypercube

4
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, @ 2004 Pearson Education Inc. All rights reserved. Partial summation

5
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, @ 2004 Pearson Education Inc. All rights reserved. Bucket sort One “bucket” assigned to hold numbers that fall within each region. Numbers in each bucket sorted using a sequential sorting algorithm. Sequential sorting time complexity: O(nlog(n/m)). Works well if the original numbers uniformly distributed across a known interval, say 0 to a - 1.

6
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, @ 2004 Pearson Education Inc. All rights reserved. Parallel version of bucket sort Simple approach Assign one processor for each bucket:

7
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, @ 2004 Pearson Education Inc. All rights reserved. Parallel bucket sort - improved version Requires all-to-all scatter

8
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, @ 2004 Pearson Education Inc. All rights reserved. 4.14 “all-to-all” scatter routine actually transfers rows of an array to columns: Transposes a matrix.

9
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, @ 2004 Pearson Education Inc. All rights reserved. All-to-All scatter on a 3-dim Hypercube T par = plogp

10
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, @ 2004 Pearson Education Inc. All rights reserved. 4.14 Tim’s version of All-to-All scatter on a 3-dim Hypercube Step 1: P0: x00, x10, x20, x30, x40, x50, x60, x70 ____/ ___/ ___/ ___/ / / / / P1: x01, x11, x21, x31, x41, x51, x61, x71 Step 2: P0: x00, x01, x20, x21, x40, x41, x60, x61 _________/____/ _________/____/ / / / / P2: x02, x03, x22, x23, x42, x43, x62, x63 Step 3: P0: x00, x01, x02, x03, x40, x41, x42, x43 ___________________/____/____/____/ / / / / P4: x05, x06, x07, x08, x45, x46, x47, x48 T par = plogp

11
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, @ 2004 Pearson Education Inc. All rights reserved. 4.11 Time Complexity Analysis for Bucket Sort Sequential Algorithm: 1.Each element is put into one of the m buckets O(n) 2.Sort elements in each bucket: (m* (n/m * log(n/m))) = nlog(n/m) If m=O(n) then TIME = O(n) Overall: T seq = O(n) (* How is this possible? A: uniform distribution assumption *) Parallel Algorithm (m=p): 1.Each PE partitions its n/p elements into p=m sub-buckets O(n/p) 2.Using all-to-all personalized broadcast algorithm, each PE sends sub- buckets to the appropriate Pes. All-to-All Scatter of 1 unit messages can be done in O(plogp) time on a p processor Hypercube. Therefore, this step takes (n/p 2 )*plogp= (n/p)logp time 3.Each PE sorts its own bucket using the seq. bucket sort alg. If there is uniform distribution and (n/p) elements can be put in (n/p) buckets then sorting can be done in O(n/p) Time. (Otherwise, it is [n/p*log(n/p)]) Total Time T par = T (1) + T (2) + T (3) = n/p + (n/p)logp + n/p

12
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M. Allen, @ 2004 Pearson Education Inc. All rights reserved. Next: Pipelining + Prime Number Sieve

Similar presentations

OK

CSCI-455/552 Introduction to High Performance Computing Lecture 25.

CSCI-455/552 Introduction to High Performance Computing Lecture 25.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on job rotation evaluation Ppt on biogas in india Ppt on astronomy and astrophysics journals Ppt on code switching and code mixing Ppt on tata steel company Ppt on concrete mix design Ppt on intelligent manufacturing in industrial automation Download ppt on water resources class 10th Ppt on power line communication adapters Ppt on rational and irrational numbers for class 9