Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sorting Algorithms CS 524 – High-Performance Computing.

Similar presentations


Presentation on theme: "Sorting Algorithms CS 524 – High-Performance Computing."— Presentation transcript:

1 Sorting Algorithms CS 524 – High-Performance Computing

2 CS 524 (Au 2004/05)- Asim Karim @ LUMS2 Sorting Sorting is the task of arranging an unordered collection (sequence) of elements into monotonically increasing (or decreasing) order Sorting transforms an unordered set of elements S = {a 1, a 2, a 3,…a n } into the set S’ = {a’ 1, a’ 2, a’ 3,…a’ n } where a’ i ≤ a’ j for 0 ≤ i ≤ j ≤ n and S’ is a permutation of S Sorting algorithms can be categorized into internal (S can fit into main memory) and external (S cannot fit in main memory)  We study internal algorithms only Sorting algorithms can also be categorized as comparison-based or noncomparison-based

3 CS 524 (Au 2004/05)- Asim Karim @ LUMS3 Data Storage on Parallel Computers Storage of input and output sequences  Where? One processor or distributed among processors?  How? What is the order of data distribution with respect to the order of the processors

4 CS 524 (Au 2004/05)- Asim Karim @ LUMS4 Compare-Exchange on Parallel Computers One element per processor: a i on P i and a j on P j Compare-exchange between two processors P i and P j requires a communication and a comparison operation A parallel system with as many processors as number of elements would deliver poor performance. Why?

5 CS 524 (Au 2004/05)- Asim Karim @ LUMS5 Compare-Split on Parallel Computers (1)

6 CS 524 (Au 2004/05)- Asim Karim @ LUMS6 Compare-Split on Parallel Computers (2) Each processors has n/p elements of the sequence Initially processor P i has block A i After sorting, the blocks of elements are ordered such that A’ i ≤ A’ j for i ≤ j and union of A i = union of A’ i Compare-split  Each processor sends its block to the other (each block is sorted locally)  The processor merges the two blocks of elements  The processor splits the merged elements and retains the appropriate half of it

7 CS 524 (Au 2004/05)- Asim Karim @ LUMS7 Sorting Network (1) Sorting network is a specialized interconnection network that can perform many comparisons simultaneously thus improving sorting performance significantly Key component of the soriting network: comparator  Increasing comparator  Decreasing comparator

8 CS 524 (Au 2004/05)- Asim Karim @ LUMS8 Sorting Network (2)

9 CS 524 (Au 2004/05)- Asim Karim @ LUMS9 Bubble Sort Complexity: O(n 2 ) Bubble sort is difficult to parallelize. Why?

10 CS 524 (Au 2004/05)- Asim Karim @ LUMS10 Odd-Even Transposition Sort (1)

11 CS 524 (Au 2004/05)- Asim Karim @ LUMS11 Odd-Even Transpositon Sort (2)

12 CS 524 (Au 2004/05)- Asim Karim @ LUMS12 Parallel Implementation: p = n Data partitioning: Each processor P i has one element a i Computation and Communication: During each phase, the odd or even numbered processors perform a compare-exchange with their right processors Performance  On a linear array  On a crossbar  On a bus Not cost optimal

13 CS 524 (Au 2004/05)- Asim Karim @ LUMS13 Parallel Implementation: p < n Data partitioning: Each processor P i has n/p elements in the block A i Computation and Communication: Sort A i locally (using merge sort or quicksort). Then, execute p phases (p/2 odd and p/2 even) performing compare-split operations with the right neigboring processor. Performance  On a linear array  On a crossbar  On a bus Cost optimal on linear array and crossbar when p = O(log n). Not cost optimal on bus

14 CS 524 (Au 2004/05)- Asim Karim @ LUMS14 Shellsort (1) Odd-even transposition sort moves elements one position at a time  If a sequence has only a few unordered elements and if they are far away from their correct position then OE sort will take a long time to sort the sequence Shellsort can move elements longer distances. It has two phases:  In the first phase, blocks that are far away are compare-split  In the second phase, an odd-even transposition sort is conducted. This is continued as long as blocks are changing positions

15 CS 524 (Au 2004/05)- Asim Karim @ LUMS15 Shellsort (2)

16 CS 524 (Au 2004/05)- Asim Karim @ LUMS16 Shellsort (3) Initially, each processor sort its block of elements locally First phase 1. Compare-split P i (i < p/2) with P p-i-1 (reverse order compare-split) 2. The processors are partitioned into two groups; one group has the first p/2 processors and the other the next p/2 processors. Compare-split (in reverse order) among each group. 3. Go to 1. Repeat for log p times. Second phase  Perform OE sort until no changes occur

17 CS 524 (Au 2004/05)- Asim Karim @ LUMS17 Shellsort (4) Performance  On a linear array  On a crossbar  On a bus

18 CS 524 (Au 2004/05)- Asim Karim @ LUMS18 Quicksort (1)

19 CS 524 (Au 2004/05)- Asim Karim @ LUMS19 Quicksort (2) Recursive divide-and-conquer algorithm that has an average complexity of O(nlogn)

20 CS 524 (Au 2004/05)- Asim Karim @ LUMS20 Quicksort (3) The partitioning of a sequence of length n has a complexity of O(n) The selection of the pivot affects significantly the overall complexity of quicksort  In the worst case, where a n-length sequence is partitioned into a 1 and a n-1-length subsequences, the overall complexity becomes O(n 2 )  On average, the complexity is O(nlogn)

21 CS 524 (Au 2004/05)- Asim Karim @ LUMS21 Parallelizing Quicksort A naïve formulation  Start off with one process with does the initial partitioning. Then, assign one of the subproblems (the recursion) to another process. Repeat for each subsequence until no further partitioning is possible.  Not cost-optimal (Why?) Analysis

22 CS 524 (Au 2004/05)- Asim Karim @ LUMS22 Message-Passing Parallel Formulation Data partitioning: Each processor P i has A i of n/p elements Computation and communication  Select a pivot  Broadcast the pivot to all processors  Locally rearrange the block A i into sub-blocks S i and L i  Combine S i and L i from all processors as S and L  Partition S to one group of processors and L to the other  Recursively perform these operations until a sub-block is assigned to one processor only. Then, the processors sort the set locally

23 CS 524 (Au 2004/05)- Asim Karim @ LUMS23


Download ppt "Sorting Algorithms CS 524 – High-Performance Computing."

Similar presentations


Ads by Google