Design and Analysis of Algorithms

Design and Analysis of Algorithms
Lecture # 08 Muhammad Nasir Department of Computer Science COMSATS University Islamabad, Lahore Campus

Merge Sort Divide: Split into two subarrays A[p..q] and A[q+1..r], where q is the halfway point of A[p..r] Conquer: Conquer by recursively sorting the two subarrays A[p..q] and A[q+1..r] Combine: Merge the two sorted subarrays A[p..q] and A[q+1..r] to produce a single sorted subarray A[p..r] The second goal is the “nuts and bolts” of the course. The third goal prepares a student for the future.

Merge Sort MERGE-SORT(A, p, r) if p<r then ⇨Check for base case
q← |(p+r)/2| ⇨ Divide MERGE-SORT(A, p, q) ⇨Conquer MERGE-SORT(A, q+1, r) ⇨Conquer MERGE(A, p, q, r) ⇨Combine Initial call: MERGE-SORT(A, 1, n) The second goal is the “nuts and bolts” of the course. The third goal prepares a student for the future.

Merge Function Pseudocode
Complexity of Merge Function? O(n)

Merge sort Analysis Let T(n)=running time on a problem of size n
If the problem size is small enough (say, n ≤ c for some constant c), we have a base case. The brute-force solution takes constant time: Θ(1) Otherwise suppose that we divide into a sub-problems, each 1/b the size of the original. Let the time to divide a size-n problem be D(n). There are a sub-problems to solve, each of size n/b each sub-problem takes T(n/b) time to solve we spend a T(n/b) time solving sub-problems Let the time to combine solutions be C(n)

Merge sort Analysis This is called a recurrence relation
A recurrence is an equation or inequality that describes a function in terms of its value on smaller inputs and to solve recursive problems.

Methods to solve Recurrence relation
Substitution method, we guess a bound and then use mathematical induction to prove our guess correct. Recursion-tree method converts the recurrence into a tree whose nodes represent the costs incurred at various levels of the recursion. Master method provides bounds for recurrences of the form T(n) = aT(n/b) + f(n) where a>=1, b > 1, and f(n) is a given function

. . . Merge Sort Analysis Merge n items: O(n) n Merge two n/2 items:
Level 0 n Level 1 Merge two n/2 items: O(n) n/2 n/2 n/4 n/4 n/4 n/4 Level 2 Each level requires O(n) operations . . . 1 1 1 1 1 1 Tree Height : log2n Each level O(n) operations & O(log2n) levels  O(n*log2n)

Merge Sort Analysis Worst case: Average case:
Performance is independent of the initial order of the array items. Advantage: Mergesort is an extremely fast algorithm. Disadvantage: Mergesort requires a second array as large as the original array. O(n * log2n). O(n * log2n).

Quicksort Quicksort is another divide and conquer algorithm. Quicksort is based on the idea of partitioning (splitting) the list around a pivot or split value then sort each partition.

Quicksort Sort an array A[p…r] Divide
A[p…q] A[q+1…r] ≤ Sort an array A[p…r] Divide Partition the array A into 2 subarrays A[p..q] and A[q+1..r], such that each element of A[p..q] is smaller than or equal to each element in A[q+1..r] Need to find index q to partition the array

Quicksort A[p…q] A[q+1…r] ≤ Conquer Recursively sort A[p..q] and A[q+1..r] using Quicksort Combine Trivial: the arrays are sorted in place No additional work is required to combine them The entire array is now sorted

Quicksort QuickSort(A, p, r) { if p < r then q  PARTITION(A, p, r)
A[p…q] A[q+1…r] ≤ QuickSort(A, p, r) { if p < r then q  PARTITION(A, p, r) QUICKSORT (A, p, q) QUICKSORT (A, q+1, r) }

Quicksort Choosing PARTITION() There are different ways to do this
Each has its own advantages/disadvantages

Quicksort First the list is partitioned around a pivot value. Pivot can be chosen from the beginning, end or middle of list): 4 12 4 10 8 5 5 2 11 7 3 5 pivot value

Quicksort The pivot is swapped to the last position and the
remaining elements are compared starting at the ends. 4 12 4 10 8 3 2 11 7 5 5 low high 5 pivot value

Quicksort Then the low index moves right until it is at an element that is larger than the pivot value (i.e., it is on the wrong side) 4 12 12 10 8 3 6 6 2 11 7 5 low high 5 pivot value

Quicksort Then the high index moves left until it is at an
element that is smaller than the pivot value (i.e., it is on the wrong side) 4 12 4 10 8 3 6 6 2 2 11 7 5 low high 5 pivot value

Quicksort Then the two values are swapped and the index values are updated: 4 2 4 10 12 8 6 3 6 12 2 11 7 5 low high 5 pivot value

Quicksort This continues until the two index values pass each other: 4
2 10 3 8 10 3 6 6 12 11 7 5 low high 5 pivot value

Quicksort This continues until the two index values pass each other: 4
2 4 3 8 10 6 6 12 11 7 5 high low 5 pivot value

Quicksort Then the pivot value is swapped into position: 4 2 4 3 5 8
10 6 6 12 11 7 5 8 high low

Quicksort Recursively quicksort the two parts: 4 2 4 3 5 5 6 10 6 12
11 7 8 Quicksort the left part Quicksort the right part

Partitioning Algorithm
Original input : S = {6, 1, 4, 9, 0, 3, 5, 2, 7, 8} Pick the first element as pivot Have two ‘iterators’ – i and j i starts at first element and moves forward j starts at last element and moves backwards While (i < j) Move i to the right till we find a number greater than pivot Move j to the left till we find a number smaller than pivot If (i<j) swap(S[i], S[j]) The effect is to push larger elements to the right and smaller elements to the left Swap the pivot with S[i] pivot i j pivot

Partitioning Algorithm Illustrated
j pivot i j Move pivot i j pivot swap i j pivot move i j pivot swap j i pivot i and j have crossed move Swap S[i] with pivot j i pivot

Partitioning Pseudocode
PARTITION(A, p, r) pivot = A[p]; leftPointer = p + 1 rightPointer = r while (True) while (A[leftPointer] < pivot) leftPointer++; while (A[rightPointer] >= pivot) rightPointer--; if leftPointer >= rightPointer break; else A[leftPointer]  A[rightPointer] A[leftPointer]  pivot What is the running time of partition()? partition() runs in O(n) time

Quick Sort: Best Case Analysis
Assuming that keys are random, uniformly distributed. The best case running time occurs when pivot is the median Partition splits array in two sub-arrays of size n/2 at each step Partition Comparisons n n Nodes contain problem size n n/2 n/2 n/4 n/4 n/4 n/4 n n/8 n/8 n/8 n/8 n/8 n/8 n T(n) = 2 T(n/2) + cn T(n) = cn log n + n = O(n log n)

Quick Sort: Worst Case Analysis
Assuming that keys are random, uniformly distributed. The worst case running time occurs when pivot is the smallest (or largest) element all the time One of the two sub-arrays will have zero elements at each step Height of Tree = n n-1 n-2 n-3 1 . Total Cost = T(n) = T(n-1) + cn T(n-1) = T(n-2) + c(n-1) T(n-2) = T(n-3) + c(n-2) … T(2) = T(1) + 2c

Pivot Selection Left-most element: The first element is chosen there
is possibility of bad partitioning. Right-most element: The last element is chosen there Median element: An element that is median to left-most and right-most element: Median = (L+R)/2. The choice of median pivot reduces the chance of bad partition. Random element: Use a function Random() to get the random pivot every time. Random element selection minimizes the chance of bad partitioning. End of lecture 45. At last!

Picking the Pivot Strategy 1: Pick the first element in S
How would you pick one? Strategy 1: Pick the first element in S Works only if input is random What if input S is sorted, or even mostly sorted? All the remaining elements would go into either S1 or S2! Terrible performance! Why worry about sorted input? Remember  Quicksort is recursive, so sub-problems could be sorted Plus mostly sorted input is quite frequent

Picking the Pivot (contd.)
Strategy 2: Pick the pivot randomly Would usually work well, even for mostly sorted input Unless the random number generator is not quite random! Plus random number generation is an expensive operation Strategy 3: Median-of-three Partitioning Ideally, the pivot should be the median of input array S Median = element in the middle of the sorted sequence Would divide the input into two almost equal partitions Unfortunately, harder to calculate median quickly, without sorting first! So find the approximate median Pivot = median of the left-most, right-most and center elements of array S Solves the problem of sorted input

Picking the Pivot (contd.)
Example: Median-of-three Partitioning Let input S = {6, 1, 4, 9, 0, 3, 5, 2, 7, 8} Left = 0 and S[left] = 6 Right = 9 and S[right] = 8 center = (left + right)/2 = 4 and S[center] = 0 Pivot = Median of S[left], S[right], and S[center] = median of 6, 8, and 0 = S[left] = 6

Quick Sort: Final Comments
What happens when the array contains many duplicate elements? Not stable What happens when the size of the array is small? For small arrays (N ≤ 50), Insertion sort is faster than quick sort due to recursive function call overheads of quick sort. Use hybrid algorithm; quick sort followed by insertion sort when N ≤ 50 However, Quicksort is usually O(n log2n) Constants are so good that it is generally the fastest algorithm known Most real-world sorting is done by Quicksort For optimum efficiency, the pivot must be chosen carefully “Median of three” is a good technique for choosing the pivot However, no matter what you do, some bad cases can be constructed where Quick Sort runs in O(n2) time

Improving Quicksort The real liability of quicksort is that it runs in
O(n2). The worst case arises when the input array is presorted or reverse sorted. Book discusses two solutions: Randomize the input array, OR Pick a random pivot element How will these solve the problem? By insuring that no particular input can be chosen to make quicksort run in O(n2) time End of lecture 45. At last!

Design and Analysis of Algorithms

Similar presentations

Presentation on theme: "Design and Analysis of Algorithms"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Design and Analysis of Algorithms

Similar presentations

Presentation on theme: "Design and Analysis of Algorithms"— Presentation transcript:

Similar presentations

About project

Feedback