Dale Roberts Mergesort Dale Roberts, Lecturer Computer Science, IUPUI Department of Computer and Information Science, School.

Dale Roberts Mergesort Dale Roberts, Lecturer Computer Science, IUPUI E-mail: droberts@cs.iupui.edu Department of Computer and Information Science, School of Science, IUPUI

Dale Roberts Why Does It Matter? 1000 Time to solve a problem of size 10,000 100,000 million 10 million 1.3 seconds 22 minutes 15 days 41 years 41 millennia 920 3,600 14,000 41,000 1,000 Run time (nanoseconds) 1.3 N 3 second Max size problem solved in one minute hour day 10 msec 1 second 1.7 minutes 2.8 hours 1.7 weeks 10,000 77,000 600,000 2.9 million 100 10 N 2 0.4 msec 6 msec 78 msec 0.94 seconds 11 seconds 1 million 49 million 2.4 trillion 50 trillion 10+ 47 N log 2 N 0.048 msec 0.48 msec 4.8 msec 48 msec 0.48 seconds 21 million 1.3 billion 76 trillion 1,800 trillion 10 48 N N multiplied by 10, time multiplied by

Dale Roberts Orders of Magnitude 10 -10 Meters Per Second 10 -8 10 -6 10 -4 10 -2 1 10 2 1.2 in / decade Imperial Units 1 ft / year 3.4 in / day 1.2 ft / hour 2 ft / minute 2.2 mi / hour 220 mi / hour Continental drift Example Hair growing Glacier Gastro-intestinal tract Ant Human walk Propeller airplane 10 4 10 6 10 8 370 mi / min 620 mi / sec 62,000 mi / sec Space shuttle Earth in galactic orbit 1/3 speed of light 1 Seconds 10 2 10 3 10 4 10 5 10 6 10 7 10 8 10 9 10 1 second Equivalent 1.7 minutes 17 minutes 2.8 hours 1.1 days 1.6 weeks 3.8 months 3.1 years 3.1 decades 3.1 centuries forever 10 21 age of universe 2 10 thousand 2 20 million 2 30 billion... 1010 seconds Powers of 2

Dale Roberts Impact of Better Algorithms Example 1: N-body-simulation. Simulate gravitational interactions among N bodies. physicists want N = # atoms in universe Brute force method: N 2 steps. Appel (1981). N log N steps, enables new research. Example 2: Discrete Fourier Transform (DFT). Breaks down waveforms (sound) into periodic components. foundation of signal processing CD players, JPEG, analyzing astronomical data, etc. Grade school method: N 2 steps. Runge-König (1924), Cooley-Tukey (1965). FFT algorithm: N log N steps, enables new technology.

Dale Roberts Mergesort Mergesort (divide-and-conquer) Divide array into two halves. ALGORITHMS divide ALGORITHMS

Dale Roberts Mergesort Mergesort (divide-and-conquer) Divide array into two halves. Recursively sort each half. sort ALGORITHMS divide ALGORITHMS AGLORHIMST

Dale Roberts Mergesort Mergesort (divide-and-conquer) Divide array into two halves. Recursively sort each half. Merge two halves to make sorted whole. merge sort ALGORITHMS divide ALGORITHMS AGLORHIMST AGHILMORST

Dale Roberts auxiliary array smallest AGLORHIMST Merging Merge. Keep track of smallest element in each sorted half. Insert smallest of two elements into auxiliary array. Repeat until done. A

Dale Roberts auxiliary array smallest AGLORHIMST A Merging Merge. Keep track of smallest element in each sorted half. Insert smallest of two elements into auxiliary array. Repeat until done. G

Dale Roberts auxiliary array smallest AGLORHIMST AG Merging Merge. Keep track of smallest element in each sorted half. Insert smallest of two elements into auxiliary array. Repeat until done. H

Dale Roberts auxiliary array smallest AGLORHIMST AGH Merging Merge. Keep track of smallest element in each sorted half. Insert smallest of two elements into auxiliary array. Repeat until done. I

Dale Roberts auxiliary array smallest AGLORHIMST AGHI Merging Merge. Keep track of smallest element in each sorted half. Insert smallest of two elements into auxiliary array. Repeat until done. L

Dale Roberts auxiliary array smallest AGLORHIMST AGHIL Merging Merge. Keep track of smallest element in each sorted half. Insert smallest of two elements into auxiliary array. Repeat until done. M

Dale Roberts auxiliary array smallest AGLORHIMST AGHILM Merging Merge. Keep track of smallest element in each sorted half. Insert smallest of two elements into auxiliary array. Repeat until done. O

Dale Roberts auxiliary array smallest AGLORHIMST AGHILMO Merging Merge. Keep track of smallest element in each sorted half. Insert smallest of two elements into auxiliary array. Repeat until done. R

Dale Roberts auxiliary array first half exhausted smallest AGLORHIMST AGHILMOR Merging Merge. Keep track of smallest element in each sorted half. Insert smallest of two elements into auxiliary array. Repeat until done. S

Dale Roberts auxiliary array first half exhausted smallest AGLORHIMST AGHILMORS Merging Merge. Keep track of smallest element in each sorted half. Insert smallest of two elements into auxiliary array. Repeat until done. T

Dale Roberts auxiliary array first half exhausted second half exhausted AGLORHIMST AGHILMORST Merging Merge. Keep track of smallest element in each sorted half. Insert smallest of two elements into auxiliary array. Repeat until done.

Dale Roberts Implementing Mergesort Item aux[MAXN]; void mergesort(Item a[], int left, int right) { int mid = (right + left) / 2; if (right <= left) return; mergesort(a, left, mid); mergesort(a, mid + 1, right); merge(a, left, mid, right); } mergesort (see Sedgewick Program 8.3) uses scratch array

Dale Roberts Implementing Merge (Idea 0) mergeAB(Item c[], Item a[], int N, Item b[], int M ) { int i, j, k; { int i, j, k; for (i = 0, j = 0, k = 0; k < N+M; k++) for (i = 0, j = 0, k = 0; k < N+M; k++) { if (i == N) { c[k] = b[j++]; continue; } if (i == N) { c[k] = b[j++]; continue; } if (j == M) { c[k] = a[i++]; continue; } if (j == M) { c[k] = a[i++]; continue; } c[k] = (less(a[i], b[j])) ? a[i++] : b[j++]; c[k] = (less(a[i], b[j])) ? a[i++] : b[j++]; } }

Dale Roberts Implementing Mergesort void merge(Item a[], int left, int mid, int right) { int i, j, k; for (i = mid+1; i > left; i--) aux[i-1] = a[i-1]; for (j = mid; j < right; j++) aux[right+mid-j] = a[j+1]; for (k = left; k <= right; k++) if (ITEMless(aux[i], aux[j])) a[k] = aux[i++]; else a[k] = aux[j--]; } merge (see Sedgewick Program 8.2) copy to temporary array merge two sorted sequences

Dale Roberts Mergesort Demo Mergesort Mergesort The auxilliary array used in the merging operation is shown to the right of the array a[], going from (N+1, 1) to (2N, 2N). Mergesort The demo is a dynamic representation of the algorithm in action, sorting an array a containing a permutation of the integers 1 through N. For each i, the array element a[i] is depicted as a black dot plotted at position (i, a[i]). Thus, the end result of each sort is a diagonal of black dots going from (1, 1) at the bottom left to (N, N) at the top right. Each time an element is moved, a green dot is left at its old position. Thus the moving black dots give a dynamic representation of the progress of the sort and the green dots give a history of the data-movement cost.

Dale Roberts Computational Complexity Framework to study efficiency of algorithms. Example = sorting. MACHINE MODEL = count fundamental operations. count number of comparisons UPPER BOUND = algorithm to solve the problem (worst-case). N log 2 N from mergesort LOWER BOUND = proof that no algorithm can do better. N log 2 N - N log 2 e OPTIMAL ALGORITHM: lower bound ~ upper bound. mergesort

Dale Roberts Decision Tree print a 1, a 2, a 3 a 1 < a 2 YESNO a 2 < a 3 YES NO a 2 < a 3 YESNO a 1 < a 3 YESNO a 1 < a 3 YESNO print a 1, a 3, a 2 print a 3, a 1, a 2 print a 2, a 1, a 3 print a 2, a 3, a 1 print a 3, a 2, a 1

Dale Roberts Comparison Based Sorting Lower Bound Theorem. Any comparison based sorting algorithm must use  (N log 2 N) comparisons. Proof. Worst case dictated by tree height h. N! different orderings. One (or more) leaves corresponding to each ordering. Binary tree with N! leaves must have height Food for thought. What if we don't use comparisons? Stay tuned for radix sort. Stirling's formula

Dale Roberts Mergesort Analysis How long does mergesort take? Bottleneck = merging (and copying). merging two files of size N/2 requires N comparisons T(N) = comparisons to mergesort N elements. to make analysis cleaner, assume N is a power of 2 Claim. T(N) = N log 2 N. Note: same number of comparisons for ANY file. even already sorted We'll prove several different ways to illustrate standard techniques.

Dale Roberts Profiling Mergesort Empirically void merge(Item a[], int left, int mid, int right) { int i, j, k; for ( i = mid+1; i > left; i--) aux[i-1] = a[i-1]; for ( j = mid; j j++) aux[right+mid-j] = a[j+1]; for ( k = left; k k++) if ( ITEMless(aux[i], aux[j])) a[k] = aux[i++]; else a[k] = aux[j--]; } void mergesort(Item a[], int left, int right) { int mid = (right + left) / 2; if ( right <= left) return ; mergesort(a, aux, left, mid); mergesort(a, aux, mid+1, right); merge(a, aux, left, mid, right); } Mergesort prof.out Striking feature: All numbers SMALL! # comparisons Theory ~ N log 2 N = 9,966 Actual = 9,976

Dale Roberts Sorting Analysis Summary Running time estimates: Home pc executes 10 8 comparisons/second. Supercomputer executes 10 12 comparisons/second. Lesson 1: good algorithms are better than supercomputers. Lesson 2: great algorithms are better than good ones. computer home super thousand instant million 2.8 hours 1 second billion 317 years 1.6 weeks Insertion Sort (N 2 ) thousand instant million 1 sec instant billion 18 min instant Mergesort (N log N) thousand instant million 0.3 sec instant billion 6 min instant Quicksort (N log N)

Dale Roberts Acknowledgements Sorting methods are discussed in our Sedgewick text. Slides and demos are from our text’s website at princeton.edu. Special thanks to Kevin Wayne in helping to prepare this material.

Dale Roberts Extra Slides

Dale Roberts Proof by Picture of Recursion Tree T(N) T(N/2) T(N/4) T(2) N T(N / 2 k ) 2(N/2) 4(N/4) 2 k (N / 2 k ) N/2 (2)... log 2 N N log 2 N

Dale Roberts Proof by Telescoping Claim. T(N) = N log 2 N (when N is a power of 2). Proof. For N > 1:

Dale Roberts Mathematical Induction Mathematical induction. Powerful and general proof technique in discrete mathematics. To prove a theorem true for all integers k  0: Base case: prove it to be true for N = 0. Induction hypothesis: assuming it is true for arbitrary N Induction step: show it is true for N + 1 Claim: 0 + 1 + 2 + 3 +... + N = N(N+1) / 2 for all N  0. Proof: (by mathematical induction) Base case (N = 0). 0 = 0(0+1) / 2. 0 = 0(0+1) / 2. Induction hypothesis: assume 0 + 1 + 2 +... + N = N(N+1) / 2 Induction step: 0 + 1 +... + N + N + 1 = (0 + 1 +... + N) + N+1 = N (N+1) /2 + N+1 = (N+2)(N+1) / 2

Dale Roberts Proof by Induction Claim. T(N) = N log 2 N (when N is a power of 2). Proof. (by induction on N) Base case: N = 1. Inductive hypothesis: T(N) = N log 2 N. Goal: show that T(2N) = 2N log 2 (2N).

Dale Roberts Proof by Induction What if N is not a power of 2? T(N) satisfies following recurrence. Claim.T(N)  N  log 2 N . Proof.See supplemental slides.

Dale Roberts Proof by Induction Claim. T(N)  N  log 2 N . Proof. (by induction on N) Base case: N = 1. Define n 1 =  N / 2 , n 2 =  N / 2 . Induction step: assume true for 1, 2,..., N – 1.

Dale Roberts Mergesort Dale Roberts, Lecturer Computer Science, IUPUI Department of Computer and Information Science, School.

Similar presentations

Presentation on theme: "Dale Roberts Mergesort Dale Roberts, Lecturer Computer Science, IUPUI Department of Computer and Information Science, School."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Dale Roberts Mergesort Dale Roberts, Lecturer Computer Science, IUPUI Department of Computer and Information Science, School.

Similar presentations

Presentation on theme: "Dale Roberts Mergesort Dale Roberts, Lecturer Computer Science, IUPUI Department of Computer and Information Science, School."— Presentation transcript:

Similar presentations

About project

Feedback