Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mergesort, Analysis of Algorithms Jon von Neumann and ENIAC (1945)

Similar presentations


Presentation on theme: "Mergesort, Analysis of Algorithms Jon von Neumann and ENIAC (1945)"— Presentation transcript:

1 Mergesort, Analysis of Algorithms Jon von Neumann and ENIAC (1945)

2 2 Why Does It Matter? 1000 Time to solve a problem of size 10,000 100,000 million 10 million 1.3 seconds 22 minutes 15 days 41 years 41 millennia 920 3,600 14,000 41,000 1,000 Run time (nanoseconds) 1.3 N 3 second Max size problem solved in one minute hour day 10 msec 1 second 1.7 minutes 2.8 hours 1.7 weeks 10,000 77,000 600,000 2.9 million 100 10 N 2 0.4 msec 6 msec 78 msec 0.94 seconds 11 seconds 1 million 49 million 2.4 trillion 50 trillion 10+ 47 N log 2 N 0.048 msec 0.48 msec 4.8 msec 48 msec 0.48 seconds 21 million 1.3 billion 76 trillion 1,800 trillion 10 48 N N multiplied by 10, time multiplied by

3 3 Orders of Magnitude 10 -10 Meters Per Second 10 -8 10 -6 10 -4 10 -2 1 10 2 1.2 in / decade Imperial Units 1 ft / year 3.4 in / day 1.2 ft / hour 2 ft / minute 2.2 mi / hour 220 mi / hour Continental drift Example Hair growing Glacier Gastro-intestinal tract Ant Human walk Propeller airplane 10 4 10 6 10 8 370 mi / min 620 mi / sec 62,000 mi / sec Space shuttle Earth in galactic orbit 1/3 speed of light 1 Seconds 10 2 10 3 10 4 10 5 10 6 10 7 10 8 10 9 10 1 second Equivalent 1.7 minutes 17 minutes 2.8 hours 1.1 days 1.6 weeks 3.8 months 3.1 years 3.1 decades 3.1 centuries forever 10 21 age of universe 2 10 thousand 2 20 million 2 30 billion... 1010 seconds Powers of 2

4 4 Impact of Better Algorithms Example 1: N-body-simulation. n Simulate gravitational interactions among N bodies. – physicists want N = # atoms in universe n Brute force method: N 2 steps. n Appel (1981). N log N steps, enables new research. Example 2: Discrete Fourier Transform (DFT). n Breaks down waveforms (sound) into periodic components. – foundation of signal processing – CD players, JPEG, analyzing astronomical data, etc. n Grade school method: N 2 steps. n Runge-König (1924), Cooley-Tukey (1965). FFT algorithm: N log N steps, enables new technology.

5 5 Mergesort Mergesort (divide-and-conquer) n Divide array into two halves. ALGORITHMS divide ALGORITHMS

6 6 Mergesort Mergesort (divide-and-conquer) n Divide array into two halves. n Recursively sort each half. sort ALGORITHMS divide ALGORITHMS AGLORHIMST

7 7 Mergesort Mergesort (divide-and-conquer) n Divide array into two halves. n Recursively sort each half. n Merge two halves to make sorted whole. merge sort ALGORITHMS divide ALGORITHMS AGLORHIMST AGHILMORST

8 8 Mergesort Analysis How long does mergesort take? n Bottleneck = merging (and copying). – merging two files of size N/2 requires N comparisons n T(N) = comparisons to mergesort N elements. – to make analysis cleaner, assume N is a power of 2 Claim. T(N) = N log 2 N. n Note: same number of comparisons for ANY file. – even already sorted n We'll prove several different ways to illustrate standard techniques.

9 9 Proof by Picture of Recursion Tree T(N) T(N/2) T(N/4) T(2) N T(N / 2 k ) 2(N/2) 4(N/4) 2 k (N / 2 k ) N/2 (2)... log 2 N N log 2 N

10 10 Proof by Telescoping Claim. T(N) = N log 2 N (when N is a power of 2). Proof. For N > 1:

11 11 Mathematical Induction Mathematical induction. n Powerful and general proof technique in discrete mathematics. n To prove a theorem true for all integers k  0: – Base case: prove it to be true for N = 0. – Induction hypothesis: assuming it is true for arbitrary N – Induction step: show it is true for N + 1 Claim: 0 + 1 + 2 + 3 +... + N = N(N+1) / 2 for all N  0. Proof: (by mathematical induction) n Base case (N = 0). – 0 = 0(0+1) / 2. n Induction hypothesis: assume 0 + 1 + 2 +... + N = N(N+1) / 2 n Induction step: 0 + 1 +... + N + N + 1= (0 + 1 +... + N) + N+1 = N (N+1) /2 + N+1 = (N+2)(N+1) / 2

12 12 Proof by Induction Claim. T(N) = N log 2 N (when N is a power of 2). Proof. (by induction on N) n Base case: N = 1. n Inductive hypothesis: T(N) = N log 2 N. n Goal: show that T(2N) = 2N log 2 (2N).

13 13 Proof by Induction What if N is not a power of 2? n T(N) satisfies following recurrence. Claim.T(N)  N  log 2 N . Proof.See supplemental slides.

14 14 Computational Complexity Framework to study efficiency of algorithms. Example = sorting. n MACHINE MODEL = count fundamental operations. – count number of comparisons n UPPER BOUND = algorithm to solve the problem (worst-case). – N log 2 N from mergesort n LOWER BOUND = proof that no algorithm can do better. – N log 2 N - N log 2 e n OPTIMAL ALGORITHM: lower bound ~ upper bound. – mergesort

15 15 Decision Tree print a 1, a 2, a 3 a 1 < a 2 YESNO a 2 < a 3 YES NO a 2 < a 3 YESNO a 1 < a 3 YESNO a 1 < a 3 YESNO print a 1, a 3, a 2 print a 3, a 1, a 2 print a 2, a 1, a 3 print a 2, a 3, a 1 print a 3, a 2, a 1

16 16 Comparison Based Sorting Lower Bound Theorem. Any comparison based sorting algorithm must use  (N log 2 N) comparisons. Proof. Worst case dictated by tree height h. n N! different orderings. n One (or more) leaves corresponding to each ordering. n Binary tree with N! leaves must have height Food for thought. What if we don't use comparisons?  Stay tuned for radix sort. Stirling's formula

17 Extra Slides

18 18 Proof by Induction Claim. T(N)  N  log 2 N . Proof. (by induction on N) n Base case: N = 1. n Define n 1 =  N / 2 , n 2 =  N / 2 . n Induction step: assume true for 1, 2,..., N – 1.

19 19 Implementing Mergesort Item aux[MAXN]; void mergesort(Item a[], int left, int right) { int mid = (right + left) / 2; if (right <= left) return; mergesort(a, left, mid); mergesort(a, mid + 1, right); merge(a, left, mid, right); } mergesort (see Sedgewick Program 8.3) uses scratch array

20 20 Implementing Mergesort void merge(Item a[], int left, int mid, int right) { int i, j, k; for (i = mid+1; i > left; i--) aux[i-1] = a[i-1]; for (j = mid; j < right; j++) aux[right+mid-j] = a[j+1]; for (k = left; k <= right; k++) if (ITEMless(aux[i], aux[j])) a[k] = aux[i++]; else a[k] = aux[j--]; } merge (see Sedgewick Program 8.2) copy to temporary array merge two sorted sequences

21 21 Profiling Mergesort Empirically void merge(Item a[], int left, int mid, int right) { int i, j, k; for ( i = mid+1; i > left; i--) aux[i-1] = a[i-1]; for ( j = mid; j j++) aux[right+mid-j] = a[j+1]; for ( k = left; k k++) if ( ITEMless(aux[i], aux[j])) a[k] = aux[i++]; else a[k] = aux[j--]; } void mergesort(Item a[], int left, int right) { int mid = (right + left) / 2; if ( right <= left) return ; mergesort(a, aux, left, mid); mergesort(a, aux, mid+1, right); merge(a, aux, left, mid, right); } Mergesort prof.out Striking feature: All numbers SMALL! # comparisons Theory ~ N log 2 N = 9,966 Actual = 9,976

22 22 Sorting Analysis Summary Running time estimates: n Home pc executes 10 8 comparisons/second. n Supercomputer executes 10 12 comparisons/second. Lesson 1: good algorithms are better than supercomputers. Lesson 2: great algorithms are better than good ones. computer home super thousand instant million 2.8 hours 1 second billion 317 years 1.6 weeks Insertion Sort (N 2 ) thousand instant million 1 sec instant billion 18 min instant Mergesort (N log N) thousand instant million 0.3 sec instant billion 6 min instant Quicksort (N log N)


Download ppt "Mergesort, Analysis of Algorithms Jon von Neumann and ENIAC (1945)"

Similar presentations


Ads by Google