Presentation is loading. Please wait.

Presentation is loading. Please wait.

Analysis of Algorithms. Time and Space  to analyze an algorithm means:  developing a formula for predicting how fast an algorithm is, based on the size.

Similar presentations


Presentation on theme: "Analysis of Algorithms. Time and Space  to analyze an algorithm means:  developing a formula for predicting how fast an algorithm is, based on the size."— Presentation transcript:

1 Analysis of Algorithms

2 Time and Space  to analyze an algorithm means:  developing a formula for predicting how fast an algorithm is, based on the size of the input  time complexity and/or  developing a formula for predicting how much memory an algorithm requires, based on the size of the input  space complexity  usually time is our biggest concern  most algorithms require a fixed amount of space These slides are based on Dr. David Matuszek’s lecturesDr. David Matuszek 2

3  if we are searching an array, the “size” of the input could be the size of the array  if we are merging two arrays, the “size” could be the sum of the two array sizes  if we are computing the n th Fibonacci number, or the n th factorial, the “size” is n  we choose the “size” to be a parameter that determines the actual time (or space) required  it is usually obvious what this parameter is  sometimes we need two or more parameters What Does “Size of the Input” Mean? 3

4  in computing time complexity, one good approach is to count characteristic operations  what a “characteristic operation” is depends on the particular problem  if searching, it might be comparing two values  if sorting an array, it might be:  comparing two values  swapping the contents of two array locations  both of the above  sometimes we just look at how many times the innermost loop is executed Characteristic Operations 4

5  it is sometimes possible, in assembly language, to compute exact time and space requirements  we know exactly how many bytes and how many cycles each machine instruction takes  for a problem with a known sequence of steps (factorial, Fibonacci), we can determine how many instructions of each type are required  however, often the exact sequence of steps cannot be known in advance  the steps required to sort an array depend on the actual numbers in the array (which we do not know in advance) Exact Values 5

6  in a higher-level language (such as Java), we do not know how long each operation takes  which is faster, x < 10 or x <= 9 ?  we don’t know exactly what the compiler does with this  the compiler almost certainly optimizes the test anyway (replacing the slower version with the faster one)  in a higher-level language we cannot do an exact analysis  our timing analyses will use major oversimplifications  nevertheless, we can get some very useful results Higher-Level Languages 6

7  usually we would like to find the average time to perform an algorithm  however,  sometimes the “average” isn’t well defined  example: Sorting an “average” array  time typically depends on how out of order the array is  how out of order is the “average” unsorted array?  sometimes finding the average is too difficult  often we have to be satisfied with finding the worst (longest) time required  sometimes this is even what we want (say, for time-critical operations)  the best (fastest) case is seldom of interest Average, Best, and Worst Cases 7

8 Constant Time  constant time means there is some constant k such that this operation always takes k nanoseconds  a Java statement takes constant time if:  it does not include a loop  it does not include calling a method whose time is unknown or is not a constant  if a statement involves a choice ( if or switch ) among operations, each of which takes constant time, we consider the statement to take constant time  this is consistent with worst-case analysis 8

9 Linear Time  We may not be able to predict to the nanosecond how long a Java program will take, but do know some things about timing: for (i = 0, j = 1; i < n; i++) { j = j * i; }  This loop takes time k*n + c, for some constants k and c k : How long it takes to go through the loop once (the time for j = j * i, plus loop overhead) n : The number of times through the loop (we can use this as the “size” of the problem) c : The time it takes to initialize the loop  the total time k*n + c is linear in n 9

10 Constant Time vs. Linear Time  constant time is (usually) better than linear time  suppose we have two algorithms to solve a task:  algorithm A takes 5000 time units  algorithm B takes 100*n time units  which is better?  clearly, algorithm B is better if our problem size is small, that is, if n < 50  algorithm A is better for larger problems, with n > 50  So B is better on small problems that are quick anyway  but A is better for large problems, where it matters more  we usually care most about very large problems  but not always! 10

11 The Array subset Problem  Suppose you have two sets, represented as unsorted arrays: int[] sub = { 7, 1, 3, 2, 5 }; int[] super = { 8, 4, 7, 1, 2, 3, 9 }; and you want to test whether every element of the first set ( sub ) also occurs in the second set ( super ): System.out.println(subset(sub, super));  (the answer in this case should be false, because sub contains the integer 5, and super does not)  we are going to write method subset and compute its time complexity (how fast it is)  let’s start with a helper function, member, to test whether one number is in an array 11

12 member static boolean member(int x, int[] a) { int n = a.length; for (int i = 0; i < n; i++) { if (x == a[i]) return true; } return false; }  if x is not in a, the loop executes n times, where n = a.length  this is the worst case  if x is in a, the loop executes n/2 times on average  either way, linear time is required: k*n+c 12

13 static boolean subset(int[] sub, int[] super) { int m = sub.length; for (int i = 0; i < m; i++) if (!member(sub[i], super) return false; return true; }  the loop (and the call to member ) will execute:  m = sub.length times, if sub is a subset of super  this is the worst case, and the one we’re most interested in  fewer than sub.length times (but we don’t know how many)  we would need to figure this out in order to compute average time complexity  the worst case is a linear number of times through the loop  but the loop body doesn’t take constant time, since it calls member, which takes linear time subset 13

14 Analysis of Array subset Algorithm  We have seen that the loop in subset executes m = sub.length times (in the worst case)  also, the loop in subset calls member, which executes in time linear in n = super.length  hence, the execution time of the array subset method is m*n, along with assorted constants  we go through the loop in subset m times, calling member each time  we go through the loop in member n times  If m and n are similar, this is roughly quadratic, i.e., n 2 14

15  an added constant, f(n)+ c, becomes less and less important as n gets larger  a constant multiplier, k*f(n), does not get less important, but...  improving k gives a linear speedup (cutting k in half cuts the time required in half)  improving k is usually accomplished by careful code optimization, not by better algorithms  we aren’t that concerned with only linear speedups!  bottom line – forget the constants! What About the Constants? 15

16 Simplifying the Formulae  throwing out the constants is one of two things we do in analysis of algorithms  by throwing out constants, we simplify 12n to just n 2  our timing formula is a polynomial, and may have terms of various orders (constant, linear, quadratic, cubic, etc.)  we usually discard all but the highest-order term  we simplify n 2 + 3n + 5 to just n 2 16

17 Big O Notation  when we have a polynomial that describes the time requirements of an algorithm, we simplify it by:  throwing out all but the highest-order term  throwing out all the constants  if an algorithm takes 12n 3 +4n 2 +8n+35 time, we simplify this formula to just n 3  we say the algorithm requires O(n 3 ) time  we call this Big O notation  (more accurately, it’s Big , but we’ll leave that for Turpin’s class) 17

18 Big O for subset Algorithm  recall that, if n is the size of the set, and m is the size of the (possible) subset:  we go through the loop in subset m times, calling member each time  we go through the loop in member n times  hence, the actual running time should be k*(m*n) + c, for some constants k and c  we say that subset takes O(m*n) time 18

19 Can We Justify Big O Notation?  big O notation is a huge simplification; can we justify it?  it only makes sense for large problem sizes  for sufficiently large problem sizes, the highest-order term swamps all the rest!  consider R = x 2 + 3x + 5 as x varies: x = 0x 2 = 03x = 05 = 5R = 5 x = 10x 2 = 1003x = 305 = 5R = 135 x = 100x 2 = x = 3005 = 5R = 10,305 x = 1000x 2 = x = = 5R = 1,003,005 x = 10,000x 2 = x = 3* = 5R = 100,030,005 x = 100,000x 2 = x = 3* = 5R = 10,000,300,005 19

20 y = x 2 + 3x + 5, for x=

21 y = x 2 + 3x + 5, for x=

22 Common Time Complexities  O(1)constant time  O(log n)log time  O(n)linear time  O(n log n)log linear time  O(n 2 )quadratic time  O(n 3 )cubic time  O(2 n )exponential time BETTER WORSE 22

23 Simple Sorting Algorithms

24 Bubble Sort  compare each element (except the last one) with its neighbor to the right  if they are out of order, swap them  this puts the largest element at the very end  the last element is now in the correct and final place  compare each element (except the last two) with its neighbor to the right  if they are out of order, swap them  this puts the second largest element next to last  the last two elements are now in their correct and final places  compare each element (except the last three) with its neighbor to the right  continue as above until you have no unsorted elements on the left 24

25 (done) Example of Bubble Sort 25

26 Code for Bubble Sort public static void bubbleSort(int[] a) { int outer, inner; for (outer = a.length - 1; outer > 0; outer--) { // counting down for (inner = 0; inner < outer; inner++) { // bubbling up if (a[inner] > a[inner + 1]) { // if out of order... int temp = a[inner]; //...then swap a[inner] = a[inner + 1]; a[inner + 1] = temp; } } } } 26

27 Analysis of Bubble Sort for (outer = a.length - 1; outer > 0; outer--) { for (inner = 0; inner a[inner + 1]) { // code for swap omitted } } }  let n = a.length = size of the array  the outer loop is executed n-1 times (call it n, that’s close enough)  each time the outer loop is executed, the inner loop is executed  inner loop executes n-1 times at first, linearly dropping to just once  on average, inner loop executes about n/2 times for each execution of the outer loop  in the inner loop, the comparison is always done (constant time), the swap might be done (also constant time)  result is n * n/2 * k, that is, O(n 2 /2 + k) = O(n 2 ) 27

28 Loop Invariants  you run a loop in order to change things  oddly enough, what is usually most important in understanding a loop is finding an invariant: that is, a condition that doesn’t change  in bubble sort, we put the largest elements at the end, and once we put them there, we don’t move them again  the variable outer starts at the last index in the array and decreases to 0  our invariant is: every element to the right of outer is in the correct place for all j > outer, if i < j, then a[i] <= a[j]  when this is combined with the loop exit test, outer == 0, we know that all elements of the array are in the correct place 28

29 Selection Sort  given an array of length n,  search elements 0 through n-1 and select the smallest  swap it with the element in location 0  search elements 1 through n-1 and select the smallest  swap it with the element in location 1  search elements 2 through n-1 and select the smallest  swap it with the element in location 2  search elements 3 through n-1 and select the smallest  swap it with the element in location 3  continue in this fashion until there’s nothing left to search 29

30 Example and Analysis of Selection Sort  selection sort might swap an array element with itself – this is harmless, and not worth checking for  analysis:  the outer loop executes n-1 times  the inner loop executes about n/2 times on average (from n to 2 times)  work done in the inner loop is constant (swap two array elements)  time required is roughly (n-1)*(n/2)  you should recognize this as O(n 2 ) 30

31 Code for Selection Sort public static void selectionSort(int[] a) { int outer, inner, min; for (outer = 0; outer < a.length - 1; outer++) { min = outer; for (inner = outer + 1; inner < a.length; inner++) { if (a[inner] < a[min]) { min = inner; } // Invariant: for all i, // if outer <= i <= inner, then a[min] <= a[i] } // a[min] is least among a[outer]..a[a.length - 1] int temp = a[outer]; a[outer] = a[min]; a[min] = temp; // Invariant: for all i <= outer, // if i< j then a[i] <= a[j] } } 31

32 Invariants for Selection Sort  for the inner loop:  this loop searches through the array, incrementing inner from its initial value of outer+1 up to a.length-1  as the loop proceeds, min is set to the index of the smallest number found so far  our invariant is: for all i such that outer <= i <= inner, a[min] <= a[i]  for the outer (enclosing) loop:  the loop counts up from outer = 0  each time through the loop, the minimum remaining value is put in a[outer]  our invariant is: for all i <= outer, if i < j then a[i] <= a[j] 32

33 Insertion Sort  the outer loop of insertion sort is: for (outer = 1; outer < a.length; outer++) {...}  the invariant is that all the elements to the left of outer are sorted with respect to one another for all i < outer, j < outer, if i < j then a[i] <= a[j]  this does not mean they are all in their final correct place; the remaining array elements may need to be inserted  when we increase outer, a[outer-1] becomes to its left; we must keep the invariant true by inserting a[outer-1] into its proper place  this means:  finding the element’s proper place  making room for the inserted element (by shifting over other elements)  inserting the element 33

34 One Step of Insertion Sort sortednext to be inserted temp sorted less than 10 34

35 Analysis of Insertion Sort  we run once through the outer loop, inserting each of n elements; this is a factor of n  on average, there are n/2 elements already sorted  the inner loop looks at (and moves) half of these  this gives a second factor of n/4  hence, the time required for an insertion sort of an array of n elements is proportional to n 2 /4  discarding constants, we find that insertion sort is O(n 2 ) 35

36 Summary  bubble sort, selection sort, and insertion sort are all O(n 2 )  we can do much better than this with somewhat more complicated sorting algorithms (wait for Jeff’s class)  within O(n 2 ),  bubble sort is very slow, and should probably never be used for anything  selection sort is intermediate in speed  insertion sort is usually the fastest of the three – in fact, for small arrays (say, 10 or 15 elements), insertion sort is faster than more complicated sorting algorithms  selection sort and insertion sort are good enough for small arrays 36


Download ppt "Analysis of Algorithms. Time and Space  to analyze an algorithm means:  developing a formula for predicting how fast an algorithm is, based on the size."

Similar presentations


Ads by Google